APOE-ε4 allele for the diagnosis of Alzheimer's and other dementia disorders in people with mild cognitive impairment in a secondary care setting

  • Protocol
  • Diagnostic


  • Lyzel S Elias-Sonnenschein,

    Corresponding author
    1. Maastricht University, Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Maastricht, Netherlands
    • Lyzel S Elias-Sonnenschein, Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Maastricht University, Postbus 616, Maastricht, 6200 MD, Netherlands. lyzel.elias@yahoo.nl.

    Search for more papers by this author
  • Wolfgang Viechtbauer,

    1. Maastricht University, Department of Psychiatry and Neuropsychology, Maastricht, Netherlands
    Search for more papers by this author
  • Inez Ramakers,

    1. Maastricht University, Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Maastricht, Netherlands
    Search for more papers by this author
  • Obioha Ukoumunne,

    1. University of Exeter Medical School, University of Exeter, Peninsula CLAHRC, Exeter, Devon, UK
    Search for more papers by this author
  • Frans RJ Verhey,

    1. Maastricht University, Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Maastricht, Netherlands
    Search for more papers by this author
  • Pieter Jelle Visser

    1. Maastricht University, Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience, Maastricht, Netherlands
    2. VU University Medical Centre, Department of Neurology, Amsterdam, Netherlands
    Search for more papers by this author


This is the protocol for a review and there is no abstract. The objectives are as follows:

Our primary objective is to determine the diagnostic accuracy of the APOE-ε4 allele in detecting AD dementia and other types of dementia in people with MCI in a secondary care setting.

Our secondary objective is to investigate the following potential sources of heterogeneity of test accuracy, where reported:

  • characteristics of participants, namely, baseline MMSE score, age, and gender;

  • APOE-ε4 background prevalence;

  • MCI definition;

  • length of follow-up; and

  • reference standards.


Mild cognitive impairment (MCI) refers to cognitive impairment without dementia (Morris 2001). It is a heterogeneous condition that has been operationalised in several ways. Matthews (Matthews 2008) has categorised MCI into age-related cognitive change, pathological decline, category systems, and the Mayo Clinic criteria. Under age-related cognitive change are age-consistent memory impairment (ACMI) and age-related cognitive decline (ARCD). Pathological decline encompasses the conditions of cognitive impairment no dementia (CIND), age-associated memory impairment (AAMI), age-associated cognitive decline (AACD), mild cognitive disorder (MCD), questionable dementia (QD), minimal dementia (MD), mild neurocognitive disorder (MNCD), limited cognitive disturbance (LCD), and benign senescent forgetfulness (BSF). The category systems cover subjective memory complaint (SMC) and MCI defined as a score of 22 to 26 on the 30-item Mini-Mental State Examination (MMSE). The Mayo Clinic criteria differentiate MCI into three subtypes: non-amnestic MCI (N-MCI), amnestic MCI (A-MCI), and multiple MCI (M-MCI). A-MCI refers to objective memory impairment with generally preserved cognitive functioning, or subjective memory complaints corroborated by an informant (Petersen 2004). Throughout this review, we will use the term 'MCI' to refer to any of the abovementioned conditions. In addition, we will be including in our definition of MCI impairment on a cognitive test with an overall Clinical Dementia Rating (CDR) score of 0.5, or based on an equivalent measure (Morris 2001).

People with MCI are at an increased risk for Alzheimer's disease (AD). About 35% of those with MCI have been reported to progress to AD over a follow-up period of 4.5 years, with a rate of progression that was thrice faster compared to cognitively healthy individuals (Bennett 2002). AD is a neurodegenerative disorder characterized pathologically by neuritic plaques and neurofibrillary tangles; and clinically by progressive memory loss, impairments in activities of daily living, and behavioural and neuropsychiatric symptoms. AD is the most common cause of dementia (Blennow 2006) and accounts for the majority of dementia cases worldwide (World Health Organization 2012).

There is currently no cure for AD dementia, although previous studies have shown some benefits of drug treatments in delaying cognitive decline and slowing disease progression (Peters 2012; Rountree 2012). Identifying those who are at an increased risk of developing AD dementia is therefore important for prognosis purposes and early intervention, when and if a treatment for reducing incident AD dementia is found.

Among the known risk factors for AD dementia is the ε4 allele of the apolipoprotein (APOE) gene. The APOE gene transports cholesterol and lipids in the brain, and is involved in neuronal repair (Bu 2009). APOE has three polymorphic alleles (ε2, ε3, and ε4) the combinations of which result in the six genotypes ε2ε2, ε2ε3, ε2ε4, ε3ε3, ε3ε4, and ε4ε4 (Leoni 2011). The APOE-ε4 allele has been shown to have high-avidity binding to amyloid beta (Strittmatter 1993), which is the main constituent of neuritic plaques.

About 15% of the general population has been reported to be APOE-ε4 allele carriers (Bu 2009). In Europe, APOE-ε4 prevalence has been reported to follow a north-south gradient, which was highest in northern Europe (˜60%), followed by middle Europe (˜40%) and southern Europe (˜30%) (Norberg 2011). An estimated 40% of those with AD dementia in the general population carry the ε4 allele (Bu 2009). APOE-ε4 has been associated with a ˜3.7-fold increased risk for AD dementia in case-control studies (Bertram 2007), while carriership of two ε4 alleles has been reported to increase the risk to as high as ˜15-fold (Farrer 1997). Although other genetic risk factors for AD dementia have been identified in genome-wide association studies in recent years, these genes usually show a much weaker effect (odds ratio of < 1.5) compared to APOE-ε4. In a meta-analysis, we found the APOE-ε4 allele to be a moderately strong predictor of progression from MCI to AD dementia (odds ratio of ˜2.3) (Elias-Sonnenschein 2011). The review, therefore, has useful implications for stratifying participants of clinical trials.

The purpose of the current review is to (1) assess through meta-analysis the diagnostic accuracy of the APOE-ε4 allele for identifying AD and predicting progression to AD dementia and other dementia among people with MCI; and (2) update and expand our previous meta-analysis. It differs from our earlier study in the following aspects. First, the period that will be covered is from 1993, when the first publication on the relationship between the APOE-ε4 allele and AD appeared (Strittmatter 1993), to 2013. Second, the definition of MCI will include the concepts of Matthews (Matthews 2008). Third, the methodological quality of the studies that will be included in the review will be assessed using the revised quality assessment of diagnostic accuracy studies (QUADAS-2) (Whiting 2011). Fourth, in the statistical analyses, we will be taking into account correlations between outcome measures, which are ideally assessed using a bivariate model. Fifth, no language restriction will be imposed on the search for relevant literature for the review. And sixth, as the diagnostic accuracy of the APOE-ε4 allele in one setting may be different from other settings, we will be including in this review only studies in a secondary care setting. We have prepared a similar review protocol for studies in primary care and in community settings. We will be evaluating the accuracy of the APOE-ε4 allele at baseline in relation to (1) progression from MCI to AD dementia; and (2) progression from MCI to other types of dementia.

Target condition being diagnosed

The two target conditions in this review are (1) AD dementia; and (2) other types of dementia, which will be assessed at follow-up. We will be comparing the results of the index test obtained at baseline with the results of the reference standards at follow-up (delayed verification).

Index test(s)

The index test is a genetic marker, that is, presence of APOE-ε4 allele as assessed by genotyping. A person is either a carrier or a non-carrier of the APOE-ε4 allele.

Clinical pathway

Dementia develops over several years. There is a presumed period when people are asymptomatic and when pathology is accumulating. Individuals or their relatives may then notice subtle impairments of recent memory. Gradually, more cognitive domains become involved, and difficulty in planning complex tasks becomes increasingly apparent. In the Netherlands, people usually present to their general practitioner, who may administer a cognitive screening test and refer them to a hospital memory clinic. However, many people with dementia do not consult their general practitioner until much later in the disorder and they will follow a different pathway to diagnosis, for example being identified during an admission to a general hospital for a physical illness. Thus, the pathway influences the accuracy of the diagnostic test. The accuracy of the test will vary with the experience of the administrator, and the accuracy of the subsequent diagnosis will vary with the history of referrals to the particular healthcare setting. Diagnostic assessment pathways may vary in other countries and diagnoses may be made by a variety of specialists, including neurologists and geriatricians.

Standard assessment of dementia includes history taking, clinical examination (including neurological, mental state, and cognitive examinations), and informant interview. Prior to diagnosis of dementia, the clinician normally rules out, and if possible treats, other physical or mental conditions that may be causing the cognitive impairment. Neuroimaging (computed tomography or magnetic resonance imaging) is recommended in most recent guidelines (McKhann 2011; NICE 2006). Most neuroimaging tests, and also recently cerebrospinal fluid sampling, are performed after a cognitive deficit is noted. However, individuals with abnormalities on brain imaging, which may be performed for any number of reasons, are more likely to be tested subsequently for cognitive impairment.

Dementia as diagnosed is defined by a deficit in more than two cognitive domains of sufficient degree to impair functional activities of daily living. The different diagnostic criteria for dementia are presented in the 'Reference standards' section and in Appendix 1.


An increasing number of studies suggest genetic susceptibility to AD dementia (Hollingworth 2011; Lambert 2009; Naj 2011; Seshadri 2010). TheAPOE-ε4 allele is the strongest known genetic risk factor for AD dementia. As a diagnostic tool, APOE genotyping might aid in:

  • identifying individuals with MCI at high risk of developing AD dementia;

  • increasing the accuracy of dementia diagnostics, in addition to the reference standards;

  • selecting candidates for trials with drugs that aim to slow down the progression of AD dementia.

Particular advantages of APOE genotyping are that it is safe and relatively inexpensive to perform.


Our primary objective is to determine the diagnostic accuracy of the APOE-ε4 allele in detecting AD dementia and other types of dementia in people with MCI in a secondary care setting.

Secondary objectives

Our secondary objective is to investigate the following potential sources of heterogeneity of test accuracy, where reported:

  • characteristics of participants, namely, baseline MMSE score, age, and gender;

  • APOE-ε4 background prevalence;

  • MCI definition;

  • length of follow-up; and

  • reference standards.


Criteria for considering studies for this review

Types of studies

We will include longitudinal cohort studies in a secondary care setting of people with MCI in which APOE was genotyped at baseline and the reference standard results were obtained at follow-up. These studies necessarily employed delayed verification of progression to dementia and are sometimes labelled as 'delayed verification cross-sectional studies' (Bossuyt 2008). We will include case-control studies if they incorporate a delayed verification design and indicate the time point wherein participants have developed MCI. As this can only occur in the context of a cohort study, these studies are invariably diagnostic nested case-control studies. We will impose no restriction on length of follow-up.


We will include participants with MCI. MCI will be defined according to the criteria of Matthews, Morris, or Petersen (Matthews 2008; Morris 2001; Petersen 2004), as described above. These criteria include a CDR score of 0.5, memory impairment confirmed by neuropsychological assessment, subjective memory complaints corroborated by an informant, generally preserved cognitive functions, no or minimal impairment of activities of daily living, absence of dementia.

The participants are those in a secondary care setting. By secondary care, we mean all hospital-based or specialist services with labels of secondary care or tertiary referral. We will include memory clinics and other outpatient services; psychiatry; gerontology and neurology wards; and office-based services with a 'specialist' in dementia in the secondary care rubric.  Most individuals in this category will have already had an interaction with medical services and a provisional diagnosis may have been made.

We will exclude participants with cognitive impairment due to alcohol or substance abuse, head injury, major depression or other neuropsychiatric disorders, Parkinson’s disease or other neurological disorders, somatic disorders, or endocrine disorders. We will consider separately studies that have more than 20% dropouts. We will note the causes for the dropouts. If this information is missing, the reliability of the conclusions must be questioned.

Index tests

We will classify participants genotyped for APOE as ε4 allele carriers (heterozygote carriers or homozygote carriers) or non-carriers. We will record the genotypes for the other APOE alleles if available, which may then be the subject of subgroup analyses.

Target conditions

There are two target conditions in this review:

1. AD dementia (conversion from MCI to AD dementia); and

2. any other types of dementia (conversion from MCI to any other types of dementia).

Reference standards

Several definitions of AD dementia are acceptable for the purposes of this review. Post-mortem diagnosis is the 'gold standard'. Pathological confirmation of AD according to the Braak, the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD), or the National Institute for Aging and Ronald and Nancy Reagan Institute for the Alzheimer’s Association (NIA-RIA) criteria is based on the density and distribution of neurofibrillary tangles (NFTs) or neuritic plaques (NP) (Murayama 2004) (Appendix 1).

The acceptable clinical reference standards for diagnosing AD dementia include: the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association (NINCDS-ADRDA) criteria (McKhann 1984); the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10) (World Health Organization 1993); the Diagnostic and Statistical Manual of Mental Disorders fourth edition text revised (DSM-IV-TR) (American Psychiatric Association 2000); DSM fourth edition (DSM-IV) (American Psychiatric Association 1994); and DSM third revised edition (DSM-III-R) (American Psychiatric Association 1987).

NINCDS-ADRDA differentiates between probable and possible AD. A diagnosis of probable AD is supported by abnormal biomarkers, whereas possible AD mainly has a clinical presentation. Details of the criteria are shown in Appendix 1. If sufficient data are available, we will perform subgroup analyses for probable and for possible AD.

Recently, the NINCDS-ADRDA criteria have been revised to include as key diagnostic features early episodic memory impairment; the presence of abnormal biomarkers, namely, medial temporal lobe atrophy, abnormal cerebrospinal fluid markers; specific metabolic pattern on neuroimaging methods; and familial genetic mutations (Dubois 2007). Studies that used these revised criteria will be included in the review.

The key distinction in all clinical criteria is the extent of social and occupational functional impairment. The different iterations of these clinical criteria may not be directly comparable over time, and the validity of the diagnoses may vary depending on how they were operationalised. We will take these issues into consideration in interpreting results, assessing sources of heterogeneity, and performing sensitivity analyses, as appropriate.

With regard to other dementia, the reference standard for Lewy body dementia is the McKeith criteria (McKeith 2005); for frontotemporal dementia, the Lund criteria (The Lund and Manchester Groups 1994), Boxer criteria (Boxer 2005), DSM, and ICD; and for vascular dementia, the National Institute of Neurological Disorders and Stroke and Association Internationale pour la Recherche et I'Enseignement en Neurosciences (NINDS-AIREN) criteria (Roman 1993), DSM, and ICD.

In the review, we will perform separate analyses for AD dementia diagnosis based on pathological criteria and on clinical criteria. We will include the reference standards in the investigation of sources of heterogeneity.

Search methods for identification of studies

Electronic searches

We will search ALOIS, the Cochrane Dementia and Cognitive Improvement Group Specialised Register, MEDLINE (OvidSP), EMBASE (OvidSP), BIOSIS Previews (OvidSP), Science Citation Index (ISI Web of Knowledge), PsycINFO (OvidSP), and LILACS (BIREME). See Appendix 2 for a proposed draft strategy to be run in MEDLINE (OvidSP) plus additional narrative on the search process. We will design similarly structured search strategies using search terms appropriate for each database listed above. We will also request a search of the Cochrane Register of Diagnostic Test Accuracy Studies (maintained by the Cochrane Renal Group).

We will make no restriction based on the language of the study report. We will use translation services as necessary. A single researcher with extensive experience of systematic reviews will perform the initial searches.

Searching other resources

  • Grey literature: Chosen electronic databases will include assessments of conference proceedings.

  • Handsearching: We will not perform handsearching as there is little published evidence of the benefits of handsearching for reports of DTA studies (Glanville 2012).          

  • Reference lists: We will scan reference lists of all eligible studies and reviews in the field for further possible titles and repeat the process until no new titles are found (Greenhalgh 2005; Horsley 2011). We will check the citation tracking and 'related articles' to identify studies that could be included in the review

Data collection and analysis

Selection of studies

One researcher will screen all titles generated by electronic database searches for relevance. The results of these electronic searches will receive an initial screen, including de-duplication, by a team of experienced assessors (predominantly senior medical students). This group has previously been shown to have good inter-rater reliability results.

Two review authors with content expertise will perform a second level of screening and identify studies for inclusion in the review. In case of differences of opinion on whether or not to include a particular study, a third author will serve as arbiter. When multiple publications are based on the same cohort, the study with the largest sample will be selected. We will use aPRISMA flowchart to illustrate the selection process.

Data extraction and management

Two review authors will independently complete a data extraction form for each study (Appendix 3). The data extraction form includes information on the baseline characteristics of the population, APOE genotyping, outcomes at follow-up, MCI definition used in the study, study duration, and inclusion and exclusion criteria.

We will summarize data on APOE-ε4 allele and outcome will be summarised in a 2 x 2 table. In this table, we will specify the number of people who were carriers of at least one APOE-ε4 allele and number of non-carriers who remained stable or progressed to AD. We will summarize information on the specificAPOE genotype and type of dementia at follow-up in another table.

We will cross-cehck the data for consistency. We will contact authors in the case of incomplete data. We will use Excel, Endnote, and RevMan to manage the data.

Assessment of methodological quality

We will assess the methodological quality of each study using the QUADAS-2 tool (Whiting 2011), as recommended by The Cochrane Collaboration. The tool consists of four domains: patient selection, index test, reference standard, and patient flow. Each domain is assessed in terms of risk of bias, with the first three domains also considered in terms of applicability (http://www.bris.ac.uk/quadas/quadas-2 Appendix 3). The components of each of these domains and a rubric which details how judgements concerning risk of bias are made are detailed in Appendix 4. Certain key areas that are important to qualify assessment are participant selection, blinding, and missing data. We recognise that QUADAS-2 may still develop and that as yet standard RevMan software does not incorporate QUADAS-2 functionality.

We will pilot QUADAS-2 assessment on two papers. If agreement is poor, we will refine the questions. We will not use QUADAS-2 data to form a summary quality score. We will produce a narrative summary describing numbers of studies that found high, low, and unclear risk of bias as well as concerns regarding applicability, which we will assess using Appendix 5.

Statistical analysis and data synthesis

We will apply the diagnostic test accuracy (DTA) framework for the analysis of a single test and extract the data from each study into a 2 x 2 table, showing the binary test results cross-classified with the binary reference standard and ignoring any loss to follow-up in the primary analyses. Such a reduction in the data may represent a significant oversimplification. We will therefore adopt an intention to diagnose (ITD) approach in sensitivity analyses wherein we will be imputing missing data under the assumption that:

(i) all dropouts developed dementia;

(ii)  all dropouts did not develop dementia;

(iii) 50% of all dropouts who were APOE-ε4 carriers developed dementia;50% did not develop dementia; 50% of all dropouts who were APOE-ε4 non-carriers developed dementia and 50% did not develop dementia; and

(iv) the proportion of people who developed dementia was the same as the overall proportion regardless of APOE-ε4 carrier status.

To check whether the conclusion is affected by the number of dropouts, we will compare the results of analyses without imputed data with those with imputed data.

The index test, presence of the APOE-ε4 allele, is a genetic marker. The platforms used for the genotyping assay are highly accurate, with a 100% concordance (Ghebranious 2005). The index test is ordinal with three levels: an individual may have no copies of the allele of interest, one copy, or two copies. Thus, we anticipate that the accuracy data may be presented as 3 x 2 tables. We will implelent separate analyses based on 2 x 2 tables to assess accuracy when the test result is dichotomised as one or two copies (present) versus no copies (absent), and then two copies (present) versus none or one only (absent). We will then obtain pooled estimates of sensitivity and specificity using the bivariate model (Macaskill 2010; Reitsma 2005). We will conduct these analyses in WinBUGS, Stata, R, and RevMan 5.2 software (Lunn 2009; R Development Core Team 2009; Review Manager 2012; Stata Corporation 2011).

We will explore the implications of any summary accuracy estimates not affected by heterogeneity emerging by considering the numbers of false positive and false negatives in populations with different prevalence of MCI, presenting the results as natural frequencies and using alternative metrics such as likelihood ratios and predictive values. We will prepare a summary of results table.

Investigations of heterogeneity

We will use bivariate meta-regression models to investigate heterogeneity in test accuracy. Where reported, we will code potential sources of heterogeneity as follows.

  • Characteristics of participants: mean age and mean MMSE score as continuous variables, gender as proportion of male to female.

  • Background prevalence of APOE-ε4: continuous variable.

  • Definition of MCI: dichotomous variable.

  • Length of follow-up: continuous variable.

  • Reference standards used: dichotomous variable.

If there are sufficient studies, we will test these variables by including them as covariates in the bivariate hierarchical model assessing whether there is evidence that sensitivity and specificity differ across the categories. We will check whether continuous sources of heterogeneity have a linear association with the log odds of sensitivity and the log odds of specificity. We will assess whether particular covariates should be included in the model using likelihood ratio tests.

Sensitivity analyses

We will investigate the influence of study quality on the accuracy of the index test using QUADAS-2 and the Standards for Reporting Diagnostic Accuracy (STARD) (Bossuyt 2003) (Appendix 4; Appendix 5; Appendix 6). We will exclude studies at high risk of bias as assessed by QUADAS-2 in sensitivity analyses, and we will compare the results with those from the primary analysis.

We will investigate outcomes for individuals with censored data using an ITD approach, as described previously. Differing length of follow-up may affect outcomes. We have set a minimum mean time to follow-up assessment of nine months. If there are sufficient studies spanning the full range of the eligible follow-up period, we will investigate the stability of results when studies with a shorter follow-up period are excluded from the analyses. Where the criteria used for the clinical diagnosis of dementia are not among the acceptable reference standards for this review, we will likewise exclude these studies in a sensitivity analysis to test whether the results differ from the analysis including all studies.

Assessment of reporting bias

We will not investigate reporting bias because of current uncertainty about how it operates in test accuracy studies and the interpretation of existing analytical tools such as funnel plots.


We thank the Cochrane Dementia and Cognitive Improvement Group for their general support.

Dr. Obioha Ukoumunne is supported by the Peninsula Collaboration for Leadership in Applied Health Research and Care (CLAHRC), a collaboration between the University of Exeter, University of Plymouth, and National Health Service South West, funded by the National Institute for Health Research.


Appendix 1. Reference standards for the diagnosis of AD dementia

Pathological criteria

 Braak stage I/II indicates normal condition; stage III/IV, cognitive impairment; and stage V/VI, dementia (Murayama 2004). The CERAD differentiates between neuritic and diffuse plaques, classifies neocortical neutritic plaques (NP) as 0, A, B, and C, and in addition uses clinical information in making the diagnosis. A CERAD stage C means abundant NP and corresponds to dementia in the elderly. The NIA-RIA categorises dementia in terms of likelihood and combines the Braak and CERAD criteria. Braak stage I/II and CERAD NP stage A correspond to low probability of AD; stage III/IV and B indicate middle probability of AD; and stage V/VI and C are regarded as high probability of AD (Murayama 2004).

Clinical criteria

The NINCDS-ADRDA key features for the diagnosis of probable AD are impairment in at least two cognitive domains, progressive impairment in memory and other cognitive functions not owing to other physical or neurological conditions, normal consciousness, with age of onset between 40 and 90 years of age (McKhann 1984). The supportive features are progressive worsening of specific cognitive functions, such as language, goal-oriented activity, and perception, which is also confirmed by the patient's family; and abnormal results of electroencephalogram (EEG), computed tomography (CT) or magnetic resonance imaging (MRI), or cerebrospinal fluid (CSF) examination (McKhann 1984). Possible AD refers to dementia based on clinical examination and MMSE or comparable tests score. It has a clinical presentation. A diagnosis of possible AD is made in the absence of other neurological, psychiatric or general physical impairment that could be causing the dementia; or in the presence of a general physical impairment of brain disorder that could result in dementia but is, in this case, not the cause of the dementia; or in the presence of a confirmed single gradually progressive serious cognitive impairment without other cause (McKhann 1984).

Appendix 2. MEDLINE search strategy

MEDLINE (OvidSP) search strategy

1. exp Cohort Studies/

2. cohort.ti,ab.

3. longitudinal.ti,ab.

4. follow-up.ti,ab.

5. year risk.ti,ab.

6. (prospective adj2 (study or analysis or evaluation)).ti,ab.

7. epidemiologic studies/

8. "observational study".ti,ab.

9. "preclinical detection".ti,ab. OR predict*.ti.

10. Disease Progression/ OR Risk/

11. or/1-10

12. (dement* OR “cognit* impair*”).ti.

13. (alzheimer* or AD).ti.

14. exp Dementia/di [Diagnosis]

15. exp Dementia/ep [Epidemiology]

16. *Cognition Disorders/di [diagnosis]

17. ((endpoint* or outcome*) adj6 (dement* or alzheimer* or AD)).ab.

18. (conversion adj4 (dement* or alzheimer*)).ab.

19. (convert* adj4 (dement* or alzheimer* or AD)).ab.

20. (predict* adj6 (dement* or alzheimer* or AD)).ab.

21. (progress* adj4 ("to dement*" or "to alzheimer*" or "to AD")).ab.

22. or/12-21

23. "Predictive Value of Tests"/

24. (dement* or alzheimer* or AD or “cognit* impair*”).ab.

25. 23 and 24

26. Neuropsychological Tests/

27. (dement* or alzheimer* or AD or “cognit* impair*”).ab.

28. 26 and 27

29. (neuropsycholog* adj4 (dement* or alzheimer* or AD or “cognit* impair*”)).ab.

30. or/22,25,28,29

31. 11 and 30

32. exp animals/ not humans.sh.

33. 31 not 32



Search narrative. This search strategy (above) has been kept deliberately very broad – utilizing only two main concepts: time (concept A) and the dementia aspect/outcome (concept B). APOE-ε4 will not be included in the search terms because this allows the search to be more sensitive to studies that do not contain the term APOE-ε4 in their abstracts yet provide information on APOE genotyping in the main text.

Potential studies for inclusion were initially identified from published non-Cochrane reviews and background literature. This generated a reference set of 35 potential (and probable) studies for inclusion to use to test the search strategy detailed above. All 35 studies are identified by the above strategy. We then further tested the strategy against a set of potential studies that had been identified through an index-test based search.

The yield from the above strategy will be very high. However, a significant proportion of references retrieved will not need to be screened as they will have already been screened from the monthly search detailed below.

Parallel to running non-index test based searches in the databases listed in the search methods section, the offline version of the Cochrane Dementia and Cognitive Improvement Group Specialized Register, ALOIS, will be searched for all longitudinal studies in which the presence of APOE-ε4 allele was measured. ALOIS is made up of both intervention and diagnostic test accuracy (DTA) studies. The DTA side of ALOIS is still under development and is not 100% comprehensive but the MEDLINE search used to identify diagnostic tests (in both longitudinal and normal cross-sectional studies) in any setting is run and screened every month.


The MEDLINE (OvidSP) strategy for ALOIS is:

1. "word recall".ti,ab.

2. "7-minute screen".ti,ab.

3. "6 item cognitive impairment test".ti,ab.

4. "6 CIT".ti,ab.

5. "AB cognitive screen".ti,ab.

6. "abbreviated mental test".ti,ab.

7. "ADAS-cog".ti,ab.

8. AD8.ti,ab.

9. "inform* interview".ti,ab.

10. "animal fluency test".ti,ab.

11. "brief alzheimer* screen".ti,ab.

12. "brief cognitive scale".ti,ab.

13. "clinical dementia rating scale".ti,ab.

14. "clinical dementia test".ti,ab.

15. "community screening interview for dementia".ti,ab.

16. "cognitive abilities screening instrument".ti,ab.

17. "cognitive assessment screening test".ti,ab.

18. "cognitive capacity screening examination".ti,ab.

19. "clock drawing test".ti,ab.

20. "deterioration cognitive observee".ti,ab.

21. "Dem Tect".ti,ab.

22. "fuld object memory evaluation".ti,ab.

23. "IQCODE".ti,ab.

24. "mattis dementia rating scale".ti,ab.

25. "memory impairment screen".ti,ab.

26. "minnesota cognitive acuity screen".ti,ab.

27. "mini-cog".ti,ab.

28. "mini-mental state exam*".ti,ab.

29. "mmse".ti,ab.

30. "modified mini-mental state exam".ti,ab.

31. "3MS".ti,ab.

32. "neurobehavioural cognitive status exam*".ti,ab.

33. "cognistat".ti,ab.

34. "quick cognitive screening test".ti,ab.

35. "QCST".ti,ab.

36. "rapid dementia screening test".ti,ab.

37. "RDST".ti,ab.

38. "repeatable battery for the assessment of neuropsychological status".ti,ab.

39. "RBANS".ti,ab.

40. "rowland universal dementia assessment scale".ti,ab.

41. "rudas".ti,ab.

42. "self-administered gerocognitive exam*".ti,ab.

43. ("self-administered" and "SAGE").ti,ab.

44. "self-administered computerized screening test for dementia".ti,ab.

45. "short and sweet screening instrument".ti,ab.

46. "sassi".ti,ab.

47. "short cognitive performance test".ti,ab.

48. "syndrome kurztest".ti,ab.

49. "six item screener".ti,ab.

50. "short memory questionnaire".ti,ab.

51. ("short memory questionnaire" and "SMQ").ti,ab.

52. "short orientation memory concentration test".ti,ab.

53. "s-omc".ti,ab.

54. "short blessed test".ti,ab.

55. "short portable mental status questionnaire".ti,ab.

56. "spmsq".ti,ab.

57. "short test of mental status".ti,ab.

58. "telephone interview of cognitive status modified".ti,ab.

59. "tics-m".ti,ab.

60. "trail making test".ti,ab.

61. "verbal fluency categories".ti,ab.

62. "WORLD test".ti,ab.

63. "general practitioner assessment of cognition".ti,ab.

64. "GPCOG".ti,ab.

65. "Hopkins verbal learning test".ti,ab.

66. "HVLT".ti,ab.

67. "time and change test".ti,ab.

68. "modified world test".ti,ab.

69. "symptoms of dementia screener".ti,ab.

70. "dementia questionnaire".ti,ab.

71. "7MS".ti,ab.

72. ("concord informant dementia scale" or CIDS).ti,ab.

73. (SAPH or "dementia screening and perceived harm*").ti,ab.

74. or/1-73

75. exp Dementia/

76. Delirium, Dementia, Amnestic, Cognitive Disorders/

77. dement*.ti,ab.

78. alzheimer*.ti,ab.

79. AD.ti,ab.

80. ("lewy bod*" or DLB or LBD).ti,ab.

81. "cognit* impair*".ti,ab.

82. (cognit* adj4 (disorder* or declin* or fail* or function*)).ti,ab.

83. (memory adj3 (complain* or declin* or function*)).ti,ab.

84. or/75-83

85. exp "sensitivity and specificity"/

86. "reproducibility of results"/

87. (predict* adj3 (dement* or AD or alzheimer*)).ti,ab.

88. (identif* adj3 (dement* or AD or alzheimer*)).ti,ab.

89. (discriminat* adj3 (dement* or AD or alzheimer*)).ti,ab.

90. (distinguish* adj3 (dement* or AD or alzheimer*)).ti,ab.

91. (differenti* adj3 (dement* or AD or alzheimer*)).ti,ab.

92. diagnos*.ti.

93. di.fs.

94. sensitivit*.ab.

95. specificit*.ab.

96. (ROC or "receiver operat*").ab.

97. Area under curve/

98. ("Area under curve" or AUC).ab.

99. (detect* adj3 (dement* or AD or alzheimer*)).ti,ab.

100. sROC.ab.

101. accura*.ti,ab.

102. (likelihood adj3 (ratio* or function*)).ab.

103. (conver* adj3 (dement* or AD or alzheimer*)).ti,ab.

104. ((true or false) adj3 (positive* or negative*)).ab.

105. ((positive* or negative* or false or true) adj3 rate*).ti,ab.

106. or/85-105

107. exp dementia/di

108. Cognition Disorders/di [Diagnosis]

109. Memory Disorders/di

110. or/107-109

111. *Neuropsychological Tests/

112. *Questionnaires/

113. Geriatric Assessment/mt

114. *Geriatric Assessment/

115. Neuropsychological Tests/mt, st

116. "neuropsychological test*".ti,ab.

117. (neuropsychological adj (assess* or evaluat* or test*)).ti,ab.

118. (neuropsychological adj (assess* or evaluat* or test* or exam* or battery)).ti,ab.

119. Self report/

120. self-assessment/ or diagnostic self evaluation/

121. Mass Screening/

122. early diagnosis/

123. or/111-122

124. 74 or 123

125. 110 and 124

126. 74 or 123

127. 84 and 106 and 126

128. 74 and 106

129. 125 or 127 or 128

130. exp animals/ not humans.sh.

131. 129 not 130

The concepts for this are:

A Specific neuropsychological tests

B General terms (both free text and MeSH) for tests/testing/screening

C Outcome: dementia diagnosis (unfocused MeSH with diagnostic sub-headings)

D Condition of interest: Dementia (general dementia terms both free text and MeSH – exploded and unfocused)

E Methodological filter: not used to limit all search

The concept combinations are:

1. (A OR B) AND C


3. A AND E

Appendix 3. Data extraction form




MCI concept:



Inclusion criteria:

Exclusion criteria:



Number of people: 

Age, in years (SD):    

Number of male (%)/female (%):

Education, in years (SD):

MMSE score (SD):   



Length of follow-up, in years (SD):

Number of people who completed the follow-up (%): 

Number of people who became demented (%): 

Number of people who progressed to AD dementia (%): 


Demographic information

 AD dementia  (n=    )No dementia   (n=     )P-valueRemarks
Age at baseline, in years (SD)     
Male (%)/female (%)                
Education, in years (SD)          
MMSE score at baseline (SD)    


Genotyping data

APOE-ε4 allele and outcome at follow-up

 At least 1 ε4 alleleε4 allele non-carrier
AD dementian=        (          %)n=           (          %)
No dementian=        (          %)n=           (          %)

APOE genotype and outcome at follow-up

AD dementian=     (      %)n=       (      %)n=      (     %)n=       (     %)n=      (    %)n=    (     %)
No dementian=     (      %)n=       (      %)n=      (     %)n=       (     %)n=      (    %)n=    (     %)

Other types of dementia


n=     (      %)n=       (      %)n=      (     %)n=       (     %)n=      (    %)n=    (     %)
nodemn=     (      %)n=       (      %)n=      (     %)n=       (     %)n=      (    %)n=    (     %)

Appendix 4. Assessment of methodological quality QUADAS-2 tool

DescriptionDescribe methods of patient selection: Describe included patients (prior testing, presentation, intended use of index test and setting): Describe the index test and how it was conducted and interpretedDescribe the reference standard and how it was conducted and interpretedDescribe any patients who did not receive the index test(s) and/or reference standard or who were excluded from the 2x2 table (refer to flow diagram): Describe the time interval and any interventions between index test(s) and reference standard

Signalling questions


Was a consecutive or random sample of patients enrolled?Were the index test results interpreted without knowledge of the results of the reference standard?Is the reference standard likely to correctly classify the target condition?Was there an appropriate interval between index test(s) and reference standard?
Was a case-control design avoided? Were the reference standard results interpreted without knowledge of the results of the index test?Did all patients receive a reference standard?
Did the study avoid inappropriate exclusions?Did all patients receive the same reference standard?
Were all patients included in the analysis?
Risk of bias: High/low/ unclearCould the selection of patients have introduced bias?Could the conduct or interpretation of the index test have introduced bias?      Could the reference standard, its conduct, or its interpretation have introduced bias?Could the patient flow have introduced bias? 
Concerns regarding applicability: High/low/ unclearAre there concerns that the included patients do not match the review question?Are there concerns that the index test, its conduct, or interpretation differ from the review question?Are there concerns that the target condition as defined by the reference standard does not match the review question? 

Appendix 5. Anchoring statements for quality assessment of APOE-ε4 allele diagnostic studies

Category Review question Inclusion criteria
PatientsParticipants with mild cognitive impairment, no dementiaParticipants fulfilling the criteria for the clinical diagnosis of MCI at baseline
Index Test APOE genotype APOE genotype
Target Condition

Alzheimer’s disease dementia (conversion from MCI to Alzheimer’s disease dementia)

Any other types of dementia (conversion from MCI to any other types of dementia

Alzheimer’s disease dementia (conversion from MCI to Alzheimer’s disease dementia)

Any other types of dementia (conversion from MCI to any other types of dementia)

Reference Standard

AD dementia

Pathological: Braak, CERAD, NIA-RIA


Other dementia


AD dementia

Pathological: Braak, CERAD, NIA-RIA


AD dementia


OutcomeN/AData to construct 2 x 2 table
Study DesignN/ALongitudinal cohort studies and nested case-control studies if they incorporate a delayed verification design and indicate the time point wherein participants have developed MCI (case-control nested in cohort studies)

Anchoring statements to assist with assessment for risk of bias

Patient selection

1. Was the sampling method appropriate?

Where sampling is used, the designs least likely to cause bias are consecutive sampling or random sampling. Sampling that is based on volunteers or selecting subjects from a clinic or research resource is prone to bias.

Weighting: High risk of bias (‘no’)

2. Was a case-control or similar design avoided?

Designs similar to case-control that may introduce bias are those designs where the study team deliberately increase or decrease the proportion of subjects with the target condition, which may not be representative. For example a population study may be enriched with extra dementia subjects from a secondary care setting, which are typically more diseased. Some case-control methods may already be excluded if they mix subjects from various settings.

Weighting: High risk of bias (‘no’)

3. Are exclusion criteria described and appropriate?

The study will be automatically graded as unclear if exclusions are not detailed (pending contact with study authors). Where exclusions are detailed, the study will be graded as “low risk” if exclusions are felt to be appropriate by the review authors. Certain exclusions common to many studies of dementia are: medical instability; terminal disease; alcohol/substance misuse; concomitant psychiatric diagnosis; other neurodegenerative condition. Exclusions are not felt to be appropriate if ‘difficult to diagnose’ patients are excluded.

Post hoc and inappropriate exclusions will be labelled “high risk” of bias.

Weighting: High risk (‘no’)

Index test

1. Was APOE genotyping performed without knowledge of clinical dementia diagnosis?

Terms such as “blinded” or “independently and without knowledge of” are sufficient and full details of the blinding procedure are not required. Interpretation of the results of the index test may be influenced by knowledge of the results of reference standard. If the index test is always interpreted prior to the reference standard then the person interpreting the index test cannot be aware of the results of the reference standard and so this item could be rated as ‘yes’.

For certain index tests the result is objective and knowledge of reference standard should not influence result, for example level of protein in cerebrospinal fluid, in this instance the quality assessment may be “low risk” even if blinding was not achieved.

Weighting: High risk (‘no’)

Reference standard

1. Is the assessment used for clinical diagnosis of dementia acceptable?

Commonly used international criteria to assist with clinical diagnosis of dementia include those detailed in DSM-IV and ICD-10. Criteria specific to dementia subtypes include but are not limited to NINCDS-ADRDA criteria for Alzheimer’s dementia; McKeith criteria for Lewy body dementia; Lund criteria for frontotemporal dementias; and the NINDS-AIREN criteria for vascular dementia. Where the criteria used for assessment is not familiar to the review authors or the Cochrane Dementia and Cognitive Improvement group (‘unclear’) this item should be classified as “high risk of bias”.

Weighting: High risk (‘no’)

2. Was clinical assessment for dementia performed without knowledge of the APOE genotype results?

Terms such as “blinded” or “independently and without knowledge of” are sufficient and full details of the blinding procedure are not required. Interpretation of the results of the reference standard may be influenced by knowledge of the results of index test.

Weighting: High risk (‘no’)

Patient flow

1. Was there an appropriate interval between APOE genotyping and clinical dementia assessment?

As we test the accuracy of the APOE-ε4 allelefor MCI progression to dementia, there will always be a delay between the index test and the reference standard assessments. The time between reference standard and index test will influence the accuracy ( Okello 2009; Visser 2006; Geslani 2005), and therefore we will note time as a separate variable (both within and between studies) and will test its influence on the diagnostic accuracy. We have set a minimum mean time to follow-up assessment of nine months. If more than 30% of subjects have assessment for MCI conversion before nine months, this item will score ‘no’.

Weighting: High risk of bias (‘no’)

2. Did all subjects get the same assessment for dementia regardless of APOE genotyping result?

There may be scenarios where subjects who score “test positive” on index test have a more detailed assessment. Where dementia assessment differs between subjects this should be classified as high risk of bias.

Weighting: High risk (no)

3. Were all patients who underwent APOE genotyping included in the final analysis?

If the number of patients enrolled differs from the number of patients included in the 2X2 table then there is the potential for bias. If patients lost to follow-up differ systematically from those who remain, then estimates of test performance may differ.

If dropouts these should be accounted for; a maximum proportion of dropouts to remain low risk of bias has been specified as 20%.

Weighting: High risk (‘no’)

4. Were missing or uninterpretable APOE genotyping results reported?

Where missing or uninterpretable results are reported, and if there is substantial attrition (we have set an arbitrary value of 50% missing data), this should be scored as ‘no’. If those results are not reported, this should be scored as ‘unclear’ and authors will be contacted.

Weighting: High risk (‘no’ and ‘unclear’)


Anchoring statements to assist with assessment for applicability

Patient selection

1. Were included patients representative of the general population of interest?

The included patients should match the intended population as described in the review question. The review authors should consider population in terms of symptoms; pre-testing; potential disease prevalence; setting.

We recognise that identifying all MCI patients in a given population may be particularly hard to achieve; therefore the information about the judgements for this criterion is particularly likely to be suboptimal. We expect that all included studies will be suboptimally reported to some degree. If there is a clear ground for suspecting an unrepresentative spectrum the item should be rated poor applicability.

Index test

1. Are there concerns that the index test differ from the review question?

The APOE genotype ascertained at baseline will not change at follow-up. The genotype will not change at follow-up. If the accuracy of the index test was based on genotype results, this item should be rated high applicability.

2. Were sufficient data on APOE genotyping given for the test to be repeated in an independent study?

In genetic analysis, it is important to report the medium, genotyping platform and quality control procedures. A common method of genotyping APOE is by extracting genomic DNA from EDTA anti-coagulated blood and amplifying this via polymerase chain reaction. DNA may also be extracted from buccal cell sample. Different genotyping platforms may be used. These platforms have very high accuracy and concordance ( Ghebranious 2005 ). In addition, the background, and training and expertise of the assessor should be reported and taken in consideration. If APOE genotyping was not performed consistently, this item should be rated poor applicability.

Reference standard

1. Was clinical diagnosis of dementia made in a manner similar to current clinical practice?

For many reviews, inclusion criteria and assessment for risk of bias will already have assessed the dementia diagnosis. For certain reviews an applicability statement relating to reference standard may not be applicable. There is the possibility that a form of dementia assessment, although valid, may diagnose a far larger proportion of subjects with disease than usual clinical practice. In this instance the item should be rated poor applicability.

Appendix 6. Standards for Reporting of Diagnostic Accuracy (STARD) checklist

Section and Topic    



1Identify the article as a study of diagnostic accuracy (recommend MeSH heading 'sensitivity and specificity').
INTRODUCTION2State the research questions or study aims, such as estimating diagnostic accuracy or comparing accuracy between tests or across participant groups.
Participants3The study population: Inclusion and exclusion criteria, setting and locations where data were collected.
 4Participant recruitment: Was recruitment based on presenting symptoms, results from previous tests, or the fact that the participants had received the index tests or the reference standard?
 5Participant sampling: Was the study population a consecutive series of participants defined by the selection criteria in item 3 and 4? If not, specify how participants were further selected.
 6Data collection: Was data collection planned before performing the index test and applying the reference standard (prospective study), or afterwards (retrospective study)?
Test methods7The reference standard and its rationale.
 8Technical specifications of material and methods involved including how and when measurements were taken, and/or cite references for index test and reference standard.
 9Definition of and rationale for the units, cut-offs when applicable and/or categories of the results of the index test and the reference standard.
 10The number, training and expertise of the persons executing and reading the index test and the reference standard.
 11Whether or not the readers of the index test and reference standard were blind (masked) to the results of the other test and whether any other clinical information were available to the readers.
Statistical methods12Methods for calculating or comparing measures of diagnostic accuracy, and the statistical methods used to quantify uncertainty (e.g. 95% confidence intervals).
 13Methods for calculating test reproducibility, if done.
Participants14When study was performed, including beginning and end dates of recruitment.
 15Clinical and demographic characteristics of the study population (at least information on age, gender, spectrum of presenting symptoms).
 16The number of participants satisfying the criteria for inclusion who did or did not undergo the index tests and/or the reference standard; describe why participants failed to undergo either test (a flow diagram is strongly recommended).
Test results17Time-interval between the index test and the reference standard, and any treatment administered in between.
 18Distribution of severity of disease (define criteria) in those with the target condition; other diagnoses in participants without the target condition.
 19A cross-tabulation of the results of the index test (including indeterminate and missing results) by the results of the reference standard; for continuous results, the distribution of the test results by the results of the reference standard.
 20Any adverse events from performing the index test or the reference standard.
Estimates21Estimates of diagnostic accuracy and measures of statistical uncertainty (e.g. 95% confidence intervals).
 22How indeterminate results, missing data and outliers of the index tests were handled.
 23Estimates of variability of diagnostic accuracy between subgroups of participants, readers or centers, if done.
 24Estimates of test reproducibility, if done.    
DISCUSSION25Discuss the clinical applicability of the study findings.

Contributions of authors

LSES and PJV conceived the study. LSES, WV and OU formulated the statistical plan of analysis. The protocol has been written with LSES as principal author and with contributions from all co-authors.

Declarations of interest

None declared.

Sources of support

Internal sources

  • Maastricht University, Netherlands.

  • University of Exeter, UK.

External sources

  • Cochrane Dementia and Cognitive Improvement Group (CDCIG), UK.

    The CDCIG has committed to provide a team of experienced assessors who will perform the initial electronic screening of studies.