rCBF SPECT for detection of frontotemporal dementia in people with suspected dementia

  • Protocol
  • Diagnostic

Authors


Abstract

This is the protocol for a review and there is no abstract. The objectives are as follows:

Primary objectives

  • To determine the diagnostic accuracy of rCBF SPECT in diagnosing FTD in populations with suspected dementia in secondary/tertiary healthcare settings

  • To determine the accuracy of rCBF SPECT in the differential diagnosis of FTD from other dementia subtypes

  • To highlight the quality and quantity of research evidence available about the effectiveness of rCBF SPECT in the target population  

  • To identify gaps in the evidence and where further research is required 

Background

This protocol is based on 'Neuropsychological tests for the diagnosis of Alzheimer's disease dementia and other dementias: a generic protocol for cross-sectional and delayed-verification studies' (Davis 2013).

Target condition being diagnosed

Dementia is a progressive syndrome of global cognitive impairment. In the UK, it affects 5% of the population over 65 and 25% of those over 85 (Alzheimer's Society 2007). In 2010, there were estimated to be 36 million people living with dementia worldwide (Alzheimer's Disease International 2010), and this will increase to over 115 million by 2050. The greatest increases in prevalence will be seen in the developing regions.  By 2040 China and its western-Pacific neighbours are predicted to have 26 million people living with dementia (Ferri 2005). Dementia encompasses a group of neurodegenerative disorders that are characterised by progressive loss of cognitive function and ability to perform activities of daily living, that can be accompanied by neuropsychiatric symptoms and challenging behaviours of varying type and severity. The underlying pathology is usually degenerative and subtypes of dementia include Alzheimer's disease dementia (ADD), vascular dementia (VaD), dementia with Lewy bodies, and frontotemporal dementia (FTD). There may be considerable overlap in the clinical and pathological presentations (Neuropathology Group of MRC CFAS 2001), and there is often co-existence of ADD and VaD (Matthews 2009; Savva 2009). 

FTD is a neurodegenerative disease that affects the anterior temporal and frontal lobes. It is thought to account for up to 16% of all degenerative dementias (Ratnavalli 2002; Miller 1997) and is the second most common young-onset dementia (Seelaar 2011). It is clinically characterised by progressive behavioural change, executive dysfunction and language difficulties and is made up of three main clinical syndromes: behavioural variant FTD, semantic dementia and progressive non-fluent aphasia. Although the core of the diagnostic process in dementia rests firmly on clinical and cognitive assessments, a wide range of investigations are available to further aid diagnosis. These include blood and cerebrospinal fluid (CSF) tests, as well as neuroimaging modalities such as magnetic resonance imaging (MRI), regional cerebral blood flow single photon emission computerised tomography (rCBF SPECT) and positron emission tomography (PET). Structural imaging with MRI shows frontal and temporal atrophy which can be asymmetrical, and rCBF SPECT shows anterior frontal or temporal hypoperfusion. 

Management of patients with dementia demands accurate diagnosis of the underlying neurodegenerative process to guide prognosis, early therapeutic intervention and advice regarding heritability and social and environmental management. This is particularly important in FTD, which has a wide range of presentations often overlapping significantly with other dementias, especially ADD and VaD. Accurate diagnosis of FTD as the underlying aetiology of the dementing illness is important as treatment with cholinesterase inhibitors (used in ADD) may have adverse effects in patients with FTD. A number of studies have established that the currently accepted clinical criteria for FTD are relatively insensitive, particularly in the early stages of the disease process when putative disease-modifying treatments are likely to be most effective (Seelaar 2011).  In the early phase of FTD, the sensitivity of clinical criteria decreases to 37% (Seelaar 2011). Standard neuropsychological tests cannot reliably differentiate FTD from ADD (Hutchinson 2007; Nedjam 2004). In autopsy-proven studies, the current clinical criteria correctly classify approximately 80-90% of behaviour variant FTD. 

rCBF SPECT is an established clinical tool, which uses an intravenously injected radio-labelled tracer to map the blood perfusion in the brain, and is thought to be a particularly useful investigation in the diagnosis of FTD. In FTD the characteristic pattern is hypoperfusion of frontal and anterior temporal lobes (Miller 1997). This pattern has been used to differentiate FTD from ADD, where hypoperfusion is more commonly in the medial temporal, superior temporal, parietal, posterior cingulate cortex and precuneus. rCBF SPECT has been reported to differentiate ADD from FTD with a sensitivity of 71.% and a specificity of 78.2% (Miller 1997). rCBF SPECT has an advantage over structural neuroimaging techniques as functional changes may precede structural ones, and compared to PET it is more widely available.

It has been proposed that a diagnosis of FTD, particularly in the early stages, should be made not only on clinical criteria but using a combination of other diagnostic findings, including rCBF SPECT. However, more extensive testing comes at a financial cost and potential risk to patient safety and comfort. Thus it is important that additional diagnostic tests are of proven benefit. rCBF SPECT is a neuroimaging tool that has been found by some to improve diagnostic accuracy of FTD from other dementias (Miller 1997; Rascovsky 2011), although its use is controversial (Knopman 2001).

This review will assess the value of rCBF SPECT in diagnosing FTD in people with suspected dementia.

Index test(s)

rCBF SPECT

SPECT is a non-invasive means of looking at blood flow to different areas of the brain. Patients with different dementias are thought to have different patterns of abnormal blood flow. Broadly speaking, patients with FTD have reduced perfusion in the frontal and anterior temporal lobes, whereas, patients with Alzhiemer's tend to have reduced flow that can be seen in the parietal lobe (Catafau 2001). Therefore, rCBF SPECT is usually used clinically when the type of presence or type of dementia are uncertain after clinical assessment, psychology testing and structural imaging.

We will be evaluating the use of rCBF SPECT in diagnosing FTD. rCBF SPECT uses radio-labelled tracers, most commonly 99m technetium-hexamethyl-propylenamine oxime (99mTc-HMPAO) or 99m technetium-L, lethylcysteinate dimer (99mTc-ECD) to demonstrate differences in rCBF. rCBF is thought to indirectly reflect neural activity in each brain region at rest. The radiotracer enters the brain at first pass, with its incorporation proportional to rCBF in the first few minutes after injection. Modifications in rCBF after injection do not change the initial distribution of the tracer because of intracellular trapping. 

Participants assessed with rCBF SPECT will be classified as having an either 'FTD pattern' or not 'FTD pattern' of blood flow based on visual interpretation and/or a variety of different analysis techniques. There will be potential challenges with study interpretation due to differing methods used in analysis and the pre- or post-processing of images. These factors are discussed in more detail below.  

Clinical pathway

Presentation

In the UK, people with suspected dementia usually present first to their general practitioner, who may administer basic screening tests (blood tests and simple tests of cognitive function) and will potentially refer to a hospital memory clinic. At this stage, other physical or mental disorders, for example depression or hypothyroidism, which might be contributing to cognitive impairment, are typically excluded or treated. FTD may present with personality changes, disinhibited behaviour, mood disorder and even psychosis, and may therefore be missed initially, and the patient will often be referred for specialist assessment when the diagnosis of FTD is suspected.

Standard diagnostic practice

Standard assessment of dementia includes history, clinical examination (including neurological, mental state and cognitive examination) and an interview with a relative or other informant. A neuroradiological examination (CT or MRI scan of the brain) is also recommended in most recent guidelines (McKhann 2011; NICE 2006). Patients may also receive a full neuropsychological assessment, if appropriate, before a diagnosis of dementia is made. Diagnostic assessment pathways may vary in other countries and diagnoses may be made by a variety of specialists including neurologists and geriatricians. 

Dementia diagnosis is defined by a deficit in more than two cognitive domains of sufficient degree to impair functional activities. These symptoms are usually progressive over a period of at least several months and should not be attributable to any other brain disease. The ICD-10 diagnostic criteria for dementia are detailed in Appendix 2.

FTD subtype is usually diagnosed by clinical presentation. rCBF SPECT is sometimes used to help establish the diagnosis of FTD, but is usually only carried out in secondary or tertiary referral centres.

Role of index test(s)

How might the index test improve diagnoses, treatments and patient outcomes?

If FTD can be diagnosed at an early stage, this will help people with dementia, their families and potential carers to make timely plans for the future. In the early stages of the disease and particularly in young onset behavioural variant, FTD can be misdiagnosed as another sub-type of dementia (often ADD). Coupled with appropriate contingency planning, proper recognition of the disease may also help to avoid costly admissions to hospital or institutional care (NAO 2007).  In addition, the accurate early identification of FTD may improve opportunities for the use of newly evolving interventions designed to delay or prevent the progression to more debilitating stages of dementia.  

Alternative test(s)

We are not including alternative tests in this review because there are currently no standard practice tests available for diagnosis of FTD. We will compare rCBF SPECT results with reference standard results.

Rationale

The public health burden of dementia is of growing concern.  With the changing age structure of populations in both high and low income countries, dementia prevalence is increasing (Ferri 2005).  At the population level, there are major implications for service provision and planning, given the condition leads to progressive functional dependence over several years.  In the UK, it is estimated that annual expenditure on dementia care is £17 billion (Alzheimer's Society 2007) , and the worldwide cost of dementia in 2010 was USD$604 billion (Alzheimer's Disease International 2010). Accurate early diagnosis of dementia and the subtype of FTD may help in planning appropriate care and reducing costs.

It is important that expensive and invasive diagnostic tests are of proven benefit over more established clinical and imaging assessments. The clinical use of SPECT in differentiating ADD, VaD and FTD has been recognised in certain diagnostic guidelines. NICE 2006 recommend that rCBF SPECT should be used to differentiate ADD, VaD and FTD if the diagnosis is in doubt.  Rascovsky 2011 felt that the additional use of SPECT could give greater certainty to a diagnosis of FTD. However other guidelines such as Knopman 2001 found that SPECT could not be recommended for routine use in either initial or differential diagnosis. Thus a systematic review of diagnostic test accuracy studies of rCBF SPECT in FTD is required.

Objectives

Primary objectives

  • To determine the diagnostic accuracy of rCBF SPECT in diagnosing FTD in populations with suspected dementia in secondary/tertiary healthcare settings

  • To determine the accuracy of rCBF SPECT in the differential diagnosis of FTD from other dementia subtypes

Secondary objectives

  • To highlight the quality and quantity of research evidence available about the effectiveness of rCBF SPECT in the target population  

  • To identify gaps in the evidence and where further research is required 

Methods

Criteria for considering studies for this review

Types of studies

There are two main study designs in which rCBF SPECT can be used in diagnosis of FTD: cross-sectional and longitudinal ("delayed verification of diagnosis", see DTA handbook chapter 6). We expect most of the study designs identified in this review to be classic case control studies in which the index test is administered to a sample of people with the diagnosis of FTD and to a sample of people without FTD (most likely ADD). We may also find some studies where a cohort of people with unspecified dementia (i.e. dementia of unknown subtype) are administered the index test and then followed up for confirmation of FTD diagnosis or not, either by clinical course or by neuropathological confirmation. The reason for our expectation is that SPECT is an expensive and invasive test which uses a radiotracer. Thus study participants are unlikely to be an unselected cohort of people with and without dementia. Study participants are also likely to have undergone other imaging investigations (e.g. CT or MRI) to help exclude other subtypes of dementia prior to study recruitment; this will be examined in the analysis and interpretation of the findings of our review.

Classic case control studies recruit are subject to considerable spectrum bias (Davis 2013). If most of the studies we identify are in this category we will present the findings of case-control study designs as the current best evidence for diagnostic test accuracy of rCBF SPECT in FTD in a narrative review, with no meta-analysis in order to avoid a biased estimate of accuracy. We will also highlight the limitations of the clinical implications of the findings.

If we identify any longitudinal cohort studies of participants with unspecified dementia that received rCBF SPECT at baseline we will present these findings separately and with a meta-analysis if we can pool the data.

Settings

Due to the expense and technological expertise required, we expect the studies to be limited to secondary and tertiary healthcare settings. Specialist memory clinics provide the most common source of patient recruitment to rCBF SPECT studies. These patients are likely to have had neuropsychological testing and imaging investigations prior to recruitment.

Participants

For case-control study designs, we will include all participants who have been recruited and clinically diagnosed with FTD or other dementia subtypes using the standard clinical diagnostic criteria (see reference standards).

For longitudinal study designs, we will include studies where all participants with suspected dementia are administered rCBF SPECT at baseline.

We will exclude studies of participants from selected populations e.g. post-stroke or patients with Parkinson's disease,and studies of participants with a secondary cause for cognitive impairment, namely: current or history of alcohol/drug abuse, central nervous system trauma (e.g. subdural haematoma), tumour or infection.

Index tests

The use of rCBF SPECT in the characterisation of FTD is dependent on a chain of actions all of which have the potential to affect the quality of the data used for clinical reporting. A radiotracer is injected into a patient, followed by image acquisition and reconstruction to produce a patient blood flow map. The blood flow map is interpreted by one or more clinicians with the aim of identifying patterns representative of FTD i.e. frontotemporal hypoperfusion. Further computerised analysis may then be carried out, typically the comparison of the patient blood flow map with a database of control scans.

Recent procedure guidelines for brain perfusion SPECT using 99mTc-labelled radiopharmaceuticals in the US and Europe make detailed recommendations which are summarised below (Juni 2009; (Kapucu 2009).

  • Patient preparation: place the patient in a quiet, dimly lit room, insert intravenous cannula 10-15 minutes prior to injection, no patient interaction within 5 minutes of injection.

  • Radiopharmaceutical preparation: use 99mTc-ethyl cysteinate dimer (99mTc-ECD) or stabilised 99mTc-hexamethylpropylene amine oxime (99mTc-HMPAO), use 99Tcm eluted within 24 hours of test.

  • Data acquisition: detailed recommendations are made, notably concerning the use of multiple detector gamma cameras, collimation and acquired counts.

  • Image processing: general recommendations are made regarding reconstruction, corrections, reformatting of slice data and semi-quantitative evaluation.

  • Interpretation criteria: relevant structural information from CT and MRI must be considered to help interpret the SPECT scans. It is possible to standardise SPECT images and analyse by focussing on a particular region of interest and/or compare to a normal database.

These guidelines provide a framework for the assessment of most of the technical aspects of published studies of the use of rCBF SPECT in the characterisation of FTD with two important provisos. Firstly, published reports generally do not specify how they carried out the study in sufficient detail to allow a complete and impartial judgement of quality to be made. Secondly, both sets of guidelines tentatively recommend that patient images are compared with control databases in order to aid interpretation but list 'database issues' under 'issues requiring further clarification' and relatively limited specific recommendations are made regarding the demographic and technical aspects of control database comparisons. These recommendations concern the need for age matching and that the same type of camera and processing methods are used for both controls and patients. The relative advantages and disadvantages of different analysis methods are not discussed. No recommendations are made regarding how thresholds should be set. It is probably the case that more modern analysis methods utilising multivariate statistics, computationally intensive registration methods and using large multi-centre control databases are likely to be more accurate than older methods. As both sets of guidelines are recent and make no specific recommendations we should be cautious in making our own judgements in these areas.

However the following broad criteria can be used in addition to those from the guidelines in the careful assessment of study quality:

  • Visual rating

    • Was rating carried out by multiple experts blinded to the clinical and/or pathological status of the patient?

    • Did the raters use well-defined criteria for assessing scans? Are these criteria explained in sufficient detail to reproduce independently?

  • Semiquantitative evaluation

    • If quantitative maps were visually assessed then the two criteria above are applicable.

    • Ideally, scans should be assessed with and without quantitative analysis.

    • Ideally there should be an explanation of the methods used to derive any thresholds used in the computation of quantitative results and the effects of threshold setting on sensitivity and specificity.

    • If normal database comparisons are used, details should be given of normal subject screening procedures and to what extent demographic matching (e.g. age, sex, education) was achieved.

Target conditions

The target condition is FTD.

Reference standards

Ante-mortem clinical diagnosis of FTD will be based on any of the following recognised diagnostic criteria.

  1. Manchester-Lund (The Lund and Manchester Groups 1994).

  2. Neary criteria (Neary 1998; Appendix 2).

  3. NINDS criteria for FTD (McKhann 2001).

  4. Histopathological diagnosis (Mackenzie 2010; Mackenzie 2011).

  5. Presence of a genetic mutation known to be associated with FTD (including MAPT (Microtubule Associated Protein Tau), GRN (Progranulin), TARDBP( transactive response DNA binding protein), VCP( valosin-containing protein), c9orf72 ( chromosome 9 open reading frame 72 ) and ( charged multivesicular body protein 2B) genes) (Mahoney 2012).

The results will be presented stratified by type of reference standard.

Controls in case-control studies will be diagnosed using a standardised definition of subtype of dementia (usually ADD), including NINCDS-ADRDA (National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association-Criteria) (McKhann 1984); The Consortium to Establish a Registry for Alzheimer's Disease (CERAD) (Mirra 1991); National Institute of Neurological Disorders and Stroke - Association Internationale pour la Recherche et l'Enseignement en Neurosciences (NINDS-AIREN) (Román 1993), Alzheimer's Disease Diagnostic and Treatment Centers (ADDTC) (Chui 1992), and Cambridge Mental Disorders of the Elderly Examination (CAMDEX) criteria (Hendrie 1988).

Diagnosis of all cause (unspecified) in longitudinal cohort studies will be diagnosed using Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) and ICD-10 (American Psychiatric Association 2013; WHO 2010).

We recognise that different iterations of clinical criteria over time may not be directly comparable (e.g. DSM-IIIR vs. DSM-IV or ICD9 vs. ICD10), and the validity of diagnoses may vary with the degree or manner in which the criteria have been operationalised (e.g. individual clinician vs. algorithm vs. consensus determination). Data on the method and application of the reference standard will be collected and, if considered to be a source of bias, will be examined as a source of heterogeneity.

Search methods for identification of studies

The search strategy detailed in Appendix 3 utilises only two search concepts, keeping the search sensitive. The concepts are (a) the index test and (b) In general and specific terms, the condition of interest.

Electronic searches

We will search MEDLINE (OvidSP), EMBASE (OvidSP), BIOSIS (Ovid), Science Citation Index (ISI Web of Knowledge), PsycINFO (Ovid), CINAHL (EBSCO) and LILACS (Bireme).  Science Citation Index includes conference abstracts in its database. See Appendix 3 for a proposed draft strategy to be run in MEDLINE (OvidSP).  Similarly structured search strategies will be designed using search terms appropriate for each database.  Controlled vocabulary such as MeSH terms and EMTREE will be used where appropriate. 

There will be no attempt to restrict studies based on setting in the search strategy. 

Search filters (collections of terms aimed at reducing the number need to screen) will not be used as those published have not proved sensitive enough (Whiting 2011). No language restriction will be applied to the electronic searches; translation services will be used as necessary. 

We will also request a search of the Cochrane Register of Diagnostic Test Accuracy Studies (hosted and maintained by the Cochrane Renal Group), and the specialised register of the Cochrane Dementia and Cognitive Improvement Group, ALOIS (http://alois.cochrane.org/), which includes both intervention and diagnostic test accuracy studies in dementia. 

Initial searches will be performed by a single researcher with extensive experience of systematic review.

Searching other resources

We will check the reference lists of all relevant papers for additional studies.

We will also search:

  • MEDION (Meta-analyses van Diagnostisch Onderzoek);

  • DARE (Database of Abstracts of Reviews of Effects);

  • HTA (Health Technology Assessments database, the Cochrane Library);

  • ARIF(Aggressive Research Intelligence Facility).

We will use relevant studies in PubMed to search for additional studies using the related articles feature. We will examine key studies in citation databases such as Science Citation Index and Scopus to ascertain any further relevant studies. We will also identify grey literature through Science Citation Index and EMBASE, which now include conference proceedings.  We will aim to access theses or dissertations from institutions known to be involved in prospective dementia studies.  We will also attempt to contact researchers involved in studies with possibly relevant, but unpublished data.  We will not perform hand-searching as there is little published evidence of the benefits of handsearching for reports of Diagnostic Test Accuracy studies (Glanville 2010). 

Data collection and analysis

Selection of studies

Studies will initially be selected from title and abstract screening undertaken by the study authors (ANS, EC, NS, SC, HA).  Subsequently, we will locate the full-text for each potentially eligible study identified by the search. These papers will then be independently evaluated for inclusion or exclusion by at least two of the study authors (CJ, EC, HA), after assessment of the sampling frame for each study. Disagreements will be resolved by discussion with a third author (SC, NS). The study selection process will be detailed in a Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram (Moher 2009).

Data extraction and management

We will extract data on study characteristics to a study specific proforma and will include data for the assessment of quality and data for investigation of heterogeneity, as described in Appendix 4. The pro forma will be piloted against two primary diagnostic studies. 

Data will be extracted by two review authors (2 of CJ, EC, HA, NS). We will dichotomise the results if necessary and cross-tabulate them in 2x2 tables of index test result (positive or negative) against target disorder (positive or negative), and will also extract and enter them directly into the Cochrane Collaboration's statistical software, Review Manager 2013. All participants should be classified as having 'FTD pattern' or not having 'FTD pattern' according to the Reference standards used, and will be classified as 'FTD present' (disease positive) or 'FTD absent' (disease negative). We will create the 2x2 table and calculate sensitivity and specificity of the index test (Table 1).

Table 1. 2x2 table of index test and disease status for each reference standard
Index test

Reference standards

(Manchester-Lund; NINDS;

histopathological criteria)

FTD present

(disease positive)

FTD absent

(disease negative)

'FTD pattern' present

(test positive)

TPFP

'FTD pattern' present

(test negative)

FNTN

Assessment of methodological quality

We will assess methodological quality of each study using the QUADAS (Quality Assessment of Diagnostic Accuracy Studies)-2 tool (Whiting 2011) . The tool is made up of four domains: patient selection; index test; reference standard; patient flow (Appendix 5).  Each domain is assessed in terms of risk of bias, with the first three domains also considered in terms of applicability. Operational definitions describing the use of QUADAS-2 are detailed in Appendix 6.

QUADAS-2 data will not be used to form a summary quality score. We will produce a narrative summary describing numbers of studies that were found to have high/low/unclear risk of bias as well as concerns regarding applicability.

Statistical analysis and data synthesis

Most study participants are likely to have undergone other imaging investigations (e.g. CT or MRI) prior to study recruitment; we will classify studies according to prior radiological investigations received (if the information is available) and perform a subgroup analysis by category to examine the effect of prior testing.

Given the current state of knowledge about FTD and the broad scope of this review the main priority will be a clear descriptive analysis of the included studies. We do not anticipate that meta-analysis will be a major feature, but where it is possible the following paragraphs as described in Davis 2013 indicate the approach that will be taken.

  • For all included studies (likely to be only delayed verification studies) the data in the two-by-two tables (showing the binary test results cross-classified with the binary reference standard) will be used to calculate the sensitivities and specificities, with their 95% confidence intervals. We will present individual study results graphically by plotting estimates of sensitivities and specificities in both a forest plot and in receiver operating characteristic (ROC) space. These findings will be considered in the light of the previous systematic assessment (using QUADAS-2) of the methodological quality of individual studies. We will use Review Manager 2013 for these descriptive analyses, and to produce summary ROC curves.  If more than one threshold is reported in an individual study, then we will present the graphical findings for all thresholds reported.  However, we will avoid the study data being included in the calculation of a summary statistic on more than one occasion, by using only the threshold which is considered to be "standard practice" for the target population in question. If there is no agreed standard practice for the index test and target population in question then the optimal threshold will be used (the threshold nearest to the upper left corner of the ROC curve) in the calculation of the summary ROC curve in Review Manager 2013 and for any subsequent meta-analysis; we recognise that this may lead to an overestimate of diagnostic accuracy (Leeflang 2008). 

  • We will perform meta-analysis on pairs of sensitivity and specificity if it is appropriate to pool the data. Once the relevant studies have been identified, it will be clear if the majority of the studies report results with consistent thresholds. If so, a bivariate random effects approach based on pairs of sensitivity and specificity using a bivariate random effects may be appropriate (Reitsma 2005). This approach enables us to calculate summary estimates of sensitivity and specificity, while correctly dealing with the different sources of variation: (i) imprecision by which sensitivity and specificity have been measured within each study; (ii) variation beyond chance in sensitivity and specificity between studies; (iii) any correlation that might exist between sensitivity and specificity. Categorised covariates can be incorporated in the bivariate model to examine the effect of potential sources of bias and variation across subgroups of studies as outlined in the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy (Macaskill 2010).  Because of the bivariate nature of the model, effects on sensitivity and specificity can be modelled separately. The results of the bivariate model can be processed to calculate likelihood ratios. To calculate (negative) predictive values, it requires an estimate of the prevalence in addition to values of sensitivity and specificity. If summary likelihood ratios can be derived, we will calculate predictive values based on population-based estimates of age-specific prevalence to estimate pre-test probability.

  • If different thresholds are reported, we will use hierarchical summary ROC models (Macaskill 2010).

  • We will use Stata statistical software package, version 11.0 or above (www.stata.com, StataCorp, Texas), to carry out the additional analyses using either the bivariate or hierarchical summary receiver-operator curves (HSROC) approaches.

Investigations of heterogeneity

The potential sources of heterogeneity include the following factors.

  • Index test:

    • different image analysis techniques and thresholds;

    • technical features of scanning e.g. camera resolution, scatter correction, total counts acquired;

    • operator characteristics e.g. training.

  • target disorder:

    • reference standard used;

    • operationalisation of these classifications (e.g. individual clinician/algorithm/consensus group);

    • stage and severity of dementia.

  • target population:

    • sociodemographic characteristics (age, sex, education);

    • clinical settings;

    • other characteristics e.g. presence of or family history of motor neurone disease.

  • study quality:

    • study design (Types of studies);

    • blinding;

    • time from index test to reference standard;

    • duration of follow-up (measured in years for delayed-verification studies);

    • loss due to drop-outs (or attrition at follow-up for delayed verification studies).

Heterogeneity will be investigated in the first instance through visual examination of forest plots of sensitivities and specificities and through visual examination of the ROC plot of the raw data.  The main sources of heterogeneity will be index test thresholds, reference standards used for the target disorders (FTD), patient characteristics in particular age (any studies that include 30% patients below the age of 65 will be examined separately) and aspects of study quality (particularly inadequate blinding).  We will initially investigate their effect by conducting subgroup analyses in Review Manager 2013, and by including each of these as covariates in the regression analyses.  If we identify further likely sources of heterogeneity we will investigate these by subgroup analyses and, if data allow, also include these as covariates in the regression model.

Sensitivity analyses

Where appropriate (i.e. if not already explored in our analyses of heterogeneity) and if sufficient data are available, we will explore the sensitivity of any summary accuracy estimates to aspects of study quality such as nature of blinding and loss to follow-up, guided by the anchoring statements developed in our QUADAS-2 exercise. Primary analysis will include all studies (unless incorporation bias is evident); sensitivity analysis will exclude studies of low quality (high likelihood of bias) to determine if the results are influenced by inclusion of the lower quality studies.

Assessment of reporting bias

We did not investigate reporting bias because of current uncertainty about how it operates in test accuracy studies and the interpretation of existing analytical tools such as funnel plots.

Appendices

Appendix 1. Appendix 1: Overview of diagnostic test accuracy reviews in dementia

 Setting 

Index test

BIOMARKER/IMAGING

CommunityPrimarySecondaryAll settingsNo. of reviews
Abeta     
Delayed verification   x1
Cross-sectional   x1
Tau/abeta ratio     
Delayed verification   x1
Cross-sectional     
PET-PiB     
Delayed verification   x1
Cross-sectional     
PET-FDG     
Delayed verification   x1
Cross-sectional     
Structural MRI     
Delayed verification   x1
Cross-sectional     
DAT scan     
Delayed verification     
Cross-sectional   x1
SPECT scan     
Delayed verification     
Cross-sectional   x1
APOE 4     
Delayed verificationxxx 3
Cross-sectional     

Index test

SCALES

CommunityPrimarySecondaryAll settingsNo. of reviews
IQCODE     
Delayed verification   11
Cross-sectionalxxx 3
AD8     
Delayed verificationxxx 3
Cross-sectionalxxx 3
MMSE     
Delayed verification   x1
Cross-sectionalxxx 3
MiniCOG     
Delayed verification     
Cross-sectionalxxx 3
MOCA     
Delayed verification     
Cross-sectional    x1
ACE-R     
Delayed verification     
Cross-sectional     

Abbreviations:

Appendix 2. Appendix 2: Classification of dementia

WHO ICD-10

Dementia

  • G1. Evidence of each of the following:

    • (1) A decline in memory, which is most evident in the learning of new information, although in more severe cases, the recall of previously learned information may be also affected. The impairment applies to both verbal and non-verbal material. The decline should be objectively verified by obtaining a reliable history from an informant, supplemented, if possible, by neuropsychological tests or quantified cognitive assessments. The severity of the decline, with mild impairment as the threshold for diagnosis, should be assessed as follows:

      • Mild: a degree of memory loss sufficient to interfere with everyday activities, though not so severe as to be incompatible with independent living. The main function affected is the learning of new material. For example, the individual has difficulty in registering, storing and recalling elements in daily living, such as where belongings have been put, social arrangements, or information recently imparted by family members.

      • Moderate: A degree of memory loss which represents a serious handicap to independent living. Only highly learned or very familiar material is retained. New information is retained only occasionally and very briefly. The individual is unable to recall basic information about where he lives, what he has recently been doing, or the names of familiar persons.

      • Severe: a degree of memory loss characterized by the complete inability to retain new information. Only fragments of previously learned information remain. The subject fails to recognize even close relatives.

    • (2) A decline in other cognitive abilities characterized by deterioration in judgement and thinking, such as planning and organizing, and in the general processing of information. Evidence for this should be obtained when possible from interviewing an informant, supplemented, if possible, by neuropsychological tests or quantified objective assessments. Deterioration from a previously higher level of performance should be established. The severity of the decline, with mild impairment as the threshold for diagnosis, should be assessed as follows:

      • Mild. The decline in cognitive abilities causes impaired performance in daily living, but not to a degree making the individual dependent on others. More complicated daily tasks or recreational activities cannot be undertaken.

      • Moderate. The decline in cognitive abilities makes the individual unable to function without the assistance of another in daily living, including shopping and handling money. Within the home, only simple chores are preserved. Activities are increasingly restricted and poorly sustained.

      • Severe. The decline is characterized by an absence, or virtual absence, of intelligible ideation. The overall severity of the dementia is best expressed as the level of decline in memory or other cognitive abilities, whichever is the more severe (e.g. mild decline in memory and moderate decline in cognitive abilities indicate a dementia of moderate severity).

  • G2. Preserved awarenenss of the environment during a period of time long enough to enable the unequivocal demonstration of G1. When there are superimposed episodes of delirium the diagnosis of dementia should be deferred.

  • G3. A decline in emotional control or motivation, or a change in social behaviour, manifest as at least one of the following:

    • (1) emotional lability;

    • (2) irritability;

    • (3) apathy;

    • (4) coarsening of social behaviour.

  • G4. For a confident clinical diagnosis, G1 should have been present for at least six months; if the period since the manifest onset is shorter, the diagnosis can only be tentative.

Comments: The diagnosis is further supported by evidence of damage to other higher cortical functions, such as aphasia, agnosia, apraxia.

Judgment about independent living or the development of dependence (upon others) need to take account of the cultural expectation and context.

Dementia is specified here as having a minimum duration of six months to avoid confusion with reversible states with identical behavioural syndromes, such as traumatic subdural haemorrhage (S06.5), normal pressure hydrocephalus (G91.2) and diffuse or focal brain injury (S06.2 and S06.3).

Neary criteria for behavioural variant of frontotemporal dementia

  • I. Core diagnostic features

    • A. Insidious onset and gradual progression

    • B. Early decline in social interpersonal conduct

    • C. Early impairment in regulation of personal conduct

    • D. Early emotional blunting

    • E. Early loss of insight

  • II. Supportive diagnostic features

    • A. Behavioral disorder

      • 1. Decline in personal hygiene and grooming

      • 2. Mental rigidity and inflexibility

      • 3. Distractibility and impersistence

      • 4. Hyperorality and dietary changes

      • 5. Perseverative and stereotyped behavior

      • 6. Utilization behavior

    • B. Speech and language

      • 1. Altered speech output

        • a. Aspontaneity and economy of speech

        • b. Press of speech

      • 2. Stereotypy of speech

      • 3. Echolalia

      • 4. Perseveration

      • 5. Mutism

    • C. Physical signs

      • 1. Primitive reflexes

      • 2. Incontinence

      • 3. Akinesia, rigidity, and tremor

      • 4. Low and labile blood pressure

    • D. Investigations

      • 1. Neuropsychology: significant impairment on frontal lobe tests in the absence of severe amnesia, aphasia, orperceptuospatial disorder

      • 2. Electroencephalography: normal on conventional EEG despite clinically evident dementia

      • 3. Brain imaging (structural and/or functional): predominant frontal and/or anterior temporal abnormality

Appendix 3. Appendix 3: Search strategy for use with MEDLINE electronic database

1. Tomography, Emission-Computed, Single-Photon/ or Tomography, Emission-Computed/
2. SPECT.ti,ab.
3. SPET.ti,ab.
4. single photon emission tomography.ti,ab.
5. single photon emission computed tomography.ti,ab.
6. "SPECT/CT".ti,ab.
7. or/1-6
8. exp Dementia/
9. Delirium/
10. Delirium, Dementia, Amnestic, Cognitive Disorders/
11. dement*.ti,ab.
12. alzheimer*.ti,ab.
13. (lewy* adj2 bod*).ti,ab.
14. (chronic adj2 cerebrovascular).ti,ab.
15. ("organic brain disease" or "organic brain syndrome").ti,ab.
16. "benign senescent forgetfulness".ti,ab.
17. (cerebr* adj2 deteriorat*).ti,ab.
18. (cerebral* adj2 insufficient*).ti,ab.
19. (pick* adj2 disease).ti,ab.
20. "Frontotemporal lobar degeneration".ti,ab.
21. "progressive non-fluent aphasia".ti,ab.
22. "primary progressive aphasia".ti,ab.
23. (FTD or FTLD).ti,ab.
24. Frontotemporal Lobar Degeneration/
25. Primary Progressive Nonfluent Aphasia/
26. Aphasia, Primary Progressive/
27. or/8-26
28. 7 and 27
29. (animals not (humans and animals)).sh.
30. 28 not 29

Appendix 4. Appendix 4: Information for extraction to proforma

Bibliographic details of primary paper:

  • Author, title of study, year and journal

 Details of index test:

  • Method of [index test] administration, including who administered and interpreted the test, and their training

  • Thresholds used to define positive and negative tests

 Reference Standard:

  • Reference standard used

  • Method of [reference standard] administration, including who administered the test and their training

 Study population:

  • Number of subjects

  • Age

  • Gender

  • Other characteristics e.g. ApoE status

  • Settings: i) community; ii) primary care; iii) secondary care outpatients; iv) secondary care inpatients and residential care

  • Participant recruitment

  • Sampling procedures

  • Time between index test and reference standard

  • Proportion of people with dementia in sample

  • Subtype and stage of dementia if available

  • MCI definition used (if applicable)

  • Duration of follow-up in delayed verification studies

  • Attrition and missing data

 

Appendix 5. Appendix 5: Assessment of methodological quality QUADAS-2 tool

DOMAIN PATIENT SELECTION INDEX TEST  REFERENCE STANDARD FLOW AND TIMING 
DescriptionDescribe methods of patient selection: Describe included patients (prior testing, presentation, intended use of index test and setting): Describe the index test and how it was conducted and interpretedDescribe the reference standard and how it was conducted and interpreted.Describe any patients who did not receive the index test(s) and/or reference standard or who were excluded from the 2x2 table (refer to flow diagram): Describe the time interval and any interventions between index test(s) and reference standard:
Signalling questions (yes/no/unclear)Was a consecutive or random sample of patients enrolled?Were the index test results interpreted without knowledge of the results of the reference standard?Is the reference standard likely to correctly classify the target condition?Was there an appropriate interval between index test(s) and reference standard?
Was a case-control design avoided?If a threshold was used, was it pre-specified?Were the reference standard results interpreted without knowledge of the results of the index test?Did all patients receive a reference standard?
Did the study avoid inappropriate exclusions?Did all patients receive the same reference standard?
Were all patients included in the analysis?

Risk of bias:

(High/low/ unclear)

Could the selection of patients have introduced bias?Could the conduct or interpretation of the index test have introduced bias?Could the reference standard, its conduct, or its interpretation have introduced bias?Could the patient flow have introduced bias? 

Concerns regarding applicability:

(High/low/ unclear)

Are there concerns that the included patients do not match the review question?Are there concerns that the index test, its conduct, or interpretation differ from the review question?Are there concerns that the target condition as defined by the reference standard does not match the review question? 

     

Appendix 6. Appendix 6: Anchoring statements for quality assessment of [index test] diagnostic studies

Table 1: Anchoring statements to assist with assessment for risk of bias

Question Response and weighting Explanation
Patient Selection
Was the sampling method appropriate?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

Where sampling is used, the designs least likely to cause bias are consecutive sampling or random sampling. Sampling that is based on volunteers or selecting subjects from a clinic or research resource is prone to bias.
Was a case-control or similar design avoided?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

Designs similar to case control that may introduce bias are those designs where the study team deliberately increase or decrease the proportion of subjects with the target condition, which may not be representative. Some case control methods may already be excluded if they mix subjects from various settings.
Are exclusion criteria described and appropriate?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

Study will be automatically graded unclear if exclusions are not detailed (pending contact with study authors). Where exclusions are detailed, the study will be graded as "low risk" if exclusions are felt to be appropriate by the review authors. Certain exclusions common to many studies of dementia are: medical instability; terminal disease; alcohol/substance misuse; concomitant psychiatric diagnosis; other neurodegenerative condition. Exclusions are not felt to be appropriate if 'difficult to diagnose' patients are excluded. Post hoc and inappropriate exclusions will be labelled "high risk" of bias.
Index Test
Was rCBF SPECT assessment/interpretation performed without knowledge of reference standard?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

Terms such as "blinded" or "independently and without knowledge of" are sufficient and full details of the blinding procedure are not required. Interpretation of the results of the index test may be influenced by knowledge of the results of reference standard. If the index test is always interpreted prior to the reference standard then the person interpreting the index test cannot be aware of the results of the reference standard and so this item could be rated as 'yes'.
Were rCBF SPECT thresholds pre-specified?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

For scales and biomarkers there is often a reference point (in units or categories) above which subjects are classified as "test positive"; this may be referred to as threshold; clinical cut-off or dichotomisation point. A study is classified high risk of bias if the authors define the optimal cut-off post-hoc based on their own study data because selecting the threshold to maximise sensitivity and / specificity may lead to overoptimistic measures of test performance.

Certain papers may use an alternative methodology for analysis that does not use thresholds and these papers should be classified as not applicable.

Reference Standard
Is the assessment used for clinical diagnosis of FTD acceptable?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

Ante-mortem clinical diagnosis of FTD will be based on recognised diagnostic criteria, The Lund and Manchester Groups 1994 or Neary 1998 or McKhann 2001 criteria (as previously outlined) or histopathological diagnosis and/or genetic mutation known to be causative of FTD (if available).

For other types of dementia potentially used to define control groups in our review, clinical diagnosis of dementia will include all cause (unspecified) dementia, using any recognised diagnostic criteria, for example DSM-IVand ICD-10 (American Psychiatric Association 2013; WHO 2010). NINCDS-ADRDA (National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association-Criteria) are the most accepted ante-mortem clinical consensus gold standard for Alzheimer's Dementia (McKhann 1984), defining three ante-mortem groups; 'probable', 'possible' and 'unlikely' Alzheimer's Dementia. The Consortium to Establish a Registry for Alzheimer's Disease (CERAD) (Mirra 1991), ICD10 and DSM-IV definitions of AD are also acceptable. National Institute of Neurological Disorders and Stroke - Association Internationale pour la Recherche et l'Enseignement en Neurosciences (NINDS-AIREN) (Román 1993), Alzheimer’s Disease Diagnostic and Treatment Centers (ADDTC) (Chui 1992), DSMIV, ICD10, Cambridge Mental Disorders of the Elderly Examination (CAMDEX) criteria (Hendrie 1988) are all acceptable for the diagnosis of vascular dementia (VD)

Where the criteria used for assessment is not familiar to the review authors or the Cochrane Dementia and Cognitive Improvement group ('unclear') this item should be classified as "high risk of bias".

Was clinical assessment for FTD performed without knowledge of the rCBF SPECT biomarker?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

Terms such as "blinded" or "independently and without knowledge of" are sufficient and full details of the blinding procedure are not required. Interpretation of the results of the reference standard may be influenced by knowledge of the results of index test.
Patient flow
Was there an appropriate interval between rCBF SPECT and clinical assessment?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

For cross-sectional case control studies the index test and application of the reference standard are ideally administeredon the same day, but a delay is ulikely to introduce bias as the condition of dementia is irreversible.

The time between reference standard and index test will influence the accuracy (Geslani 2005; Okello 2009; Visser 2006), and therefore we will note time as a separate variable (both within and between studies) and will test its influence on the diagnostic accuracy. We have set a minimum mean time to follow-up assessment of 1 year for longitudinal cohort (delayed verification) studies.

Did all subjects get the same assessment for dementia regardless of rCBF SPECT result?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

There may be scenarios where subjects who score "test positive" on index test have a more detailed assessment. Where dementia assessment (i.e. reference standard) differs between groups of subjects this should be classified as high risk of bias.
Were all patients who received rCBF SPECT assessment included in the final analysis?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

If the number of patients enrolled differs from the number of patients included in the 2x2 table then there is the potential for bias.

If participants' data is missing due to drop-out differs systematically from those who remain, then estimates of test performance may differ. If drop outs occur these should be accounted for; a maximum proportion of drop outs to remain low risk of bias has been specified as 20% but this will depend upon length of follow-up in longitudinal cohort studies.

Were missing or uniterpretable rCBF SPECT results reported?

No = high risk of bias

Yes = low risk of bias

Unclear = unclear risk of bias

Where missing or uninterpretable results are reported, and if there is substantial attrition (we have set an arbitrary value of 50% missing data), this should be scored as 'high risk of bias'. If these results are not reported, this should be scored as 'unclear' and authors will be contacted.
Anchoring statements to assist with assessment for applicability
Question Explanation
Were included patients representative of the general population of interest?

The included patients should match the intended population as described in the review question. The review authors should consider population in terms of symptoms; pre-testing; potential disease prevalence; setting

If there are clear grounds for suspecting an unrepresentative spectrum the item should be rated poor applicability.

Index test
Were sufficient data on rCBF SPECT application given for the test to be repeated in an independent study?Variation in technology, test execution, and test interpretation may affect estimate of accuracy. In addition, the background, and training/expertise of the assessor should be reported and taken in consideration. If rCBF SPECT biomarker was not performed consistently this item should be rated poor applicability.
Reference Standard
Was clinical diagnosis of dementia made in a manner similar to current clinical practice?For many reviews, inclusion criteria and assessment for risk of bias will already have assessed the dementia diagnosis. For certain reviews an applicability statement relating to reference standard may not be applicable. There is the possibility that a current reference standard, although valid, may diagnose a far smaller proportion of subjects with disease than in usual clinical practice. In this instance the item should be rated poor applicability.

Table 2: Review question and inclusion criteria

Category Review Question Inclusion Criteria
Patients

Participants with suspected FTD (Primary Objective 1)

 

 

Participants fulfilling the criteria for the clinical diagnosis of any forms of dementia in secondary and tertiary care setting
Index TestrCBF SPECT biomarkerrCBF SPECT biomarker
Target Condition

Frontotemporal dementia (FTD)

 

Initial diagnosis of FTD

 

Differential diagnosis of FTD from other dementia subtypes

 

Reference Standard

Diagnosis of FTD as determined by the Manchester-Lund or NINDS criteria,

Histopathological confirmation of diagnosis and/or genetic mutation known to be causative of FTD (if available)

 

Diagnosis of FTD as determined by the Manchester-Lund or NINDS criteria,

Histopathological confirmation of diagnosis and/or genetic mutation known to be causative of FTD (if available)

 

NINCDS-ADRDA (National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association-Criteria)is the most accepted ante-mortem clinical consensus gold standard for Alzheimer's Dementia, defining three ante-mortem groups; 'probable', 'possible' and 'unlikely' Alzheimer's Dementia. CERAD, ADDTC, ICD10 and DSMIV definitions of ADD were also acceptable.

 

NINDS-AIREN, ADDTC, DSMIV, ICD10, CAMDEX criteria were acceptable for VaD

 

OutcomeN/AData to construct 2x2 table
Study DesignN/A

Longitudinal cohort studies and nested case-control studies if they incorporate a delayed verification design (case-control nested in cohort studies) (Objectives)

Cross-sectional studies in which: i) rCBF SPECT results and the clinical diagnostic criteria were obtained within a narrow time-frame, and ii) FTD patients were differentiated from patients with other dementia subtypes

 

In assessing individual items, the score of unclear should only be given if there is genuine uncertainty.  In these situations review authors will contact the relevant study teams for additional information.

Abbreviations FTD = frontotemporal dementia, NINDS = National Institute of Neurological Disorders and Stroke

Contributions of authors

All authors contributed to the drafting of the protocol.

Declarations of interest

None.

Ancillary