Reliability and validity of the Japanese version of the Frontal Assessment Battery in patients with the frontal variant of frontotemporal dementia
Shutaro Nakaaki, MD, PhD, Department of Psychiatry and Cognitive-Behavioral Medicine, Nagoya City University Graduate School of Medical Sciences, Mizuho-cho, Mizuho-ku, Nagoya 467-8601, Japan. Email: email@example.com
Abstract Patients with the frontal variant of frontotemporal dementia (fv-FTD) exhibit deficits of executive functions. However, no single executive function task that might be used to detect the executive function deficits in fv-FTD patients has been established as yet. The frontal assessment battery (FAB) devised by Dubois et al. (2000) has been reported to be a quick and simple bedside screening test that is sensitive for differentiating between FTD and Alzheimer’s disease (AD). The present study was conducted with the aim of ascertaining the reliability and validity of the Japanese version of the FAB among Japanese patients with fv-FTD. The Japanese version of FAB was given to patients with mild fv-FTD (n = 18) and those with AD (n = 18). The test–retest reliability was evaluated after a 3-week interval by the same interviewer. Data from the Wisconsin Card Sorting Test (Keio version: KWCST) were also collected to ascertain the validity of the FAB. The Japanese version of the FAB exhibited good internal reliability (Cronbach’s α: 0.70, 95% confidence interval [CI] = 0.50–0.84) and good test–retest reliability (intraclass correlation coefficient: 0.89, 95%CI = 0.77–0.95). Significant correlations were observed between the total FAB score and the category achieved (r = 0.454, P < 0.05) and number of perseveration errors (number of errors that were perseverations; r = 0.719, P < 0.01) in the KWCST. A cut-off of 10 for the total FAB score yielded the highest sensitivity (85%) and specificity (92%) for discriminating between patients with fv-FTD and AD with the highest positive likelihood (12.0, 95%CI = 2.6–55.4). The Japanese version of the FAB offers promise as an easy and quick bedside screening test to distinguish fv-FTD from AD.
Although the frontal variant of frontotemporal dementia (fv-FTD) is characterized by executive dysfunction even from the early stage, it is not easy to detect the abnormalities of executive functions in these patients, even by detailed neuropsychological assessment. For example, in the case reported by Lough et al., the results of the Wisconsin Card Sorting test did not reveal any abnormalities of executive functions despite the manifestation of behavioral disturbances in the patient.1 Similarly, standard frontal tasks have been shown to be inadequate for detecting orbital–frontal dysfunction.2 Several cognitive tasks have been developed to detect FTD in patients with preserved cognitive ability (episodic memory, spatial skills, praxis etc.);3,4 but these tests take more than 15 min to complete in subjects with dementia. In addition, the use of Addenbrooke’s Cognitive Examination (ACE)4 is limited by the possibility of a false positive diagnosis of early dementia.
Patients with FTD characteristically exhibit drastic behavioral changes. These patients exhibit more severe deficits, in particular, of both disinhibition and euphoria, than patients with Alzheimer’s disease (AD). Recent studies have demonstrated that stereotypic and ritualistic behaviors are distinctive clinical features of FTD.5,6 Therefore, while behavioral assessment is a useful tool to capture the behavioral changes in FTD, information provided by a caregiver living with the patient is also of great significance. Anatomic heterogeneity associated with FTD makes the assessment of behavioral symptoms in these patients even more complex; for example the behavioral symptoms differ between right-sided and left-sided patients.7
Recently, the Frontal Assessment Battery (FAB) was developed as a bedside test by Dubois et al.8 This test is easily administered at the bedside, and evaluation of executive functions using this test takes no longer than 10 min. The FAB consists of six subtests to explore the different aspects of frontal lobe function. The performance in each subtest is rated on a scale of 0–3, with lower scores indicating greater degrees of executive dysfunction. The sensitivity and validity of this assessment tool have been demonstrated in various neurodegenerative diseases associated with frontal lobe dysfunction.9 The validity of the FAB for distinguishing patients with early stage FTD from those with AD has also been demonstrated.10
Unfortunately, despite its usefulness for clinical assessment, to the best of our knowledge, the Japanese version of the FAB has not been used widely, except in cases of Parkinson’s disease.11 In an attempt to develop the use of the Japanese version of the FAB, we confirmed the reliability and validity of this assessment tool in patients with FTD in the present study. In addition, we also attempted to define an appropriate cut-off for total FAB score to distinguish between patients with early stage FTD and those with AD.
Thirty-six Japanese patients attending Nagoya City University Hospital as outpatients participated in the present study. The diagnostic evaluation included complete history and physical examination, routine blood tests (including estimation of the serum vitamin B12 level and thyroid function), magnetic resonance imaging (MRI) of the brain, and neuropsychological testing. Eighteen patients were diagnosed as having probable cases of fv-FTD, based on the Lund and Manchester criteria.12 An informant-based history of progressive change in personality and behavior was obtained for all of the patients with fv-FTD, and all of the fv-FTD patients had either predominantly frontal lobe atrophy on brain MRI or frontal lobe hypoperfusion on brain single-photon emission computed tomography (SPECT). We enrolled patients with early stage fv-FTD as determined according to the clinical dementia rating (CDR). We defined the early stage as being represented by a CDR of either 0.5 (n = 12) or 1 (n = 6). None of the fv-FTD patients had parkinsonism. Patients diagnosed to have the temporal variant of FTD (semantic dementia, progressive non-fluent aphasia) were excluded because the FAB is designed to assess executive function deficits, and is not an appropriate tool for the assessment of semantic memory or aphasia in these patients who exhibit semantic dementia and progressive non-fluent aphasia as the dominant symptoms. Eighteen patients with AD, matched for age, sex ratio, education, and score on the Mini-Mental State Examination (MMSE)13 were also enrolled in the study. A diagnosis of probable AD in the patients was made in accordance with the National Institute of Neurology and Communicative Disorder and Stroke/Alzheimer’s Disease and Related Disorders Association (NINCDS/ADRDA) criteria.14 In addition, 18 normal elderly subjects were recruited from the community as a normal control group; all of these subjects had an MMSE score of >27 (out of 30), and none had any previous history of neurologic and/or psychiatric illness, hearing loss, or visual disability.
Patients who satisfied the following exclusion criteria were excluded from the study: (i) presence of other neurological diseases; (ii) history of mental illness or substance abuse prior to the onset of dementia; (iii) focal brain lesions as seen on MRI; (iv) MMSE score <11; (v) inability to obtain reliable informed consent from the patient and/or his/her relatives; and (vi) patient under treatment with either acetylcholine esterase inhibitors or neuroleptic medication.
Patient demographic data are summarized in Table 1. There were no significant differences in the demographic variables (age, sex ratio, education) or the duration of illness and MMSE scores between the two patient groups (fv-FTD subjects, AD patients). The study protocol was approved by the Ethics Committee of Nagoya City University Medical School. All the subjects were informed about the purpose and procedures of the present study, and signed an informed consent form prior to their participation in the study.
Table 1. Subject characteristics (mean ± SD)
|Age (years)||64.3 ± 6.7||64.4 ± 6.1||65.4 ± 4.9|
|Education (years)||10.0 ± 1.2||10.2 ± 1.3||10.7 ± 1.1|
|Duration of illness (years)||2.5 ± 0.8||2.3 ± 0.6||(–)|
|MMSE score||22.2 ± 2.2*||21.6 ± 1.3*||28.9 ± 0.8|
|FAB||6.7 ± 2.5*†||12.2 ± 1.6*||16.5 ± 1.0|
Takagi et al. translated the original English version of the FAB8 into Japanese.11 We slightly modified this translated Japanese version of the FAB,11 only pertaining to subtest 2 (literary fluency). Takagi et al. specified the letter ‘sa’ for the literary fluency task but, as they themselves later suggested, this word is only infrequently used in the Japanese language.11 Therefore, we adopted the letter ‘a’ as more appropriate for the literary fluency task.15 The FAB test battery consists of six subtests, namely, those for similarities (conceptualization), lexical fluency (mental flexibility), motor series (programming), conflicting instructions (sensitivity to interference), go–no go (inhibitory control), and prehension behavior (environmental autonomy). The score in each subtest was graded on a scale of 0–3, and the maximum possible total score obtained by adding the scores in each of the subtests was 18. Higher scores signaled better performance in the test.
We used the SPSS 11.0 J software for Windows (SPSS, Chicago, IL, USA) for the statistical analysis.
The reliability of this scale was assessed by two methods. First, the test–retest reliability was assessed on a subset of 18 fv-FTD patients, 3 weeks after the initial evaluation. The test–retest reliability was estimated using analysis of variance intraclass correlation coefficient (anova-ICC). In general, an anova-ICC >0.7 indicates good reliability. Then, the internal consistency of the scale was estimated by determining Cronbach’s alpha among the 18 patients. Cronbach’s alpha between 0.70 and 0.90 indicates good internal consistency.
First, in order to examine the concurrent validity, we assessed the correlation between the performance of the fv-FTD patients in the FAB and in the Keio version of the Wisconsin Card Sorting Test (KWCST)16 as the gold standard for the assessment of executive functions. We adopted the category achieved and the number of perseveration errors (the number of repetitions immediately preceding the incorrect response) in the WCST as the indices for the present study, because these two indices have been considered to reflect aspects of executive dysfunction.8 Second, in order to examine discriminant validity, we used a stepwise discriminant analysis in two groups of fv-FTD subjects and AD subjects. In addition, a receiver operating characteristic (ROC) curve analysis was applied to determine the cut-off for total FAB score for differentiating between patients with fv-FTD and those with AD. We validated the cut-off by calculating the sensitivity, specificity and likelihood ratio (LR) for distinguishing between patients with fv-FTD and those with AD.
Cronbach’s alpha for the patients (0.70, 95% confidence interval [CI] = 0.50–0.84) indicated good internal consistency for all the six subtests of the FAB. The test–retest reliability (n = 18) of the total score after 3 weeks of the initial evaluation was excellent, with a good ICC (0.89, 95%CI = 0.77–0.95) confirming the excellent external reliability of the FAB.
There was a significant correlation between the FAB score and the category achieved in the KWCST (r = 0.454, P < 0.05). Significant correlation was also observed between the FAB score and the number of perseveration errors (no. errors that were perseverations) in the KWCST (r = 0.719, P < 0.01). A stepwise multiple regression analysis was conducted for the total score in the FAB in the patients by entering the age, education, MMSE score, and both the category achieved and number of perseveration errors in the KWCST as independent variables. Regression analyses indicated that the number of perseveration errors in the KWCST was the only significant predictor of the total FAB score as determined from the responses of the patients (standardized regression coefficient =− 0.828, P = 0.002).
The total FAB scores and the scores in each subscale of the FAB in the patients are summarized in Table 1 and Table 2, respectively. One-way anova showed a significant effect of group on both the total FAB score and the score in each subtest (P < 0.01). Post-hoc comparisons (Bonferroni) showed that the patients with fv-FTD had significantly lower total FAB scores as well as lower scores in each of the subtests, except for the lexical fluency (mental flexibility) and motor series (programming) subtests, as compared with the AD patients. Although the patients in the normal control group had significantly better total FAB scores and better scores in each subtest (P < 0.01), there were no significant differences between the normal control group and the AD group in regard to the scores in the subtests for similarities (conceptualization) and prehension behavior (environmental autonomy).
Table 2. Mean FAB subtest scores (mean ± SD)
|Similarities (conceptualization)||1.8 ± 0.7||2.5 ± 0.6|| 0.025*||2.8 ± 0.3|
|Lexical fluency (mental flexibility)||1.5 ± 0.8||1.7 ± 0.6||0.61||2.5 ± 0.5|
|Motor series (programming)||0.8 ± 0.5||1.2 ± 0.7||0.14||2.4 ± 0.5|
|Conflicting instructions (sensitivity to interference)||1.1 ± 0.6||2.3 ± 1.0||<0.000*||2.9 ± 0.2|
|Go-no go (inhibitory control)||0.6 ± 0.4||1.5 ± 0.7|| 0.002*||2.7 ± 0.4|
|Prehension behavior (environmental autonomy)||0.8 ± 1.1||2.9 ± 0.2||<0.000*||3.0 ± 0|
A discriminant analysis between the fv-FTD patients and AD patients yielded a canonical discriminant function for the total FAB score (r = 0.850, Wilks’λ = 0.278, P < 0.001); the total FAB score correctly identified 85.7% of the fv-FTD patients. A stepwise discriminant analysis between the fv-FTD patients and AD patients with the scores in the six FAB subsets as independent variables yielded a canonical discriminant function with two of the subset scores, that is, conflicting instructions (sensitivity to interference) and prehension behavior (environmental autonomy; r = 0.836, Wilks’λ = 0.301, P < 0.001). The scores in these two subsets correctly identified 78.6% of the fv-FTD patients. Both the ROC curve and Table 3 indicate that a cut-off for total FAB score of either 10 or 11 yielded a high sensitivity and specificity for discriminating between fv-FTD patients and AD patients, the sensitivity and specificity for the two cut-offs being 85% and 85%, or 92% and 85%, respectively. In addition, examination of the positive LR also demonstrated the validity of the cut-off of 10 or 11 to discriminate between the fv-FTD patients and AD patients.
Table 3. Cut-off for total FAB score for differentiating between patients with fv-FTD and AD
| 9|| 78||100|| || |
|10|| 85|| 92||12.0||(2.6–55.4)|
|11|| 85|| 85|| 6.1||(1.9–18.9)|
|12||100|| 71|| 3.5||(1.6–7.5)|
|13||100|| 63|| 1.8||(1.1–2.7)|
|14||100|| 21|| 1.3||(1.0–1.7)|
We confirmed that the Japanese version of the FAB has good internal as well as external reliability. We also demonstrated the satisfactory concurrent validity and discriminant validity of this assessment tool. To the best of our knowledge, this is the first study to demonstrate the reliability and validity of the Japanese version of the FAB for the evaluation of patients with fv-FTD.
In terms of concurrent validity, regression analysis showed that the number of perseveration errors in the KWCST was the only significant predictor of the total FAB score. Recent studies suggest that the WCST may not assess any single unitary function, but a complex cognitive process.17,18 They found that the increase in the number of perseveration errors in the WCST, such as the ‘stuck in set’ perseveration defined by Nelson, which means repetitions immediately preceding the incorrect response, may reflect executive dysfunction. In a study conducted by Nagahama et al. the regional cerebral blood flow in the dorsal prefrontal area as determined by brain SPECT was significantly correlated with the number of ‘stuck in set’ perseverations.18 Therefore, the present results suggesting an association between the total FAB score and the number of perseveration errors in the KWCST may suggest an association of the total FAB score with executive dysfunction.
In the present study, the cut-off for the total FAB score of 10 was considered as the optimal cut-off for differentiating between fv-FTD and AD, because it yielded a high sensitivity (85%), specificity (92%) and also a high positive LR. When the cut-off score was set at >10, the specificity of the test for discriminating between fv-FTD and AD decreased from 0.85 to 0.21, as shown Table 3. The positive LR indicates the odds at which a score below the cut-off (10) is likely to occur in patients with fv-FTD. As compared with other cut-off scores, the cut-off score of 10 yielded a high positive LR, indicating that a total FAB score of <10 was 12 times more likely to occur in patients with fv-FTD than in patients with AD. Our proposed cut-off is close to the cut-off of 11 proposed by Slachevsky et al.10 to distinguish between patients with fv-FTD and those with AD.
With regard to the scores in the subtests of the FAB, there were significant differences in the scores in four of the subtests (similarities, conflicting instructions, go–no go, and prehension behavior) between patients with fv-FTD and those with AD. Furthermore, a stepwise discriminant analysis demonstrated that the scores in two subtests, namely, conflicting instructions and prehension behavior, also discriminated between the two groups of subjects. Recent studies suggest that patients with fv-FTD have degeneration of not only the dorsolateral frontal lobe, but also of the orbital frontal lobe.19,20 Therefore, fv-FTD patients also show deficits in some types of tests of attention and executive functions that are sensitive to abnormalities of the orbital frontal lobe cortex. Consistent with this finding, Slachevsky et al. demonstrated that patients with early stage of fv-FTD had worse performance on conflicting instructions than patients with AD.10 Dubois et al. suggest that the domains of executive function of the subset of ‘conflicting instructions’ on the FAB resemble the Stroop task because both tasks are considered to be sensitive to interference.8 A recent review also indicated that FTD patients had greater impairment of executive functions, such as in the Stroop task, as compared with AD patients.21 In the prehension task on the FAB, Dubois et al. reported that FTD patients might exhibit either imitation behavior or utilization behavior.8 The imitation behavior is considered to have a discriminatory value for differentiating between patients with FTD and those with AD, consistent with the results reported in two previous studies on the usefulness of the FAB.10,22 Taken together, it is likely that the performance in either conflicting instructions or prehension behavior among the six subtests may have the best sensitivity for discriminating between fv-FTD and AD patients. However, a recent study reported that there was a significant difference between FTD and AD patients in relation to the scores in the following subtests of the FAB: lexical fluency, motor series, and prehension behavior.22 The difference between the present results and those reported by Lipton et al.22 may be attributed to the differing characteristics of the FTD patient samples used. The Lipton et al. study included patients with the temporal variant type of FTD (semantic dementia, progressive non-fluent aphasia) and also some FTD patients with parkinsonism.22
We must point out several limitations of the present study. First, the sample size was relatively small. Study on a larger number of FTD patients is necessary to further endorse the validity of the Japanese version of the FAB. Second, recent studies have shown that executive functions may decline even during the early stage of AD. Swanberg et al. reported that approximately 60% of AD patients manifest executive dysfunction.23 Therefore, even if the FAB were adopted, it would be difficult to distinguish between patients with FTD and those with AD, especially those with the latter having executive dysfunction. Unfortunately, we could not assess the executive functions in the AD patients in the present study except with the FAB. Thus, further study would be necessary to compare FTD and AD patients with executive dysfunction. Third, we did not assess patients with the temporal type of FTD using the FAB. Therefore, it is unclear whether this tool would also be useful to distinguish between patients with the temporal type of FTD and AD patients.
Despite these limitations, the present study did demonstrate that the Japanese version of the FAB has good reliability and validity. We also determined the optimal cut-off for the total FAB score to distinguish between patients with FTD and patients with AD. We recommend the use of the Japanese version of the FAB for evaluation of patients with dementia as an easily administered assessment tool, even in FTD patients with behavioral problems who may find it difficult to complete other types of executive and attention tasks.