Use of questionnaire infeasibility in order to detect cognitive disorders: Example of the Center for Epidemiologic Studies Depression Scale in psychiatry settings


*Takeshi Nishiyama, MD, PhD, Department of Information and Biological Sciences, Graduate School of Natural Sciences, Nagoya City University, 1 Mizuho-cho, Mizuho-ku, Nagoya 467-8601, Japan. Email:


Aim:  To examine the extent to which cognitive disorders influenced the feasibility and accuracy of both the 20-item and the 10-item Center for Epidemiologic Studies Depression Scale (CES-D).

Methods:  Cross-sectional analyses of 223 first-visit patients in a psychiatric clinic and 108 patients in a psychiatric department in a general hospital were conducted. To assess the influence of age, gender, and the presence of cognitive disorders on the feasibility of both versions of the CES-D, multiple logistic regression was performed with feasibility per se as the dummy dependent variable. In order to assess the accuracy of the CES-D, receiver operating characteristic (ROC) analysis was performed.

Results:  The infeasibility of both types of CES-D were so strongly associated with the presence of cognitive disorders that it can be used as an indicator of cognitive impairment. Moreover, the 10-item CES-D had almost as acceptable an internal consistency reliability as the 20-item CES-D in the study settings.

Conclusions:  Information obtained from both versions of the CES-D could be utilized fully, using infeasibility as an indicator of cognitive disorders, in psychiatry settings. Other screening instruments with as heavy a cognitive load as the CES-D can also be used in the same manner as an indicator of cognitive disorders to save the need for instruments specifically designed for dementia. Such usage can decrease the burden on both the respondent and the clinician in clinical practice.

THERE HAS BEEN accumulating evidence that mental disorders are underrecognized in routine clinical practice. Mental disorders such as major depression,1–3 dementia,4 and alcohol use disorder5,6 are often undetected or misdiagnosed in primary care settings. In particular, mental disorders comorbid with the principal mental disorder are also detected only one-half to one-third of the time in psychiatry settings.7–10 To avoid such underrecognition and resulting under-treatment, screening for mental disorders in primary care and psychiatry settings has been recommended by several authors and guidelines.1–6,11,12

Significant limitations exist in the routine administration of a questionnaire to all patients regardless of their risk status in practice-based screening, due to their cognitive impairment, which was reported to reduce acceptability and feasibility.13,14 Because the cognitively impaired segment of the population in clinical settings grows with the general aging of the population, the routine use of a screening instrument will become more prohibitive due to the decreasing acceptability and feasibility related to cognitive impairment. The impact of cognitive impairment, however, on the acceptability, feasibility, and performance of a screening instrument has not been fully investigated. For example, a systematic review of the detection of depression in older adults2 found that only one study specifically examined patients with relatively mild dementia using the Center for Epidemiological Studies Depression Scale (CES-D).15,16 In this study17 the CES-D demonstrated favorable accuracy in mildly demented patients, with an area under the receiver operating characteristic (ROC) curve (AUC) of 0.782 (95% confidence interval [CI]: 0.619–0.945). Another study, however, that used the Geriatric Depression Scale (GDS) found that the accuracy of this questionnaire was substantially attenuated in an Alzheimer's disease group (AUC, 0.66; 95%CI: 0.62–0.70) relative to a cognitively intact group (AUC, 0.78; 95%CI: 0.71–0.85) in which severe dementia had been excluded.18 All other studies using depression screening instruments, which have some methodological concerns about subject spectrum19 or criterion standards, also excluded severely demented patients and did not clarify the acceptability and feasibility of the instruments.20–26 In contrast, two studies regarding quality of life (QOL) questionnaires examined the acceptability, feasibility, and reliability of the questionnaires among patients with an appropriately broad spectrum of cognitive impairment and found that the severity of dementia had only a small effect on the internal reliability and test–retest reliability, but had a greater effect on the acceptability and feasibility of the measure.13,14

Given these limited findings, the acceptability and feasibility of screening instruments in a population of patients displaying a broad spectrum of cognitive impairment seen in clinical practice remain unclear. Therefore, we examined the acceptability and feasibility, as well as the accuracy, of the Japanese version of CES-D,16 a commonly used depression-screening instrument that has been validated in psychiatry settings. The aim of the present study was to determine the extent to which cognitive impairment influenced the acceptability and feasibility of the CES-D in two different types of psychiatry settings, a psychiatric clinic and the psychiatric department in a general hospital, in order to clarify the generalizability of the results obtained. We used the original 20-item CES-D in the psychiatric clinic and the briefer 10-item CES-D,27,28 designed for adequate feasibility, in the general hospital.



We evaluated the internal consistency reliability, validity, and feasibility of both the 20-item CES-D in the first sample and the 10-item CES-D in the second sample. The first sample was obtained from a psychiatric clinic affiliated with a psychiatric hospital in the suburbs of Nagoya, Japan. The second sample was acquired from the psychiatric department in a general hospital in Nagoya. In both facilities, initial comprehensive evaluations of all patients seeking care were conducted before clinical dispositions were made. If inpatient treatment was needed, patients were referred to the affiliated psychiatric hospitals. As shown in Table 1, we consecutively recruited all first-visit patients in the clinic (n = 391) between 25 March 2003 and 31 January 2004, and those in the hospital (n = 140) between 30 August 2004 and 30 March 2005, regardless of psychotropic medication status before study enrollment. We excluded patients (n = 19 in the clinic sample, n = 4 in the hospital sample) who were blind, seriously hearing impaired, or who could not speak Japanese. Of those eligible for the study, 26 subjects from the clinic and five subjects from the hospital refused to participate. Finally, in our analyses, we included only those patients (clinic sample: n = 223, 54.7% female, mean age ± SD, 40.6 ± 16.8 years; hospital sample: n = 108, 56.0% female, mean age ± SD, 52.8 ± 20.9 years) who were consecutively followed up for more than 2 months. Comparison of the subjects based on inclusion in the study analyses indicated that the excluded subjects from the clinic sample were old (41.3 ± 18.1 years) and often female (53.9%), compared with the included subjects. Similar comparisons made from the hospital sample showed that the excluded subjects were slightly younger (50.9 ± 24.6 years) and were more often female (68.8%). Such a biased sample selection might reduce the generalizability of the hospital sample, particularly in regards to gender. The ethics committees in both facilities approved the study protocol.

Table 1.  Sampling procedure and subject characteristics
 Clinic sampleHospital sampleP
  • Because individuals were given more than one diagnosis, the total does not equal the no. subjects included, and the percentage of each diagnostic group does not sum to 100%.

  • This term refers to delirium, dementia, amnesia, and other cognitive disorders included in the DSM-IV.

  • §

    Mental retardation is included in this diagnostic class, unlike in the DSM-IV classification.

  • This term refers to schizophrenia and other psychotic disorders in the DSM-IV.

 Discontinued visit−124−23 
 Follow-up more than two months223108 
Age (years) (mean ± SD)40.6 ± 16.854.4 ± 20.8<0.001
Female (%)122 (54.7)61 (56.5)0.852
Cognitive disorders (%)11 (4.9)18 (16.7)<0.001
 Dementia with delirium02 
 Other cognitive disorders01 
 Other cognitive disorders with delirium10 
Mental retardation§30 
Mood disorders (%)124 (55.6)56 (51.9)0.600
 Major depressive disorder11354 
 Other mood disorders112 
 Major depressive episode10049 
Psychotic disorders (%)30 (13.5)3 (2.8)0.005
Anxiety disorders (%)33 (14.8)20 (18.5)0.481
Other mental disorders (%)41 (18.4)25 (23.1)0.384


All patients who routinely visited the waiting room were invited to participate in the study. After signing informed consent, they were asked to complete the CES-D before seeing the psychiatrists. When we analyzed the feasibility of administering this instrument, we determined a priori the criteria for infeasibility as outlined here. If a subject had difficulty completing the instrument independently, the instrument was administered in a consistent manner by trained nurses, who read the items aloud, not deviating from the item wording. We judged the instrument infeasible for the subject if more than four items of the 20-item CES-D and more than three items of the 10- item CES-D were not completed even with the help described. The CES-D score was summed to yield a total score ranging from 0 (not depressive) to 60 (most depressive) in the long form and from 0 (not depressive) to 10 (most depressive) in the short form, according to the conventional method.15,27 CES-D scores with permissible missing items were imputed based on the mean score obtained.15,27 Additionally, to clarify respondent burden, we surveyed the length of time required by the first 20 subjects recruited in both samples to complete the two CES-D forms.

The in-depth diagnostic assessment was made based on longitudinal follow up and all available data, instead of using merely standardized interviews.29 To obtain a prospectively confirmed diagnosis based on the DSM-IV criteria, experienced psychiatrists, blinded to the CES-D score, reviewed any document presented at the first interview, interviewed the patient and persons accompanying him or her, and observed changes in the patients' condition over time. Any questions unresolved by each psychiatrist were discussed and resolved with the first author, who reviewed all available medical records and supervised patients' clinical courses. To increase diagnostic accuracy, we included only those patients who were contacted consecutively for more than 2 months, although no measure of inter-rater reliability was obtained. A major depressive episode was diagnosed regardless of whether patients had received antidepressants at the first visit. Furthermore, diagnoses of cognitive disorders, which consisted of not only dementia but also mental retardation in the present study, were based on the judgment of experienced psychiatrists who utilized neurological examination, neuropsychological testing, appropriate laboratory findings, and, if necessary, neuroimaging during follow up.

Data analysis

Estimates of the internal reliability of the CES-D were computed using Cronbach's alpha. To assess the influence of age, gender, and the presence of cognitive disorders on the feasibility of the CES-D, multiple logistic regression was performed with feasibility per se as the dummy dependent variable. Backward stepwise regression was performed until the Akaike information criterion (AIC) was increased by the removal of any variable in a particular step. Similarly, to assess the influence of age, gender, presence of cognitive disorders, and the presence of a major depressive episode on the CES-D score, backward stepwise linear regression based on the AIC was performed with the CES-D score as the dependent variable.

When the sensitivity and specificity, conventional operating characteristics for criterion validity of a screening instrument, are applied to continuous screening scores, much information is lost. To avoid this limitation, stratum-specific likelihood ratios (LR) were assessed for continuous scales,30 while positive and negative LR, sensitivity, and specificity were assessed for dichotomous scores. Positive and negative predictive values depend on disease prevalence, however, and so are not reported herein.

The stratum-specific LR were calculated using a spreadsheet program described by Peirce and Cornell.31 In order to assess the validity of a screening instrument that yields continuous scores, we also conducted ROC analysis. The AUC and its standard error were calculated using SPSS (version 10.0; SPSS, Chicago, IL, USA). All analyses, except as otherwise noted, were performed in the R statistical computing environment for Windows (Version 2.2; R Foundation for Statistical Computing, Vienna, Austria). All tests conducted were two-tailed.


Table 1 provides the demographic and diagnostic characteristics of the two samples. Reflecting the differences in the study settings, the sample from the general hospital consisted of older patients (t = 4.59, d.f. = 142.1, P < 0.001) who were likely to have more cognitive disorders (χ2 = 11.11, d.f. = 1, P < 0.001), fewer psychotic disorders (χ2 = 8.09, d.f. = 1, P = 0.004), and comparable mood disorders (χ2 = 0.28, d.f. = 1, P = 0.60) as compared to the clinic sample.

When the time required to complete the CES-D was surveyed, we found that the 20-item CES-D was much lengthier to administer than the 10-item CES-D (average time ± SD: 5.1 ± 1.8 min for the long form and 1.3 ± 0.3 min for the short form).

On examination of the internal consistency reliability, the alpha of the 20-item CES-D was 0.87 in the hospital sample, while that of the 10-item CES-D was 0.77 in the clinic sample.

On backward logistic regression conducted to assess the influence of age, gender, and the presence of a cognitive disorder, the final logistic regression model included the presence of a cognitive disorder (P < 0.001; odds ratio [OR], 35.6; 95%CI: 6.11–195.4) and age (P < 0.001; OR, 1.07; 95%CI: 1.03–1.11) as significant variables influencing the 20-item CES-D feasibility in the hospital sample. Similarly, the final model included only the presence of a cognitive disorder (P < 0.001; OR, 146.7; 95%CI: 27.0–796.1) in the 10-item CES-D sample. Table 2 presents the cross-tabulation data and the operating characteristics that represent the surprisingly strong association between the presence of cognitive disorders and CES-D infeasibility.

Table 2.  CES-D infeasibility and criterion diagnosis of cognitive disorders
 20-item CES-D10-item CES-D
  1. CES-D, Center for Epidemiologic Studies Depression Scale; CI, confidence interval; LR+, positive likelihood ratio; LR−, negative likelihood ratio.

LR+ (95%CI)14.02 (7.25–27.08)16.43 (6.40–42.17)
LR− (95%CI)0.29 (0.12–0.69)0.29 (0.14–0.59)

On backward linear regression performed to assess the influence of age, gender, the presence of a cognitive disorder, and presence of a major depressive episode on the CES-D score, the final linear model included only the presence of a major depressive episode both in the 20-item CES-D (P < 0.001; regression coefficient, 12.01; 95%CI: 9.02–15.01) and the 10-item CES-D (P < 0.001; regression coefficient, 5.44; 95%CI: 3.71–7.18) as the significant variable influencing the CES-D score. Therefore, we did not stratify either of the samples according to age, gender, or the presence of a cognitive disorder in the analyses that followed. Table 3 presents the estimates of stratum-specific LR and the results of the ROC analysis conducted to evaluate the ability of the CES-D to discriminate between depressive and non-depressive subjects. Here a moderate accuracy of the CES-D can be appreciated.

Table 3.  CES-D score and criterion diagnosis of major depressive episode
 20-item CES-D 10-item CES-D
  1. AUC, area under the receiver operating characteristic curve; CES-D, Center for Epidemiologic Studies Depression Scale; CI, confidence interval; SSLR, stratum-specific likelihood ratio.

AUC (95%CI)0.77 (0.70–0.83)AUC (95%CI)0.80 (0.71–0.89)
ScoreSSLR (95%CI)ScoreSSLR (95%CI)
0–150.11 (0.04–0.31)0–20.15 (0.04–0.56)
16–441.28 (1.08–1.51)3–70.96 (0.66–1.40)
45–605.45 (1.76–16.83)8–104.20 (1.80–9.82)


The major findings of the present study are the following: (i) infeasibility in the administration of either the 20- or 10-item CES-D is so strongly associated with the presence of cognitive disorders that it can be used as an indicator of cognitive impairment; (ii) the 10-item CES-D has an almost as acceptable internal consistency reliability for use as a screening instrument for major depression as the 20-item CES-D in psychiatry settings; (iii) the 10-item CES-D can be administered with an average time saving of approximately 3.5 min (approx. 80% reduction) from the 20-item CES-D, which is in agreement with results previously reported;27 (iv) the scores on both the 20- and 10-item CES-D are not influenced by age, gender, or presence of cognitive disorders.

Psychiatry settings may be suitable for investigating the feasibility of a self-administered questionnaire. First, the psychiatric population tends to have an appropriately broad spectrum of cognitive impairment derived from mental disorders including dementia, which may affect the feasibility of administering an instrument. Second, many questionnaires designed for specific mental disorders have been developed, the reliabilities and validities of which have already been demonstrated. Because major depression, like other mental disorders, is remarkably misdiagnosed in routine psychiatric practice when compared with the criterion diagnosis (κ, 0.1–0.3),9,32 and because the condition is the most common in psychiatry settings,7,33 and major depression is also a common (30–50%) complication of dementia,34 we chose the commonly used depression screening instrument, CES-D, to examine its feasibility in these settings.

Although many studies have been conducted on depression screening questionnaires in demented patients;21,23,25 older residents in nursing homes24,26 or retirement communities;20 older primary care patients;35,36 geriatric patients;22 and psychiatric outpatients,37 the impact of cognitive impairment on the acceptability and feasibility of a screening instrument has not been fully investigated. To the best of our knowledge, there have been only two studies that examined the influence of cognitive impairment on the feasibility and acceptability of questionnaires.13,14 These studies reported relatively low sensitivities when the infeasibility of questionnaires was used as an indicator of the presence of a cognitive disorder, compared with the higher sensitivities in dementia screening instruments (e.g. 0.96 of normal subtracting serial 7s backward to 79).11 These findings are in agreement with the present findings.

From a clinical perspective, the purpose of screening is to improve diagnostic recognition, which requires high sensitivity and corresponding small false negatives so that the clinician can be confident that after a negative test result, there is little need to inquire about the symptoms of the target disorder. In contrast, false positives are less of a problem for a screening instrument because its major cost is the time that a clinician takes to determine that the disorder is not present. Presumably, this is the time that the clinician would nonetheless have spent for the same purpose.7 Therefore, in order to use information about the infeasibility of a questionnaire to detect cognitive impairment, it is necessary to increase sensitivity much more. For example, adding some items from dementia screening instruments on the CES-D may be used to improve its sensitivity.

Several methodological issues might have affected the validity of the present study. The first issue concerns the reliability of the criterion diagnosis. Although we used a prospectively confirmed diagnosis to assure accuracy, the relatively low diagnostic comorbidity seen in the present study compared with previously reported results7–10 might stem from the lack of diagnostic reliability. This, however, contrasts sharply with our higher success rate in making a diagnosis than a previous study that used structured interviews,29 which confers higher generalizability to the results. The second issue concerns the comparison between the two types of CES-D, which were not used in the same sample. Therefore, the validity of the two types of CES-D could not be compared directly. Third, because we did not check whether a subject had difficulty in completing the instrument independently, the extent to which external help in the completion of the CES-D can affect its feasibility is not clear. The CES-D was administered, however, in a consistent manner, and thus the comparison of the two types of CES-D should be valid at least in the sample used.

Despite these limitations, these data suggest that we can make the most of the information that both CES-D provide using infeasibility per se as an indicator of cognitive disorder in psychiatry settings. It is inferred from these findings that other screening instruments that require a heavy cognitive load (e.g. comprehension, information retrieval, judgment, or selection of appropriate responses), such as the CES-D, can be used as an indicator of cognitive disorder based on infeasibility information. In this way, saving the need for the instruments specifically designed for dementia will decrease the burden on both the respondent and the clinician in clinical practice.


We are grateful to the participating patients, the psychiatrists, and the assistants at the Okehazama Hospital and the Daido Hospital for their collaboration. None of the authors has any conflict of interest. There was no sponsor for this study.