SEARCH

SEARCH BY CITATION

Keywords:

  • Center for Epidemiologic Studies Depression Scale;
  • feasibility;
  • major depressive disorder;
  • receiver operating characteristic

Abstract

  1. Top of page
  2. Abstract
  3. METHODS
  4. RESULTS
  5. DISCUSSION
  6. ACKNOWLEDGMENTS
  7. REFERENCES

Aims:  The Center for Epidemiologic Studies Depression Scale (CES-D) has been validated to avoid misdiagnoses of major depression in routine psychiatric outpatient settings, but it was reported to be only marginally feasible in these specific settings. A briefer and simpler version, known as the 10-item CES-D, meant to attain adequate feasibility, has been validated in geriatric outpatient settings, but it has not yet been examined in psychiatry outpatient settings. The purpose of the present study was therefore to compare the feasibility, reliability, and validity of the two types of CES-D.

Methods:  A cross-sectional analysis was conducted of 86 consecutive outpatients in a psychiatric department in a general hospital.

Results:  The 10-item CES-D has a higher feasibility than the 20-item CES-D, and its internal consistency, reliability, and validity are almost identical to those of the 20-item CES-D.

Conclusions:  The 10-item CES-D is the better instrument to use because of the higher feasibility than the 20-item CES-D in psychiatric outpatient settings. The different answer format used in each questionnaire (a yes or no format in the former vs a multiple-choice format in the latter) may influence the feasibility, rather than the number of items.

ACCUMULATING EVIDENCE SUGGESTS that major depression, in particular major depression comorbid with dementia,1 is underrecognized in routine psychiatric practice.2–4 To avoid such underrecognition and resulting under-treatment, many screening instruments have been developed to detect the presence of major depression. Few of these instruments, however, have been specifically validated for use in routine psychiatric outpatient settings.2,5,6

Among these screening instruments, Schulberg et al. and Furukawa et al. examined the test characteristics of the Center for Epidemiologic Studies Depression Scale7 (CES-D or the 20-item CES-D) in psychiatric outpatients using semistructured interviews for criterion-standard diagnoses.2,5 Despite the demonstrated utility of the CES-D in these studies, the high CES-D incompletion rate of approximately 20–25% suggests that this tool presents problems for psychiatric patients; specifically, the CES-D utilizes a forced four-choice scale format that patients may find difficult to complete. To reduce such respondent burden and to attain an adequate response rate, a briefer and simpler version of the CES-D, known as the 10-item CES-D, has been proposed.8 The 10-item CES-D has been reported to retain acceptable reliability and validity in geriatric outpatients,8–10 but its reliability and validity have not been investigated in psychiatric outpatient settings.

Furthermore, administering a questionnaire to all patients regardless of risk status in practice-based screening has significant limitations to routine use in psychiatry outpatients. The psychiatric population also tends to have a broad spectrum of cognitive impairment derived from mental disorders,11 which may affect questionnaire feasibility. Particularly, cognitive disorders were strongly associated with the infeasibility of the CES-D.12 Because the cognitively impaired segment of the population in psychiatry settings grows with the general aging of the population, the routine use of a screening instrument will become more prohibitive due to the decreasing feasibility related to cognitive impairment. To our knowledge, previous work has not fully investigated the feasibility of any depression screening instruments in psychiatric outpatient settings. Here, ‘feasibility’ is defined as the failure to complete more than a predefined threshold number of items in a screening instrument.

The first aim of the present study was therefore to compare the feasibility of the 10-item and 20-item CES-D in a psychiatric outpatient setting. The second aim was to compare the reliability and validity of the two types of CES-D in this setting.

METHODS

  1. Top of page
  2. Abstract
  3. METHODS
  4. RESULTS
  5. DISCUSSION
  6. ACKNOWLEDGMENTS
  7. REFERENCES

Subjects

We consecutively recruited all first-visit outpatients in the psychiatric department in a general hospital in Japan between 30 April 2006, and 30 March 2007. Patients were recruited regardless of whether or not they had received any psychotropic medication just before enrollment in the study. All subjects except for one agreed to participate; thus, a total of 86 subjects were included in the study. Table 1 lists demographic and diagnostic characteristics of the subjects. Those of the participants who failed to complete the two types of CES-D are also listed in Table 1. Comorbid diagnoses according to depressive status are shown in Table 2. The study protocol was approved by the ethics committees in the facility.

Table 1.  Subject characteristics
 Total n (%)20-item CES-D failed n (%)10-item CES-D failed n (%)
  • Because individuals were given more than one diagnosis, the total does not agree with the number of subjects included. Similarly, the percentage of each diagnostic group does not sum to 100%.

  • This term refers to schizophrenia and other psychotic disorders in the DSM-IV.

  • ††

    This term refers to delirium, dementia, and amnestic and other cognitive disorders included in the DSM-IV.

  • CES-D, Center for Epidemiologic Studies Depression Scale; MINI, Mini International Neuropsychiatric Interview.

No. subjects86103
Age (years) (mean ± SD)47.0 ± 20.370.5 ± 16.579.3 ± 4.7
Sex: female (%)56 (67%)8 (80)2 (67)
DSM-IV diagnosis according to MINI   
 Mood disorders51 (59.3%)2 (20.0)0
  Depressive episode36 (41.9%)2 (20.0)0
 Anxiety disorders48 (55.8%)00
 Substance use disorders1 (1.2%)00
 Psychotic disorders2 (2.3%)00
 Eating disorders4 (4.7%)00
 Antisocial personality disorder1 (1.2%)00
 Cognitive disorders††6 (7.0%)6 (60.0)3 (100.0)
 Others14 (16.3%)2 (20.0)0
Table 2.  Comorbid diagnoses according to depressive status
DSM-IV diagnosis according to MINIDepressive nNot depressive n
  • Major depressive episode according to Mini International Neuropsychiatric Interview (MINI).

Anxiety disorders3018
Substance use disorders10
Psychotic disorders02
Eating disorders22
Antisocial personality disorders01
Cognitive disorders06
Others212

Measures

All patients who visited the waiting room were invited to participate in the study. After signing informed consent, they were asked to complete the two types of CES-D before seeing psychiatrists. During the course of the study, the order of administration of the two instruments was assigned randomly to eliminate ordering effects. If a subject had difficulty in completing the instrument alone, the instrument was administered in a consistent manner by trained nurses who read the items aloud to prevent deviation from the item wording. According to the previous study method,5 we considered the instrument infeasible for the subject if more than four items of the 20-item CES-D were not completed for any reason, even with the assistance described. For the 10-item CES-D, we set the cut-off score to ≥two items, the equivalent cut-off point in the 20-item CES-D.

The CES-D score was summed to yield a total score ranging from 0 (not depressive) to 60 (most depressive) in the long form and 0 (not depressive) to 10 (most depressive) in the short form, and CES-D scores with permissible missing items were imputed based on the mean score obtained according to the conventional method.7,8 Additionally, to measure respondent burden, we surveyed the length of time required by all subjects to complete the two types of CES-D.

On the same day of their visits, all subjects were examined using the gold standard Mini International Neuropsychiatric Interview (MINI) to identify current DSM-IV disorders.13 The MINI is a standardized diagnostic interview according to the DSM-IV criteria, and it was developed as a short and efficient package to be used in clinical as well as research settings.14,15 As a first step, the initial 11 consecutive subjects were independently assessed by two psychiatrists (including T.N.) using the MINI. Then, inter-rater reliability was obtained as a kappa of 0.667 for the 18 disorders included in the MINI and 0.744 for major depressive episode. With this reliability level, all subsequent subjects and, if there were, persons accompanying them were examined by one experienced psychiatrist (T.N.) who was blind to the CES-D scores.

Furthermore, diagnoses of cognitive disorders were made based on the results of the Mini-Mental State Examination,16 neurological examination, appropriate laboratory findings, and cranial X-ray computed tomography during follow up as recommended by the Quality Standards Subcommittee of the American Academy of Neurology.17

Data analyses

Estimates of the internal reliability of the CES-D were computed using Cronbach's alpha.18 When conventional operating characteristics for the criterion validity, sensitivity, and specificity of a screening instrument are applied to continuous screening scores, a great deal of information is lost. We avoided this limitation by assessing stratum-specific likelihood ratios (SSLR) for continuous scales.19,20 Additionally, positive and negative LR, sensitivity, and specificity were also assessed for the results. Positive and negative predictive values depend on disease prevalence, but they are reported herein.

Receiver operating characteristic (ROC) analysis was also conducted. We determined whether the two correlated AUC values obtained from 76 subjects who completed both type of CES-D were statistically different using a non-parametric method.21 Finally, a McNemar χ2 test with continuity correction was used to assess the difference in feasibility between the two types of CES-D in a total of 86 participants. All analyses were performed in the R statistical computing environment for Windows (version 2.6).22 All tests conducted were two tailed.

RESULTS

  1. Top of page
  2. Abstract
  3. METHODS
  4. RESULTS
  5. DISCUSSION
  6. ACKNOWLEDGMENTS
  7. REFERENCES

When the time required to complete the CES-D was surveyed, we found the 20-item CES-D to be much lengthier to administer than the 10-item CES-D (average time ± SD: 3.4 ± 2.4 min for the long form and 1.1 ± 1.0 min for the short form). On examination of the internal consistency reliability, alpha coefficients for the 20-item and 10-item CES-D were 0.92 and 0.80, respectively. ROC analysis illustrates the excellent ability of the CES-D to discriminate between depressive and non-depressive subjects. The AUC was 0.89 (95% confidence interval [CI]: 0.82–0.96) and 0.92 (95%CI: 0.86–0.98) for the 20-item and 10-item CES-D, respectively (Fig. 1). These two AUC, which were obtained from 76 subjects completing both types of CES-D, were not significantly different (P = 0.52). Table 3 lists the results for the sensitivity, specificity, and predictive values for the various cut-offs of the two types of CES-D. In addition, Table 4 lists the SSLR and aforementioned operating characteristics as a whole.

image

Figure 1. Receiver operating characteristic curve for the (inline image) 10-item and (inline image) 20-item Center for Epidemiologic Studies Depression Scale to screen for depressive episodes.

Download figure to PowerPoint

Table 3.  Validity characteristics of the 10-item and 20-item CES-D at different cut-offs
 Cut-offSensitivitySpecificityPPVNPV
  1. Bold, chosen cut-off scores.

  2. CES-D, Center for Epidemiologic Studies Depression Scale; NPV, negative predictive value; PPV, positive predictive value.

10-item CES-D3/40.940.670.700.93
4/50.910.690.700.91
5/60.880.810.790.89
6/70.740.930.890.81
7/80.470.980.940.69
20-CES-D16/170.970.480.600.95
19/200.940.550.630.92
20/210.940.570.640.92
22/230.910.710.720.91
23/240.910.760.760.91
24/250.880.760.750.89
27/280.790.760.730.82
30/310.760.860.810.82
36/370.650.950.920.77
Table 4.  Validity characteristics for the 10-item and 20-item CES-D to screen for depressive episodes
 CES-D-20CES-D-10
  1. AUC, area under the ROC curve; CES-D, Center for Epidemiologic Studies Depression Scale; CI, confidence interval; LR+, positive likelihood ratio; LR−, negative likelihood ratio; SSLR, stratum-specific likelihood ratio.

AUC (95%CI) 0.89 (0.82–0.96) 0.92 (0.86–0.98)
SSLR (95% CI)0–200.11 (0.03–0.37)0–30.09 (0.03–0.30)
21–360.73 (0.39–1.35)4–60.79 (0.35–1.75)
37–6013.59 (3.98–46.35)7–1010.29 (3.70–26.63)
LR+ (95% CI)24–603.83 (2.24–6.54)6–104.63 (2.51–8.55)
LR− (95% CI)0–230.12 (0.04–0.32)0–50.15 (0.06–0.35)
Sensitivity 0.91 0.88
Specificity 0.76 0.81

Finally, the number of items not completed on the 10-item CES-D were 10 items for three patients and one item for three patients, while those on the 20-item CES-D were 20 items for nine patients, six items for one patient, and one item for one patient (Fig. 2). The number of subjects who failed to complete more than a predefined threshold number of items in the 10-item CES-D was significantly lower than that in the 20-item CES-D (3/86 vs 10/86; McNemar's χ2 = 5.14, d.f. = 1, P = 0.02). The diagnoses assigned to the subjects who failed to complete the 20-item CES-D consisted of six cases of dementia, two of mental retardation, and two of major depression; for the 10-item CES-D, the diagnoses were three cases of dementia (Table 1).

image

Figure 2. No. non-completed items on the (a) 10-item and (b) 20-item Center for Epidemiologic Studies Depression Scale.

Download figure to PowerPoint

DISCUSSION

  1. Top of page
  2. Abstract
  3. METHODS
  4. RESULTS
  5. DISCUSSION
  6. ACKNOWLEDGMENTS
  7. REFERENCES

The major findings of the present study are the following: (i) the feasibility of the 10-item CES-D is significantly and substantially higher than that of the 20-item CES-D; (ii) the 10-item CES-D's internal consistency, reliability, and validity were almost identical to those of the 20-item CES-D, and they indicate its use as a screening instrument for major depression in psychiatric outpatient settings; and (iii) the 10-item CES-D can be administered in approximately 30% of the time necessary for the 20-item CES-D. To the best of our knowledge, this study is the first to evaluate the feasibility of any depression-screening instrument in psychiatric outpatient settings. The second finding is in agreement with results reported in older primary care patients, who tend to have as broad a spectrum of cognitive impairments as the psychiatric population.9,10 With regard to administration time, the third finding is also in agreement with previous reports.8

Unfortunately, the psychiatric population tends to have a broad spectrum of cognitive impairments derived from their mental disorders.11 Furthermore, major depression is a common (30–50%) complication of dementia.23 Significant limitations thus hinder the routine administration of a questionnaire to all patients regardless of risk status in practice-based screenings. These limitations arise primarily due to patient cognitive impairment, which has been reported to reduce questionnaire acceptability and feasibility.24 To cope with this problem, it is desirable to use questionnaires that are as feasible and acceptable as possible. The present results show that almost all of the subjects who failed to complete the 20-item CES-D were unable to answer any of the items although half of the subjects who failed to complete the 10-item CES-D were unable to answer only one item on it (Fig. 2). This suggests that the feasibility of each questionnaire may not be so much influenced by the number of items used for each questionnaire but by the answer format, where the former use a multiple-choice format but the latter uses a yes or no format. Therefore, a questionnaire with a yes or no format (e.g. the 10-item CES-D) may be more suitable for psychiatric outpatient settings than those with a multiple-choice format (e.g. the 20-item CES-D) in the light of its feasibility.

From a clinical perspective, the purpose of screening is to improve diagnostic recognition. This requires high sensitivity and a corresponding small false-negative rate so that the clinician can be confident that a negative test result indicates little need to inquire about the target disorder's symptoms. In contrast, false positives are less of a problem for a screening instrument because their major cost is the time a clinician takes to determine that the disorder is not in fact present. Presumably, this is the time the clinician would have nonetheless spent for the same purpose.6 This perspective is based on the situation in which the sensitivity and specificity are used to gauge test performance. If the SSLR is instead used to test performance, it is not necessary to tolerate the cost of high false positives.

Using the data in strata rather than a series of cut-offs for positive versus negative findings is a more efficient use of the information included in a test. First, a patient's pre-test probability of disease is estimated from experience, local data, or published literature. Next, the pre-test probability can be converted to the post-test probability using the formula:

  • image

Note that these are odds, not probabilities. The conversions are simple but not intuitively obvious: odds = probability / (1 − probability) and probability = odds / (1 + odds).19 For example, consider a patient with a pre-test probability of 30% for a major depressive episode. Those patients with a 10-item CES-D score >7 have a post-test probability of 83% for this episode, whereas those with a score <3 have a post-test probability of 4.9%. Thus, we can make our recognition sensitive and specific at the same time by using a SSLR based on a given test score.

The first limitation of the present study is the reliability of cognitive disorder diagnoses, which differ from the diagnoses of other disorders based on the MINI. It must be noted that there was no confirmation of their reliability. The second limiting issue is the relatively small size of the study sample, which did not permit the examination of variables potentially causing CES-D infeasibility. Third, because we did not check whether or not each subject had difficulty in completing the CES-D, the extent to which external help in the completion of the instrument can affect its feasibility is not clear. Each type of CES-D, however, was administered in a consistent manner and thus, the comparison of two types of CES-D should be valid at least in the present study. The fourth issue is the histogram comparison of the uncompleted item between the two types of CES-D (Fig. 2), based on which we suggested that the feasibility for each questionnaire could be influenced by their answer format, rather than by the number of items. There is a possibility, however, that the feasibility of the questionnaires may be influenced by the number of items. For example, most subjects who failed to complete the 20-item CES-D recognized it too hard to answer items on it due to their symptoms (such as lack of self-confidence, lack of concentration, or tiredness). Another explanation is the different factor structures that underlie the two types of the CES-D. There are 50% of items in the 10-item CES-D belonging to the somatic factor, but only 35% of items in the 20-item CES-D.25 Such differences may affect the difference of the feasibility, rather than the answer format used. To eliminate this uncertainty, it is a better strategy to make a comparison between the same type of CES-D with different answer formats. One such example is the comparison between the 10-item CES-D with yes-no format and multiple-choice format. To make this comparison, we created the 10-item CES-D from the 20-item CES-D retrospectively, which is referred to as the post-hoc 10-item CES-D here. The number of items not completed on the post-hoc 10-item CES-D was 10 items for nine patients and four items for one patient, and thus, significantly more patients failed to complete the post hoc 10-item than the original 10-item CES-D (10/86 vs 3/86; P = 0.02). Therefore, the feasibility of the instruments seemed to be influenced by their answer format, rather than by the number of items, although there still remains the possibility that the number of items may influence the feasibility.

Despite these limitations, the present study has a higher success rate in making a diagnosis than previous studies5,9; this confers greater generalizability to the results.

In summary, the present data suggest that the 10-item CES-D (a questionnaire with a yes or no format) is a better instrument to use for detecting major depressive episodes in psychiatric outpatient settings because of (i) a substantial reduction of respondent burden; (ii) the resulting greater feasibility over the 20-item CES-D (a multiple-choice format test); and yet (iii) reliability and validity comparable to the 20-item CES-D. The different answer format used in each questionnaire may influence its feasibility, rather than the number of items.

ACKNOWLEDGMENTS

  1. Top of page
  2. Abstract
  3. METHODS
  4. RESULTS
  5. DISCUSSION
  6. ACKNOWLEDGMENTS
  7. REFERENCES

We are grateful to the participating patients, psychiatrists (including Susumu Adachi), and assistants at the Kariya Memorial Hospital for their collaboration.

REFERENCES

  1. Top of page
  2. Abstract
  3. METHODS
  4. RESULTS
  5. DISCUSSION
  6. ACKNOWLEDGMENTS
  7. REFERENCES
  • 1
    Cohen CI, Hyland K, Kimhy D. The utility of mandatory depression screening of dementia patients in nursing homes. Am. J. Psychiatry 2003; 160: 20122017.
  • 2
    Schulberg HC, Saul M, McClelland M, Ganguli M, Christy W, Frank R. Assessing depression in primary medical and psychiatric practices. Arch. Gen. Psychiatry 1985; 42: 11641170.
  • 3
    Shear MK, Greeno C, Kang J et al. Diagnosis of nonpsychotic patients in community clinics. Am. J. Psychiatry 2000; 157: 581587.
  • 4
    Ramirez Basco M, Bostic JQ, Davies D et al. Methods to improve diagnostic accuracy in a community mental health setting. Am. J. Psychiatry 2000; 157: 15991605.
  • 5
    Furukawa T, Hirai T, Kitamura T, Takahashi K. Application of the Center for Epidemiologic Studies Depression Scale among first-visit psychiatric patients: A new approach to improve its performance. J. Affect. Disord. 1997; 46: 113.
  • 6
    Zimmerman M, Sheeran T. Screening for principal versus comorbid conditions in psychiatric outpatients with the Psychiatric Diagnostic Screening Questionnaire. Psychol. Assess. 2003; 15: 110114.
  • 7
    Radloff LS. The CES-D scale. A self-report depression scale for research in the general population. Appl. Psychol. Meas. 1977; 1: 385401.
  • 8
    Kohout FJ, Berkman LF, Evans DA, Cornoni-Huntley J. Two shorter forms of the CES-D (Center for Epidemiological Studies Depression) depression symptoms index. J. Aging Health 1993; 5: 179193.
  • 9
    Lyness JM, Noel TK, Cox C, King DA, Conwell Y, Caine ED. Screening for depression in elderly primary care patients. A comparison of the Center for Epidemiologic Studies-Depression Scale and the Geriatric Depression Scale. Arch. Intern. Med. 1997; 157: 449454.
  • 10
    Irwin M, Artin KH, Oxman MN. Screening for depression in the older adult: Criterion validity of the 10-item Center for Epidemiological Studies Depression Scale (CES-D). Arch. Intern. Med. 1999; 159: 17011704.
  • 11
    Gorwood P, Corruble E, Falissard B, Goodwin GM. Toxic effects of depression on brain function: Impairment of delayed recall and the cumulative length of depressive disorder in a large sample of depressed outpatients. Am. J. Psychiatry 2008; 165: 731739.
  • 12
    Nishiyama T, Ozaki N, Iwata N. Using infeasibility information of a questionnaire to detect cognitive disorders: Example of the Center for Epidemiologic Studies Depression Scale in psychiatry settings. Psychiatry Clin. Neurosci. 2009; 63: 2329.
  • 13
    American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, 4th edn. American Psychiatric Association, Washington, DC, 1994.
  • 14
    Sheehan DV, Lecrubier Y, Sheehan KH et al. The Mini-International Neuropsychiatric Interview (M.I.N.I.): The development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. J. Clin. Psychiatry 1998; 59 (Suppl. 20): 2233.
  • 15
    Otsubo T, Tanaka K, Koda R et al. Reliability and validity of Japanese version of the Mini-International Neuropsychiatric Interview. Psychiatry Clin. Neurosci. 2005; 59: 517526.
  • 16
    Folstein MF, Folstein SE, McHugh PR. ‘Mini-mental state’. A practical method for grading the cognitive state of patients for the clinician. J. Psychiatr. Res. 1975; 12: 189198.
  • 17
    Knopman DS, DeKosky ST, Cummings JL et al. Practice parameter: Diagnosis of dementia (an evidence-based review). Report of the Quality Standards Subcommittee of the American Academy of Neurology. Neurology 2001; 56: 11431153.
  • 18
    Feldt L, Woodruff D, Salih F. Statistical inference for coefficient alpha. Appl. Psychol. Meas. 1987; 11: 93103.
  • 19
    Peirce JC, Cornell RG. Integrating stratum-specific likelihood ratios with the analysis of ROC curves. Med. Decis. Making 1993; 13: 141151.
  • 20
    Sackett DL, Richardson WS, Rosenberg W, Haynes RB. Evidence- Based MEDICINE, 1st edn. Livingstone, New York, 1997.
  • 21
    DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics 1998; 44: 837845.
  • 22
    R Development Core Team. A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, 2007.
  • 23
    Zubenko GS, Zubenko WN, McPherson S et al. A collaborative study of the emergence and clinical features of the major depressive syndrome of Alzheimer's disease. Am. J. Psychiatry 2003; 160: 857866.
  • 24
    Bureau-Chalot F, Novella JL, Jolly D, Ankri J, Guillemin F, Blanchard F. Feasibility, acceptability and internal consistency reliability of the Nottingham health profile in dementia patients. Gerontology 2002; 48: 220225.
  • 25
    McCallum J, Mackinnon A, Simons L, Simons J. Measurement properties of the Center for Epidemiological Studies Depression Scale: An Australian community study of aged persons. J. Gerontol. B Psychol. Sci. Soc. Sci. 1995; 50: S182189.