Because Drs. Katz and Yelin are Editors of Arthritis Care & Research, review of this article was handled by the Editor of Arthritis & Rheumatism.
Systemic Lupus Erythematosus
Version of Record online: 31 MAY 2011
Copyright © 2011 by the American College of Rheumatology
Arthritis Care & Research
Volume 63, Issue 6, pages 884–890, June 2011
How to Cite
Julian, L. J., Gregorich, S. E., Tonner, C., Yazdany, J., Trupin, L., Criswell, L. A., Yelin, E. and Katz, P. P. (2011), Using the center for epidemiologic studies depression scale to screen for depression in systemic lupus erythematosus. Arthritis Care Res, 63: 884–890. doi: 10.1002/acr.20447
The contents of this article are solely the responsibility of the authors and do not necessarily represent the official views of the NIH.
- Issue online: 31 MAY 2011
- Version of Record online: 31 MAY 2011
- Accepted manuscript online: 10 FEB 2011 12:08PM EST
- Manuscript Accepted: 25 JAN 2011
- Manuscript Received: 29 APR 2010
- NIH/National Center for Research Resources University of California
- San Francisco Clinical and Translational Science Institutes. Grant Number: UL1-RR024131
- NIH/National Institute of Arthritis and Musculoskeletal and Skin Diseases. Grant Numbers: P60-AR053308, K08-MH072724
- The Rosalind Russell Medical Research Center for Arthritis
Identifying persons with systemic lupus erythematosus (SLE) at risk for depression would facilitate the identification and treatment of an important comorbidity conferring additional risk for poor outcomes. The purpose of this study was to determine the utility of a brief screening measure, the Center for Epidemiologic Studies Depression Scale (CES-D), in detecting mood disorders in persons with SLE.
This cross-sectional study examined 150 persons with SLE. Screening cut points were empirically derived using threshold selection methods, and receiver operating characteristic curves were estimated. The empirically derived cut points of the CES-D were used as the screening measures and were compared to other commonly used CES-D cut points in addition to other commonly used methods to screen for depression. Diagnoses of major depressive disorder or other mood disorders were determined using a “gold standard” structured clinical interview.
Of the 150 persons with SLE, 26% of subjects met criteria for any mood disorder and 17% met criteria for major depressive disorder. Optimal threshold estimations suggested a CES-D cut score of 24 and above, which yielded adequate sensitivity and specificity in detecting major depressive disorder (88% and 93%, respectively) and correctly classified 92% of participants. To detect the presence of any mood disorder, a cut score of 20 and above was suggested, yielding sensitivity and specificity of 87% and correctly classifying 87%.
These results suggest the CES-D may be a useful screening measure to identify patients at risk for depression.
In 1999, the American College of Rheumatology (ACR) Ad Hoc Committee on Neuropsychiatric Lupus developed a nomenclature system for neuropsychiatric syndromes in systemic lupus erythematosus (SLE) (1), which included depressive disorders of at least moderate severity. The Center for Epidemiologic Studies Depression Scale (CES-D) was also suggested by this committee as the preferred screening method for depression in SLE. Despite this recommendation and widespread use of the CES-D, to our knowledge, no studies have evaluated the psychometric utility of this measure in persons with SLE.
Depression is a common and debilitating comorbidity associated with SLE (2, 3). Quickly identifying patients with sufficient depressive symptoms requiring treatment or referral to specialty care is an ongoing challenge in the clinic, particularly with overlapping somatic symptoms common to both SLE and depression. Conventional methods considered to be the gold standards for diagnosing depressive disorders are lengthy, require highly trained evaluators, and are typically unavailable and/or prohibitively costly in the medical setting.
The primary goal of this study was to determine the utility of the CES-D as a screening instrument for depression in a cohort of persons with SLE. Specifically, we proposed using threshold selection techniques to empirically derive cut points of the CES-D in comparison to traditionally utilized cut points, other commonly used single-item measures for depression, and questions embedded in other disease activity scales and quality of life scales, for detecting mood disorders as determined by a gold standard structured clinical interview (4).
SUBJECTS AND METHODS
Subjects and data collection method.
The sample for this study was drawn from participants in the University of California, San Francisco Lupus Outcomes Study (LOS), a prospective study of 957 individuals with diagnostically confirmed SLE (by medical chart review using ACR criteria) (5). Details about enrollment and data collection for this study have been reported previously (6), and are briefly summarized here. Subjects were recruited through academic medical centers (25%), community rheumatology offices (11%), nonclinical sources, including patient support groups and conferences (26%), and other various forms of the media (38%). The primary data collection method was through annual telephone interviews. Participants who live in the greater San Francisco Bay area were recruited for an in-person assessment in the Clinical and Translational Science Institutes Clinical Research Center (CRC) for a comprehensive clinical study evaluating depression, body composition, physical function, and disability. Exclusion criteria for this larger study included non-English speaking, younger than age 18 years, 50 mg or greater of oral prednisone, pregnancy, uncorrected vision problems interfering with reading ability, and joint replacement within the past year. Three hundred twenty-five individuals were asked to participate; 74 (22.8%) were ineligible (35 lived too far away, 25 were too ill, 9 had had recent surgery, 2 were pregnant, 2 had poor English skills, and 1 had cognitive problems). Of the 251 eligible individuals, 84 (33.5%) declined participation. Reasons for declining were primarily related to transportation (n = 12) and scheduling difficulties (n = 39). One hundred sixty-three individuals completed the study visits. Complete data were achieved in 150 persons. This research protocol has been approved by the University of California, San Francisco Committee on Human Research. All participants gave their informed consent to participate.
Sociodemographic and disease characteristics.
Sociodemographic characteristics included age, sex, race/ethnicity, education (college education versus less than college educated), marital status (married/with partner versus not), employment status (currently working versus not), and poverty status (household income at or below versus above 125% of the federal poverty threshold). Sociodemographics were collected through the LOS telephone interview.
Health and disease characteristics.
Disease activity was assessed using the Systemic Lupus Activity Questionnaire (SLAQ), a validated, self-report measure of disease activity in SLE that includes items assessing constitutional symptoms, mucocutaneous symptoms, musculoskeletal symptoms, and other disease activity domains (7, 8). A modified SLAQ was also calculated, excluding symptoms overlapping with depression (i.e., depressed mood, concentration problems, and decreased energy). Other disease characteristics included disease duration (years), current treatment with glucocorticoids, mean corticosteroid (prednisone) dose, suspected renal involvement (i.e., reported an abnormal urinalysis, including blood or protein in urine), and reported history of stroke or myocardial infarction. Finally, the Medical Outcomes Study Short Form 36 (SF-36) physical function scale was utilized as a general measure of physical limitations (9). With the exception of information about medications, which was collected at the CRC, all health and disease characteristics were collected in the telephone interview. The mean ± SD interval between assessment of health and disease characteristics and the assessment of depression was 3.3 ± 3 months.
The CES-D (4) was used in this study as the screening measure of depression and was administered in questionnaire form at the CRC visit. The CES-D is a 20-item scale commonly used to evaluate current depressive symptom severity, with a score range of 0–60 (higher scores reflect increased symptom severity). Item responses range from 0 to 3, where 0 = rarely or none of the time (less than 1 day in the past week), 1 = some or a little of the time (1–2 days), 2 = occasionally or a moderate amount of the time (3–4 days), and 3 = most or all of the time (5–7 days). The CES-D was recommended by the ACR Ad Hoc Committee on Neuropsychiatric Lupus as the preferred measure of depressive symptom severity for patients with SLE (1). Empirically derived cut scores will be compared to conventional cut scores of 16 and 23 (10). Previous studies have also suggested a cut score of 19 for rheumatoid arthritis, using a 13-item modification of the CES-D removing somatic items (11). We also compared our empirically derived cut scores to this 13-item modification to determine performance in comparison to other validation studies in rheumatic disease.
Mini-International Neuropsychiatric Interview (MINI)
The MINI (12) was utilized as the gold standard to determine the presence versus absence of either current major depressive disorder or any mood disorder, including dysthymia, minor depression, and major depressive disorder. The MINI was designed as a structured interview to determine diagnoses for the major axis I psychiatric disorders in the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) and the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10). Validation and reliability studies have been conducted comparing the MINI to the commonly used Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Third Edition, Revised and the Composite International Diagnostic Interview (a structured interview developed by the World Health Organization for lay interviewers for ICD-10) (12), with results indicating that the MINI has acceptably high validity and reliability scores. The MINI has been used in a range of conditions, including rheumatologic populations (13). Evaluators at the CRC were trained to conduct the MINI along with standardized clinically informed prompts by a licensed clinical psychologist (LJJ) in 14 training days over the course of 4 weeks, including assessments observed by the trainer. Results for all patient interviews were reviewed for proper diagnostic conclusions. On average, the MINI administration is 15–25 minutes. The MINI was administered at the CRC at least 1 hour after administration of the CES-D by an evaluator blinded to the results of the CES-D.
Depression screening measures used for comparison.
Single- or 2-item measures have been shown to be equally useful as longer questionnaires in screening for depression (14). Additionally, mood items are often embedded into other measures of disease status and quality of life, and the degree to which the CES-D may perform better than these measures is not known. In order to test the utility of the CES-D in comparison to these single-item and quality of life approaches, we used the following measures: 1) embedded single-item mood question: as part of the SLAQ (7, 8), all participants were asked, “Over the past three months, has depressed mood been a problem for you,” which yielded a dichotomous (yes versus no) response, and 2) subscale embedded in a quality of life questionnaire: the SF-36 mental component score was used and a cut score of 29 and below was used as an index of clinically significant mood problems, as suggested in previous studies (15).
Descriptive statistics were used to characterize participant sociodemographics and disease status. To determine the optimal screening cut point of the CES-D for major depressive disorder and any mood disorder, receiver operating characteristic (ROC) curves were estimated and two threshold selection methods were utilized: the Youden Index and a second technique that determines the proximity to perfect correspondence (referred to in this article as a “Distance to Perfect” Index) (16). Briefly, the Youden Index determines the maximum vertical distance from the ROC curve to the diagonal reference, or “chance” line, i.e., the “optimal” cut point corresponds to the point on the ROC curve farthest from the reference line, which has previously been used as a measure of the accuracy of a diagnostic test in clinical epidemiology (17). Similarly, the Distance to Perfect Index selects the point on the ROC curve that is closest to the upper left-hand corner of the graph (0,1), representing perfect classification (18), thereby minimizing misclassification. Cut points were initially determined for the entire sample and compared to traditionally suggested CES-D cut points, the somatic item–free CES-D, responses of “yes” to the two single-item measures, and a score of 29 and below on the SF-36 mental component score.
Sociodemographic and health-related characteristics of the cohort are shown in Table 1. Participants had a mean ± SD age of 48.8 ± 12.3 years, 93% were women, 63% were white, 69% had a college degree, and 12% had poverty-level household incomes (<125% of the federal poverty threshold). The mean ± SD disease duration was 15.8 ± 9.3 years, the mean ± SD SLAQ disease activity score was 12.1 ± 7.1, 42% reported recently having suspected renal involvement, and 16% had a history of stroke or myocardial infarction. According to the MINI diagnostic criteria, any mood disorder was present in 26% of the sample (Table 1), with major depressive disorder as the most prevalent disorder (17%), followed by minor depression (6%) and dysthymia (4%). Two individuals met criteria for major depressive disorder and dysthymia (1%).
|Total sample (n = 150)|
|CES-D total score, mean ± SD||18.3 ± 7.1|
|Major depressive disorder||26 (17)|
|Minor depression||9 (6)|
|Any mood disorder||40 (26)|
|Sociodemographic and health characteristics|
|Age, mean ± SD years||48.8 ± 12.3|
|African American||21 (14)|
|Hispanic ethnicity||27 (18)|
|College degree or above||104 (69)|
|Married or with partner||75 (50)|
|Currently working||62 (41)|
|Below poverty||18 (12)|
|Disease duration, mean ± SD years||15.8 ± 9.3|
|SF-36 physical function scale, mean ± SD||62.23 ± 27.6|
|SLAQ disease activity score, mean ± SD||12.1 ± 7.1|
|Suspected renal involvement||63 (42)|
|History of cardiovascular/cerebrovascular event (stroke or myocardial infarction)||24 (16)|
|Current corticosteroid use||71 (47)|
|Corticosteroid dose, mean ± SD mg||3.71 ± 5.0|
ROC curves are shown in Figure 1. Results from the Youden and Distance to Perfect indices were somewhat disparate for a diagnosis of major depressive disorder. The Youden Index derived a cut score of 21 with a sensitivity, specificity, positive predictive value, and negative predictive value of 96%, 85%, 58%, and 99%, respectively. The Distance to Perfect Index derived a higher cut score of 24 with a sensitivity, specificity, positive predictive value, and negative predictive value of 89%, 93%, 72%, and 98%, respectively. Given these results, an optimal cut point of 24 was selected due to the slightly favorable balance between sensitivity and specificity, and the improved positive predictive value (Table 2). To detect cases of any mood disorder, the indices both suggested that the optimal CES-D cut point was 20 and greater, with a sensitivity, specificity, positive predictive value, and negative predictive value of 87%, 87%, 69%, and 95%, respectively (Table 3). Psychometrics were not substantially different when patients were stratified by race/ethnicity (white versus nonwhite) (results not shown).
|Empirically derived CES-D cut points, 24 and above||Traditional CES-D cut points||CES-D somatic factor removed, 13 and above||SF-36 mental health component score, 40 and below||Single depressed mood question, yes/no†|
|16 and above||23 and above|
|Sensitivity (95% CI)||0.88 (0.69–0.97)||0.96 (0.78–1.00)||0.88 (0.69–0.97)||0.88 (0.69–0.97)||0.81 (0.61–0.92)||0.73 (0.74–0.99)|
|Specificity (95% CI)||0.93 (0.86–0.96)||0.66 (0.55–0.73)||0.89 (0.82–0.94)||0.89 (0.82–0.84)||0.77 (0.68–0.83)||0.44 (0.35–0.53)|
|PPV (95% CI)||0.72 (0.53–0.86)||0.36 (0.25–0.49)||0.64 (0.46–0.79)||0.64 (0.46–0.79)||0.42 (0.29–0.57)||0.26 (0.18–0.36)|
|NPV (95% CI)||0.98 (0.92–0.99)||0.99 (0.92–1.0)||0.97 (0.92–0.99)||0.97 (0.92–0.99)||0.95 (0.88–0.98)||0.97 (0.87–0.99)|
|Correctly classified, %||92||70||89||89||77||52|
|Empirically derived CES-D cut points, 20 and above||Traditional CES-D cut points||CES-D somatic factor removed, 12 and above||SF-36 mental health component score, 44 and below||Single depressed mood question, yes/no†|
|16 and above||23 and above|
|Sensitivity (95% CI)||0.87 (0.72–0.95)||0.95 (0.8–0.99)||0.74 (0.58–0.86)||0.79 (0.63–0.90)||0.80 (0.64–0.90)||0.93 (0.79–0.98)|
|Specificity (95% CI)||0.87 (0.79–0.92)||0.72 (0.62–0.80)||0.94 (0.87–0.97)||0.89 (0.81–0.94)||0.73 (0.65–0.81)||0.47 (0.38–0.57)|
|PPV (95% CI)||0.69 (0.54–0.81)||0.54 (0.41–0.66)||0.81 (0.63–0.91)||0.72 (0.56–0.84)||0.51 (0.38–0.63)||0.37 (0.28–0.48)|
|NPV (95% CI)||0.95 (0.88–0.98)||0.98 (0.91–1.00)||0.91 (0.84–0.96)||0.93 (0.86–0.97)||0.92 (0.84–0.96)||0.95 (0.85–0.99)|
|Correctly classified, %||87||78||88||87||75||59|
Empirically derived cut points for a diagnosis of major depressive disorder are shown in Table 2 with traditionally utilized cut scores of the CES-D, the modified somatic-free CES-D score, and our other two comparison measures (one single-item measure and the mental component score of the SF-36). In these analyses, the empirically derived CES-D cut point yielded the highest overall classification rates (92%). According to McNemar's exact test, although threshold analyses suggested a cut point of 24 as optimal, this cut point was not statistically different than a CES-D cut point of 23 (P = 0.12).
Likewise, the empirically derived cut point of 20 for a diagnosis of any mood disorder along with comparison measures are shown in Table 3. This cut point of 20 performed very similarly to a commonly used cut point of 23 with a slightly increased sensitivity, suggesting there is little differentiation between these two points in detecting a diagnosis of any mood disorder.
Finally, empirically derived cut points for the CES-D in detecting major depressive disorder and any mood disorder were compared to a single-item measure of depression and the mental health component score of the SF-36. Threshold analyses suggested a cut score of 40 and below for the mental health component score for detection of major depressive disorder, and 44 and below for detection of any mood disorder. In both cases, the CES-D cut scores had improved classification rates (Tables 2 and 3).
The purpose of this investigation was to determine the utility of the CES-D screening instrument for depression among persons with SLE. In this sample, 17% of persons met criteria for major depressive disorder and 26% met criteria for any mood disorder, including major depressive disorder, minor depression, and dysthymia, using a comprehensive structured clinical interview from the DSM-IV. To our knowledge, the present study is one of few studies employing structured clinical interview techniques for depression. Notably, prevalence rates from our study were similar to a previous study using comparable (DSM-IV based) interview techniques (19). In this comparable study of women with SLE, Nery and colleagues found mood disorders present in 27% and a diagnosis of major depressive disorder in 22% (19). Results of the threshold estimation techniques suggested a cut point score of 24 and above on the CES-D, which correctly classified 92% of participants as current major depressive disorder present versus absent. Overall, the CES-D appears to be a useful screening measure to identify patients who have a range of disease activity levels to detect the presence of major depression.
This proposed cut score of 24 is very similar to the commonly used cut score of ≥23 for “probable depression” (10) and may have some improved psychometric properties, but was not statistically different. Our results suggest that in this disease population, a higher cut score may be advantageous in identifying patients who likely meet criteria for major depression. This cut score of 24 is also similar to a recently suggested cut score of 25 for older adults (20).
We also investigated the utility of the CES-D for detection of any mood disorder, including major depression, minor depression, and dysthymia. To detect any of these 3 mood disorders, the threshold estimation techniques suggested a cut point score of 20 and above, which was observed to yield optimal diagnostic accuracy. Although our classification of any mood disorder is heterogeneous and represents a broad category of mood disorders, sensitivity and specificity remained reasonably high at 87% for both. These results suggest that the CES-D may also be useful for the detection of this broad category of mood disorders, including subsyndromal depression. Using the CES-D to detect patients at risk for a range of mood disorders may be a useful approach, particularly to identify those with subsyndromal depression. Subsyndromal depressive symptoms not meeting criteria for a depressive disorder have been observed to place patients at risk for poor outcomes (21). Therefore, monitoring those patients who endorse these symptoms of depression more closely in the course of clinical care may be warranted.
It is well understood that assessing depression in a medical population presents specific measurement challenges. One of the most prominent of these challenges is the issue of symptom contamination. Clearly somatic symptoms of depression overlap with disease-related symptoms, particularly decreased energy, concentration difficulties, sleep, and appetite disturbance. Empirical methods have been previously employed to address this problem by removing contaminating items (22), or in selecting measures that are relatively somatic symptom free (e.g., the Geriatric Depression Scale). In our study, we compared our empirically derived cut scores to a comparably derived cut score on the modified somatic symptom–free measure previously suggested for use in rheumatoid arthritis. These results suggest that by removing the somatic symptoms, we may not be improving detection of depressive disorder in this condition, and in some instances the cut score derived from the original CES-D performed better than a modified version. Overall, we believe that the unmodified CES-D may be useful and has the advantage of maintaining the psychometric integrity of the measure. Maintaining the original item structure facilitates broad use of the CES-D in SLE and enables comparisons across populations. Furthermore, while there is certainly the potential of overlapping symptoms causing inflation of depression severity scores and thereby shifting base rates of depression in populations, this potential relies on the assumption that these somatic symptoms of depression are invariably etiologically linked to the medical condition. It is equally possible that these somatic symptoms of depression are etiologically linked to an underlying depressive disorder (23), and removing these symptoms may underestimate the prevalence and severity of depression. Recently, Thombs et al observed that somatic symptoms were not endorsed at substantively higher rates by scleroderma patients compared to matched community controls, further supporting this idea (24). In this study, we cannot ascribe the etiology of our somatic symptoms. However, it is important both in the clinical and research setting to remain cognizant of these issues, and future longitudinal studies tracking the evolution of these symptoms in SLE are warranted.
Limitations of this study include the potential for reduced generalizability. First, healthier individuals may be more likely to participate in this study, particularly since it required travel to our CRC. However, the prevalence of significant disease manifestations in this cohort suggests that disease severity may be representative of persons with SLE. Second, we included only English-speaking patients, limiting the generalization of our results to non-English–speaking patients with SLE. Third, with the exception of SLE diagnostic status, health characteristics are collected by self-report. It is possible that our patients are unaware of their health status across all factors. Fourth, the sample sizes of participants who met criteria for depression are relatively small, and further psychometric confirmation is warranted in a larger sample. Fifth, although the CES-D is easily administered and scored, it is also somewhat longer (i.e., 20 items) than other depression scales (e.g., the Patient Health Questionnaire 9 [PHQ-9] ). There is increasing reliance on these briefer measures, yet psychometric evaluation of these briefer measures needs to be completed in complex conditions such as SLE. For example, although the PHQ-9 has been shown to be associated with the CES-D in rheumatic disease (26), the PHQ-9 was not validated against a structured clinical interview for mood disorders. In addition, we compared the CES-D to a single-item measure that may lack sensitivity compared to other established ultra-short measures previously studied (27). Furthermore, there is the potential for spectrum bias in that we included patients who were potentially being treated for depression in this sample, and therefore there may be some increase in our estimates of diagnostic accuracy. Finally, although the screening and the gold standard depression measures were both evaluated at the same visit, disease characteristics, quality of life, and the single-item question were evaluated by the preceding LOS telephone interview, and it is possible that symptoms and some aspects of disease activity may have changed between the telephone interview and their CRC visit.
Depression is a common and debilitating comorbidity associated with SLE. Results from this study suggest that the CES-D, which is quickly administered and easily scored, may be a useful tool in the research and clinical setting to identify depression among persons with SLE. In the clinic, it is important to highlight that screening for depression is only one aspect of appropriate care in both primary care and subspecialty clinics. One potential disadvantage for depression screening is the possibility of opportunity cost, in that even a relatively brief screen for depression could detract from screening and assessment of other clinical issues. Second, merely screening for depression may not be beneficial in the absence of a comprehensive care model with adequate resources for not only assessment, but also treatment and followup (28). Randomized clinical trials are necessary in SLE and other rheumatic diseases to fully understand the benefit–harm tradeoffs for screening for depression in the context of specialty care.
In sum, depression in the context of SLE, like many chronic conditions, is a substantial risk factor for a number of poor health outcomes, including decreased treatment adherence and increased disability (28, 29). Additionally, mental health status is among the strongest predictors of health care costs among patients with SLE (30). A number of effective treatments for depression are available and identifying patients at risk for significant depressive disorders in the context of a comprehensive care model could substantially improve the quality of life for patients with SLE.
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Julian had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Julian, Gregorich, Katz.
Acquisition of data. Julian, Criswell, Yelin, Katz.
Analysis and interpretation of data. Julian, Gregorich, Tonner, Yazdany, Trupin, Yelin, Katz.
- 4The CES-D Scale: a self-report depression measure for research in the general population. Appl Psychol Meas 1977; 1: 385–401..
- 5Diagnostic and Therapeutic Criteria Committee of the American College of Rheumatology. Updating the American College of Rheumatology revised criteria for the classification of systemic lupus erythematosus [letter]. Arthritis Rheum 1997; 40: 1725., for the
- 10Measuring health: a guide to rating scales and questionnaires. 3rd ed. New York: Oxford University Press; 2006..
- 15SF-36 physical and mental health summary scales: a manual for users of version 1. 2nd ed. Lincoln (RI): QualityMetric; 2001., .
- 17Evaluating medical tests: objective and quantitative guidelines. Newbury Park (CA): Sage Publications; 1992..
- 24Canadian Scleroderma Research Group. Sociodemographic, disease, and symptom correlates of fatigue in systemic sclerosis: evidence from a sample of 659 Canadian Scleroderma Research Group Registry patients. Arthritis Rheum 2009; 61: 966–73., , , , , and the