Validation of self-report depression rating scales in Huntington's disease


  • Jennifer De Souza,

    Corresponding author
    1. Birmingham and Solihull Mental Health Foundation Trust, Neuropsychiatry Service, Birmingham, United Kingdom
    2. Department of Psychiatry, University of Birmingham, Birmingham, United Kingdom
    • Department of Psychiatry, The Barberry, 25, Vincent Drive, Edgbaston, Birmingham B15 2FG, United Kingdom
  • Lisa A. Jones,

    1. Department of Psychiatry, University of Birmingham, Birmingham, United Kingdom
  • Hugh Rickards

    1. Birmingham and Solihull Mental Health Foundation Trust, Neuropsychiatry Service, Birmingham, United Kingdom
    2. Department of Psychiatry, University of Birmingham, Birmingham, United Kingdom
  • Potential conflict of interest: nothing to report.


The aim of this study was to assess the criterion validity of three self-report measures of depression in a sample of patients with Huntington's disease (HD). Fifty patients with HD completed the Beck Depression Inventory-II (BDI-II), the Hospital Anxiety and Depression Scale (HADS), and the Depression Intensity Scale Circles (DISCs). Current psychiatric status was assessed using the schedules for clinical assessment in neuropsychiatry (SCAN), and ICD-10 diagnosis was used as the gold standard. Receiver operating characteristics (ROC) curves were obtained and the sensitivity, specificity, positive, and negative predictive values were calculated for different cut-off scores on each rating scale. Twelve patients (24%) met ICD-10 criteria for depressive disorder. The depression sub-scale of the HADS (HADS-D) at an optimal cut-off of 6/7 was found to discriminate maximally between depressed and nondepressed patients in this population. The DISCs at a cut-off of 1/2 also performed well at detecting possible “cases” of depression, whereas the BDI-II performed the least satisfactorily of all scales. The HADS-D and DISCs are good screening measures for depression in the HD population and the DISCs may be particularly useful in those patients with more severe communicative and cognitive deficits. © 2009 Movement Disorder Society


George Huntington's original description of Huntington's disease (HD) noted that “The tendency to insanity, and sometimes that form of insanity which leads to suicide, is marked.”1 Since this time, depression has been recognized as a common comorbid condition in HD. Estimates of depression prevalence in HD range greatly from 9% to 63%2 with a meta-analysis of 16 studies concluding that approximately 1 in 3 people with HD will experience depression at some point during their lives.3

Depression in HD deserves increased attention, owing to the fact that elevated levels of depressive symptoms are associated with functional decline4 and reduced quality of life.5 In addition, depression may predate disease onset by up to 20 years6 and is associated with impairments in working memory in presymptomatic individuals.7 Given that depression is one of the more readily treatable symptoms of HD, it is important that depression is diagnosed and consequently treated as early as possible.

However, a diagnosis of depression can be especially difficult to make in the setting of HD. Overdiagnosis of depression can result because core somatic symptoms of depression such as weight loss, sleep disturbance, decreased concentration, psychomotor retardation, and fatigue overlap with the physical symptoms of HD, which may give the appearance of depression in euthymic patients. Underdiagnosis can also result if such affective symptoms are simply attributed to the somatic and cognitive aspects of HD or that they are explained away as a reaction to their diagnosis or degree of disability. Additionally, in the later stages of the disease, the presence of apathy and parkinsonism may confound a diagnosis of depression.

Depression rating scales are routinely used in clinical practice and research but are often selected arbitrarily even though they can differ greatly in terms of their symptom content8 and cognitive complexity.9 The Beck Depression Inventory (BDI)10 has been the most commonly used self-report measure of depression in HD. However, 8 of the 21 items refer to “somatic” symptoms and relatively it is one of the more cognitively complex of the rating scales (measured in terms of length of items, readability, linguistic problems, and number of items), which together are likely to reduce its measurement accuracy in the HD population. To ensure that rating scales are measuring depressive symptoms and not just symptoms of HD and that the most appropriate cut-offs are selected, it is important that such measures of depression are validated in the HD population.

Therefore, the aim of this study was to evaluate the criterion validity of three self-report rating scales as screening measures for depression in the HD population when compared to “gold-standard” operationally-defined diagnoses made using a widely-used, standardized semi-structured psychiatric interview.


Participants and Setting

Fifty patients with a clinical and genetic diagnosis of HD were recruited from the out-patient Huntington's disease service based in the department of Neuropsychiatry at the Queen Elizabeth Psychiatric Hospital, Birmingham, UK. All participants opted into the study having received information about the research project during their routine HD clinic appointment. Patients were excluded from the study if they were sufficiently cognitively impaired to prevent them from giving informed consent, were less than 18 years of age, or were not fluent in English. Participants gave written informed consent for the study, which was approved by the Solihull Local Research Ethics Committee. All participants were interviewed at their homes by one of the authors (JK) in a single session lasting approximately an hour and a half.

Psychiatric Assessment

Current psychiatric status was assessed using section 6 (depressed mood and ideation), section 7 (thinking, concentration, energy and interests), and section 8 (bodily functions) of the Schedules for Clinical Assessment in Neuropsychiatry (SCAN),11 which was considered the gold standard for this study. The SCAN is a widely used semi-structured interview aimed at assessing, measuring, and classifying the psychopathology associated with major psychiatric disorders. The standardized operational diagnostic criteria of the International Classification of Diseases, 10th edition (ICD-10)12 were then used to make formal diagnoses of no, mild, moderate, or severe depression. All ratings were made by the researcher (JK) and another trained psychologist to ensure that consensus was reached. Atthe same assessment, all participants additionally completed the following self-report measures of depression:

Beck Depression Inventory-II (BD-II)

The BDI-II13 is the revised version of the BDI. The BDI-II was developed in response to the publication of the DSM-IV,14 which changed many of the diagnostic criteria for depressive disorder. The BDI-II is a 21 item (each item scored 0-3) self-report rating scale that provides a quantitative assessment of the intensity of depression (total score 0-63). The BDI-II does, however, contain several items relating to somatic symptoms and has high overall cognitive complexity.9 Completion time, ∼10 min.

Hospital Anxiety and Depression Scale (HADS)

The HADS15 is a 14 item (each item scored 0-3), self-administered rating scale that consists of two sub-scales assessing the presence and severity of depression (0–21) and anxiety (0–21), with a global score of 0–42. It was designed to diminish the influence of somatic symptoms and consequently does not include items relating to the physical symptoms of depression and has a medium overall cognitive complexity.9 Both the total HADS as a global measure of mood as well as just the depression sub-scale of the HADS (HADS-D) were validated in this study. Completion time, ∼5–10 min.

Depression Intensity Scale Circles (DISCs)

The DISCs16 is a simple screening and severity measure of depression. It consists of a 6-point ordinal graphic rating scale (score range 0–5) portraying six circles with an increasing proportion of gray shading that represents an increasing amount of sadness or depression. Once the scale has been explained to the patients, they are asked to indicate which circle best shows how sad or depressed they are feeling today. The DISCs was designed to improve the assessment of mood in Acquired Brain Injury patients who may have cognitive and/or communicative deficits and who consequently might find more complex assessment tools difficult to complete.16 Completion time, ∼2 min.

HD Assessment

In addition to participants completing the BDI-II, HADS, and DISCs, the assessment also included the Unified Huntington's Disease Rating Scale (UHDRS)17 motor section and Total Functional Capacity as measures of disease severity. The UHDRS is a research tool, which has been developed by the Huntington Study Group to provide a uniform measure of clinical performance and course of HD. The UHDRS has undergone extensive reliability and validity testing and has been used in many research studies as a primary outcome measure.17 The Addenbrooke's Cognitive Examination (ACE)18 was also administered to provide a measure of global cognitive impairment that is also particularly sensitive to the cognitive deficits observed in patients with HD.

Statistical Analysis

The sensitivity, specificity, positive predictive values (PPV), and negative predictive values (NPV) were calculated for the cut-offs in the mid-range of all the scales. Receiver Operating Characteristic (ROC) curves were obtained by plotting sensitivity against 1-specificity for each score on each depression rating scale. The “area under the curve” (AUC) was also calculated for each rating scale, which provides an indication of the discriminative property of a scale. The analyses were conducted using SPSS version 14.0.


Fifty patients with HD participated in this study. The mean age was 51.2 years (S.D. = 10.35); 26 of the participants were male (52%); the mean number of years of education was 12.26 years (S.D. = 1.78), and all of the participants were white. The mean age at disease onset was 44.12 years (S.D. = 10.55) and the mean number of years since disease onset was 6.78 years (S.D. = 3.79). The average ACE score was 71.4 (S.D. = 18.4) and the average MMSE score was 24.5 (S.D. = 5.2).

Using the SCAN, 6 patients met ICD-10 criteria for mild depressive disorder, 5 met criteria for moderate depressive disorder, and 1 met criteria for severe depressive disorder to give an overall prevalence of depression of 24%. Although only 12 patients received a formal diagnosis of depression, a further 18 patients (36%) reported feelings of dysphoria. Table 1 shows the average and range of scores obtained on each depression rating scale for the depressed and non-depressed patients. Using the recommended cut-offs for the depression rating scales (BDI-II, 13/14; HADS, 14/15; HADS-D 7/8; DISCs, 1/2), each scale resulted in more cases of depression than formal diagnoses obtained from the SCAN. The BDI-II produced 9 extra cases of depression, the HADS and HADS-D 4 extra cases and the DISCs gave rise to 6 more cases of depression.

Table 1. Depression rating scales: properties and basic statistics obtained for the depressed and non-depressed patients (according to ICD-10)
Depression rating scaleRange of scoresNumber of itemsDepressed patients (N = 12) mean (S.D. range)Nondepressed patients (N = 38) mean (S.D. range)
BDI-II0–632126.08 (13.97, 11–58)8.84 (8.89, 0–29)
HADS0–421421.25 (6.90, 14–36)7.55 (7.82, 0–32)
HADS-D0–21711.17 (2.72, 7–17)3.50 (3.94, 0–13)
DISCs0–512.83 (0.83, 1–4)0.79 (0.81, 0–3)

Sensitivity, specificity, positive, and negative predictive values for different cut-off scores on the BDI-II, HADS, HADS-D and DISCs were calculated. Table 2 shows these results for the recommended and optimal cut-offs on each scale and Figure 1 displays the results in the form of a ROC curve.

Figure 1.

Receiver Operating Characteristic (ROC) curves for the depression rating scales showing the optimal cut-offs for each scale.

Table 2. Performance of the depression rating scales for the standard and optimal cut-offs
Depression measureCut-offDepression casesAUCSensitivitySpecificityPPVNPV
  • AUC, area under curve; PPV, positive predictive value; NPV, negative predictive value.

  • a

    Standard cut-off.

  • b

    Optimal cut-off.

SCAN 12 (24%)Gold standard
BDI-II13/14a21 (42%)0.8560.830.710.480.93
10/11b25 (50%)1.000.660.481.00
HADS14/15a16 (32%)0.9000.750.820.560.91
13/14b20 (40%)1.000.790.601.00
HADS-D7/8a16 (32%)0.9230.920.870.690.97
6/7b19 (38%)1.000.820.631.00
DISCs1/2a,b18 (36%)0.9430.920.820.610.97

The optimal cut-off score is the point at which the scale best discriminates “caseness” in the population. This is determined by the cut-off with the maximal sum of sensitivity and specificity. For the BDI-II, this cut-off was at 10/11 (sensitivity 1.00, specificity 0.66) where a score of 11 or more is indicative of depression presence and a score of 10 or less indicates the absence of depression.

The optimal cut-off for the HADS was 13/14 (sensitivity 1.00, specificity 0.79) and for the HADS-D, the cut-off, which best discriminated between depressed and nondepressed HD patients was 6/7 (sensitivity 1.00, specificity 0.82). However, for the HADS-D, a very similar sum of sensitivity and specificity was observed for the range of cut-off scores 6/7 to 8/0. The DISCs was the only scale where the optimal cut-off was the same as the advocated cut-off of 1/2 (sensitivity 0.92, specificity 0.82).

It is generally accepted that an AUC of greater than 0.8 is a good indicator that a scale is a valid screening instrument for a particular population and an AUC of 0.9 or more is regarded as being an excellent screening tool. Where as the DISCs, HADS-D, and HADS were found to be excellent screening measures for discriminating between depressed and nondepressed HD patients with AUCs of 0.943, 0.923, and 0.900 respectively, the BDI-II was found to only be a good screening tool for depression in this population with an AUC of 0.856.


Clinicians and investigators view self-report rating scales of depression in HD patients with caution as motor symptoms alone may spuriously raise scores. This is the first study to date that has validated self-report rating scales for depression in the HD population, despite their common use in research and clinical practice.

Of the scales validated in this study, the BDI-II was the one with highest overall cognitive complexity and contained the most number of items relating to somatic symptoms. The BDI-II was the least suitable scale for discriminating between depressed and nondepressed patients with HD confirming the contribution of somatic HD features to inaccuracy. At the optimal cut-off on the BDI-II of 10/11(sensitivity 1.00, specificity 0.66), the scale has perfect sensitivity but this is at a cost to the specificity, which means that many nondepressed patients may incorrectly be identified as possibly being depressed.

The optimal cut-offs on the total HADS 13/14 and HADS-D 6/7 combine excellent sensitivity with good specificity (HADS: sensitivity 1.00, specificity 0.79; HADS-D: sensitivity 1.00, specificity 0.82). With an AUC of 0.923 for the HADS-D compared to an AUC of 0.900 for the HADS, the results support the original intent for the subscales to be used separately to identify casesness.15 Despite the obvious advantage of the HADS-D for use in the HD population, one item does still relate to a somatic symptom that could be experienced by a non-depressed HD patient (question 8: “I feel as if I am slowed down”). Additionally, by concentrating on the depressive state of anhedonia, the HADS-D omits two of the nonsomatic items of ICD-10 (suicidal ideation and excessive and inappropriate guilt), which reduces the content validity of the scale.

The DISCs was included as a more practical tool for patients with severe cognitive and/or communicative deficits. The results confirm that a score ≥ 2 accurately predicted “cases“ of depression according to ICD-10 criteria. Perhaps surprising given its simplicity, the DISCs had the highest overall AUC of 0.943.

With the use of any rating scale, there will always be a trade off between sensitivity and specificity. The optimal cut-off point of a scale should depend on the purpose for which it is to be utilized. Given the evidence that depressive symptoms in HD can lead to functional decline4 and reduce overall quality of life,5 it seems reasonable to use a lower cut-off where most “cases” can be identified even at the cost of a relevant number of false positives.

Given that this is the only study to date that has validated depression rating scales in the HD population, it is not possible to directly compare results with any previous studies. However, the results of this study compare favorably to research on the use of self-report rating scales as screening measures for depression in other neurological disorders. Leentjens et al.19 concluded that the psychometric properties of the BDI are not ideal for patients with Parkinson's Disease (PD) and the HADS has been recommended for use in patients with PD with mild to moderate rather than severe depression.20 The criterion validity of the DISCs was even greater in the HD population than in patients with acquired brain injury (ABI), the target population for which it was initially designed (HD: sensitivity 0.92, specificity 0.82; ABI: sensitivity 0.60, specificity 0.87).16

Limitations of the study arise from the limited sample size, and it is therefore important for the findings to be replicated with a larger sample size looking additionally at the ability of the depression rating scales to detect change in severity of depressive symptoms. Additionally, those patients with severe cognitive deficits were excluded from this study, owing to their assumed inability to consistently respond meaningfully and reliably. This together with the limitations of recruitment from an out-patient HD service may limit the generalizability of the results to the entire HD population. The majority of depressed patients had either mild or moderate depression (91.7%) and consequently the scales were not so rigorously tested in patients with severe depression. This may have been a contributing factor to the superior performance of the HADS-D over the BDI-II.

Some may also criticize the use of ICD-10 diagnoses obtained from the SCAN interview as a gold standard, given that somatic and cognitive symptoms comprise six of the criteria for a formal diagnosis of depressive disorder (loss of initiative, decreased energy, diminished ability to think or concentrate, change in psychomotor activity, sleep disturbance, and change in appetite). Additionally, ICD-10 stipulates that a depressive episode must not be attributable to any organic mental disorder, thereby bringing into question whether such major classification systems can be applied to patients with HD at all. However, with a semi-structured interview, it is possible for the interviewer to explore any symptoms in depth to decide whether such symptoms are related to depression or not. Furthermore, until there is a more suitable alternative, a (semi) structured interview using operational diagnostic criteria provides the only means to standardize diagnoses at all.


As screening instruments for depression in patients with HD, the authors would recommend using the HADS-D with a cut-off of 6/7, the DISCs with a cut-off of 1/2 and so long as one can accept a very low specificity, the BDI-II can also be used as a screening instrument at a cut-off of 10/11. The DISCs may be a useful instrument for HD patients with more complex cognitive and communicative difficulties. For all possible “cases” of depression that arise from screening, a subsequent diagnostic interview should be conducted to confirm or reject a diagnosis of depressive disorder.


We thank the patients for taking part in this study. This research was sponsored by Birmingham and Solihull Mental Health Foundation Trust.

Financial Disclosures: Jennifer De Souza: none; Hugh Rickards: 1 Advisory board for Meda Pharmaceuticals for which an honorarium was paid; Lisa A. Jones: none.

Author Roles:

Jennifer De Souza: 1. Research Project: Conception, Organization, and Execution; 2. Statistical Analysis: Design and Execution; 3. Manuscript: Writing of the first draft; Lisa A. Jones: 1. Research Project: Conception and Design; 2. Statistical Analysis: Review and Critique; 3. Manuscript: Review Critique; Hugh Rickards: 1. Research Project: Conception and Design; 2. Statistical Analysis: Design, Review and Critique; 3. Manuscript: Review and Critique.