Symptom prevalence differences of depression as measured by BDI and PHQ scales in the Look AHEAD study

Summary Objective To compare depressive symptomatology as assessed by two frequently used measures, the Beck Depression Inventory (BDI‐1A) and Patient Health Questionnaire (PHQ‐9). Methods Investigators conducted a cross‐sectional secondary analysis of data collected as part of the follow‐up observational phase of the Look AHEAD study. Rates of agreement between the BDI‐1A and PHQ‐9 were calculated, and multivariable logistic regression was used to examine the relationship between differing depression category classifications and demographic factors (ie, age, sex, race/ethnicity) or comorbidities (ie, diabetes control, cardiovascular disease). Results A high level of agreement (κ = 0.47, 95% CI (0.43 to 0.50)) was found in the level of depressive symptomatology between the BDI‐1A and PHQ‐9. Differing classifications (minimal, mild, moderate, and severe) occurred in 16.8% of the sample. Higher scores on the somatic subscale of the BDI‐1A were significantly associated with disagreement as were having a history of cardiovascular disease, lower health‐related quality of life, and minority racial/ethnic classification. Conclusions Either the BDI‐1A or PHQ‐9 can be used to assess depressive symptomatology in adults with overweight/obesity and type 2 diabetes. However, further assessment should be considered in those with related somatic symptoms, decreased quality of life, and in racial/ethnic minority populations.


| INTRODUCTION
Depression is 60% more common in individuals with type 2 diabetes than in the general population and is associated with numerous poor health behaviors including physical inactivity and higher dietary fat intake. 1,2 Individuals suffering from depression are at increased risk of adverse cardiovascular outcomes in addition to workplace absenteeism, unemployment, and disability. [3][4][5] Depression increases the activation of the hypothalamic-pituitary-adrenal axis, sympathetic nervous system, and proinflammatory cytokines, resulting in elevated serum glucose levels. 6 However, only half of patients with diabetes and major depression are recognized as depressed by their primary care provider. 7 This may relate to these conditions' overlapping symptoms including fatigue, appetite changes, and decreased libido. 8,9 Brief and easy-to-administer screening questionnaires for individuals with diabetes are critical to improve surveillance of depression. 10 Historically, the 21-item Beck Depression Inventory (BDI) has been the most widely used instrument in healthcare for assessing symptoms of depression. 11 The BDI (BDI, BDI-1A, BDI-11) presents individuals with a series of statements consistent with depressive symptoms and asks to choose the one closest to how they felt, using both cognitive and somatic variables (eg, mood, muscle aches). 11 However, several barriers in the BDI's use have been noted, including high item difficulty (skill/ability needed to complete an item). 12 In addition, shorter versions (eg, 16-item), though less time consuming, have a lower sensitivity in identifying true cases of depression. 13 A shorter instrument, the Patient Health Questionnaire (PHQ-9), was developed to reduce the time required for completion and is now widely used in clinical settings. 14,15 Both the BDI and PHQ-9 are selfreport instruments that use 4-point Likert-type scales to assess symptoms of depression and are well-validated measures. 11,[14][15][16] A unique characteristic of the PHQ is its inclusion of the criteria for depression stipulated by the Diagnostic and Statistical Manual (ie, DSM-IV, . 17 The PHQ-9 asks individuals to read each of the nine criteria of a major depressive episode from the DSM and indicate how often they have felt that way in the last two weeks. 14,15 Some investigators argue that this direct mapping to the DSM, along with the PHQ-9's shorter length, makes it a potentially more desirable measure for screening for depression. 14,17,18 Although the BDI and PHQ-9 have been used for measuring symptoms of depression in individuals with type 2 diabetes and have been found to be effective, 7,19-21 there are no studies that compare these instruments in a population with type 2 diabetes. The PHQ-9 has increased in use and efficacy data are needed to ensure it is the appropriate tool to use in with this patient population. The Look AHEAD (Action for Health in Diabetes) trial, a randomized controlled trial comparing intensive lifestyle intervention (ILI) to diabetes support and education (DSE) in adults with overweight/obesity and type 2 diabetes, offered a unique opportunity to compare the BDI and the PHQ-9 in this population. 22,23 While the active intervention phase of Look AHEAD has concluded, an observational cohort continues to be followed. The objectives of this study were to compare depressive symptomatology as assessed by the BDI-1A and PHQ-9 for Look AHEAD participants in the observational follow-up cohort. Specifically, investigators compared the BDI-1A and PHQ-9 in the following areas: (1) depression prevalence using both instruments' cut points for no (BDI-1A)/minimal (PHQ-9), mild, moderate, and severe depression, (2) differences in degree of reported depressive symptomatology, (3) differences in degree of reported depressive symptomatology based on demographic factors (ie, age, sex, race/ethnicity) or comorbidities (diabetes control, cardiovascular disease [CVD]). In addition, investigators examined if the BDI-1A's cognitive and somatic scales explain any observed differences. Investigators hypothesized that the BDI-1A and PHQ-9 can be used interchangeably for depression screening for adults with overweight/obesity and type 2 diabetes.

| METHODS
Look AHEAD was a randomized controlled trial that examined the impact of participation in a long-term ILI designed to produce weight loss on the health outcomes of 5145 adults with overweight/obesity and type 2 diabetes. Participants were randomized to one of two conditions: ILI or DSE (control group). 23 Its principal goal was to determine the impact of the ILI on cardiovascular morbidity and mortality (ie, nonfatal and fatal myocardial infarction and stroke, hospitalization for angina). The design and methods of the trial have been described in detail elsewhere. 23,24 Despite greater weight loss in the ILI than the DSE group at every annual assessment, the Look AHEAD intervention was stopped at a median follow-up of 9.6 years due to a lack of significant differences in cardiovascular-related morbidity or mortality between the two conditions. 25 All living participants at the end of the trial were invited to join a follow-up observational study to determine the longer-term effects of the intervention on several outcomes. Of the 3985 individuals who attended a follow-up clinic examination, 3703 had both the BDI-1. A and PHQ-9 performed and their results are the subject of these analyses (Table 1). During the original trial, depressive symptomatology was measured annually with the BDI-1A, whereas in the follow-up observational study, its symptomatology was measured once with both the BDI-1A and PHQ-9. The current analysis considers only responses to the BDI-1A and PHQ-9 questionnaires. Data were collected from 2013 to 2014.

| Patient Health Questionnaire
The version of the PHQ used in Look AHEAD was the PHQ-9, a 9-item self-report measure of depressive symptoms in adults that corresponds directly with DSM criteria for a major depressive episode. 14,15 Participants indicated, using 0 to 3 scales, how frequently they experienced each of nine symptoms of depression in the past two weeks, resulting in composite scores ranging from 0 to 27. The PHQ-9 classifies individuals as having no symptoms of depression (ie, score of 0 to 4), mild symptoms (ie, score of 5 to 9), moderate symptoms (ie, score of 10 to 14), moderately severe (ie, score 15 to 19), or severe symptoms (ie, score of 20 to 27). 14,15 In the current analyses, the moderate and moderately severe categories were combined to more closely map onto the four categories of the BDI-1A.

| Beck Depression Inventory
The version of the BDI used in Look AHEAD was the BDI-IA, a wellvalidated, 21-item, self-report measure of depressive symptoms in adults. 11 Participants were asked to respond to items based on how they had been feeling in the past week. Each item has four answer choices of increasing severity scored on a scale of 0 to 3, resulting in total scores ranging from 0 to 63. Higher scores indicate greater severity of symptoms. Participants were identified as having minimal symptoms of depression (ie, score of 0 to 9), mild symptoms (ie, score of 10 to 17), moderate symptoms (ie, score of 18 to 29), or severe symptoms (ie, score of 30 to 63). 11 These same classification categories were used in these analyses. The BDI-1A contains two subscales that primarily measure somatic (7 items) and cognitive symptoms (14 items) of depression. The BDI-1A and PHQ-9 use the same names for all depression categories (ie, mild, moderate, severe) except for the T A B L E 1 Baseline characteristics of Look AHEAD participants in the follow-up observational study with both BDI-1A and PHQ-9 data (n = 3703)

| Other measures
Participants provided basic demographic information including sex, race/ethnicity, and history of CVD or depression through question-

| Statistical analyses
Investigators assessed the relationship between certain variables of interest and differing depression category classifications using multivariable logistic regression. The differences of BDI-1A and PHQ-9 scales prevented a nonbinary logistic regression analysis agreement

| Categorization of depressive symptomatology by the BDI-1A and PHQ-9
A descriptive inspection of differences in reporting of symptomatology indicated that 85.9% of participants were categorized as having no or minimal depressive symptoms by the BDI-1A compared with 82.1% by the PHQ-9. The BDI-1A classified 11.8% of participants as having mild/moderate symptoms compared to 14.1% by the PHQ-9.
A total of 2.0% of participants were classified as having moderate/ moderately severe symptomatology by the BDI-1A compared to 3.6% by the PHQ-9. The percentage of individuals classified in the severe range was 0.3% and 0.2% by the BDI-1A and PHQ-9, respectively.

| Disagreement in the categorization of depressive symptomatology
From the 3985 participants, 3703 individuals had data from both the BDI-1A and PHQ-9 (Table 1). A total of 622 (16.8%) participants had conflicting levels of depression classification between the PHQ-9 and BDI-1A. Table 2 shows where these differences occur. For 419 (11.3%) participants, the PHQ-9 classified at a higher depression level

| Factors associated with disagreement in categorization
An initial multivariable logistic regression analysis revealed that higher scores on the BDI-1A were associated with disagreement in depres-  Table 3.

| DISCUSSION
Investigators conducted a novel comparison of the BDI-1A and PHQ-9 as screening measures for depression in adults with overweight/obesity and type 2 diabetes. Comparisons revealed a high level of agreement (83%) between measures (κ=0.47) in the classification of depressive symptoms. These results are similar to those found in studies that have compared the BDI-1A and PHQ-9 in other disease states with kappa's ranging from 0.24 to 0.64. [29][30][31][32] Although the BDI-1A has historically been considered the optimal measure of depressive symptoms, the PHQ-9 is briefer and available at no cost, which may be advantageous to many healthcare providers and particularly those working with resource-limited populations. This moderate level of agreement suggests that either the BDI-1A or PHQ-9 may be used when assessing depressive symptoms in most adults with overweight/obesity and type 2 diabetes. If the questionnaire results are dichotomized into those indicating no depression vs higher levels (possibly prompting further clinical evaluation), the agreement is better (κ=0.56), although still not ideal.
Although the rate of agreement between these two measures suggests a degree of interchangeability, it should be noted that the BDI-1A and PHQ-9 yielded conflicting levels of depression for a this may be due to the lower cutoff scores used by the PHQ-9. 29 As a result, the PHQ-9 is more likely to pick up minimal levels of depression, which may lead to earlier identification of depressive symptoms.
This may make the PHQ-9 a more desirable measure in individuals who are at higher risk of complications associated with depression, such as medication nonadherence. 33 However, less discrepancy between instruments is found in the moderate to severe ranges.
In these analyses, the most common type of disagreement between measures was identification as mildly depressed on one mea- When considering severity of depression, higher scores on the BDI-1A predicted differing classification, although these discrepancies seemed to be driven primarily by higher scores on the somatic as opposed to the cognitive subscale. These findings suggest that physical, opposed to psychological, symptoms might have a greater influence on depression screening. Other studies also recognized this difficulty of reporting as well as difficulty in diagnosing depression in the presence of somatic symptoms. 35,36 This is likely related to several factors. The psychopathology of the somatic component, both painful and nonpainful, of depression is understudied and not completely understood. 37 For example, lack of pleasure and sleep abnormalities are related to abnormal serotonin and norepinephrine regulation of the hypothalamus and sleep centers, whereas fatigue and loss of energy appear to be affected by malfunctioning neuronal circuits regulated by multiple neurotransmitters. 38,39 However, the overlapping symptoms of depression and diabetes, particularly somatic symptoms including fatigue, must be considered when these comorbid conditions occur but they are, unfortunately, underrecognized in clinical settings. 1,2,7,8 Awareness of cognitive and affective symptoms unique to depression (ie, negative thoughts, anxiety) can help clinicians delineate the correct diagnosis of depression. 9 Furthermore, broad terminology is used in the BDI-1A to define somatic symptoms such as somatoform, psychosomatic, vegetative, medically unexplained, masked, and somaticized, 36 making it difficult to differentiate between somatic symptoms related to another psychiatric condition, somatoform disorders, or other medical conditions. 37 These factors provide rationale for the potential delay of recognition of depression when somatic symptoms are present and may account for discrepancies in reporting. Nevertheless, one meta-analysis found that twothirds of depressed patients reported somatic symptoms, 40 emphasizing the importance of further investigation for earlier recognition of depression in their presence.
The study findings contrast with a study of low-income women (18 to 60 years) with one or more chronic health conditions (eg, seasonal allergies, asthma, diabetes, CVD) that failed to find any significant predictors of differences in scores between the BDI-1A and PHQ-9. 41 The conflicting findings are likely related to methodological and study population differences. These analyses compared differences in classification levels that are commonly used in clinical prac- This study has multiple strengths that include a large racially/ethnically diverse population and the use of well-validated questionnaires. The focus on a group that has been diagnosed with type 2 diabetes provides specific information for this group; however, these findings may not generalize to individuals without diabetes. A limitation of the current study is that no "gold standard" (eg, a structured clinical interview) was used to determine the presence and severity of depression, with which each of these measures could then be compared. However, the overall goal of this study was to examine differences in reported symptomatology of depression between these two commonly used questionnaires.
Additional limitations of the study include low initial rates of depression (ie, 85.9% were categorized as having no or minimal depressive symptoms by the BDI-1A and 82.1% by the PHQ-9), low rates of severe depression, and the difference in time periods of reporting between the two instruments (previous one vs two weeks). The findings from this study suggest that future research should compare the differences between the BDI-1A and PHQ-9 against a structured clinical interview.

| CONCLUSION
The results suggest that either the PHQ-9 or BDI-1A can be used in depression screening for adults with overweight/obesity and type 2 diabetes. The findings suggest the need for further assessment of individuals with type 2 diabetes who have comorbid conditions such as CVD, report greater depression severity (particularly somatic symptoms of depression), have a decreased quality of life, or who are of a certain race or ethnicity (eg, Hispanic or American Indian/Alaskan Native).

This study is supported by the Department of Health and Human
Services through the following cooperative agreements from the

DATA SHARING PLAN
The study protocol, analysis plan, forms, and detailed data description of de-identified participants from the Look AHEAD trial may be located at the National Institute of Diabetes, Digestive, and Kidney