The association of depression and multimorbidity in the elderly: implications for the assessment of depression
Ms Lena Spangenberg, MSc, Department of Medical Psychology and Medical Sociology, University of Leipzig, Philipp-Rosenthal-Str. 55, 04103 Leipzig, Germany. Email: email@example.com
Background: Depression and multimorbidity are common in the elderly. Assessing depression might be difficult because of the overlap of depressive and somatic symptoms, possibly leading to confounded results.
Methods: This study investigates the frequency of depression, multimorbidity and their association, the potential impact of multimorbidity on the assessment of depression by the Patient Health Questionnaire, and whether using a cut point might cause misleading results in the elderly German population (60–85 years, n= 1659).
Results: Depressive syndromes are significantly more frequent in multimorbid respondents. Multimorbidity is associated with higher item scores, especially in the somatic items, and multimorbid respondents show higher depression severity levels in comparison to non-multimorbid persons.
Conclusion: There are associations between multimorbidity and depressive symptoms, therefore potentially confounding prevalence rates. As such, causal pathways of these associations should be studied under a longitudinal perspective in future studies.
Despite extensive research, there is no consensus about the prevalence of depressive disorders in the elderly.1–4 Contradictory findings may partly result from methodological aspects, which can cause either an overdiagnosis or underdiagnosis of depressive disorders.3,5 The overlap of depressive and somatic symptoms, especially in higher age groups, leads to diagnostic difficulties (e.g. inflated depression scores due to somatic comorbidity) because multimorbidity (e.g. the co-occurrence of various medical conditions in one person) is a common finding in the elderly.6–9
The depression module of the Patient Health Questionnaire (PHQ-9) is a brief, well-established and well-validated instrument allowing both a categorical diagnosis of major and minor depressive syndromes and a dimensional measurement of depression severity.10–14 Its nine items correspond to the diagnostic criteria of major depression according to the DSM–IV (see Table 3).15 It was originally developed for use in primary care, but nowadays very good psychometric properties in other clinical settings and in the general population have been demonstrated in numerous studies.16,17 Several studies have found that the PHQ-9 performs well on elderly nursing home residents,18 primary care patients and community-dwelling seniors receiving in-home assessments.19–21 A PHQ-9 score ≥10 is supposed to indicate clinically relevant depression, regardless of diagnostic status.12 Recently, it was discussed that using a cut point can contribute to false positive cases.22 The somatic symptoms of the DSM-IV criteria (i.e. fatigue, trouble sleeping, appetite or weight change) are frequent in the general population and are associated not only with depression, but also with several medical conditions.22–26 Thus, elevated PHQ-9 scores might also result from conditions other than mere depression.
Table 3. Association of multimorbidity and depressive symptoms in ‘non-depressed’ and ‘depressed’ respondents
|(1) Little interest or pleasure in doing things (anhedonia)||98.223***||0.061||5.333*||0.047|
|(2) Feeling down, depressed, or hopeless (depressed mood)||73.686***||0.047||1.863||0.017|
|(3) Trouble falling or staying asleep, or sleeping too muchc||157.497***||0.095||1.464||0.014|
|(4) Feeling tired or having little energyc||151.942***||0.092||6.549*||0.058|
|(5) Poor appetite or overeatingc||52.244***||0.034||1.686||0.016|
|(6) Feeling bad about yourself – or that you are a failure or have let yourself or your family down||5.272**||0.004||3.369||0.031|
|(7) Trouble concentrating on things, such as reading the newspaper or watching television||79.636***||0.050||0.949||0.009|
|(8) Moving or speaking so slowly that other people could have noticed/ being so fidgety or restless that you have been moving around a lot more than usual||11.995***||0.008||0.283||0.003|
|(9) Thoughts, that you would be better off dead or hurting yourself in some way||3.030*||0.002||1.132||0.010|
A study of elderly primary care patients showed through receiver operating characteristics analyses that the PHQ-9 performs better for those with fewer medical conditions (i.e. shows higher area under the curve values).19 This indicates that assessing depression might be difficult in the medically ill even with this psychometrically well-validated instrument because of the overlap of depressive and somatic symptoms. This is especially important, as multimorbidity occurs very frequently in the elderly, and multimorbidity and depression are associated with each other.27,28 Thus, diagnostic tools are required that are not confounded by somatic morbidity and that enable an accurate diagnosis of depressive syndromes. Instruments specifically developed for use in somatically ill patients, such as the Hospital Anxiety and Depression Scale (HADS) or the Geriatric Depression Scale (GDS) for the elderly,29–31 already exist, but they have some disadvantages compared to the PHQ-9: the HADS allows a dimensional screening of depression but does not offer a categorical algorithm,29,32,33 and applying the GDS is less economical because of its length.30,31 Although the PHQ-9 has been proven to be an economical and well-performing instrument in medical settings and clinical elderly samples, some recent results cast doubt on the PHQ-9's valid applicability in persons suffering from multimorbidity.19 Studies on the potential impact of multimorbidity on the PHQ-9 are lacking to date, and it has never been studied in a general population sample.
- 1Firstly, our study aims to describe the frequency of depression and multimorbidity, as well as their association, in a population-based representative elderly sample (60–85 years).
- 2The second objective is to examine the potential impact of multimorbidity on the PHQ-9 scores. Therefore, we analyze the association of multimorbidity and depressive symptoms on a single-item basis assessed by the PHQ-9.
- 3In the third step, we examine if PHQ-9 scores ≥10 indicate clinically relevant depression or rather reflect other conditions (e.g. multimorbidity).
A representative sample of the German general population was selected with the assistance of a demographic consulting company (USUMA, Berlin, Germany). Germany was separated into 258 sample areas representing the different regions of the country. Households of the respective area and household members fulfilling the inclusion criteria (age at or above 14, able to read and understand German) were selected randomly. All participants were informed about the study and gave their informed consent. The survey met the ethical guidelines of the International Code of Marketing and Social Research Practice by the International Chamber of Commerce and the European Society of Opinion and Marketing Research. The sample was aimed to be representative in terms of age, gender, and education. A first attempt was made for 8368 addresses, of which 8116 were valid. If not at home, a maximum of three attempts were made to contact the selected person. All subjects were visited by a study assistant and informed about the investigation, and self-rating questionnaires were presented. The assistant waited until participants answered all questionnaires and offered help if the meaning of questions was not clear. A total of 5033 people agreed to participate and completed the self-rating questionnaires in May and June 2008 (participation rate: 62.1%). Our study includes only participants aged 60 to 85 years (n= 1659; see Table 1 for demographic characteristics).
Table 1. Sociodemographic characteristics of the sample
|Age groups|| || || |
| 60–64 years||181 (23.5%)||218 (24.5%)||399 (24.1%)|
| 65–69 years||236 (30.6%)||206 (23.2%)||442 (26.6%)|
| 70–74 years||182 (23.6%)||201 (22.6%)||383 (23.1%)|
| 75–79 years||109 (14.1%)||144 (16.2%)||253 (15.3%)|
| 80–85 years||63 (8.2%)||119 (13.4%)||182 (11.0%)|
|Marital status|| || || |
| Married||580 (75.2%)||362 (40.8%)||942 (56.8%)|
| Single||23 (3.0%)||32 (3.6%)||55 (3.3%)|
| Divorced||52 (6.7%)||53 (6.0%)||105 (6.3%)|
| Widowed||116 (15.0%)||441 (49.7%)||557 (33.6%)|
|Persons in the household|| || || |
| Living alone||173 (22.4%)||491 (55.3%)||664 (40.0%)|
| Two persons||567 (73.5%)||381 (42.9%)||948 (57.1%)|
| More than two persons||31 (4.0%)||16 (1.8%)||47 (2.9%)|
|Educational status|| || || |
| Less than 10th grade||511 (66.3%)||697 (74.8%)||1175 (70.9%)|
| 10th grade||164 (21.3%)||153 (17.3)||317 (19.1%)|
| High school||28 (3.6%)||21 (2.4%)||49 (3.0%)|
| College or university degree||68 (8.9%)||50 (5.6%)||118 (7.1%)|
In terms of sociodemographic characteristics, there was a significant difference between men and women. Women are overrepresented in the higher age groups (χ2= 20.33, P≤ 0.001), are more likely to live alone (χ2= 185.57, P≤ 0.001) or be widowed (χ2= 234.48, P≤ 0.001), and they are more likely to have less education (χ2= 19.55, P≤ 0.001).
Assessment of depression
Depression was assessed using the German version of the PHQ-9.34 Response categories for the nine items ranged from 0 (‘not at all’) to 3 (‘nearly every day’). For the present study, a continuous depression score was calculated; the sum of the answers to the nine items ranged from 0 to 27. Depression severity is represented by a total score of 0–4 = no significant depressive symptoms, 5–9 = mild depressive symptoms, 10–14 = moderate depression, 15–19 = moderately severe depression and 20–27 = severe depression. A score ≥10 is supposed to indicate clinically significant depression, regardless of diagnostic status.12 Moreover, a categorical assessment of major depression or minor depression is possible. Major depression requires either the item 1 or item 2 and a minimum of five of the nine symptoms to be present ‘more than half of the days’ (2). Minor depression requires two to four symptoms, including item 1 or 2, to be present ‘more than half of the days’ (2).35
Assessment of somatic morbidity
The number and severity of chronic diseases were assessed using an instrument developed by Bayliss et al.36–38 It is a self-report questionnaire based on a list of 24 common chronic diseases. Respondents rate each condition on a 5-point scale from 1 (interferes with daily activities ‘not at all’) to 5 (interferes with daily activities ‘a lot’). First, the questionnaire makes it possible to calculate the number of chronic medical conditions. Second, the total score represents the disease burden, which is the sum of diseases weighted by the level of interference for each condition. Validation of the instrument showed that it is strongly associated with a subjective state of health. An earlier validation against medical records revealed that median sensitivity relative to a ‘gold standard’ of chart review was 75% (range 35–100%), and median specificity was 92% (range 61–100%).37 In our analyses we used the number of medical conditions, since this indicator is less confounded by somatization and depression than the disease burden indicator.
Assessment of health-related quality of life (HRQOL)
HRQOL was measured using the German version of the 12-Item Short Form Health Survey, a generic measure of HRQOL. This brief instrument is a shorter, but reliable and valid version of the 36-Item Short Form Health Survey. Physical and mental health composite scores range from 0 to 100. A score of 0 indicates the lowest level of health, and a score of 100 indicates the highest level of health.39
Depression, multimorbidity and their association
The presence of a major or minor depressive syndrome was classified using the described PHQ-9 diagnostic algorithm. Multimorbidity was defined according to the Berlin Aging Study.40 Respondents were classified as multimorbid if they reported five or more of the 24 listed medical conditions. Descriptive analysis for depression (major or minor depressive syndrome, distribution of depression severity) and multimorbidity are presented. Multimorbid (MM+) and non-multimorbid (MM-) respondents are compared by analyses of variance (age, PHQ-9 total score) and χ2-tests (gender, presence of depressive syndromes).
Impact of multimorbidity on PHQ-9 item scores
For further analysis, we classified the respondents as either ‘depressed’ (major or minor depressive syndrome according to diagnostic algorithm), ‘potentially depressed’ (no depressive syndrome according to diagnostic algorithm, PHQ-9 score ≥10) or ‘non-depressed’ (no depressive syndrome according to diagnostic algorithm, PHQ-9 score <10). To determine whether the presence of multimorbidity has an impact on the PHQ-9 item scores, analyses of variance were conducted separately in ‘non-depressed’ and ‘depressed’ respondents with depressive syndrome as the between-subjects factor. Effect sizes are presented and interpreted according to Cohen (η2= 0.01 small effect, η2= 0.06 moderate effect, η2= 0.14 large effect).41
PHQ-9 cut point and clinical relevant depression
Whether a cut point of ≥10 indicates clinically significant depression or reflects multimorbidity was investigated through comparisons of ‘non-depressed’, ‘potentially depressed’ and ‘depressed’ respondents by further analyses of variance (age, number of chronic conditions, physical health composite score, mental health composite score). Post-hoc differences were analyzed by Scheffé tests. Additionally, the frequency distribution of the items is described for ‘potentially depressed’ respondents.
Depressiveness, multimorbidity and their association in the study sample
Depressive syndrome was found in 6.6% (n= 109) of the respondents (2.8% major depressive syndrome, n= 46; 3.8% minor depressive syndrome, n= 63). The majority of respondents (63.9%, n= 1041) was classified as non-multimorbid (MM-) (11.4% no medical condition, 24.8% one or two medical conditions, 27.7% three or four medical conditions) whereas 36.1% (n= 598) reported five or more medical conditions (MM+) (21.0% between five and seven medical conditions, 9.7% between and 8 and 10 medical conditions, 5.4% more than 10 medical conditions). Compared to persons with fewer than five chronic conditions, multimorbid respondents are significantly older (MM+ 71.45 years > MM- 69.21 years) (F(1, 1658) = 44.973, P < 0.001, η2= 0.026), report higher PHQ-9 Scores (MM+ 4.87 [SD 3.9] > MM- 2.42 [SD 3.2]) (F(1, 1654) = 194.013, P < 0.001, η2= 0.105) and are more often women (40.7% vs. 30.7%) (χ2(1, 1659) = 17.594, P < 0.001).
The frequency of depressive syndromes and the distribution of PHQ-9 total scores according to depression severity level are presented in Table 2. Depressive syndromes are significantly more frequent in multimorbid respondents.
Table 2. Distribution of depressive syndromes and depression severity in the sample according to multimorbidity
|Depressive syndromea|| || || || || || || |
| Major depression||46||2.8||17||1.6||29||4.8||15.031**|
| Minor depression||63||3.8||31||2.9||32||5.4||6.194*|
| Any depression||109||6.6||48||4.5||61||10.2||20.205**|
|PHQ-9 total score|| || || || || || ||149.464**|
| 0–4||1180||71.1||816||81.4||316||52.8|| |
| 5–9||356||21.5||149||14.0||207||34.6|| |
| 10–14||94||5.7||38||3.6||56||9.4|| |
| 15–19||24||1.4||9||0.8||15||2.5|| |
Impact of multimorbidity on PHQ-9 item scores
As shown in Table 3, analyses of variance reveal a significant effect of multimorbidity for all PHQ-9 items in ‘non-depressed’ individuals. Effect sizes show that these differences appear particularly large in item 3 and item 4 (trouble sleeping, fatigue). Moderate effect sizes are found in items 1, 2, 5 and 7, whereas the other effects are rather small (see Table 3). In ‘depressed’ respondents, there is a moderate effect of multimorbidity on items 1 and 4. Multimorbid respondents consistently report higher scores.
PHQ-9 cut point and clinical relevant depression
Overall, 7.1% (n= 118) of the total sample had a PHQ-9 score ≥10. Of this, 62.3% (n= 77) had a major or minor depressive syndrome (‘depressed’, 4.6% of the total sample), but 34.7% (n= 41) did not fulfil the diagnostic criteria of major or minor depressive syndrome (‘potentially depressed’, 2.5% of the entire sample). Analyses of variance revealed significant differences in age (F(2, 1653) = 12.296, P < 0.001, η2= 0.015), number of chronic conditions (F(2, 1651) = 47.742, P < 0.001, η2= 0.055), physical health (F(2, 1634) = 71.136, P < 0.001, η2= 0.08) and mental health (F(1, 1634) = 207.766, P < 0.001, η2= 0.203) between ‘depressed’, ‘potentially depressed’ and ‘non-depressed’ respondents (see Table 4). Post-hoc Scheffé tests showed that ‘potentially depressed’ and ‘depressed’ respondents differed significantly from ‘non-depressed’ respondents. Both groups reported a higher number of chronic conditions and lower levels of physical and mental health (see Table 4). ‘Depressed’ participants were significantly older than ‘non-depressed’ participants and reported significantly lower levels of HRQOL mental health than ‘non-depressed’ and ‘potentially depressed’ individuals.
Table 4. Differences in age, number of chronic conditions, physical and mental health in ‘non-depressed’, ‘depressed’ and ‘potentially depressed’ respondents
|Age||69.77||72.93||71.12||12.296**||0.015||ND < MD**|
|Chronic conditions||3.89||6.39||7.12||47.742**||0.055||ND < MD, PD**|
|PCS||46.92||37.92||39.19||71.136**||0.080||ND > MD, PD**|
|MCS||55.44||39.58||44.42||207.766**||0.203||ND > MD, PD**, MD < PD*|
Of the ‘potentially depressed’ respondents, 95.1% scored a 1 (‘several days’) on either one or both core symptoms (depressed mood and anhedonia). The symptoms most frequently reported (‘more than half of the days’ or ‘nearly every day’) were fatigue (by 75.7%), trouble sleeping (by 73.1%) and trouble concentrating (by 58.6%), whereas feeling bad about yourself and suicidal thoughts were the most infrequently endorsed items (17.1% and 14.6%).
Assessing depression with the PHQ-9 in the elderly can be difficult because of the overlap of depressive and somatic symptoms and possibly confounded prevalence rates. As multimorbidity is frequent in the elderly, the present study aimed to examine the frequency of depression, multimorbidity and their association as well as the potential impact of multimorbidity on the assessment of depressive symptoms by the PHQ-9 in the elderly population (60–85 years). It was further investigated whether using a cut point might contribute to misleading results.
Our study found the prevalence of depressive syndromes in the elderly German population to be lower than the average rate found in other community-based studies,1–3 but the prevalence is within the usual range and similar to comparable studies.3,4,16,42 Multimorbidity was less prevalent in comparison to other studies. In our study, only 36.1% were classified as multimorbid, whereas other studies reported multimorbidity rates of 78% (age 80+),7 83.2% (age 75+) and up to 99% (age 65+).43 However, these studies were conducted in a general practice setting and applied a less strict definition of multimorbidity (i.e. the co-occurrence of two or more conditions). It is in line with other findings that multimorbidity is associated with age and with depressive symptoms.7,8,42,43
Analyses of variance demonstrated that there is a clear association between multimorbidity and item scores on the PHQ-9 in ‘non-depressed’ respondents. In general, multimorbid respondents report higher item scores (i.e. report more depressive symptoms). Interestingly, effect sizes revealed that this association is greatest for two ‘somatic’ items (fatigue and trouble sleeping), but for other items, moderate effects were also found. This finding can be interpreted in several ways. On the one hand, it is possible that multimorbid persons suffer more often from depressive symptoms, even on a non-clinical level. On the other hand, our results might indicate that somatic morbidity confounds item scores (i.e. inflates them artificially), as fatigue and trouble sleeping, in particular, are more frequently endorsed. This explanation is supported by the finding that even in ‘depressed’ respondents multimorbidity is linked to a higher score in fatigue, though there is a significant effect on anhedonia too. Summing up, it seems likely that some of the PHQ-9 items are confounded by multimorbidity to a certain extent.
The literature discusses whether depression in the elderly is hidden behind somatic symptoms (‘masked depression’) either because of somatization of the depression or because it intensifies the symptoms of an accompanying physical condition.4,44 Therefore, it has been argued that it is necessary to take somatic symptoms into account to improve the detection of depression in the elderly.23 As such, the PHQ-9 has a major advantage compared to other instruments such as the Geriatric Depression Scale and the Hospital Anxiety and Depression Scale because, in addition to its brevity, it includes somatic symptoms.29 Otherwise, it has been argued that the somatic symptoms might cause elevated total scores, especially in chronically ill individuals, and contribute to false positives.22,23 In our sample, 41 respondents reported an elevated PHQ-9 score (≥10), indicating clinically significant depression,12 but did not fulfil the criteria for a major or minor depressive syndrome according to diagnostic algorithm. Are these individuals' results possibly false positives because of confounding through somatic illness?22 Supporting this position is the present result that those ‘potentially depressed’ respondents reported significantly more chronic conditions than ‘non-depressed’ individuals and, in particular, somatic items (fatigue, trouble sleeping) were frequently endorsed. On the contrary, they reported significantly lower levels of physical and mental health than ‘non-depressed’ individuals but did not differ from ‘depressed’ respondents regarding the number of chronic conditions, age and physical health. The only difference from ‘depressed’ participants was found in mental health. ‘Depressed’ respondents reported significantly lower levels of mental health than ‘potentially depressed’ respondents. These findings indicate that, in the group of the ‘potentially depressed’ persons, clinically significant depressive symptoms are present and linked to impaired HRQOL. The sub-threshold character of their symptoms is underlined by the finding that compared to ‘depressed’ respondents they reported a higher level of HRQOL mental health, but it was lower compared to ‘non-depressed’ respondents.45,46 Furthermore, results suggest that the presence of multimorbidity might act as a confounding factor for PHQ-9 item scores, even at this sub-threshold level of depression.
Despite the major strengths of our approach (i.e. population-based, large and representative sample, age 60–85 years), our study has some limitations too. As we did not include nursing home residents in our study, our findings are only applicable to elderly living in private homes. Therefore, it is possible that we underestimated somatic morbidity. Furthermore, self-rating instruments were used. Without a clinical interview – the gold-standard criterion – the prevalence of some conditions might have been overestimated or underestimated. However, the instruments employed are well-established and demonstrated good psychometric properties,18–21,37,47 and the agreement between self-reporting and medical charts was shown to be good.48 Unfortunately, our cross-sectional study design does not allow us to interpret findings in a causal way. If depressive symptoms such as sleep problems are either a result of a chronic condition, a result of the disease burden or independent of the condition should be prospectively studied under a longitudinal perspective.6
Notwithstanding the clear cut association between multimorbidity and PHQ-9 scores in ‘non-depressed’ and partly in ‘depressed’ elderly, the interpretation of our results remains difficult. It appears that multimorbidity is associated with higher item scores, especially in the somatic items, and multimorbid persons show higher depression severity when applying the PHQ-9 dimensionally. It must be taken into account that the PHQ-9 is confounded by multimorbidity. However, it has been demonstrated that the PHQ-9 performs well in the elderly.18–21,49 Future studies should investigate the performance of the PHQ-9 in multimorbid persons by receiver operating characteristics using a gold-standard criterion.