The Outcome of Physical Symptoms with Treatment of Depression


  • Received from the Department of Medicine (TG, GE, KK), Indiana University School of Medicine, and the Regenstrief Institute, Inc., Indianapolis, Ind.

    This study was presented as an oral abstract at the 2001 Society of General Internal Medicine meeting in San Diego, Calif.

Address correspondence and requests for reprints to Dr. Kroenke: Regenstrief Institute, 1050 Wishard Boulevard, RG6, Indianapolis, IN 46202 (e-mail:


OBJECTIVE:  This study examined the prevalence, impact on health-related quality of life (HRQoL), and outcome of physical symptoms in depressed patients during 9 months of antidepressant therapy.

DESIGN:  Open-label, randomized, intention-to-treat trial with enrollment occurring April through November 1999.

SETTING:  Thirty-seven primary care clinics within a research network.

PATIENTS:  Five hundred seventy-three depressed patients started on one of three selective serotonin reuptake inhibitors (SSRIs) by their primary care physician and who completed a baseline interview.

INTERVENTIONS:  Patients were randomized to receive fluoxetine, paroxetine, or sertraline.

MEASUREMENTS AND MAIN RESULTS:  Outcomes assessed included physical symptoms, depression, and multiple domains of HRQoL. Prevalence of physical symptoms was determined at baseline and after 1, 3, 6, and 9 months of treatment. Stepwise linear regression models were used to determine the independent effects of physical symptoms and depression on HRQoL domains.

 Of the 14 physical symptoms assessed, 13 were present in at least a third to half of the patients at baseline. Each symptom showed the greatest improvement during the initial month of treatment. In contrast, depression continued to show gradual improvement over a 9-month period. Physical symptoms had a predominant effect on pain (explaining 17% to 18% of the variance), physical functioning (13%), and overall health perceptions (13% to 15%). Depression had the greatest impact on mental (26% to 45%), social (14% to 32%), and work functioning (9% to 32%).

CONCLUSIONS:  Physical symptoms are prevalent in depressed patients and initially improve in the first month of SSRI treatment. Unlike depression, however, improvement in physical symptoms typically plateaus with minimal resolution in subsequent months.

Physical symptoms are extremely prevalent in a primary care setting. In fact, they account for greater than 50% of outpatient clinic visits or an estimated 400 million visits annually in the United States alone.1 At least one third of these symptoms are medically unexplained.2 Recent research has established a strong relationship between somatization and depression.1–13 Both the existence of unexplained symptoms and the total number of physical symptoms increase the likelihood of a concurrent depressive or anxiety disorder.1 Additionally, greater symptom severity, recent stress, and lower patient ratings of overall health are independent predictors of an affective disorder.4,5

Physical rather than emotional symptoms are the presenting complaints that the majority of depressed patients voice to their primary care physician. An international study in 15 countries revealed that more than two thirds of depressed patients in primary care present exclusively with physical complaints.14 In fact, half of these patients report multiple somatic symptoms. Prior studies have focused on the recognition and diagnosis of depression in the presence of somatic symptoms, but there has been limited research on the outcome of physical symptoms in patients treated for clinical depression.

The ARTIST (A Randomized Trial Investigating SSRI Treatment) study was a “real world” clinical trial in which primary care patients with depression were randomized to one of three selective serotonin reuptake inhibitor (SSRI) antidepressants and followed over 9 months of therapy.15 Depressive and physical symptoms were serially assessed, as were multiple domains of health-related quality of life (HRQoL). Both the initial prevalence of physical symptoms as well as the change in bothersome symptom prevalence over 9 months of antidepressant treatment were examined. In addition, the relative effects of physical symptoms and depression on various HRQoL domains at baseline were evaluated.


Study Subjects

In the ARTIST study, patients who were deemed clinically depressed by their primary care physician and considered candidates for antidepressant treatment were randomized to paroxetine, fluoxetine, or sertraline.15 Patients were enrolled from 37 clinical practices involving 87 physicians in 2 primary care networks. Subjects were eligible if they were over 18 years of age, received their primary care from a participating physician, had access to a home telephone, and had a depressive disorder for which their primary care physician (PCP) felt antidepressant therapy was warranted. Exclusion criteria included cognitive impairment severe enough to preclude an adequate interview; terminal illness; residence in an extended care facility; active suicidal ideations; current treatment (i.e., past 2 months) with an SSRI antidepressant; use of a non-SSRI antidepressant at any dose for depression or at low doses (>50 mg of amitriptyline or its equivalent) for a nondepressive disorder; history of a bipolar disorder; active cocaine or opiate use; and pregnancy or breastfeeding.

Outcome Assessment

Computer-administered telephone interviews were used to conduct both baseline and follow-up interviews at 1, 3, 6, and 9 months after enrollment. Depression outcome was assessed with two measures of core depressive symptoms, the HSCL-20 and the 9-item Patient Health Questionnaire (PHQ-9) depression scale. The HSCL-20 is a 20-item modified subscale of the 90-item Hopkins Symptom Checklist. It includes the full 13-item depression subscale of these longer instruments plus 7 additional items that allow for an assessment of all Diagnostic and Statistical Manual, fourth edition (DSM-IV) items. The HSCL-20 has been successfully used in primary care depression trials where it has demonstrated the sensitivity to detect differences in depression severity change between treatment groups.16–18 The PHQ-9 is a self-administered questionnaire that evaluates the 9 DSM-IV depressive symptoms and is a validated measure of depression severity.19,20

The physical symptom measure included 14 of the 15 items from the Patient Health Questionnaire (PHQ-15) somatic symptom module.21 The sexual dysfunction item was excluded because the ARTIST outcome assessment included a more detailed sexual function questionnaire. For each physical symptom on the PHQ, subjects are asked to what degree they have been bothered during the past month, with responses scored as 0 for “not bothered at all,” 1 for “bothered a little,” and 2 for “bothered a lot.” Thus, scores on the 14-item PHQ physical symptom scale used in the ARTIST study could range from 0 to 28.

A number of health-related quality of life domains were evaluated. The 36-item Short-form Health Survey (SF-36) measures health-related quality of life in 8 domains, including physical functioning, social functioning, mental health, general health perception, pain, vitality, and physical and emotional role functioning.22,23 Three scales from the Work Limitations Questionnaire (WLQ), including output demand, time management, and interpersonal relations, were used to evaluate function in the workplace.24 Selected measures from the Medical Outcomes Study (MOS) were administered to assess social functioning, concentration, positive well-being, hopefulness, sleep, and sexual function.25 Subjects completed screening anxiety and alcohol disorder items from the PRIME-MD.26 Finally, validated questionnaires were used to evaluate quality of close relationships and disposition.27

As a measure of medical comorbidity, the Chronic Disease Score (CDS) was calculated for each patient. The CDS score is based on prescribed medications and increases with the number of different chronic diseases as inferred from the subject's medication profile. Individual medications are mapped to medication classes, which are then mapped to different chronic diseases. The original CDS was calculated by summing the weights of each unique CDS class for each patient.28 A revised version of the CDS, used in this study, employs empirical weights to calculate the CDS score.29 Both scores have been shown to predict mortality and health care resource utilization after adjusting for demographics and previous resource utilization.

Statistical Analysis

The prevalence of symptoms was determined at baseline and all follow-up intervals. For individual symptoms, data were analyzed both as any symptom (“bothered a little” or “bothered a lot”) and severe symptom (“bothered a lot”). To determine whether the prevalence of individual symptoms at follow-up time points differed from baseline prevalence, a generalized estimating equation was applied to a cumulative logistic regression with multiple comparisons, using subjects to define the cluster. To determine the new development of a symptom, an inception case was any patient who was not “bothered a lot” by a particular symptom at baseline, but developed a symptom of this severity during follow-up.

A hierarchical linear regression model was used to determine the independent effects of physical symptoms and depression on HRQoL at baseline. Age, gender, and race were entered in block 1. In block 2, physical symptom score and depression severity were entered, controlling for anxiety. Because two separate depression measures were used, two models were constructed, one using the HSCL-20 as the depression severity measure and the other using the PHQ-9.

We also assessed whether physical symptom improvement was associated with the degree of depression improvement; classified as remission, response, and nonresponse. Remitters were defined as having an HSCL-20 score ≤0.5 after 3 months of antidepressant treatment, while partial responders had a ≥50% improvement in HSCL-20 score but not to a level <0.5.30 Patients who did not meet either criterion were classified as nonresponders. Mixed-model analysis of covariance with baseline score, demographics, randomized drug, site, and month as covariates along with random effects for subject, clinic, and doctor within clinic, were used to compare the three levels of depression response.


Enrollment occurred from April through November of 1999. Of 601 patients who provided informed consent and who were randomized to treatment, 573 completed the baseline telephone assessment. The 28 prebaseline dropouts were demographically similar to the 573 patients who completed the baseline assessment, but had slightly less severe depression (mean PHQ-9 score of 12.5 vs 14.3). Patients had a mean age of 46 years, with the majority being women (79%) and white (84%). Major depression was present in 74% of subjects, dysthymia alone in 18%, and minor depression in 8%. Approximately one third of the study participants reported a past history of treatment for depression. In the month preceding enrollment, 35% of the patients had experienced an anxiety attack and 45% had reported some use of alcohol. Follow-up interviews were successfully completed in 94% of patients at 1 month, 87% at 3 months, 84% at 6 months, and 79% at 9 months.

Table 1 summarizes the prevalence of specific symptoms in this population of depressed patients at baseline, 1, 3, 6, and 9 months after randomization to an SSRI treatment group. All physical symptoms were quite prevalent—both the 2 symptoms that constitute actual DSM-IV criteria for depressive disorders (fatigue and sleep complaints) as well as the 12 symptoms not part of the explicit criteria for depression. In fact, most symptoms were present in at least a third to half of the patients and, when present, were severe in 10% to 20% or more of patients.

Table 1. Prevalence of Physical Symptoms in Depressed Patients at Baseline and During 9 Months of Antidepressant Therapy
SymptomBothered a Lot or a Little (%)Bothered a Lot (%)
  • *

     Prevalence of menstrual problems is determined only for the women in the sample.

Number interviewed546538504483455546538504483455
Fatigue 96.3 86.6 84.6 79.9 80.6 69.1 36.4 33.9 31.1 29.7
Sleep problems 85.0 71.4 67.0 61.9 62.3 57.1 27.3 26.2 23.6 22.0
Headaches 80.7 66.2 66.7 65.7 64.3 33.2 16.9 17.7 14.5 15.2
Nausea/indigestion 71.2 61.2 62.3 55.8 59.0 25.1 13.6 14.3 13.7 15.2
Back pain 70.3 54.9 65.5 61.8 61.4 27.7 15.2 19.8 18.0 22.2
Limb pain 76.0 62.1 69.1 66.4 67.6 30.9 20.8 23.0 20.1 25.3
Stomach pain 63.2 49.1 52.1 43.0 47.8 21.1   9.1   9.7   7.2   9.0
Bowel problems 62.3 56.3 56.1 52.7 48.3 23.4 15.8 13.7 13.0 14.8
Palpitations 57.4 40.3 41.3 33.6 37.4 11.9   3.5   3.6   3.7   3.3
Dyspnea 55.0 38.9 38.5 40.8 42.5 11.9   7.1   6.3   6.6   7.7
Dizziness 47.7 36.4 35.7 32.2 32.6   6.5   3.9   4.6   4.3   4.2
Menstrual problems* 38.4 33.3 34.9 30.1 31.2 11.9   8.6   8.3   6.2   8.4
Chest pain 36.7 23.2 25.2 25.5 27.1   7.2   2.2   3.6   2.7   2.2
Fainting   6.1   3.9   3.0   3.1   3.3   0.9   0.2   0.6   0.2   0.9

Incident symptoms were uncommon in this group of depressed patients being treated with an antidepressant. In other words, relatively few patients reported being “bothered a lot” by a particular physical symptom at follow-up if they had not reported being “bothered a lot” with that symptom at baseline. For most symptoms, the proportion of patients with an incident severe symptom at any of the four follow-up interviews was less than 5% to 10%, except back pain (13%), limb pain (12%), fatigue (12%), and sleep problems (11%).

The change in prevalence over the 9-month time period for five representative symptoms is displayed in Figure 1. Focusing on these symptoms—fatigue, sleep, stomach problems, headaches, and palpitations—the baseline prevalence of a severe symptom (“bothered a lot”) ranged from 12% for palpitations to 69% for fatigue. Prevalence dropped substantially during the initial 4 weeks of SSRI therapy. Thereafter it plateaued, with only minimal improvement during the remaining 8 months of the trial. This time course was similar for the other 9 physical symptoms not shown in the graph.

Figure 1.

Change in prevalence over the 9-month time period for five representative symptoms: fatigue, sleep, stomach problems, headaches, and palpitations. Illustrated is the baseline prevalence of a severe symptom (i.e., “bothered a lot”). This time course was similar for the other 9 physical symptoms not shown in the graph.

The proportion of variance in different domains of HRQoL attributable to physical symptoms and depression is summarized in Table 2. The variance estimates are adjusted for age, gender, race, anxiety, and comorbid disease. Physical symptoms accounted for the greatest proportion of variance in bodily pain (17% to 18%), role functioning due to physical health (11% to 14%), general health perceptions (13% to 15%), and physical functioning (13%), while depression had the greatest impact on mental health (26% to 45%), social functioning (14% to 32%), work functioning (9% to 32%), and multiple other domains of HRQoL. The possibility of an interaction between physical symptoms and depression was examined. While achieving statistical significance for a few HRQoL domains, adding the interaction term to the model produced only a slight change in the variance.

Table 2. Percent of Variance in Various Domains of Health Status Attributable to Physical Symptom and Depressive Symptom Severity
HRQoL Domain% Variance Attributable to*
Physical Symptom SeverityDepressive Symptom Severity
  • *

     A range of variance is presented for those domains where the 2 models (one using the HSCL-20 as a depression measure and the other using the PHQ-9) gave somewhat different variance estimates. All variance estimates are adjusted for age, gender, anxiety, and comorbid medical diseases.

  • HRQoL, health-related quality of life; SF-36, 36-item Short-form Health Survey; MOS, Medical Outcomes Study; WLQ, Work Limitations Questionnaire; PHQ-9, 9-item Patient Health Questionnaire depression scale; HSCL-20, 20-item Hopkins Symptom Checklist modified depression subscale.

SF-36 Bodily pain17 to 18 1
SF-36 General health13 to 15 0 to 1
SF-36 Role—physical 11 to 14 3
SF-36 Physical function13 0
SF-36 Vitality 7 to 1316 to 25
MOS Sleep 3 to 710 to 17
MOS Sexual function 3 0 to 1
WLQ Time management 0 to 517 to 32
MOS Memory/concentration 0 to 416 to 48
SF-36 Social function 0 to 314 to 32
SF-36 Mental health 0 to 226 to 45
SF-36 Role—emotional 0 to 225 to 40
Disposition 0 to 1 9 to 18
WLQ % effective 0 9

Among demographic factors, age had the greatest effect. In particular, it accounted for a moderate proportion of the variance in the SF-36 physical functioning (17%), MOS sleep (3% to 7%), and general health perceptions (3%). Gender and race had a smaller impact accounting for less variance in fewer domains. These two demographic characteristics did not account for more than 1% to 3% of the variance in any HRQoL domain, except bodily pain (gender, 2% to 6%).

Additionally, anxiety and comorbid medical diseases were adjusted for within the analysis. Medical comorbidity did not account for any of the variance in the HRQoL domains, except role functioning due to physical health (0% to 1%), general health (2%), and physical functioning (1%). Anxiety affected the domains of mental health (7% to 10%) and work (5%) to the greatest extent. In the other HRQoL domains, anxiety accounted for 0% to 3% of the variance.

Figure 2 shows the time course for improvement for the nonpain (9 items) and pain (5 items) somatic symptom subscales of the PHQ compared to core depressive symptoms and positive well-being. Improvement in the latter two domains reflects a decrease in “negative” affective symptoms and an increase in “positive” affective symptoms, respectively. To standardize comparisons among these four domains, change was measured in effect size, which is the mean change divided by the pooled standard deviation for a measure. For core depressive symptoms and positive well-being, there was a rapid improvement as reflected by the steep curve in the first month, followed by more gradual improvement in the following months of the trial. In contrast, both pain and nonpain somatic symptoms showed a similar steep improvement in the first month of SSRI treatment but then plateau thereafter. Pain symptoms, in particular, showed the least improvement in terms of effect size.

Figure 2.

Time course for improvement for the nonpain (9 items) and pain (5 items) somatic symptom subscales of the PHQ compared to core depressive symptoms and positive well-being. To standardize comparisons among these four domains, change was measured in effect size, which is the mean change divided by the pooled standard deviation for a measure.

Table 3 shows the degree of physical symptom improvement according to the three levels of depression response at 3 months, classified as remission, response, and nonresponse.30 Remitters and partial responders had significantly more change (P < .001) than nonresponders in both pain and nonpain physical symptoms at both 1 and 3 months. The magnitude of physical symptom improvement in remitters and partial responders ranged from an effect size of 0.6 to 1.0, compared to 0.3 to 0.5 for nonresponders. In contrast, remitters and partial responders did not differ significantly from one another in the degree of improvement in either their pain or nonpain physical symptoms at 1 or 3 months.

Table 3. Physical Symptom Improvement According to the Level of Depression Response
Depression Response*Improvement in Pain Symptoms at 1 and 3 MonthsImprovement in Nonpain Symptoms at 1 and 3 Months
Mean (SD)Change—Effect SizeMean (SD)Change—Effect Size
Baseline1 Month3 MonthsBaseline1 Month3 Months
  • *

     Remitters were defined as having a Hopkins Symptom Checklist modified depression subscale (HSCL-20) score 0.5 after 3 months of antidepressant treatment, while partial responders had a50% improvement in SCL-20 score but not to a level0.5. Patients who did not meet either criterion were placed in the nonresponders group.

  • Patient Health Questionnaire physical symptom score for the 5-item pain score ranges from 0 (no pain) to 10 (worst pain), and for the 9-item nonpain score ranges from 0 (asymptomatic) to 18 (most symptomatic).

  • ‡ 

    To standardize comparisons among domains, change was measured in effect size, which is the mean change divided by the pooled standard deviation for a measure.

  • SD, standard deviation.

Remitters3.7 (1.9) (2.8)0.70.9
Partial responders4.6 (2.1) (2.9)0.91.0
Nonresponders5.0 (2.1) (2.8)0.50.4


Like previous studies,1–14 the ARTIST trial confirms that many physical symptoms are highly prevalent in primary care patients who present with clinical depression. This study extends our understanding of physical symptoms in the presence of depression, by establishing a time course for improvement in individual symptoms with the treatment of depression and by determining the relative impact that physical symptoms and depression have on various domains of HRQoL. Strengths of the ARTIST study include its large sample size, random assignment to an SSRI agent, outcome assessment with multiple measures during both acute and maintenance periods of depression therapy, and a study design representative of actual clinical practice.

Within the first month of antidepressant treatment, a substantial proportion of depressed patients reported improvement in their physical symptoms. The burden of physical symptoms, as measured by somatic symptom severity score, declined substantially during the first 4 weeks but then leveled off during the remainder of the study. In contrast, depression had both a rapid initial improvement as well as a continued gradual improvement over the entire 9 months of treatment. Relatively few patients who did not have bothersome physical symptoms at the inception of antidepressant therapy developed incident symptoms during treatment.

While there is substantial literature demonstrating a strong cross-sectional association between somatic symptoms and depression, there is much less information about their longitudinal relationship. Widmer and Cadoret compared depressed with nondepressed primary care patients and found that both new and recurrent cases of depression were often heralded by somatic complaints in the preceding months.31,32 In our study, we followed depressed patients treated over 9 months and while somatic symptoms improved in many, there remained a substantial reservoir of unresolved symptoms. In particular, pain symptoms showed the poorest response, and have been shown to adversely affect depression outcomes.33,34 Recently, it has also been shown that while response to antidepressants occurs in 70% or more of depressed primary care patients, complete remission may occur in only 35% to 40%. Whether residual somatic symptoms contribute to lower remission rates needs to be determined.

An important limitation of our study is that all patients were clinically depressed and treated with an antidepressant. Thus, we cannot ascertain whether physical symptom improvement was simply an epiphenomenon of depression improvement or whether it was due to an independent antidepressant effect on physical symptoms, a placebo response, or merely the natural history of physical symptoms in primary care. The fact that the physical symptoms exhibited a different time course of improvement than the core depressive symptoms (as displayed in Fig. 2) coupled with the differential effects of physical symptoms and depression on HRQoL suggests that physical symptoms are at least a somewhat separate entity from depressive symptoms.

Somatic symptoms are extremely prevalent in primary care practice and, in an important proportion of patients, persistent and disabling. At least one third of somatic symptoms are medically unexplained and serve as an important marker of potentially treatable depressive and anxiety disorders.5,9 The fact that bothersome somatic symptoms frequently improve during the first month of antidepressant treatment in many patients is useful for the primary care physician in counseling the depressed patient presenting with physical complaints.

For those depressed patients whose somatic symptoms persist despite depression treatment, further treatment strategies should be investigated. Some may have persistent depression, which might respond to more intensive depression therapy. Others may have minimal residual depression but continued somatic symptoms. Antidepressants as well as cognitive-behavioral (CBT) therapy and other types of psychological and behavioral treatments have proven effective in somatic symptoms and symptom syndromes, and their effect does not appear to be entirely mediated through alleviation of depression or anxiety.35–37 However, the majority of antidepressants used in trials focusing on somatic symptoms have been tricyclic rather than the SSRI or other newer antidepressants. While a stepwise approach toward persistent somatic symptoms integrating these and other types of interventions has been proposed,5 much work remains to be done on developing evidence-based interventions.


The ARTIST trial was supported by a grant from Eli Lilly. Work on this paper was also supported by Grant T-32 PE15001 from the Health Resources and Service Administration.