Based on evidence that psychologic distress often goes unrecognized although it is common among cancer patients, clinical practice guidelines recommend routine screening for distress. For this study, the authors sought to determine whether the single-item Distress Thermometer (DT) compared favorably with longer measures currently used to screen for distress.
Patients (n = 380) who were recruited from 5 sites completed the DT and identified the presence or absence of 34 problems using a standardized list. Participants also completed the 14-item Hospital Anxiety and Depression Scale (HADS) and an 18-item version of the Brief Symptom Inventory (BSI-18), both of which have established cutoff scores for identifying clinically significant distress.
Receiver operating characteristic (ROC) curve analyses of DT scores yielded area under the curve estimates relative to the HADS cutoff score (0.80) and the BSI-18 cutoff scores (0.78) indicative of good overall accuracy. ROC analyses also showed that a DT cutoff score of 4 had optimal sensitivity and specificity relative to both the HADS and BSI-18 cutoff scores. Additional analyses indicated that, compared with patients who had DT scores < 4, patients who had DT scores ≥ 4 were more likely to be women, have a poorer performance status, and report practical, family, emotional, and physical problems (P ≤ 0.05).
Consensus-based guidelines developed by the Distress Management Panel of the National Comprehensive Cancer Network (NCCN) recommend screening all patients with cancer regularly for psychologic distress as part of routine care.1 This recommendation is based on evidence indicating that clinically significant distress often goes unrecognized by oncology professionals even though it is common among cancer patients.2–4 This situation is unfortunate for at least two reasons. First, the presence of heightened distress is associated with a number of negative outcomes, including greater nonadherence to treatment recommendations,5 poorer satisfaction with care,6 and poorer quality of life across multiple domains.7 Second, failure to recognize distress is likely to result in cancer patients not receiving pharmacologic and nonpharmacologic interventions that are known to be effective in relieving distress in this patient population.8, 9
To meet the objective of routine screening, it would be advantageous to have a measure that could be administered and interpreted rapidly by clinical staff. Several brief measures, such as the 14-item Hospital Anxiety and Depression Scale (HADS)10 and the 18-item version of the Brief Symptom Inventory (BSI-18),11 have been evaluated and were identified as useful in screening for distress in cancer patients. Despite their relatively brevity, the time and effort required to administer and score these multiitem distress measures represent significant barriers to their widespread use. Although technologic innovations, such as computerized administration and scoring of screening measures, can address these barriers,12, 13 the resources required to take advantage of these innovations are unavailable today in most clinical settings.
Recognizing the need for a means to screen rapidly for distress in cancer patients, Roth and colleagues14 developed the single-item “Distress Thermometer” (DT). Patients who complete this measure are asked to rate their distress using a scale with scores ranging from 0 (“no distress”) to 10 (“extreme distress”). In the NCCN Clinical Practice Guidelines for Distress Management,1 the DT is accompanied by a problem list that asks patients to identify any of 34 issues (grouped into categories such as emotional problems and family problems) that have been a problem for them in the past week.
To date, there has been limited research evaluating the utility of the DT as a means of screening for distress in cancer patients. In their initial report on the DT, Roth and colleagues14 described the results of a study in which the DT and the HADS were administered to 93 men with prostate carcinoma. Adopting a cutoff score of 5 for the DT and using the established cutoff score of 15 for the HADS total score,15 those authors found that 28.6% of patients met the DT cutoff score, and 13% of patients met the HADS cutoff score. These rates reportedly yielded a 74.4% concordance rate between the 2 screening measures. Sensitivity and specificity were not reported. In a similar study, Trask and colleagues16 administered the DT and the HADS to 50 men and women who were potential candidates for bone marrow transplantation. Adopting a cutoff score of 5 for the DT, those authors found that 50% of patients met this cutoff score. Using the established cutoff score of 8 for the individual HADS scales,17 the authors found that 51% of patients met this cutoff score for anxiety, and < 20% of patients met this cutoff score for depression. No further information was reported regarding the correspondence in classification between the DT and the HADS.
Two studies have examined the operating characteristics of the DT as a screening measure in more detail. Akizuki et al.18 administered a Japanese language version of the DT and a psychiatric interview to 275 cancer patients with mixed diagnoses, the most common of which was breast carcinoma (31% of participants). A receiver operating characteristic (ROC) curve analysis was conducted to identify the optimal DT cutoff score relative to the presence or absence of adjustment disorder or major depressive disorder, as diagnosed using the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition.19 These diagnoses do not appear to have been based on a structured clinical interview, and no information is reported about the reliability of the diagnostic process. Procedures for determining the optimal cutoff score consisted of identifying the point of greatest sensitivity at which the likelihood ratio (defined as how much the DT would raise or lower the pretest probability of the target disorders) was > 2. The use of this criterion resulted in the selection of a DT cutoff score of 5, which yielded a sensitivity of 0.84 and a specificity of 0.61.
In the second study, Hoffman et al.20 administered the DT, the BSI, and the BSI-18 to 68 cancer patients with mixed diagnoses, the most common of which was breast carcinoma (21% of participants). Once again, ROC curve analysis was conducted to identify the optimal DT cutoff score. In that study, the criteria used was the DT score that yielded the optimal sensitivity and specificity relative to established cutoff scores for identifying “caseness” on the BSI21 and the BSI-18.11 Those authors reported areas under the curve of 0.74 and 0.80, respectively, for the BSI and BSI-18, suggesting that DT scores effectively discriminated patients classified as cases and noncases using established cutoff scores. However, they also reported that visual inspection of each ROC curve revealed no specific DT score that stood out as maximizing sensitivity and specificity. In that study, it was found that the use of the customary DT cutoff score of 5 yielded a sensitivity of 0.59 and a specificity of 0.71 relative to the BSI criteria for caseness and a sensitivity of 0.70 and a specificity of 0.64 relative to BSI-18 criteria for caseness.
In the current study, we sought to characterize further the operating characteristics of the DT as a screening measure for distress in cancer patients. The primary objective was to determine the optimal cutoff score on the DT for identifying clinically significant distress. Previous research on this topic, as discussed earlier, has been characterized either by small sample sizes20 or by the use of criterion measures of unknown reliability.18 To address these issues, we recruited a relatively large sample of cancer patients and used 2 measures (i.e., the HADS and the BSI-18) with established cutoff scores for identifying clinically significant distress in cancer patients. Assuming that an optimal DT cutoff score could be identified, a secondary objective of the current study was to explore whether demographic or clinical factors differentiated patients who scored above or below this cutoff score. In addition, we sought to explore whether patients who scored above or below this score differed in their reports of practical, family, emotional, spiritual, and physical problems.
MATERIALS AND METHODS
Participants were patients at one of five participating institutions: Beth Israel Cancer Center (New York, NY), the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins (Baltimore, MD), Memorial Sloan-Kettering Cancer Center (New York, NY), the H. Lee Moffitt Cancer Center and Research Institute at the University of South Florida (Tampa, FL), and the University of Michigan Comprehensive Cancer Center (Ann Arbor, MI). To be eligible to participate in the study, participants had to be 1) age ≥ 18 years, 2) diagnosed with cancer, 3) scheduled for an outpatient appointment, 3) able to read standard English, and 4) able to provide informed consent.
Individuals were approached in the waiting areas at each institution prior to a scheduled outpatient visit. After they received an explanation of the study and provided informed consent, participants were asked to complete a packet of self-report questionnaires that included a demographic and clinical data form, the DT, the Problem List, the HADS, and the BSI-18. Of 474 individuals who were approached, 380 patients (80%) agreed to participate and provided usable data. Participants and nonparticipants did not differ significantly with regard to gender, race/ethnicity, or receipt of treatment in the past month (P values > 0.05). It was found that nonparticipants were significantly older (P < 0.001) and had been diagnosed longer (P < 0.01) compared with participants.
Demographic data were obtained through use of a standardized self-report questionnaire. Variables assessed were age, gender, race/ethnicity, marital status, education, and annual household income. Disease and treatment data also were obtained by patient self-report. Variables assessed were cancer type, date of diagnosis, and type of treatments received in the last month. In addition, patients completed the self-reported Karnofsky Performance Scale.22
The DT is a single-item, self-report measure of psychologic distress.1 The DT has an 11-point range with endpoints labeled “no distress” (0) and “extreme distress” (10). Respondents are instructed to circle the number (0–10) that describes best how distressed they have been in the past week (Fig. 1). The operating characteristics of the DT are the subject of the current study.
The Problem List was developed by the Distress Management Guidelines Panel of the NCCN.1 It consists of 34 problems commonly experienced by cancer patients that are grouped into 5 categories (practical problems, family problems, emotional problems, spiritual/religious concerns, and physical problems). Respondents are instructed to indicate whether or not (yes or no) any of the items listed has been a problem in the past week (Fig. 1). This version of the Problem List has not been evaluated previously.
The HADS10 is a 14-item, self-report measure of psychologic distress. The measure is distinguished by the general absence of somatic symptoms that may be attributable to either medical or psychiatric conditions. Accordingly, the HADS is suited well for use with cancer patients. For each item, respondents are asked to indicate which of 4 options (rated 3–0) comes closest to describing how they have been feeling in the past week. It has been shown that a total score ≥ 15 is indicative of clinically significant distress.15
The BSI-1811 is an 18-item version of the 53-item BSI.21 Based on findings regarding the prevalence of distress in cancer patients obtained using the BSI,23, 24 Zabora et al.24 have proposed the use of gender-specific cutoff scores for the BSI-18 set at the upper 25th percentile (male score ≥ 10; female score ≥ 13) to identify cancer patients who experience clinically significant distress. A comparison of these cutoff scores with established rules for identifying “caseness” on the full 53-item BSI21 yielded a sensitivity of 0.91 and a specificity of 0.93.24
Demographic and Clinical Characteristics
Table 1 shows that the 380 participants who were included in the analyses were an average of 56 years of age (range, 21–89 years). The sample was split fairly evenly in terms of gender (51% male and 49% female). The majority of participants were white (85%) and were married or living in a marriage-like relationship (72%). Fifty percent of the sample had earned a college degree, and 56% reported an annual household income ≥ $40,000. A broad range of cancer diagnoses was represented, with no single diagnosis comprising > 16% of the sample. Fifty-five percent of participants had received some form of cancer treatment in the past month, with > 10% of the sample having undergone chemotherapy and/or surgery in the past month. The average self-reported Karnofsky Performance Scale score corresponded to a rating that fell between “able to carry on normal activity or do work even with minor physical complaints” (a score of 2) and “able to carry on normal activity or do work but takes effort because of physical problems” (a score of 3). Participants had been diagnosed with cancer an average of 2.55 years previously (range, from 2 days to 30 years).
Table 1. Demographic and Clinical Characteristics of Study Sample
Table 2 lists the frequency distribution of DT scores. The average score was 3.41 (standard deviation = 2.79). ROC curves were constructed for sensitivity and 1-specificity for the range of possible scores on the DT compared with established HADS and BSI-18 cutoff scores for identifying clinically significant distress (Figs. 2, 3). Based on previous research, the presence of clinically significant distress was defined as a total score ≥ 15 on the HADS15 or a total score ≥ 10 (males) or ≥ 13 (females) on the BSI-18.24 The ROC curves are graphic representations of the trade-off between the sensitivity (true-positive rate) and specificity (true-negative rate) for every possible cutoff score on the DT. The area under the curve (AUC) in each ROC curve provides an estimate of the overall discriminative accuracy of the DT relative to the established cutoff scores for the HADS and the BSI-18. In ROC analysis, an AUC of 1 represents a test with perfect accuracy relative to the established criterion, whereas an AUC of 0.5 represents a test with no apparent accuracy relative to the established criterion. In the current study, the AUC was 0.80 using the HADS cutoff score as the criterion, and the AUC was 0.78 using the BSI-18 cutoff scores as the criterion. These values are in the range typically characterized as representing good overall accuracy. Visual inspection of the ROC curves suggests that a score ≥ 4 is the optimal DT cutoff score for identifying distressed cancer patients using either the HADS or the BSI-18 as the criterion.
Table 2. Frequency Distribution of Distress Thermometer Scores
No. of patients
The classification of patients based on a DT cutoff score of 4 relative to established HADS and BSI-18 cutoff scores is illustrated in Table 3. Using the HADS as the criterion, it was found that a DT cutoff score of 4 yielded a sensitivity of 0.77 and a specificity of 0.68. Using the BSI-18 as the criterion, it was found that the DT cutoff score of 4 yielded a sensitivity of 0.70 and a specificity of 0.70.
Table 3. Correspondence of the Distress Thermometer with the Hospital Anxiety and Depression Scale and the 18-Item Brief Symptom Inventorya
Relation of the DT Cutoff Score to Demographic and Clinical Variables
Chi-square analyses (Table 4) and t tests (Table 5) were conducted to explore the relation of the DT cutoff score of 4 to demographic and clinical variables. Chi-square analyses for cancer diagnosis and type of previous treatment were limited to categories for which the observed percentages were > 10%. Of the demographic variables measured, the DT cutoff score was related significantly (P ≤ 0.05) only to gender, with women more likely to report scores above the cutoff score. Of the clinical variables measured, the DT cutoff score was related significantly (P ≤ 0.05) only to performance status, with patients who scored above the cutoff having a poorer performance status than patients who scored below the cutoff.
Table 4. Relation of a Distress Thermometer Cutoff Score of 4 to Categoric Demographic and Clinical Variables
No. of patients (%)
DT score < 4
DT score ≥ 4
DT: Distress Thermometer.
Did not graduate college
Treatment in the past month
Chemotherapy in the past month
Surgery for malignancy in the past month
Breast carcinoma diagnosis
Lung carcinoma diagnosis
Table 5. Relation of a Distress Thermometer Cutoff Score of 4 to Continuous Demographic and Clinical Variables
Student t test
SD: standard deviation; DT: Distress Thermometer.
DT below cutoff
DT above cutoff
Yrs since diagnosis
DT below cutoff
DT above cutoff
DT below cutoff
DT above cutoff
Relation of the DT Cutoff Score to Problem List Items
Chi-square analyses were conducted to explore the relation of the DT cutoff score to endorsement of items on the Problem List. With regard to practical problems, the DT cutoff score was related significantly (P ≤ 0.05) to 1 of 5 problems listed (20%). Patients who scored above the cutoff were more likely to report problems with housing. With regard to family problems, the DT cutoff score was related significantly (P ≤ 0.05related to 2 of 2 problems listed (100%). Patients who scored above the cutoff were more likely to report problems dealing with their children and dealing with their partner. With regard to emotional problems, the DT cutoff score was related significantly (P ≤ 0.05) to 5 of 5 problems listed (100%). Patients who scored above the cutoff were more likely to report problems with depression, fears, nervousness, sadness, and worry. With regard to spiritual problems, the DT cutoff score was not related significantly (P ≤ 0.05) to either of the 2 problems listed (0%). With regard to physical problems, the DT cutoff score was related significantly (P ≤ 0.05) to 14 of 20 problems listed (70%). Patients who scored above the cutoff were more likely to report problems with appearance, bathing or dressing, breathing, changes in urination, constipation, eating, fatigue, feeling swollen, fevers, getting around, nausea, pain, sexuality, and sleep.
The principal findings from the current study were that the 1-item DT compared favorably with the HADS and the BSI-18 as a method of screening for distress in ambulatory cancer patients, and that a cutoff score of 4 on the DT yielded optimal sensitivity and specificity relative to established cutoff scores on the other measures. Additional findings were that patients who scored at or above the cutoff of 4 on the DT were significantly more likely to be female and to have a poorer performance status. Finally, as expected, patients who scored above the cutoff were significantly more likely to report a variety of problems that included practical, family, emotional, and physical concerns.
The conclusion that the 1-item DT compared favorably with the HADS and the BSI-18 as a screening measure is based on the AUC statistics obtained when comparing the full range of DT scores with established cutoff scores for the HADS (AUC = 0.80) and the BSI-18 (AUC = 0.78). In the only other study of the DT that reported AUC statistics, Hoffman et al.20 obtained similar estimates when comparing DT scores with cutoff scores for identifying caseness on the BSI (AUC = 0.74) and the BSI-18 (AUC = 0.80). Taken together, the findings show that the single-item DT can discriminate effectively between classified patients with and without clinically significant distress using established cutoff scores on existing multiitem distress measures.
Current NCCN distress management guidelines recommend that patients who score ≥ 5 on the DT should be referred to a psychosocial care team for management of distress.1 The adoption of a score of 5 as a criterion appears to have been based on the original report describing the DT, in which a score of 5 was used as the basis for referring patients for psychiatric evaluation.14 In an attempt to provide an empirical basis for this type of treatment decision, in the current study, we used ROC curve analysis to identify the optimal DT score for identifying clinically significant distress. Using established cutoff scores on existing screening measures as the criteria for comparison, findings consistently indicated that a DT cutoff score of 4 yielded the optimal combination of sensitivity and specificity. This finding differs from the results of an earlier study using ROC curve analysis, in which no single DT score was seen as maximizing sensitivity and specificity.20 This discrepancy may reflect the fact that the current sample size was more than four times as large as the sample size in the earlier study, thereby increasing the ability to detect an optimal value. The current finding also differs from the results of an earlier study using ROC analysis that identified a DT cutoff score of 5 as yielding optimal sensitivity and specificity.18 This discrepancy may reflect differences in study methodology. Whereas the current study used established cutoff scores on existing measures as the criteria against which the DT was compared, the earlier study used diagnoses of adjustment disorder and major depressive disorder based on psychiatric interviews. Cross-cultural differences in the meaning and reporting of psychologic distress also may be a factor, because the current study was conducted in the U.S., and the earlier study was conducted in Japan.
The correlation of the DT cutoff score of 4 with female gender and poorer performance status was not unexpected. Previous studies using other measures of distress have reported similar results. With regard to gender differences, a meta-analysis of 58 studies conducted with cancer patients between 1980 and 1994 found that levels of psychologic distress were higher in studies in which only female patients were studied than in studies in which both male and female patients were studied.25 Performance status was examined too infrequently in that set of studies to allow for a meta-analysis. However, a number of individual studies can be identified that have reported significant correlations between poorer performance status and greater psychologic distress.26–29 In the only other study of the DT that examined these same correlations, it was found that DT scores were related significantly to poorer performance status (as assessed using Eastern Cooperative Oncology Group criteria) but not gender.18
In the current study, patients with DT scores ≥ 4 were more likely to report 22 of the 34 problems on the Problem List. Not surprisingly, patients with scores ≥ 4 on a measure that assessed distress were more likely to report of all the emotional problems and all the family problems that were listed. The association observed between DT scores ≥ 4 and the increased likelihood that 14 of the 20 physical problems listed would be present is consistent with evidence regarding the distressing nature of many of the symptoms commonly experienced by cancer patients (e.g., pain and fatigue).30 Findings from the current study suggest that practical problems and spiritual problems are less likely to be accompanied by clinically significant psychologic distress. One prior study examined the relation of DT scores to nearly all the same problems that are listed on the Problem List.20 In that study, each problem was rated in terms of how much distress it produced (from 0 [no distress] to 10 [extreme distress]), then ratings were averaged across problem domains. Consistent with the current study, DT scores were correlated significantly with the average ratings of emotional, physical, and family problems but were not correlated with practical problems. In contrast to the current study, DT scores were correlated significantly with the average rating of spiritual problems. The results from those two studies provide insights into the types of problems that are most likely to result in distress in ambulatory cancer patients. The findings also alert clinicians and researchers who use the DT to the types of patient problems they are most likely to encounter when using this measure to screen for distress.
The Problem List used in the current study asked patients to indicate those issues that were a problem for them in the past week as a means of identifying possible sources of distress. This format does not provide an opportunity for patients to identify other potential sources of distress, such as unmet needs. Future researchers may wish to examine whether other problem list formats (such as asking patients to indicate those issues for which they would like help) may be more useful than the current format in identifying potential sources of distress.
In the current study, we successfully addressed several of the methodological limitations of prior research on the DT. Notable strengths included a sample that was relatively large and geographically diverse compared with the samples used in previous research. In addition, the current sample was more diverse with regard to types of cancer and cancer treatment modalities represented than in prior research. These features should ensure greater generalizability of the findings. Other strengths included the use of other screening measures with established cutoff scores as the basis for comparison and the use of statistical methods appropriate for the identification of an optimal cutoff score. However, there were several limitations in the current study that should be noted. First, there was limited diversity in the current sample with regard to race/ethnicity, education, and socioeconomic status. Additional work is needed to establish the operating characteristics of the DT in minority populations and low-literacy populations. Second, the finding that a DT cutoff score of 4 yielded the optimal sensitivity and specificity was not cross-validated in a second sample of patients in the current study. Although the selection of this cutoff score was confirmed using a second measure of distress in the current study, evidence that a similar cutoff score is obtained using the same measures in another sample of patients would increase confidence in the finding. Finally, the design of the current study does not allow for any conclusions to be drawn about the clinical benefit of screening for distress. The findings are limited to the characterization of the DT relative to established methods of screening for distress in cancer patients.
The issue of the clinical benefit of screening cancer patients routinely for distress is not without controversy. It has been argued that screening is unlikely to provide efficient identification of untreated psychiatric morbidity in oncology settings31 and, thus, may not lead to better outcomes. To address these concerns, two types of studies are needed. One type is research investigating whether routine screening for distress using the DT results in improved identification of patients who experience clinically significant distress relative to other forms of clinical care that do not involve screening. The second type is research investigating whether treatment delivered in a manner consistent with NCCN recommendations (i.e., routine screening followed by management of distress according to clinical practice guidelines) results in better health outcomes for cancer patients relative to treatment delivered in a manner that is not consistent with NCCN recommendations. The conduct of a study along these lines would help considerably in developing evidenced-based guidelines for the management of distress in cancer patients and may provide valuable evidence documenting the benefits of psychosocial care that could assist in securing improved reimbursement for such services.