Fatigue is a major disease and treatment burden for cancer patients. Several scales have been created to measure fatigue, but many are long and difficult for very ill patients to complete, or they are not easy to translate for non-English speaking patients. The Brief Fatigue Inventory was developed for the rapid assessment of fatigue severity for use in both clinical screening and clinical trials.
The study enrolled 305 consecutive, consenting adult inpatients and outpatients with cancer who could understand and complete the self-report measures used in the study. The same instruments also were administered to 290 community-dwelling adults to obtain a comparison sample. Research staff completed a form that indicated the primary site and stage of the cancer, rated the Eastern Cooperative Oncology Group performance status of the patient, described the characteristics of the pain, and described the current pain treatment being provided to the patients.
The BFI was shown to be an internally stable (reliable) measure that tapped a single dimension, best interpreted as severity of fatigue. It correlated highly with similar fatigue measures. Greater than 98% of patients were able to complete it. A range of scores defining severe fatigue was identified.
Fatigue is a commonly reported symptom by individuals suffering from various diseases, including multiple sclerosis, rheumatoid arthritis, and cancer. Fatigue is the most frequently reported symptom of cancer patients.1 Because of its prevalence, it is often reported as the symptom that is the most distressing and causes the greatest amount of interference with daily life.2 In a study of breast and lung carcinoma patients, Blesch et al.3 reported that 99% of patients in their sample experienced fatigue. In a study of chemotherapy and radiotherapy patients, Irvine et al.4 found that 61% experienced fatigue after treatment. The occurrence of fatigue across diagnostic and treatment categories at all phases of the life span underscores the need for empirically based interventions.5
Fatigue in cancer patients is associated with psychological disturbance, symptom distress, and decreases in functional status.4 Fatigue in cancer may be caused by the disease itself, or it may be caused by various treatments, such as chemotherapy, radiotherapy, biological therapy, or surgery. In addition, other mechanisms that may be responsible for fatigue include sleep disturbances, environmental conditions, level of activity, nutritional status, and diverse inherent factors.6 Based on the available data, it is unclear how each of these factors may affect the cancer patient's experience of fatigue.
Until very recently, fatigue has not been measured routinely in many diseases, such as cancer, despite its impact on patients and its effect on quality of life.7 Most commonly, fatigue is found as one item on a scale measuring functional status or evaluating mood or as an item in a toxicity report. In recent literature reviews, Irvine et al.8 and Richardson2 both argued for better assessment tools and greater methodological rigor for future research on fatigue in cancer patients. They also noted that assessment scales have rarely been subjected to adequate reliability and validity testing. Richardson2 encouraged the creation of more assessment tools designed specifically for use with the cancer population.
Recommendations for assessing fatigue vary according to the purpose of the assessment. The general approach to assessing fatigue is based on the conceptualization of fatigue as a subjective experience. Therefore, it can be compared with the symptom of pain: It is what the person experiencing it says it is. Most researchers have used self-report instruments to measure fatigue. In clinical practice, the focus is on efficiently obtaining information that is needed for patient care. Further research is needed to compare the various approaches to measuring fatigue and to clearly separate the experience of fatigue from the responses to fatigue.
Recently, several instruments have been employed for the assessment of fatigue. These include: the Pearson–Byars Fatigue Feeling Checklist,9 the Profile of Mood States (POMS) Fatigue and Vigor subscales,10 the Functional Assessment of Cancer Therapy-Fatigue (FACT-F),11 the Piper Fatigue Self-Report Scale,12 the Fatigue Assessment Instrument (FAI),7 the Multidimensional Fatigue Inventory (MFI),13 the Multidimensional Fatigue Symptom Inventory (MFSI),14 and the Fatigue Symptom Inventory (FSI).15 Some of these measures are too long for very ill patients to complete. Others depend on English-based expressions or idioms that make them difficult to translate. A significant number of cancer patients in the United States do not speak English or speak English as a second language, making ease of scale translation a goal of clinical and research assessment measures.
The Pearson–Byars Fatigue Feeling Checklist9 consists of 13 items that are descriptive of energy levels. Examples of items include “slightly tired,” “fairly well pooped,” “very lively,” and “quite fresh.” Patients rate whether they feel “better than,” “the same as,” or “worse than” for each item. The scale is brief, exhibits good internal consistency (alpha 0.82–0.97) and test-retest reliability, and has the ability to discriminate between patients and healthy volunteers. However, this instrument utilizes a three-point scale that may not provide enough variability for the assessment of different levels of fatigue. Also, some of the items on this checklist may be difficult for some patients to comprehend or too idiomatic for easy translation.
Two subscales of the POMS10 have been used to assess fatigue in cancer patients. These subscales are the POMS-Fatigue and the POMS-Vigor. The POMS-Fatigue consists of the following items: “worn-out,” “listless,” “fatigued,” “exhausted,” “sluggish,” “weary,” and “bushed.” The POMS-Vigor includes these items: “lively,” “active,” “energetic,” “cheerful,” “alert,” “full of pep,” “carefree,” and “vigorous.” On both subscales, each item is rated using a five-point scale. These scales are brief enough to assess fatigue in cancer patients but, like the Pearson–Byars, may be difficult to translate.
The FACT-F11 was designed to measure the fatigue symptoms of cancer patients with anemia. It consists of the 28 items of the FACT-General (FACT-G) to assess health-related quality of life and an additional 13 items to assess fatigue. Examples of these fatigue items are “feel weak all over,” “have energy,” and “I am frustrated by being too tired to do the things I want to do.” Each item was rated on a five-point Likert scale ranging from “0” (not at all) to “4” (very much so). The FACT-F demonstrates good internal consistency and test-retest reliability. The main disadvantage of the FACT-F is its length. It may be too long to be given to cancer patients in a clinical setting. The 13-item fatigue subscale, on the other hand, does meet the requirement of a rapidly-administered scale, but some items may be difficult to understand or to translate into other languages.
Other scales were designed to portray more than one dimension of fatigue, including the Piper Fatigue Self-Report Scale (PFS),12 the MFI,13 and the MFSI.14 Many of these scales have been developed carefully from a theoretical standpoint, taking into account the multifactorial nature of fatigue, and meet accepted standards of validity and reliability. However, their length makes them difficult for cancer patients to complete, especially for those who are very tired. The time required to complete them makes them difficult to use for clinical screening, or for outcome measures in clinical trials.
We developed a new fatigue measure, the Brief Fatigue Inventory (BFI), to address some of the concerns with existing instruments. We based the BFI on the Brief Pain Inventory (BPI),16 which has been used successfully to assess the severity and impact of cancer pain in the United States17 and in European18 and Asian countries.19 The simple wording of the BPI makes it easy to understand for educationally disadvantaged patients as well as easy to translate. The BPI measures the severity of pain and interference caused by pain using 0–10 scales. In addition, the level of pain assessed by the BPI can be divided into categories of “mild” (1–4), “moderate” (5–6), and “severe” (7–10) pain based on the amount of pain-related interference with function.20 We hoped to use the same method to establish ranges of fatigue severity.
Study Design and Overview
The study consisted of several phases. First, we examined fatigue by using responses to a fatigue questionnaire developed at the University of Wisconsin-Madison. The data were collected from both healthy controls and patients in the Madison area. Second, by using preliminary results from the Wisconsin data, we refined the items and administered a revised fatigue questionnaire to patients at The University of Texas M. D. Anderson Cancer Center and to a sample of community-dwelling adults from the Houston area with a similar age distribution. Third, we proceeded to a formal validation of the final version of the BFI by demonstrating the instrument's psychometric properties. We showed evidence of the BFI's validity with construct, concurrent, and discriminant validity. We also demonstrated its reliability. Fourth, we developed a categorization of severity of fatigue based on how much it interfered with function.
Development of the Brief Fatigue Inventory: Data from the Wisconsin Fatigue Study
The Pain Research Group collected data on fatigue from a fatigue questionnaire that had been administered previously to both patients and normal subjects at the Wisconsin Comprehensive Cancer Center at the University of Wisconsin-Madison. This questionnaire included demographic variables and fatigue-related items. These fatigue-related items assessed the severity of fatigue, the amount of interference with function caused by fatigue, and the presence of factors that worsen fatigue, such as pain and medications. Other items focused on how factors, such as illness, medical treatments, depression, worrying, “keeping up with daily activities” (such as work and school work), too much idle time, and difficulty sleeping, contribute to subjects' fatigue. The questionnaire also included items asking about the quality of sleep. A mood measure, the POMS, also was administered. We used these data to eliminate items that did not contribute to discrimination between patients and controls and items that demonstrated little variability or sensitivity with the patient sample for the final BFI.
The Wisconsin sample was composed of 249 patients and normal subjects. Approximately 29% of the total sample were psychiatric patients receiving treatment for depression, whereas 27% were cancer patients. One-third of the cancer patients were receiving interferon treatment. Thirty-three percent of the sample was comprised of normal volunteers who were administered the questionnaire at a benefits fair. The remaining 11% was comprised of graduate students who were given the questionnaire after a night of studying during finals week.
In contrast to the commonly held notion that sleep affects level of fatigue, preliminary analyses of the Wisconsin data suggested that items describing how people sleep, such as “feeling rested and refreshed upon awakening,” “difficulty falling asleep,” and “waking up during the night,” were not significantly related to reported fatigue in the patient group, nor was report of the number of hours slept in the previous 24 hours. Based on this preliminary examination of fatigue and related questions in a very diverse sample, a multidisciplinary working group was formed to review the fatigue literature and to develop a strategy for the simplification and validation of the BFI.
Validation Strategy Plan
The validation plan involved establishing validity in three ways: through construct, concurrent, and discriminant validity. Construct validity was examined by using a data reduction technique known as factor analysis. This procedure allowed us to discern the underlying factors or constructs the BFI is supposed to measure. Concurrent validity analysis was performed by correlating the BFI with the Fatigue subscale of the FACT and the Vigor and Fatigue subscales of the POMS. Although there is no “gold standard” for the measurement of fatigue, it is important to evaluate how well the BFI compares with previously validated fatigue instruments. Discriminant validity was established by comparing the mean BFI scores of patient groups who were expected to have different levels of fatigue based on Eastern Cooperative Oncology Group (ECOG) performance status. To show that the BFI items are reliable, Cronbach's coefficient alpha, a measure of internal consistency, was calculated. Finally, we explored the possibility of defining ranges of fatigue severity scores following the procedure that we had used to define ranges of pain severity based on the BPI.
The Brief Fatigue Inventory
The number of items and the item format of the final BFI (see Fig. 1) closely follow those of the BPI,16 a pain-assessment instrument. The one-page BFI has only nine items, with the items measured on 0–10 numeric rating scales. The advantages of using numeric rating scales over other types of rating scales are discussed elsewhere.21 Three items ask patients to rate the severity of their fatigue at its “worst,” “usual,” and “now” during normal waking hours, with 0 being “no fatigue” and 10 being “fatigue as bad as you can imagine.” Six items assess the amount that fatigue has interfered with different aspects of the patient's life during the past 24 hours. Depending on the purposes of measurement, this time interval can be changed to the past week. The interference items include general activity, mood, walking ability, normal work (includes both work outside the home and housework), relations with other people, and enjoyment of life. The interference items are measured on a 0–10 scale, with 0 being “does not interfere” and 10 being “completely interferes.”
The control subjects (n = 290) were members of service groups around the Houston area. Each organization was contacted in advance, and permission was granted to introduce the study and recruit participants at a future organization meeting. Any person attending the organization meeting on the scheduled date was eligible to participate, and participation implied informed consent. There were no restrictions placed on eligibility to participate, including any current or past cancer diagnosis. Specific service organizations were chosen in an attempt to gather data from a diverse study sample in terms of gender, ethnicity, and approximate age of cancer patients. Data were collected over a 3-month period.
The data on patients (n = 305) were collected at The University of Texas M. D. Anderson Cancer Center with both inpatients and outpatients from the departments of Bone Marrow Transplantation, Leukemia, Lymphoma, Gastrointestinal Oncology, and Radiation Oncology. Patients were at least 18 years of age, were able to read and understand English, had a pathological diagnosis of cancer, and had given verbal consent to participate. Patients were excluded if their clinical status was felt to be too poor to allow them to complete the survey or if they were diagnosed with a major psychiatric illness at the time of the study. Compliance was excellent. Only five patients refused to participate for the following reasons: “do not want to take the time,” “not up to it,” and “too tired to write.”
Table 1 presents the descriptive characteristics of the controls and patients who participated in the study. The control and patient samples were approximately the same age. Most of the controls were male, white, fully employed, and highly educated. The patient sample was evenly represented by gender and was mostly white. Thirty-four percent of the patients were disabled due to illness. Although they were not as highly educated as the controls, the majority of the patients in the sample had at least a college degree. The three most common primary cancer diagnoses were lymphoma, leukemia (acute), and leukemia (chronic). More than half of the patients had poor performance status (ECOG performance rating of 2 or greater).
Table 1. Description of the Sample of Patients and Control Subjects
Patients (n = 305)
Controls (n = 290)
ECOG: Eastern Cooperative Oncology Group.
Employed full time
Disabled due to illness
ECOG performance status
(0) Fully active
(1) Restricted but ambulatory
(2) Ambulatory, capable of self care
(3) Capable of only limited self care
(4) Completely disabled
Median age in yrs (range)
Instruments Used and Data Collected
In addition to the preliminary fatigue questionnaire, the survey packet contained two previously validated measures of fatigue and a questionnaire to gather basic demographic information. The primary disease site was also recorded.
The previously validated measures included the following: 1) The FACT-F and FACT-Anemia subscales of the FACT assessment system.11 This 20-item instrument contains 13 items dealing with fatigue and 7 items dealing with anemia. Each question is scored on a five-point rating scale from 0 (“not at all”) to 4 (“very much”). 2) The POMS-Vigor and POMS-Fatigue subscales, which measure subjective mood states.10 The 15-item instrument includes 8 items describing feelings of vigor (“energetic,” “full of pep”) and 7 items describing feelings of fatigue (“listless,” “bushed”). Each question is scored on a five-point rating scale from 0 (“not at all”) to 4 (“extremely”).
Patient Data (Checklist)
In addition to the assessment instruments mentioned above, information on factors that might influence fatigue scores for the patient group also were collected. This information was recorded on a checklist, which assessed disease status, nutritional status, treatment information, and current laboratory values thought to be related to fatigue. Disease status information included cancer diagnosis, stage, infection evidence, ECOG performance status, and eating problems (e.g., loss of appetite, vomiting). Information on usual and current body weight also was recorded. Treatment regimen data consisted of whether the patient had undergone radiotherapy, chemotherapy, surgical therapy, bone marrow transplantation, or blood transfusion within the past 30 days. It was also noted whether or not the patient was taking opioid analgesics and/or was receiving biotherapy (e.g., interleukin-2, epoetin-alpha, growth-colony stimulating factor, tumor necrosis factor). Finally, patient laboratory data results with a possible relation to fatigue were recorded if they were available within the last week. These included laboratory variables, such as hemoglobin level, white blood cell count, platelets, total and absolute neutrophil count, lactic dehydrogenase, alanine amino transferase, bilirubin, blood urea nitrogen, creatinin, and serum albumin. The study coordinator or designated study nurse completed this checklist record.
Statistical Method Used to Categorize Fatigue Severity
We explored the possibility that patients could be grouped as having mild, moderate, and severe fatigue based on the “fatigue worst” item. We investigated possible boundaries for the groupings by examining the correlation between patients' self-reported interference with function and their ratings on the “fatigue worst” item. We chose “fatigue worst” to represent fatigue severity for conceptual simplicity and because this item has the largest correlation with the interference items.
By using the same boundary models we used in establishing cut points between categories for pain severity on the BPI,20 we tested two boundaries for mild levels of fatigue and two boundaries for severe levels of fatigue. For mild fatigue, we tested whether a cut point between 3 and 4 or between 4 and 5 was the optimal point for distinguishing “mild” fatigue from “nonmild” fatigue. For severe fatigue, we evaluated whether a cut point between 6 and 7 or between 7 and 8 was the optimal point for distinguishing “severe” fatigue from “nonsevere” fatigue. With two possible cut points for mild fatigue and two possible cut points for severe fatigue, there are four possible combinations with which to distinguish the three levels of fatigue severity, with “moderate” falling between “mild” and “severe.”
We assumed that persons with “mild” fatigue would have significantly less fatigue-related interference than those with “nonmild” fatigue. We also assumed that persons with “severe” fatigue would have significantly greater interference than those with “nonsevere” fatigue. Six fatigue interference items were used in a multivariate analysis to test for cut points between severity categories.
In a multivariate analysis of variance (MANOVA), there is more than one dependent variable, so there is more than one between group and error sum of squares as well as between group and error sums of cross products. These sums of squares and cross products form the basis of multivariate test criteria in much the same way as the between-group and error mean squares do in univariate analysis of variance (ANOVA). Unlike in a single-factor ANOVA in which there is only one dependent variable and, hence, one F-value associated with the between group and error sum of squares, there are several multivariate test criteria in MANOVA. Three of the most commonly used are Pillai's trace, Wilk's lambda, and Hotelling's trace,22, 23 which can each be transformed into statistics approximated by the F distribution. If these three criteria agree, then it would reinforce the appropriateness of the cut points.
To determine whether a “fatigue worst” cut point of between 3 and 4 or between 4 and 5 provides a more optimal range to describe “mild” fatigue by the impact of the six interference items, one can perform two separate multivariate analyses.22 The first analysis would create two groups based on a “fatigue worst” rating cut point of between 3 and 4. The second analysis would construct two groups based on a “fatigue worst” rating cut point of between 4 and 5. The larger of the two F statistics (either from Hotelling, or Wilk, or Pillai) would correspond to the better cut point.
We extended the preceding idea by using two cut points instead of one. Thus, instead of creating two groups based on “mild” and “nonmild,” we created three groups based on “mild,” “moderate,” and “severe.” The larger F statistic from MANOVA should be associated with the cut points that maximally discriminate fatigue severity. In this analysis, we performed four separate MANOVAs using the six interference items (relationship with others, work, enjoyment of life, walk, mood, activity) as dependent variables and the three levels of fatigue severity (mild, moderate, severe) as the between-groups factor.
Preliminary analyses indicated that several items of the fatigue questionnaire did not discriminate between controls and patients or did not vary within the patient group and were therefore dropped from the final nine-item BFI. The eliminated items included those concerning sleep quality and also the items “least fatigue” and “ability to think clearly.” Table 2 presents a descriptive summary of the key fatigue outcome measures by the two groups: normal controls and patients. A mean BFI fatigue score, which is discussed further below, was calculated from the nine BFI items. On all of the outcome measures, the patients reported significantly higher levels of fatigue scores than the controls.
Table 2. Means and Standard Deviations on Key Fatigue Outcome Measures for Patients and Control Subjects
Higher levels of fatigue are associated with lower scores on anemia specific items of the Functional Assessment of Cancer Therapy (FACT), the fatigue subscale of the FACT, and the Profile of Mood States (POMS)-vigor and with higher scores on the Brief Fatigue Inventory (BFI) and the POMS-fatigue subscale.
Statistically significant difference between patients and controls at the P < 0.01 level.
Figure 2 indicates that a large proportion of controls report scores at the lower end of the distribution of BFI scores, whereas the scores of cancer patients tend to be distributed uniformly. A greater proportion of patients report higher levels of fatigue compared with controls. These findings give evidence for the sensitivity of the BFI with cancer patients.
Establishing the Validity of the Brief Fatigue Inventory
The validation of the BFI was carried out with the patient data. Construct validity was shown by factor analysis. Concurrent validity was demonstrated by correlating the BFI with commonly used fatigue measures, such as the fatigue subscales of the POMS and FACT, the POMS-Vigor, and the nonfatigue items of the FACT. Finally, discriminant validity was presented by comparing the BFI scores of patient groups who were expected to have differing levels of fatigue.
Factor analysis is a multivariate procedure which allowed us to determine the underlying constructs measured by the items in the BFI. This procedure identified a single underlying construct among the nine BFI items. The factor loadings were high, and ranged from 0.81 for usual fatigue to 0.92 for activity. This pattern of factor loadings is indicative of the association of the nine BFI items with a single factor.
Single-Factor Model Fit
The screen plot was used to capture the correct number of factors. Clearly, the eigenvalues of 6.9, 0.57, and 0.36 for the first three factors indicate that most of the data can be explained by a single construct. In fact, the first factor explains about 75% of the variability in the data. In addition, this single-factor model fits well according to Harman's24 rule that the standard deviation of the residuals be slightly less than or approximately equal to the standard error of a correlation coefficient, which is the reciprocal of the square root of the sample size. In our case, the standard deviation of the residuals is 0.04 compared with the standard deviation of the correlation coefficient, which is 0.06.
Because the BFI measures a single construct, the arithmetic mean of the nine BFI items can be used as a global BFI score. If responses to several items are missing, then a mean BFI score adjusted for the number of missing values can be computed. Ware et al.25 recommended that a scale score be calculated if respondents answered at least one-half of the items. Thus, if an individual responds to the majority of the items (at least five items on the BFI), then a mean BFI score can be computed. However, if less than five items are completed, then it is inappropriate to compute a mean BFI score.
Factor Analysis of the Profile of Mood States and Functional Assessment of Cancer Therapy-Fatigue Scales
Separate factor analyses of the items from the fatigue subscale of the FACT and the items from the POMS fatigue scale also resulted in a single factor for each scale. Factor analysis of the items comprising both the Vigor and Fatigue subscales of the POMS yielded two factors. Similarly, a factor analysis of the items in both the fatigue subscale and the anemia-specific items of the FACT resulted in two factors.
This type of validity was established by showing that the BFI is closely related to existing instruments that measure fatigue. Two previously validated measures, POMS-Fatigue and the Fatigue subscale of the FACT, were used in this analysis. The BFI was significantly correlated with both the FACT (r = −0.88, P < 0.001) and the POMS (r = 0.84, P < 0.001) Fatigue subscales. The FACT and POMS Fatigue subscales also were significantly correlated with one another (r = −0.92, P < 0.001).
Treatment- or disease-related anemia (represented by hemoglobin level) is commonly related to fatigue in cancer patients. Due to the different hemoglobin standards based on gender, we calculated separate standard scores for males and females. By using these adjusted hemoglobin levels, the correlation between the BFI and hemoglobin was statistically significant (r = −0.36, P < 0.001). Hemoglobin also was significantly related to both the Fatigue subscale of the FACT (r = 0.38, P < 0.001) and the POMS-Fatigue (r = −0.34, P < 0.001).
Discriminant validity also was examined by comparing the BFI scores of groups formed based on other variables (i.e., ECOG performance status, FACT-Fatigue subscale, and POMS-Fatigue) that were expected to be associated with severity of fatigue. The ECOG performance status rating is widely used to assess disease severity.26 Patients with more severe disease were expected to have greater levels of fatigue. Table 3 shows that the mean BFI scores, in fact, were significantly different across different performance status ratings based on the ECOG scale. The two other fatigue instruments, the Fatigue subscale of the FACT and the POMS-Fatigue, showed similar discriminant validity.
Table 3. Means and Standard Deviations of the Brief Fatigue Inventory, Profile of Mood States, and Functional Assessment of Cancer Therapy across Performance Status Categories
ECOG = 0 (n = 55)
ECOG = 1 (n = 86)
ECOG = 2 (n = 154)
ECOG: Eastern Cooperative Oncology Group; BFI: Brief Fatigue Inventory; FACT: Functional Assessment of Cancer Therapy; POMS: Profile of Mood States.
Significantly different at each level of ECOG performance status (P < 0.001).
To show that the nine items comprising the BFI are reliable, we calculated the Cronbach's coefficient alpha for these items. Coefficient alphas range from 0 to a maximum of 1, with higher values indicating little measurement error. Individual alphas for each item (if deleted) were 0.95 for both activity and work and 0.96 for the remaining items. The internal consistency coefficient of 0.96 supports the reliability of the BFI.
Physiologic Predictors of Fatigue
Due to potentially different etiologies of fatigue with different cancers, the patient group was divided into two different diagnostic categories. The first group was made up of those with solid tumors, such as breast and lung carcinoma. The second group was made up of those diagnosed with hematologic malignancies, such as leukemia, lymphoma, and multiple myeloma. This categorization was made to examine the correlation between laboratory values and fatigue in these two different groups.
Laboratory data collected from the checklist included hemoglobin level, white blood cell count, platelets, total and absolute neutrophil count, and factors related to nutritional status, such as lactic dehydrogenase, alanine amino transferase, bilirubin, blood urea nitrogen, creatinin, and serum albumin. These laboratory data were used in an exploratory regression analysis to determine which of these variables would significantly predict BFI scores. In the exploratory analyses, stepwise, backward, and forward regressions arrived at the same results.
In the hematologic group, both albumin and hemoglobin were significant predictors of fatigue. Albumin, with a standardized coefficient of −0.4, is a stronger predictor of fatigue than hemoglobin (standardized coefficient of −0.2). Hemoglobin levels, after adjustment for the effect of gender, were converted back to raw data values for easier interpretation. A unit drop in hemoglobin (g/dL) would increase the BFI score by 0.3 unit provided albumin level is held constant. A unit drop in albumin (g/dL) would increase the BFI score by 1.7 units provided that hemoglobin level is held constant. For the hematologic group, this two-predictor regression model (hemoglobin and albumin) explained 26% of the variance.
In the solid tumor group, only albumin was a significant predictor of fatigue. A unit drop in albumin level would increase the BFI score by as much as 2.4 units. This regression model accounted for 20% of the variance in the solid tumor group.
Categorizing Severity of Fatigue
We wanted to determine whether there were categories of fatigue severity that might be described as “mild,” “moderate,” or “severe.” We explored these models, as described above in Methods, based on combinations of numerical scale cut points that had been useful previously in categorizing pain severity. Table 4 shows that the cut point categories shown in model 1 were optimal based on the agreement of the multivariate criteria described in Methods: 1–3 for mild, 4–6 for moderate, and 7–10 for “severe” fatigue. However, when we examined the interference for the lower boundaries, we found that these groupings did not discriminate as well as the groupings for the upper boundaries. The two models with the highest F values, models 1 and 2, indicate that the cut point for “severe” fatigue was consistently between 6 and 7.
Table 4. F Statistics for Fatigue Severity Levels Based on Various Test Criteria in Multivariate Analyses of Variance
Possible boundary models
Figure 3 presents mean BFI interference score (composite of six items) graphed against “fatigue worst.” The optimal cut points are associated with large increases in interference. As fatigue severity increases, so does fatigue interference. However, the rate of increase (slope) changes as both of these items increase. For instance, the steepest slope was between 6 and 7. This suggests that, for every unit increase in fatigue severity, the increase in interference was larger between 6 and 7 than anywhere else along the “fatigue worst” continuum. The slope between 3 and 4 was not as steep as the slope between 6 and 7. This indicates that, although it can be said that 7–10 is the optimal “severe” level of fatigue, the steepness of the slope for the lower boundary model is less pronounced. At this point, finding the optimal cut point for “mild” and “moderate” fatigue severity should be investigated further.
The nine-item BFI taps into a single dimension that can be thought of as the subjective report of fatigue severity. It is a reliable (internally stable) instrument that is correlated with measures of performance status (patients who are more ill report higher levels of fatigue) and with physiological markers of anemia (hemoglobin) and nutritional status (albumin) known to be associated with fatigue.
The performance of the BFI in this study is virtually interchangeable with a 13-item fatigue scale that is a part of the FACT system of assessment as well as with the Fatigue scale of the POMS. Both of these other instruments also demonstrate a single factor that can be thought of as assessing severity of fatigue. All three scales are easily administered, and patients can complete them quickly, suggesting that they can be used for clinical screening as well as for outcome measures in clinical trials in which fatigue severity is an outcome of interest.
The BFI, however, may have some advantages over the other two measures. Its use of simple, single-word designations of fatigue severity levels and functional domains makes it very easy to understand. Based on our experience with the BPI, translation of the BFI into other languages should be simple and straightforward. Many patients find the response rating system (which uses 0–10 scales) easy to understand, and this type of rating fits in well with the 0–10 ratings of symptom severity that are often used in clinical practice.
The BFI and the other two measures used in this study do not capture the multiple dimensions that longer instruments were designed to represent, such as the cognitive, affective, and somatic components of fatigue. These multidimensional scales, however, are often too long for tired patients to complete. In clinical practice, one might wish to screen for patients with high levels of fatigue based on a measure such as the BFI, then do additional assessments to determine the causes of fatigue in those patients. The multidimensional measures also are probably too long to be completed when fatigue is of interest in a clinical trial. The inclusion of long assessment instruments in such trials often leads to missing data, because patients find them difficult to complete or because it is difficult to schedule sufficient time for their administration. These longer scales probably will find their application in descriptive studies of fatigue, in which they can be given in a one-on-one situation with study personnel and when enough time can be scheduled for their completion.
The three fatigue assessment tools presented here could serve to rapidly identify those patients with clinically-significant fatigue. In this study, we examined BFI scores to identify cut points that might discriminate between patients with different levels of fatigue severity. We used the same approach that we had used previously to identify “mild,” “moderate,” and “severe” levels of pain. This involved comparing fatigue severity ratings with the patient's ratings of how fatigue interfered with common functional domains. With pain severity, three distinct groupings (mild, moderate, and severe) could be formed, and the correlation between functional impairment and pain severity was nonlinear. The three groupings of pain severity clearly stood apart from one another. The correlation between fatigue severity rating and functional interference, however, is more linear at the lower end of the range of fatigue scores; therefore, and the development of cut points between mild and moderate categories of fatigue severity is less straightforward.
We found that fatigue severity ratings could be thought of as forming two groups, “severe” and “nonsevere,” with those patients rating their worst fatigue at a 7 or greater as having “severe” fatigue. By using this range (7–10) to indicate severe fatigue, 35% of our patient population and only 5% of our community control population would be identified as having severe fatigue. It is interesting to note that, with pain severity ratings, 7–10 also defines a group with severe pain.20 Clearly, those rating their fatigue severity at a 7 or greater are very tired and most probably need some type of clinical intervention to improve their function. However, because of the linear correlation between fatigue severity and functional interference found at the lower end of fatigue severity, we suggest that the cut points between “mild” and “moderate” fatigue should be regarded as provisional in nature.
Fatigue is endemic in those with cancer. Most of those working in oncology are familiar with patients who are totally disabled by their fatigue. Because of the few current treatments available, fatigue is more difficult to treat than pain. There are few clinical trials that assess fatigue as an outcome, and fatigue as a side effect of therapy is often only crudely assessed. The use of simple, easily administered, and easily scored fatigue scales should open the way for epidemiologic studies on fatigue, should improve communication about fatigue between patients and those who care for them, and should facilitate clinical trials that are focused on the development of new treatments for fatigue.
Acknowledgments: The authors thank Hong Guo and Melissa Menzies for data entry and Martha Engstrom for editing.