Physician ratings of the extent to which 15 FKSI items were believed to be treatment- or disease-related were examined for missing data. Less than 1% of the total expected physician responses to the survey were missing. Any responses of “neither” (54 of 465, 11.6% of total) were excluded from the analyses. Table 2 and Figure 1 summarize the physician ratings of the degree to which items are disease- versus treatment-related.
Figure 1. Expert ratings of disease- versus treatment-related attribution by symptom: Means and 95% confidence intervals per FKSI Item. Items rated by experts as “exclusively disease-related” received a score if (+2); “exclusively treatment-related” (−2); “predominantly disease-related” (+1), “predominantly treatment-related” (−1), and ratings of “too close to determine” were assigned a score of 0. FKSI, Functional Assessment of Cancer Therapy—Kidney Symptom Index.
Download figure to PowerPoint
Using the first method of analyses, we identified seven items that were categorized as either exclusively disease-related (i.e., +2) or predominantly disease-related (i.e., +1) by more than 50% of all respondents (i.e., at least 16 of 31). These items assessed pain, weight loss, bone pain, dyspnea (“shortness of breath”), worry, cough, and hematuria (“blood in urine”). Seven other items, although not clearly rated by the majority as disease-related, were quite frequently rated as “too close to determine.” This list included lack of energy, fatigue, ability to enjoy life, appetite, bothered by fevers, ability to work, and sleep. Only the item “I am bothered by side effects oftreatment” was clearly rated as predominantly treatment-related.
Using the second method of analyses, we identified nine items whose 95% CI did not fall below zero (i.e., representative of the “too close to determine” category). These items included all seven items identified as more disease-related by the first method of analyses, plus fever and fatigue. “Lack of energy” came close to significance, with the 95% CI touching zero. This process produced a candidate set of 10 questions from the FKSI for consideration as likely candidates for designation as FKSI-DRS: lack of energy (borderline), pain, weight loss, bone pain, fatigue, shortness of breath, worry that condition will get worse, coughing, bothered by fevers, and blood in urine. Of this list of 10 items, nine are physical symptoms. One is psychological (“I worry that my condition will get worse”), and not likely to be caused directly by disease activity in any physical sense. On the basis that it was highly unlikely that worry as a symptom could be directly attributed as a symptom of kidney cancer, it was deleted from further consideration in the FKSI-DRS, resulting in a nine-item scale (see Appendix I).
Because the reduction of the FKSI-15 to the nine-item FKSI-DRS was based primarily on expert clinical input, we obtained additional patient input during the course of the FKSI-15 validation study. Specifically, we interviewed 15 people with advanced kidney cancer and asked them to respond to eight items that were rated most associated with disease rather than treatment, based on interim results of the expert survey study (n = 18): pain, weight loss, bone pain, shortness of breath, worry that condition will get worse, coughing, bothered by fevers, and blood in the urine. Patients were asked five questions: 1) whether the list of eight expert-derived symptoms represented the most important set of symptoms relative to their condition; 2) the relative importance of fatigue, sleep, and appetite (which were missing from the short list); 3) whether they had experienced significant fatigue since their diagnosis, and if so, whether they thought their fatigue was more physical or mental, and what percentages they would assign to these attributions; 4) whether they thought “fatigue” was different from “lack of energy”; and 5) whether there were any other symptoms they felt were associated with their condition that we should be asking about.
In response to the first question, a majority of the 15 patients (n = 10, 67%) endorsed the list of items as the most important symptoms relative to advanced kidney cancer. Patients volunteered symptoms that were not on the list, including fatigue/lack of energy (n = 3), reaction of loved ones/concern about family relationships (n = 3), and depression/worry (n = 1). Responses to the second question revealed clear endorsement of fatigue (n = 11) from patients and mixed endorsement of appetite (n = 7) and sleep (n = 6) as problems associated with advanced kidney cancer. Consistent with the prior question, a majority of patients (n = 11, 73%) reported experiencing significant fatigue since their diagnosis, and most (n = 7, 64%) thought of their fatigue as a physical manifestation as opposed to mental (n = 1, 9% mental; n = 3, 27% both). Patients' estimates of the extent to which they thought of their fatigue as physical averaged 72% (range 40–99). Two-thirds reported that fatigue was different than lack of energy, but patients were unable to offer any consistent distinction between the two concepts. When asked to volunteer symptoms that were not on the list, patients failed to generate any new candidate disease-related symptoms.
The FKSI-DRS showed high internal consistency at the baseline assessment (Cronbach's alpha [α] = 0.78), time 2 assessment (0.75), and time 3 assessment (0.78). Cronbach's alpha at all time points exceeded 0.70, which is a common minimum standard for internal consistency reliability, and suggestive that the FKSI-DRS can be used as an independent measure of disease-related symptoms and functioning. The stability of the FKSI-DRS over time was high, with an ICC of 0.85 between time 1 and time 2 (range of possible values = 0.0–1.0). Thus, the symptom index shows high test–retest reliability between baseline and 3–7 days post baseline.
Convergent validity. The associations between FKSI-DRS, FACT-G and subscale scores were evaluated using Spearman correlations. Because it captures physical symptoms of disease, the FKSI-DRS was expected to be most highly correlated with the PWB subscale of the FACT-G, and correlated to a slightly lesser extent with the FWB domain compared with psychosocial (EWB and SWB) scores. Indeed, this was the case at both time 1 and time 3, with very high correlations between FKSI-DRS and PWB scores (r range = 0.84–0.85) and FWB scores (r range = 0.69–0.71). As would be expected because of its content of disease-related physical symptoms, correlations between FKSI-DRS and EWB and SWB were low to moderate (r range = 0.30–0.52). The FKSI-DRS contains two PWB items. Therefore, these correlations were inflated by redundancy. Nevertheless, even after redundancy was removed and the FKSI-DRS was correlated with an abbreviated PWB score, removing all overlap, correlations remained comparable. The FKSI-DRS correlation with the five-item PWB at both time 1 and time 3 was 0.78. In all cases, including those corrected for overlap, correlations between FKSI-DRS and FACT-G scales were significant at P < 0.0001 with the exception of SWB at time 3, which was significant at P < 0.001.
Discriminant (known-groups) validity. The ECOG PSR was trichotomized into PSR = 0, PSR = 1, and PSR > 1. For the FKSI-DRS, all scores across PSR groups were in the appropriate direction, that is, patients with the lowest PSR (i.e., best performance status) had the highest FKSI-DRS scores (i.e., greater well-being and symptom status), and those with higher PSR had FKSI-DRS scores reflecting poorer well-being and symptom status (Table 3). Based on cross-sectional analyses, the FKSI-DRS differentiated patients grouped by PSR (P < 0.0001). Effect sizes were calculated for group comparisons to provide an indication of the clinical significance of group differences. Following Cohen's guidelines for effect sizes , effect sizes for adjacent PSR groups were moderate to large for the FKSI-DRS (e.g., 0.69–1.01).
Table 3. Effect sizes of FKSI-DRS: cross-sectional (baseline) scores by ECOG performance status and longitudinal scores by GRCS
|Scale||ECOG PSR||n||Mean (SD)||Common SD||Group comparisons|
|Group||Mean difference||Effect size*||P-value|
|1||50||27.22 (4.16)|| ||0–2||9.22||1.71|| |
|2||25||23.48 (6.02)|| ||1–2||3.74||0.69|| |
|FKSI-DRS change‡||Worse||13||−3.15 (3.18)||3.9||Worse vs. same||−3.11||−0.80||0.0024†|
|Same||108||−0.04 (3.72)|| ||Same vs. better||−2.34||−0.60|| |
|Better||10||2.30 (4.74)|| ||Worse vs. better||−5.45||−1.40|| |
Responsiveness to clinical change. Changes in FKSI-DRS scores were calculated by subtracting patients' time 1 scores from the time 3 scores. Patients' GRCS scores were categorized as “worse,”“same,” or “better” by collapsing GRCS domain scores that were rated on a 15-point scale (−7 through 0 to +7). As a result of sample size restrictions at the extremes of change, all gradations of change were collapsed into one “changed” category for worse and one category for better. GRCS scores were categorized as “worse” if they were rated <−1, “same” if rated −1 to +1, and “better” if rated >+1. ANOVA techniques were used to compare mean change scores on the FKSI-DRS between categories of change (better, same, worse) in each anchor variable (GRCS domain). It is important to note that the practice of pooling all globally changed patients into one group will extend the magnitude of change beyond “minimal,” because itincludes categories of change that exceed minimal by the patient's own judgment.
Changes in FKSI-DRS scores were in the anticipated direction, such that patients who rated themselves as worse on the GRCS had worsening scores on the FKSI-DRS (Mean [SD] = −3.15 [3.18]), patients who rated themselves as improved on the GRCS tended to have improvements in the FKSI-DRS (2.30 [4.74]), and patients who reported remaining the same on the GRCS tended to have change scores for most subscales scores that were in between the two other groups (= −0.04 [3.72]) (see Table 3). In addition, changes in FKSI-DRS scores were significantly different between the “worse” versus “same” and “better” groups (F2,128 = 6.34, P = 0.0024).
Estimating important differences. Distribution- and anchor-based methods were used to estimate FKSI score changes that represent “clearly” important differences (CIDs) and that might approximate MIDs. Distribution-based estimates included 1/2 SD, 1/3 SD, and 1 standard error of measurement (SEM). Anchors used were the PSR and the GRCS. Because of the course nature of the clinical anchors and the collapsing of changes of any magnitude into one group, we considered them to provide conservative estimates of CIDs rather than being MIDs. Table 4 displays the distribution-based estimates of these differences at baseline, time 3 and change from baseline to time 3. The full range of distribution-based MIDs for the FKSI-DRS was 1–3 points, with most estimates in the 2–3 point range.
Table 4. Distribution-based estimates of minimally important differences (MIDs) for FKSI-DRS
| ||n||1/3 SD||1/2 SD||Criterion SEM||Likely range of MID|
| Time 3||131||1.76||2.65||2.07|| |
| Baseline to time 3 change||131||1.30||1.95|| || |
Cross-sectional (PSR) anchor-based criteria yielded larger estimates, again not necessarily reflecting MIDs. As displayed in Table 3, effect sizes for baseline scores of adjacent PSR groups ranged from 0.69 to 1.01 (approximately 4–5 points) for FKSI-DRS. The GRCS was analyzed as a longitudinal clinical anchor to determine MIDs. The longitudinal anchor-based criteria yielded estimates closer to the distribution-based estimates, with effect sizes for adjacent GRCS groups ranging from 0.60 to 0.80 for FKSI-DRS (approximately 2–3 points).
The effect sizes for the cross-sectional and longitudinal anchor-based CID estimates using PSR for FKSI-DRS are in the moderate to large range . Change scores associated with effect sizes >0.50 (moderate) exceed what would be considered minimal. Reconciling the rather large effect sizes of the anchor-based comparisons with the smaller distribution-based estimates, it is reasonable to suggest 2–3 points as the MID range for the FKSI-DRS.