Metaanalysis of the correlation between radiographic tumor response and patient-reported outcomes


  • David Victorson Ph.D.,

    Corresponding author
    1. Center on Outcomes, Research and Education (CORE) Evanston Northwestern Healthcare and Northwestern University, Evanston, Illinois
    • Center on Outcomes, Research and Education (CORE), 1001 University Place, Suite 100, Evanston, IL 60201
    Search for more papers by this author
    • Fax: (847) 570-8033

  • Mehul Soni Pharm.D.,

    1. Center on Outcomes, Research and Education (CORE) Evanston Northwestern Healthcare and Northwestern University, Evanston, Illinois
    Search for more papers by this author
  • David Cella Ph.D.

    1. Center on Outcomes, Research and Education (CORE) Evanston Northwestern Healthcare and Northwestern University, Evanston, Illinois
    Search for more papers by this author



The primary aim of the current study was to determine whether radiographic tumor response is associated with patient-reported outcomes such as symptom response or health-related quality of life.


A metaanalysis was conducted of 21 available studies from 1995–2003 that provided data sufficient for examining the association between tumor response and patient-reported outcomes, including symptom response and health-related quality of life. A second aim was to examine the influence of possible moderating study variables on effect size variation.


As hypothesized, patient-reported outcome improvement rates were most frequently associated with patients classified as a complete or partial response (CR/PR), followed by those with stable disease (SD) and progressive disease (PD). Moderate effect sizes were observed between the CR/PR and SD (effect size of 0.35) and CR/PR and PD categories (effect size of 0.43). A weak effect size was found between SD and PD (effect size of 0.16), raising concern over the meaningfulness of the SD category. No significant correlations were found between effect size and patient or study characteristics. Significant associations existed between treatment duration and age, study duration, survival, and symptom response rates, especially among PD patients.


Despite significant study-to-study heterogeneity, an important association exists in the correlation between tumor response and formal measures of change in patient-reported outcomes. A better understanding of this relation would be enhanced if future reports included estimates of effect size in patient-reported outcome change by tumor response category. Practical implications, limitations, and directions for future research are provided. Cancer 2006. © 2005 American Cancer Society.

Due largely to the introduction of new therapeutic agents and improved supportive care regimens, cancer survival rates continue to climb.1–3 However, progress has been incremental over several decades, and nevertheless the great majority of people with metastatic or recurrent cancer will die from their disease.4, 5 Although a significant number of advanced cancer patients respond to first-line regimens, survival rates continue to remain fairly low, with the main outcome being largely palliative.2, 6, 7 Nevertheless, it is becoming apparent that treatment may improve quality of life even if it does not lengthen it.8

In today's treatment milieu, physicians face uncertainty as to whether palliative tumor reduction is worth the cost of potential negative symptom or quality of life disruption that often occur secondary to treatment toxicity.9 Health-related quality of life has been studied a great deal with respect to the short- and long-term experiences of cancer patients undergoing various surgical procedures and adjuvant therapies. Health-related quality of life studies concern themselves with the degree to which the usual or expected physical, functional, emotional, and social well being is altered by a medical condition or treatment.10

Although it is generally accepted that patients who respond to treatment live longer than those who do not, this correlation is highly complex.11 A similarly complex positive correlation may exist between palliative tumor reduction and patient-reported outcome improvement; however, to the best of our knowledge, the literature has never been formally reviewed.2, 7, 12, 13 The presumed correlation between tumor response and patient benefit is based on the hypothesis that symptoms are caused primarily by tumor burden. These symptoms would therefore be relieved when a tumor is made smaller with therapy, thereby improving symptom reports and health-related quality of life. A longitudinal pilot study was conducted with 30 consecutive metastatic breast carcinoma patients to evaluate the effects of paclitaxel and recombinant human granulocyte–colony-stimulating factor (rhG-CSF) on several patient-reported outcomes, such as pain relief and social, emotional, and functional well being.7 Findings indicated that nearly all patients who experienced a partial response (PR) or maintained stable disease (SD) status described an improvement in patient-reported outcome scores, whereas those with progressive disease (PD) reported an overall decline.

In a randomized clinical trial examining the relation between tumor response and patient-reported outcomes among postmenopausal patients receiving second-line endocrine treatment after tamoxifen failure (n = 128), complete and partial responders (CR/PR) reported significantly better physical well-being, mood, coping, appetite, and less dizziness than nonresponders (SD).12 In contrast to these studies, no significant group differences were found in patient-reported outcome scores between tumor responders and nonresponders (with the exception of chest pain) in a study involving patients with nonsmall cell lung carcinoma (n = 130) after radical radiotherapy.4 Moreover, the authors reported that a substantial number of nonresponding patients attained symptom palliation and improved health-related quality of life. Finally, in a study examining the relation between patient well-being and disease response among 155 breast carcinoma patients, 61% of responders and 17% of those with PD reported improved well being, whereas 24% of partial responders reported diminished well being.14 Conflicting findings from small sample studies raise important questions regarding the palliative impact of anticancer treatments among responders and nonresponders and suggest the potential benefit of a metaanalysis of the available literature.

The primary objective of the current study was to investigate the association between tumor response and patient-reported outcome improvement. Metaanalytic procedures were used to integrate study outcomes from the available published empiric studies. Metaanalysis entails the systematic quantitative review and synthesis of results from individual studies.15 We hypothesized that the magnitude of the effect of anticancer treatments on patient-reported outcome improvement would be strongest among those with CR/PR, whereas those with SD and PD would demonstrate concomitant effect size decreases. A secondary, exploratory aim was to examine the possible role of moderating variables that may be significantly related to tumor response such as type of cancer, type and duration of treatment, rate and duration of tumor response, rate and duration of patient-reported outcome improvement, age, and gender.


Search Strategy and Inclusion Criteria

A MEDLINE search of studies from 1995–2003 was performed using any of the following terms combined with tumor response, quality of life, symptoms or symptom response: lung cancer, breast cancer, prostate cancer, colorectal cancer, head and neck cancer, bladder cancer, ovarian cancer, uterine cancer, and brain cancer. Our search criteria focused on these cancer sites because of their prevalence and the common availability of tumor response data. All identified abstracts were reviewed. Articles from abstracts that reported sufficient data for effect size estimation were reviewed in their entirety to determine eligibility. In addition, articles were drawn from other literature sources available to the investigators (e.g., reference lists, peer consultations, and online sources). Studies were selected if they provided a tumor-reducing treatment (chemotherapy or radiation) to adult cancer patients, reported data on tumor response, and at least two patient-reported assessments including a baseline evaluation.

The search described above resulted in 4374 matches. Abstract review resulted in more than 350 articles that reported sufficient effect size estimation data. In addition, greater than 250 articles were drawn from other literature sources available to the investigators (online articles and 12 articles from peer consultation). Of the articles obtained from the MEDLINE search, only 14 provided data sufficient for estimating the effect size of the correlation between tumor response and patient's self-reported quality of life or symptom data. In addition, seven articles were obtained from additional online search engines available to investigators that provided sufficient data for effect size estimation between tumor response status and patient-reported outcomes. Five study authors were contacted for specific information related to their presented results.

Procedures and Study Coding

Following the recommendations of Cooper and Hedges,16 metaanalytic decision rules were established to work with this varied literature. Eligible studies were coded for publication date, demographics (age, gender), sample size, type of cancer, type and duration of treatment, schedule and method of patient-reported outcome measurement, and tumor response duration. For each study the percentage of patients with patient-reported outcome improvement was computed into a proportion across tumor response categories. In this analysis, patient-reported outcome improvement or worsening was defined as any proportional increase or decrease between tumor response groups. Whenever possible, a total health related quality of life scale score was used to estimate effect size. If no total score was provided, the primary endpoint of the study was used (e.g., psychologic well-being, pain). Proportions were pooled and averaged in studies that administered multiple quality of life measures, insofar that sufficient information was provided for aggregation (a conservative approach). Studies with multiple experimental treatment phases were coded such that each treatment was included as a separate unit of analysis. However, in one study in which no significant differences were found between treatment groups receiving different doses of the same chemotherapy agent,17 the total reported value was used. In the end, 21 studies were included after all conditions for inclusion were met (Table 1).

Table 1. Study Characteristics
StudyCancer type (no.)Treatment protocolTumor size assessmentQOL assessment
  1. QOL: quality of life; CT: computed tomography; CR: complete response; PR: partial response; NC: no change; PD: progressive disease; sxs: symptoms; chemo: chemotherapy; NSCLC: nonsmall cell lung carcinoma; FACT-L: Functional Assessment of Cancer Therapy – Lung; SD: stable disease; SEM: standard error of measurement; LCS: Lung Cancer Scale; TOI: trial outcomes index; WHO: World Health Organization; LCSS: Lung Cancer Symptom Scale; Dx: diagnosis; EORTC-QLQ-C30: European Organization for Research and Treatment of Cancer – Quality of Life Questionnaire; MRTI: magnetic resonance temperature imaging; MSAS-GDI: Memorial Symptom Assessment Scale–Global Distress Index; FACT-B: Functional Assessment of Cancer Therapy–Breast; BCM-20: Brain Cancer Module-20; FLIC: Functional Living Index–Cancer; MHI: Mental Health Inventory; BPI: Brief Pain Index: MPAC: Memorial Pain Assessment Card.

Byrne et al., 199927Advanced mesothelioma (n = 23)Cisplatin 100mg/m2 on day 1 + gemcitabine 1000mg/m2 IV on Days 1, 8,and 15 of 28-day cycle x6 cyclesCT scans at entry, before 2nd, 4th, and 6th cyclesCR = complete disappearance on two occasions at least 4 wks apart; PR = > 50% decrease in product of two diameters at least 4 wks apart; NC = < 50%change in size or < 25% increase in size; PD = > 25% increase in size or new lesionsSerial changes in sxs after chemoNot provided
Cella, 200210Advanced NSCLC (n = 599)Cisplatin 75 mg/m2 IV over 1 hr plus etoposide 100mg/m2 IV x 3 days VS cisplatin 75mg/m2 over 1 hr + paclitaxel 135mg/m2 IV over 24 hrs VS cisplatin 75mg/m2 over 1 hr + paclitaxel 250mg/m2 IV over 24 hr + GCSF 5mcg/kg/dayNot providedNot providedFACT-L v. 2 at baseline, 6 weeks, 12 weeks, and 6 months1/2 and 1/3 SD change between baseline, 12 weeks, and change scores tested as meaningful changes; SEM in the LCS and TOI at baseline & 12 weeks also tested; 1 SEM drop = “declined”; 1 SEM rise = “improved”; other = “no change”
Ellis et al., 19951Inoperable NSCLC (n = 120)Mitomycin-C 8mg/m2; vinblastine 6mg/m2; cisplatin 50mg/m2 - repeated q21 daysBaseline + every treatment cycleCR = disappearance of all disease x4 weeks; PR = > 50% decrease in product of length and width for at least 4 wks; SD/NC = < 50% decrease or < 25% increase in size without new lesion/progression; PD > 25% increase in disease or new lesionsTumor -related sxs (malaise, pain, cough, dyspnea, and othersCR = complete disappearance of sxs; PR = good improvement of sxs; NC = minor or no change; PD = worsening of sxs - as compared by the patients' grade change (not clarified further)
Frasci et al., 199928NSCLC: IIIB/IV (n = 75)Cisplatin 50mg/m2 and gemcitabine 1000mg/m2 and paclitaxel (from 50mg/m2) on days 1 and 8 q3 weeksEvaluation of tumor response assessed after three cycles; minimum duration of 4 weeks requiredWHO response criteriaModified 10-item LCSS questionnaire upon DX, after 3 and 6 cycles, and q3months until deathA decrease in total sum of modified-LCSS
Geels et al., 200013Metastatic breast carcinoma (n = 300)IV doxorubicin 40mg/m2 day 1 q3wks VS. doxorubicin 40mg/m2 day 1 + vinorelbine 20mg/m2 days 1-8 q3wksRadiologic studies: baseline, before 4th cycle, and repeated after 2 cycles if warrantedCR = disappearance of all clinical evidence (based on 2 observations at least 4 weeks apart) PR (for measurable disease) = > 50% decrease in sum of products of all measured lesions PR (for evaluable dz) = similar to above if definitely > 50% estimated decrease SD = < 50% decrease - < 25% increase PD = > 25% increaseEORTC-QLQ-C30 & Mammary Cancer Checklist (MCC): baseline, day 1 of cycle 3, then q3wks Case Report Forms (CRF): baseline + q3wksImprovement: QoL = at least 1+ from baseline; CRF = at least 1 symptom grade - Worsening: QoL = at least 1+ (& no -); CRF = at least 1 symptom grade + (& no +) Stable = neither of the above; EORTC = lower score means better QoL; CRF = lower grader means better QoL
Hobday et al., 200229Metastatic colorectal CA (n = 78)Q28 Days unless progression/toxic: eniluracil 50mg/d on days 1-7; 5-FU (30mg/m2/day) on days 2-6Physical exam/Chest X-ray and/or CT, MRTI, Ultrasound imaging; Response assessed before 2nd and 3rd cycle and then after every other cycle“Standard criteria” used to define tumor response/progression; Progression = worsening of tumor-related symptoms, > 5% weight loss, or PS drop of more than 1 levelUniscale (UNI) and EORTC-QLQ-C30 at study entry, before every other treatment, and at discontinuation> 10 pt change in EORTC-QLQ-C30 (global, functional, and symptom) or UNI scale score
Ilson et al., 199930Advanced esophageal carcinoma (n = 38)Cisplatin 30mg/m2 + irinotecan 65mg/m2 over 30 min weekly x 4 weeks of 6wk cycle x 3 cycles if SD or until progressionRepeat radiographic studies performed after the 1st and 2nd tx cycles and then after every 2 tx cyclesCR = disappearance of all clinical evidence (for at least 4 weeks) PR (for measurable dzs) = > 50% decrease in sum of products of all measured lesions PR (for evaluable dzs) = similar to above if definitely > 50% estimated decrease SD = < 50% decrease - < 25% increase PD = > 25% increaseFACT-G, EORTC-QLQ-C30, and dysphagia scale at baseline, after 1st and 2nd cycles, and then every other cyclePaired t-test - two-sided
Kris et al., 200317NSCLC (n = 221)Daily oral gefitinib, either 500 mg (two 250 mg tablets) or or 250 mg (1 tablet & 1 matching placebo) dispensed on day 1 of 28-day tx cycleRepeat radiographic imaging studies 4 & 8 weeks after randomization, then every 8 weeksRadiographic responses > 50% decrease in lesion sizeFACT-L at pretreatment and every 28 days; FACT-L subscale weekly2-point (pt) increase in summed score for 4 weeks
Langendijk et al., 20006Inoperable NSCLC (n = 65)3 GY radiation therapy per fraction (QIW) up to 30 GY doseProduct of 2 largest diameters measured before and 2-6 weeks after RT using CT or chest radiographyResponse = complete disappearance or 50% or more decrease in the diameter product; < 50% reduction = nonresponseSX: cough/hemoptysis/pain arm-shoulder/pain chest wall/appetite loss measured on 4 pt scale (1 = nil; 2 = mild; 3 = moderate; 4 = severe); Multi-item scales (dyspnea, fatigue) on a 0-100 range: (0 = nil; 1-34 = mild; 35-67 = moderate; 68-100 = severe); EORTC-QLQ-C30 version 2 + LC13 module for QoL measurement;QoL: Those w/an increase baseline C30 total score of at least 5 pts (to a min of 40) on two consecutive F/U visits = responders; Others with 61-80 (control) and 81-100 (prevention) maintaining their score over two consecutive F/U visits = responders
Langenijk et al., 20014NSCLC: Stage IIIa/b (88%) or Stage I/II (12%) with extensive disease (n = 167)2.25 Gy QIW (total = 45 Gy) + booster dose PRNProduct of two largest diameters assessed at baseline & 2-6 weeks after RT using CT scanResponse = > 50% reduction of product of two largest diameters; Nonresponse = < 50% reductionEORTC-QLQ-C30/LC13: baseline, in 4th week of RT, then at 2 weeks, 6 weeks, 3 months, 6 months, and 12 months after RT completionSingle-items (cough, hemoptysis, pain arm/shoulder, pain chest wall, appetite loss) = 1 = nil, 2 = mild, 3 = moderate, 4 = severe; Multi-item (dyspnea and fatigue) converted = 0 = nil, 1-34 = mild, 35-67 = moderate, 68-100 = severe
Middleton et al., 199831Mesothelioma (n = 39)Palliative MVP (mitomycin-C 8 mg/m2 q/6 weeks) chemotherapyChest CT scanCR = disappearance of all known disease for at least 4 wks; PR = > 50% decrease in product of length/width of tumor for at least 4 wks w/o appearance of new lesions or progression of any lesion; SD/NC = < 50% decrease or < 25% increase in tumor size, w/o appearance of new lesions or progression of any lesion > 25% for a min of 4 wks; PD = > 25% increase in lesion size or appearance of new lesions;Tumor -related sxs (malaise, pain, cough, dyspnea, and others)CR = complete disappearance of sxs; PR = good improvement of sxs; NC = minor or no change; PD = worsening of sxs - as compared by the patients' grade change (not clarified further)
Modi et. al., 20022Refractory metastatic breast carcinoma (n = 59)3, 24, or 96 hour infusions of paclitaxel; dose range: 135-250 mg/m2 at 3 week intervalsEvery 2 courses of therapy (q 6 wks)CR = complete disappearance of sxs PR = > 50% reduction in tumor size MR = < 50% - > 25% reduction SD = < 25% decrease and < 25% increase PD = > 25% increase/or new tumor growthMSAS-GDI and FACT-B: within 7 days prior to first paclitaxel admin (baseline), then every 2 cycles (or q 6 wks) before paclitaxelMSAS-GDI: 10% or more change from baseline score FACT-B: > 0.5SD = > 9.6 pts change from baseline
Osoba et al., 20008Anaplastic astrocytoma (n = 162)TMZ 150mg/m2/d x5d (chemo-txed) or 200mg/m2/dx5d (chemo-naïve) - repeated q28 days for up to 1 yrCT or MRI baseline or q2 monthsCR = complete disappearance; PR = > 50% reduction; SD = 50% reduction and < 25% increase; PD = > 25% increaseEORTC-QLQ-C30 + BCM-20 module at baseline and before each subsequent cycle> 10 pt difference in score at 4 wks = clinically significant change
Osoba et al., 200032Recurrent glioblastoma multiforme (Phase II n = 109; Phase III n = 89; Phase III n = 90)Phase II & III: TMZ 200mg/m2/d for chemo-naïve pts and 150mg/m2/d for pretreated pts Q28Days; Phase III: PCB 125 mg/m2/d for pretreated pts and 150 mg/m2/d for chemonaive on days 1-28 of 56 day cyclesTumor status assessed q2monthsStandard criteria usedEORTC-QLQ-C30 and BCM20 at baseline and subsequently before each cycle of chemotherapy (TMZ and PCB arms)10 pts or greater change in QoL scores lasting for at least two QoL assessments 4 wks apart
Seidman et al., 19957Metastatic breast carcinoma (n = 49)paclitaxel + GCSF as salvage therapyNot definedNot definedMSAS, FLIC, MHI, BPI, MPAC: baseline +Not defined
Steele et al., 200033Malignant pleural mesothelioma (n = 29)Vinorelbine 30mg/m2 weekly x6 injectionsSpiral CT scans Sum of up to 5 lesions identified at baseline to compute sum longest diameter (SLD); repeated at 4 weeksCR = disappearance of all target lesions PR = > 30% reduction in SLD PD = > 20% increase in SLDRotterdam Symptom Checklist: baseline (before CT), after each CT cycle, and at 1st F/U visit after CT completion> 20% change from baseline score
Tester et al., 19975Metastatic NSCLC: Stage IV confirmed (n = 20)paclitaxel 200mg/m2 over 3 hours; repeated q21days (max 6 cycles)Baseline + after 2 & 4 weeks of treatmentPR = > 50% disappearance of baseline lesions SD = no significant change over 6 weeks or more (subjective) PD = > 25% increase/new lesions appeared/or ECOG worsenedFACT-L: baseline + prior to q treatmentNot defined
Yung et al., 199934Malignant astrocytomas (n = 162)TMZ 200mg/m2/d x5days to chemo-naive pts and 150mg/m2/d x5days to pretreated pts q28daysNot providedAssessed based on MRI scans and corticosteroid useEORTC-QLQ-C30 and BCM20 - on Day 1 then at every visitNot provided

Because all studies provided a percentage of tumor responders as an outcome, effect sizes were calculated using the difference between proportions. The META program18 was used to calculate with a logit transformation (ln[p1/(1 – p1)] – ln[p2/(1 – p2)], in which “ln” is the natural logarithm) to normalize the distribution. This provides an averaged effect size estimate as a logit difference, which was then transformed into a percentage difference (2*[EXP(logit/2)/(EXP(logit/2)+1)–0.5]) to appreciate the magnitude of proportional difference between groups. This transformed value can be interpreted like a correlation coefficient (e.g., 0.00–0.29 = weak; 0.30–0.69 = moderate; and 0.70–1.00 = strong) (unpublished data). After effect size computation for each study, estimates were pooled and tested for homogeneity (chi-square test) to determine whether computed effect sizes from individual studies are estimates of true (errorless) population parameters. Because studies varied significantly in methodology and content (unpublished data), results were weighted by the square root of the sample size to provide an optimally weighted estimate.20 Similar to the logit transformed effect size, the optimally weighted value was also converted from a logit difference to a percentage difference for ease of interpretation. Because the adjustment procedure has the potential to artificially inflate values, it is recommended that both effect size estimates be considered.18

To account for possible selective publication bias (e.g., publication of positive results from small studies and nonpublication of negative results from small studies), the “fail-safe N” was calculated, which increases confidence in the stability of results.21 This statistic is defined as “the number of new, unpublished, or unretrieved nonsignificant or ‘null result’ studies that would be required to exist to lower the significance of a metaanalysis to some specified level—for example, to barely significant or nonsignificant.”22


Table 2 summarizes individual study proportions, averaged effect sizes, and variances, enabling one to reproduce summary statistics. A positive effect size indicates patient-reported outcome improvement, whereas a negative effect size suggests worsening.

Table 2. Study Proportions, Effect Sizes, and Variances for Individual Tumor Response Categories
StudyTumor response categoryNo. of subjectsProportion*Effect sizeVariance
  1. CR: complete response; PR: partial response; SD: stable disease; PD: progressive disease.

  2. a Numerator is the proportion of improvement and the denominator is the proportion of worsening.

Byrne et al., 199927CR/PR100.90/0.104.39412.34
Cella et al., 200210CR/PR940.47/0.53−0.24030.1715
Ellis et al., 19951CR/PR760.99/0.019.190102.35
Frasci et al., 199928CR/PR380.71/0.291.7910.6214
Geels et al., 200013CR/PR410.69/0.311.6000.5266
Hobday et al., 200229CR/PR200.80/0.202.7731.953
Ilson et al., 199930CR/PR270.96/0.046.35627.04
Kris et al., 200317CR/PR210.96/0.046.35627.34
Langendijk et al., 20006CR/PR80.75/0.252.1973.555
Langendijk et al., 20014CR/PR1430.54/0.460.32070.1133
Middleton et el., 199831CR/PR80.99/0.019.191115.44
Modi et al., 20022CR/PR230.61/0.390.89460.7673
Osoba et al., 20008CR/PR190.53/0.470.24030.8475
Osoba et al., 20008CR/PR220.95/0.055.88822.05
Osoba et al., 20008CR/PR130.54/.460.32071.246
Osoba et al., 200032CR/PR230.96/0.046.35627.23
Ramirez et al., 199814CR/PR440.82/0.183.0331.035
Seidman et al., 19957CR/PR180.94/0.065.69622.09
Steele et al., 200033CR/PR70.86/0.143.6319.689
Tester et al., 19975CR/PR60.16/0.84−3.2448.857
Yung et al., 199934CR/PR510.69/0.311.6000.4258

Study Characteristics

The studies were published from 1995–2003, mostly in oncology specialty journals. Sample sizes ranged from 20–599 participants; the average sample size was 126. The average treatment duration was just over 9 months (±9.4 mos), whereas the average study duration was just over 4 months (±3.2 mos). In this analysis, the study duration refers to the assessment time period in which quality of life data were collected. Observed study durations were similar for CR/PR, SD, and PD groups. The average objective tumor response rate was 41% (±17%) and the average tumor response duration lasted just over 6 months (±3.5 mos). The average patient-reported outcome duration was nearly 4 months (±1.3 mos), and the average survival was 9.8 months (±3.6 mos).

The average patient-reported outcome improvement rate across tumor response groups was 64% (75% in the CR/PR group, 64% in the SD group, and 40% in the PD group). Whereas these percentages suggest a high rate of patient-reported outcome improvement (especially among nonresponders) it is important to consider that these preliminary calculations made no adjustment for sample size differences, which varied widely. Furthermore, PD patient-reported outcome rates may be artificially inflated given that 29% of studies grouped SD and PD categories together. For these reasons, it is valuable to examine the transformed and weighted values because they reflect more accurate estimations of patient-reported outcome improvement. The 21 studies included information from 2629 adult cancer patients. The average median age was 55.7 years (±7.4 yrs), with 51% males and 49% females (Table 3). The type of malignancy reported included breast carcinoma (19%), nonsmall cell lung carcinoma (33%), pleural mesothelioma (14%), colorectal carcinoma (5%), gliobalstoma (14%), anaplastic astrocytoma/oligoastrocytoma (10%), and esophageal carcinoma (5%). Radiotherapy was used in two studies. Anticancer drugs included single and combined regimens of paclitaxel, vinorelbine, doxorubicin, eniluracil, fluorouracil (5-FU), temozolomide, procarbazine, cisplatin, gemcitabine, mitomycin-C, vinblastine, epirubicin, doxorubicin, iododoxorubicin and cycloplosphamide, methotrexate, and gefitinib. Patient-reported outcome measures included the Memorial Symptom Assessment Scale-Global Distress Index (MSAS-GDI), European Organization for Research and Treatment of Cancer Quality of Life Questionnaire Core 30 (EORTC QLQ-C30), Rotterdam Symptom Checklist (Psychologic subscale), CRF Pain Index, Functional Living Index – Cancer (FLIC), Lung Cancer Symptom Scale (LCSS), Functional Assessment of Cancer Therapy-General (FACT-G), Functional Assessment of Cancer Therapy – Lung (FACT-L), and tumor-related symptoms (e.g., cough, appetite loss, hemoptysis, dyspnea).

Table 3. Pooled Means, Standard Deviations, and Ranges for Demographic and Clinical Variables across 21 Studies (Overall n = 2629)
Demographic and clinical variablesMean (standard deviation)RangeNo. of studies
  1. TR: tumor response; SR: symptom response; CR/PR: complete/partial response; SD: stable disease; PD: progressive disease.

  2. Pooled means were obtained by computing the average median values for each of the 21 studies.

Age in yrs55.7 (7.37)42–6820
 Male51% 21
Treatment duration in mos9.1 (9.4)1–2419
Study duration in mos4.3 (3.1)1.5–17.520
TR rate41% (17%)16–79%21
TR duration in mos6.2 (3.6)2.5–17.515
SR rate64% (20%)38–96%21
 CR/PR75% (22%)16–99%21
 SD64% (22%)23–93%12
 PD40% (31%)6–93%18
SR duration in mos3.8 (1.3)1.5–5.69
Survival in mos9.8 (3.6)5–1511

Patient-Reported Outcome Differences across Tumor Groups

The first set of analyses computed effect sizes to assess the magnitude of symptom response differences between tumor response categories in patients reporting patient-reported outcome improvement (Table 4). In the first analysis, summary statistics were computed for the difference between the CR/PR and SD groups (n = 778). The transformed effect size was 0.3473, a moderate value signifying a 35% difference between groups. After adjusting for sample size, the optimally weighted effect size estimate suggested a 21% difference (logit difference = 0.8549; standard deviation [SD] = 1.983; standard error [SE] = 0.4022), indicating moderate variation between estimates. The logit transformed effect size value approached significance (t = 2.09; P = 0.06) and the fail-safe N was 67. The test of homogeneity indicated significant differences between studies (chi-square = 47.72; P < 0.001), suggesting that effect size differences are likely due to factors outside of sampling error.

Table 4. Tumor Response Category Effect Size Summary Statistics (Improvers Only)
Tumor response categoryNo. of studiesNo. of subjectsAveraged transformed effect sizea (optimally weighted transformed ESb)P valuecFail-safe N
Group 1Group 2
  • CR/PR: complete/partial response; SD: stable disease; PD: progressive disease.

  • a

    Magnitude of the difference between groups.

  • b

    Magnitude of the difference between groups weighted by square root of sample size.

  • c

    Two-tailed Student t test of unweighted estimate.

CR/PR vs. SD127780.3473 (0.8549)0.0667
CR/PR vs. PD186290.4335 (0.2288)0.0067
SD vs. PD95540.1623 (0.6276)0.2115

The next analysis compared CR/PR and PD groups (n = 629). The transformed effect size was 0.4335, a moderate value that results in a 43% difference between groups. The optimally weighted effect size estimate suggested a 23% difference (logit difference = 0.9319; SD = 1.282; SE = 0.5305), indicating moderate variability between estimates. The logit transformed effect size value was significant (t = 4.24; P <.001) with a fail-safe N of 67. The test of homogeneity indicated significant differences between studies (chi-square = 36.15; P < 0.01), again suggesting that effect size differences are likely due to factors outside of sampling error.

The final analysis examined differences between SD and PD groups (n = 554). The transformed effect size was 0.1623, a small value resulting in a 16% difference between groups. The optimally weighted effect size estimate also suggested a 16% difference (logit difference = 0.6276; SD = 1.249; SE = 0.3764), indicating uniformity between estimate approaches. The logit transformed effect size estimate was not significant (t = 1.195; P > 0.05) and the fail-safe N was 15. The test of homogeneity indicated significant differences between studies (chi-square = 28.21; P = 0.0004), suggesting that effect size differences are likely due to factors outside of sampling error.

Analysis of Moderator Effects

Because effect sizes were not homogeneous, it is reasonable to examine the effects of external correlates.17 Pearson correlation coefficients were calculated between all continuous moderating variables. No statistically significant relations were observed between study or demographic variables and CR/PR or SD tumor response categories. However, the median treatment duration was found to be significantly associated with several demographic and study variables, including the pooled symptom response rates of all tumor response categories (correlation coefficient [r](19) = −0.520; P < 0.05), the average symptom response rate of the PD category (r(16) = −0.660; P < 0.01), and age (r(18) = −0.602; P < 0.01). Furthermore, the median treatment duration was correlated with the median study duration (r(18) = 0.670; P < 0.01) and the median survival (r(11) = 0.625; P < 0.05). These significant variables were entered into separate simple regression analyses with median treatment duration as the dependent variable, given its level of correlation with other variables. The results indicated that each independent variable explained a statistically significant proportion of the variability in median treatment duration (Table 5). Overall, longer treatment corresponded with longer study participation, increased survival, and less symptom response (especially among PD patients). Even though younger patients tended to have longer treatments, age and symptom response were found to be statistically unrelated.

Table 5. Significant Moderating Variables from Simple Regression Analyses
 Median treatment duration
Adjusted R2F (p)dfB (p)
  1. R2: R-squared; F: F-test; p: significance level; SR: symptom response; PD: progressive disease; df: degrees of freedom; B: beta weight.

Median age0.325.8 (0.04)1,16−0.602 (0.008)
Median study duration0.4113 (0.002)1,16−0.660 (0.005)
Overall SR rate0.236.3 (0.02)1,17−0.520 (.0023)
PD SR rate0.4010.8 (0.005)1,14−0.660 (0.005)
Median survival0.325.8 (0.04)1,90.625 (0.04)


The results of the current study indicate that objective tumor response is indeed significantly associated with subjective symptom response in oncology. Compared with patients whose best overall response was SD or PD, patients with objective tumor responses were more likely to experience health-related quality of life or symptomatic benefit as measured by various patient-reported questionnaires. Meaningful effect size differences in patient-reported outcome benefit rates were not noted between patients with SD and those with PD. Others23–26 have expressed concern over the reliability and meaningfulness of a ‘stable disease’ category in oncology outcome measurement, and these data may be consistent with that concern. Symptom response was observed with some frequency in patients whom disease remains stable or progresses over the course of study follow-up, but in both cases this was less frequent than that observed with CR/PR.

Although the objective and subjective response data are clearly related, they are by no means redundant information. It is interesting to note that patient-reported outcome response duration (mean = 3.8 mos) is shorter than tumor response duration (mean = 6.4 mos), suggesting that the patient might detect worsening condition before oncologists can detect tumor growth. If so, this suggests that patient-reported outcomes are justifiably as informative as other biomedical endpoints and can result in earlier treatment changes. Both types of data are useful, in complementary ways. Overall, these findings suggest an ordered, monotonically decreasing relation between tumor response categories and patient-reported outcome improvement, with CR/PR achieving the strongest effects and PD attaining the weakest. Continued research is needed to examine the beneficial nature of anticancer therapies across tumor response groups. Whereas it is apparent that patients whose tumor responses to therapy are more likely to experience symptom response than those whose tumors do not, many patients whose tumors do not respond to therapy indeed experience a symptom response. One possible explanation for this is a placebo effect—that these patients feel better and report fewer symptoms simply because they are under active treatment. It is unknown how many of these ‘stable disease’ patients might actually have experienced a ‘minor response’ to treatment. Because the definition of CR/PR depends on the degree of tumor shrinkage, a patient with SD might have tumor shrinkage of 40–45% and still not be eligible to be a CR/PR. The objective (tumor response) and subjective (symptom response) classifications, whereas related, each contribute unique information as well; therefore, one cannot be substituted for the other.

One of the benefits of metaanalysis is to offer a rigorous quantitative assessment of available empiric data from several small studies.20 Consistent with our hypothesis of a relation between patient-reported outcome improvement and objective tumor response, we found that anticancer treatment had the strongest patient-reported outcome-enhancing effects for patients who achieved a CR or PR, followed by those who achieved SD as their best overall response. The pooling of available studies for this analysis demonstrated that patients with PD as their best overall response reported some symptom relief, which is similar to other studies that have reported patient-reported outcome gains among PD patients. These findings were supported with moderate, observed effect size differences between CR/PR and SD and CR/PR and PD, and a weak difference between SD and PD categories. Possible sources of error variance can be attributed to moderating variables such as age, treatment duration, study duration, survival, and symptom response rate. In most cases, study and treatment duration were closely related because patients continued therapy until death or showed significantly diminished quality of life. The positive relation between treatment duration and survival also intuits that the longer one is treated, the longer one will live. Age was found to be negatively related to treatment duration, highlighting the unfavorable effect of advancing age on participation in clinical trials. The negative association between symptom response rates and treatment duration, especially among PD patients, suggests that as tumor burden increases, symptom response decreases.

Several limitations of this study should be considered. Because this metaanalysis utilized a fixed-effects modeling approach with a predominantly heterogeneous sample of cancer types, treatments, and phases, our findings are generalizable only to similar studies. The greatest limitation of the current study most likely is that this literature is not adequately oriented to the research question at hand. It is troublesome that only 4% of possible studies reported sufficient information and suggests that a whole arena is not being represented for effect size calculation. Subjective symptom response is usually not a primary outcome, and studies with positive subjective symptom responses are more likely to report subjective symptom metrics than studies with negative outcomes. Although we utilized the fail-safe N to address this potential source of bias, it is very likely there are others studies that were not considered. The small number of studies also placed limitations on the types of secondary analyses we were able to conduct with moderating variables, such as controlling for cancer type or adverse treatment effects resulting from toxicity differences. Even articles that reported satisfactory information varied significantly, with unequal and small group sizes and different ways of defining tumor response categories (e.g., some studies reported three distinct categories, whereas others grouped subjects as either responders or nonresponders, often pooling SD and PD groups). Several different quality of life measures were used, and in some cases more than one instrument was used in the same study. Although most studies provided a total scale score, a few studies used specific subscales or symptoms. Also, there was inconsistency in terms of how quality of life improvement was defined (e.g., some studies used a 10-point increase from baseline to follow-up, whereas other studies did not include improvement criteria). Finally, there was considerable variability across different cancer types, treatments, and treatment duration.

Future Research

Studying the association of interest is important because more and more individuals are living with cancer. Metaanalysis methods provide a powerful way to analyze data from many different small studies that independently show varying results. There is significant heterogeneity across the studies used in this analysis, making generalizability of the conclusions difficult. However, to our knowledge, there are few published studies to choose from that would have this type of information on each study patient. It will be easier for outside evaluators to compute metaanalytic summary statistics and establish a base of normative data if future reports provide sociodemographic information (e.g., age, gender, ethnicity) and meaningful baseline/follow-up patient-reported outcome data (means, SDs, SEs, effect sizes, proportions, etc.) by tumor response classification. This would advance our knowledge regarding this important relation and facilitate the comparison of single study results to averaged norms, which would further establish conditions under which radiographic tumor response has value in patients' lives.


The authors thank Amy Peterman, Ph.D., Jennifer Beaumont, M.S., and David Kenny, Ph.D., for editorial and statistical input.