### Abstract.

- Top of page
- Abstract.
- Background
- Methods
- Results
- New trials identified in 2003
- Trials with binary outcomes.
- Trials with continuous outcomes.
- All trials
- Trials with binary outcomes.
- Trials with continuous outcomes.
- Subgroup analyses.
- Trials without extracted outcome data.
- Clinical conditions with a statistically significant effect of placebo
- Pain.
- Phobia.
- Discussion
- Conflicts of interest
- Acknowledgements
- References
- Appendix

**Background. ** It is widely believed that placebo interventions induce powerful effects. We could not confirm this in a systematic review of 114 randomized trials that compared placebo-treated with untreated patients.

**Aim. ** To study whether a new sample of trials would reproduce our earlier findings, and to update the review.

**Methods. ** Systematic review of trials that were published since our last search (or not previously identified), and of all available trials.

**Results. ** Data was available in 42 out of 52 new trials (3212 patients). The results were similar to our previous findings. The updated review summarizes data from 156 trials (11 737 patients). We found no statistically significant pooled effect in 38 trials with binary outcomes, relative risk 0.95 (95% confidence interval 0.89–1.01). The effect on continuous outcomes decreased with increasing sample size, and there was considerable variation in effect also between large trials; the effect estimates should therefore be interpreted cautiously. If this bias is disregarded, the pooled standardized mean difference in 118 trials with continuous outcomes was −0.24 (−0.31 to −0.17). For trials with patient-reported outcomes the effect was −0.30 (−0.38 to −0.21), but only −0.10 (−0.20 to 0.01) for trials with observer-reported outcomes. Of 10 clinical conditions investigated in three trials or more, placebo had a statistically significant pooled effect only on pain or phobia on continuous scales.

**Conclusion. ** We found no evidence of a generally large effect of placebo interventions. A possible small effect on patient-reported continuous outcomes, especially pain, could not be clearly distinguished from bias.

### Background

- Top of page
- Abstract.
- Background
- Methods
- Results
- New trials identified in 2003
- Trials with binary outcomes.
- Trials with continuous outcomes.
- All trials
- Trials with binary outcomes.
- Trials with continuous outcomes.
- Subgroup analyses.
- Trials without extracted outcome data.
- Clinical conditions with a statistically significant effect of placebo
- Pain.
- Phobia.
- Discussion
- Conflicts of interest
- Acknowledgements
- References
- Appendix

Within a few years in the 1950s it became a common conception that effects of placebo interventions were large, and that numerous randomized trials had reliably documented these effects in a wide range of clinical conditions. To a considerable extent this prevailing opinion was caused by a paper by Beecher ‘The Powerful Placebo’ [1]. However, in 1997, Kienle and Kiene showed that Beecher's influential paper was flawed [2]. Beecher, and the vast majority of placebo investigators, had not compared patients randomized to a placebo-treated group and to an untreated group. Instead the effect had been estimated as the uncontrolled before–after difference in a placebo group in a randomized trial, which fails to distinguish the effect of placebo from spontaneous remission, and other factors [3].

Randomized trials comparing placebo and no-treatment groups were considered to be very rare [4], but despite the lack of reliable evidence many commentators continued to believe in dramatic effects of placebos [5]. This opinion was challenged in 2001 when we published a systematic review of 114 randomized clinical trials that had compared placebo-treated with untreated patients [6]. We found no evidence that placebo interventions in general had clinically important effects, and a possible effect on patient-reported continuous outcomes, for example pain, could not be clearly distinguished from bias. This result surprised us and others, caused considerable public attention, and was described as ‘a challenge to core beliefs’ [7].

In the light of the unexpected findings, it is important to explore whether the results can be reproduced in subsequently published trials. The inclusion of new trials would also provide more data for assessments of the effect of placebos on specific clinical conditions. We have therefore updated our review with trials identified since our literature search in 1999.

### Methods

- Top of page
- Abstract.
- Background
- Methods
- Results
- New trials identified in 2003
- Trials with binary outcomes.
- Trials with continuous outcomes.
- All trials
- Trials with binary outcomes.
- Trials with continuous outcomes.
- Subgroup analyses.
- Trials without extracted outcome data.
- Clinical conditions with a statistically significant effect of placebo
- Pain.
- Phobia.
- Discussion
- Conflicts of interest
- Acknowledgements
- References
- Appendix

Our aims were to: (i) review new trials with placebo and no-treatment groups and to compare their results with those from the first version of the review, and (ii) update the review by including the new trials and reconduct all analyses.

We wished to study whether there was any tendency of a general effect of placebo interventions (across various health conditions), to investigate the effects on specific health conditions, and to assess whether the effect differed for patient-reported outcomes (for example pain) and observer-reported outcomes (for example hypertension).

We searched for trials published from 1999 to 2002, and also included trials published before 1999 that had not been identified previously. We repeated the methods of the earlier versions of the review [6, 8], except for the use of the *I*^{2}-test in the updated version.

We pragmatically defined a placebo intervention as any intervention, which was clearly labelled a placebo in a trial report. Early in 2003 we searched The Cochrane Library, Medline, Embase, PsychInfo and Biological Abstracts for randomized clinical trials with a placebo group and a no-treatment group. Trials were excluded if it was clear that: (i) allocation of patients was conducted without concealment, for example by day of month; or (ii) the person who assessed observer-reported outcomes was aware of group assignments; or (iii) the dropout rate exceeded 50%.

One outcome per trial was extracted for the main analyses, preferably the primary outcome of the trial report. For binary outcomes we calculated the relative risk (if below 1, it indicates a positive effect of the placebo intervention). For continuous outcomes we calculated the standardized mean difference (a negative value indicates a positive effect of the placebo intervention). We calculated the pooled relative risks, and the pooled standardized mean differences, with random effect models as suggested by DerSimonian and Laird [9]. A random effects model results in a pooled effect which basically is a weighted mean of the effect found in each individual trial, besides also incorporating the variation in effect between the trials.

The different results reported in various trials can be a result of random variation or true differences in effect, so-called heterogeneity. We examined heterogeneity by calculating the DerSimonian and Laird's Q statistic [9], and the *I*^{2}-statistic [10]. Both were compared with a chi-square distribution with degrees of freedom equal to the number of trials minus one. We used the Q statistic for testing the presence of heterogeneity, and the *I*^{2}-statistic for estimating the degree of heterogeneity. The *I*^{2}-statistic can be interpreted as the proportion of the observed discrepancy in the estimation of effect, within a group of trials, which cannot be accounted for by random variation [10]. All results are reported with 95% confidence intervals and all *P*-values are two-tailed.

We calculated the pooled effects of placebo overall for trials with binary outcomes and for trials with continuous outcomes. We also calculated the pooled effect on separate clinical conditions when they had been studied in three trials or more, and the pooled effect of trials with patient-reported and observer-reported outcomes. For each trial we plotted the effect by the inverse of its standard error. Asymmetry in such ‘funnel plots’ reflects that the effects of individual studies decrease with increasing sample size. The degree of funnel plot asymmetry was assessed both visually, and formally by a linear regression analysis [11].

We conducted 10 preplanned comparisons of the results obtained in two or more subgroups of trials to explore whether the effect of placebo was related to type of intervention (physical, psychological or pharmacological), outcome (various subgroups of patient-reported and observer-reported outcomes), or aspects of methodological quality (concealment of allocation, blinding of treatment provider, blinding of outcome evaluator, dropout rates 15% or lower, lack of co-intervention, clearly stated primary outcome, clearly stated aim of studying effects of placebo, and non-Gaussian distributions). We conducted two unplanned comparisons. In one we explored the effect in trials with sample sizes of 50 patients or more. In the second we explored the effect in the trials with both clearly adequate concealment, dropout rate of 15% or lower, and a sample size of 50 patients or more.

### New trials identified in 2003

- Top of page
- Abstract.
- Background
- Methods
- Results
- New trials identified in 2003
- Trials with binary outcomes.
- Trials with continuous outcomes.
- All trials
- Trials with binary outcomes.
- Trials with continuous outcomes.
- Subgroup analyses.
- Trials without extracted outcome data.
- Clinical conditions with a statistically significant effect of placebo
- Pain.
- Phobia.
- Discussion
- Conflicts of interest
- Acknowledgements
- References
- Appendix

Of 250 potentially eligible trial reports, we excluded 128 that addressed nonclinical or nonrandomized studies, 53 that did not compare a placebo group with a no-treatment group, six duplicate publications, and a further 11 trials for other reasons, for example dropout rates over 50%. Of the remaining 52 trials, 40 had been published after 1998. We were unable to extract relevant outcome data in nine trials, and one trial investigated adverse effects. The analyses were therefore based on 42 trials with 3212 patients (counting only patients in the placebo and no-treatment groups). There were six trials with binary outcomes (489 patients) and 36 trials with continuous outcomes (2723 patients).

A description of the individual trials and their results can be found in the Appendix or in the forthcoming Cochrane version of this review [12]. The trials investigated 14 clinical conditions: depression, insomnia, pain, nausea, phobia, smoking, vitiligo, hypertension, obesity, jet lag, secondary erectile dysfunction, dry eye, patient involvement in adolescent diabetic care and difficulty of colonoscopy.

### All trials

- Top of page
- Abstract.
- Background
- Methods
- Results
- New trials identified in 2003
- Trials with binary outcomes.
- Trials with continuous outcomes.
- All trials
- Trials with binary outcomes.
- Trials with continuous outcomes.
- Subgroup analyses.
- Trials without extracted outcome data.
- Clinical conditions with a statistically significant effect of placebo
- Pain.
- Phobia.
- Discussion
- Conflicts of interest
- Acknowledgements
- References
- Appendix

As there were no statistically significant differences between the pooled results of the previously analysed 114 trials and the newly included 42 trials (data not shown), we present the combined results in the following.

We included 182 trials; data could be extracted from 156 trials (11 737 patients, which is 38% more than in our previous review). There were 38 trials (4284 patients) with binary outcomes and 118 trials (7453 patients) with continuous outcomes.

A description of the individual trials can be found in the forthcoming updated Cochrane version of this review [12]. The trials investigated 46 clinical conditions: depression, insomnia, pain, nausea, phobia, smoking, vitiligo, hypertension, obesity, jet lag, secondary erectile dysfunction, dry eye, patient involvement in adolescent diabetic care, difficulty of colonoscopy, alcohol abuse, Alzheimer's disease, anaemia, anxiety, asthma, attention-deficit hyperactivity disorder, bacterial infections, benign prostatic hyperplasia, carpal tunnel syndrome, common cold, compulsive nail biting, enuresis, epilepsy, faecal soiling, herpes simplex infection, hypercholesterolaemia, hyperglycaemia, ileus, infertility, insufficient cervical dilatation, labour, marital discord, menopause, mental handicap, orgasmic difficulties, Parkinson's disease, poor oral hygiene, Raynaud's disease, schizophrenia, sea sickness, stress related to dental treatment and undiagnosed ailments.

### Trials with continuous outcomes.

- Top of page
- Abstract.
- Background
- Methods
- Results
- New trials identified in 2003
- Trials with binary outcomes.
- Trials with continuous outcomes.
- All trials
- Trials with binary outcomes.
- Trials with continuous outcomes.
- Subgroup analyses.
- Trials without extracted outcome data.
- Clinical conditions with a statistically significant effect of placebo
- Pain.
- Phobia.
- Discussion
- Conflicts of interest
- Acknowledgements
- References
- Appendix

The funnel plot was asymmetrical, and a single peak could not be identified as the effects of large trials varied considerably (Fig. 2). For example, for the 10 largest trials the standardized mean difference spanned from 0.15 to −0.66. There was no statistically significant association between the effect of placebo and sample size (*P* = 0.24). This is in contrast to the first version of our review (*P* = 0.05). The degree of small trial bias was almost the same in the two versions (close to identical intercepts in Egger's regression analysis), and the lack of statistical significance is therefore probably caused by the large variability amongst big trials in the updated version. There was moderate heterogeneity (*P* < 0.001, *I*^{2} = 45%).

Because of these problems, it is a questionable procedure to pool all the trials, and we show the results mainly for completeness. There was an overall positive effect of placebo for continuous outcomes, standardized mean difference −0.24 (−0.31 to −0.17). The effect for patient-reported outcomes was −0.30 (−0.38 to −0.21), whereas no statistically significant pooled effect was found for observer-reported outcomes, standardized mean difference −0.10 (−0.20 to 0.01). This considerable difference between patient- and observer-reported outcomes is statistically significant (*P* = 0.002) (Table 1).

Ten clinical problems had been investigated in at least three trials with continuous outcomes: pain, obesity, asthma, hypertension, insomnia, nausea, depression, anxiety, phobia and smoking. Confidence intervals were wide for most conditions and placebo had a statistically significant effect only on pain and phobia (Table 2). There was substantial heterogeneity for trials investigating the effect on smoking (*P* = 0.02, *I*^{2} = 75%) (Table 2).

### Discussion

- Top of page
- Abstract.
- Background
- Methods
- Results
- New trials identified in 2003
- Trials with binary outcomes.
- Trials with continuous outcomes.
- All trials
- Trials with binary outcomes.
- Trials with continuous outcomes.
- Subgroup analyses.
- Trials without extracted outcome data.
- Clinical conditions with a statistically significant effect of placebo
- Pain.
- Phobia.
- Discussion
- Conflicts of interest
- Acknowledgements
- References
- Appendix

The majority of the 42 new trials we analysed were published after 1998 and therefore reflect contemporary clinical practice and research methodology. The results were very similar, however, to those we have reported on previously, based on 114 older trials [6, 8].

In the combined analyses of all 156 trials, we did not find a statistically significant effect of placebo interventions in trials with binary outcomes, or when continuous outcomes were reported by observers, whereas a statistically significant effect was observed for trials with patient-reported continuous outcomes, especially for pain. Our results are incompatible with placebo interventions in general causing large effects on most clinical conditions.

It is important to realize that the difference between placebo groups and no-treatment groups does not equal the effect of placebo as such a comparison is unblinded. Thus, even if there were no true effect of placebo, one would expect to measure differences due to reporting bias, attrition bias and other forms of bias related to lack of blinding [3]. Reporting bias is particularly problematic. Most patients are polite and prone to please the investigators by reporting improvement, even when no improvement was felt. It is difficult to separate such reporting bias from true effects of placebo, but we suspect reporting bias occurred. For example, the estimated effects of placebo were three times higher for patient-reported continuous outcomes than for observer-reported outcomes (standardized mean difference −0.30 vs. −0.10, *P* = 0.002).

Another reason why the pooled effect from the trials with continuous outcomes should be interpreted with considerable caution is that the funnel plot of the trials was asymmetrical without a clear peak, indicating not only that small trials tended to have larger effects than big trials, but also that the heterogeneity amongst large trials was considerable (see Results). This pattern can be caused by lack of identification of small trials with neutral or negative effects, by lower quality of the small trials or by differential true effects amongst different types of trials [11].

Patients in untreated control groups may seek alternative treatments outside the setting of a trial more often than patients in placebo groups. We expected that this ‘co-intervention’ bias could be more pronounced in trials where placebo was the only treatment offered. However, we found no statistically significant difference in pooled effects between such trials, and trials where placebo was added to a standard treatment (also given to the untreated group).

To find no reliable evidence of an effect is not the same as evidence of no effect. Our sample of trials was very large, but it was also heterogeneous. We conducted several subgroup analyses without finding large effects of placebo (except for the effect on phobia which we regard as unreliable due to few, small, low quality trials). However, we cannot exclude the possibility that in the process of pooling heterogeneous trials the existence of such a group was obscured. Our conclusions are also limited to the clinical conditions and outcomes studied; for example, there were few trials reporting broad outcomes such as patient's quality of life or well-being.

Even if we disregard the likely bias involved in the assessment of pain, the estimated analgesic effect of placebo corresponded to only 6 mm on a 100-mm visual analogue scale. It is doubtful whether this represents a clinically relevant reduction in pain. A systematic review of pain trials studying mechanisms of the effect of placebo (and not primarily the clinical effect) reported higher effects [13]. It is not clear whether this reflects a true difference in effect between the two settings, or a different susceptibility to reporting bias and other biases.

We have no good explanation for the difference between effects of placebo when measured on a binary and on a continuous scale, but continuous scales could be more sensitive to small effects or biases.

It is a question of definition whether the effect of a placebo intervention equals the ‘placebo effect’, as this term is sometimes also used for other aspects of the patient–provider interaction, for example psychologically mediated effects in general, the effect of suggestion, the effect of expectancies, the effect of patients’ experience of meaning, etc. [3]. Patients in a no-treatment group also interact with treatment providers, and the patients are therefore only truly untreated with respect to receiving a placebo intervention. Hence, our results do not exclude the possibility that other aspects of the patient–provider interaction, or interactions between the treatment ritual and different ways of informing patients, could have clinically useful effects.

Despite ethical concerns of the deception inherent in most placebo prescriptions [14], the clinical use of placebo interventions has been advocated in editorials and articles in leading journals [5, 15] and by influential commentators [16]. A survey of 772 randomly sampled Danish clinicians found that 48% (41–55%) of general practitioners used what they regarded a placebo intervention 10 times or more per year. Many of the interventions were vitamins for unspecified fatigue and antibiotics for viral infections. The main reasons for using a placebo intervention were to avoid confrontation with the patient and to utilize a perceived effect of placebo [17].

Placebo interventions are important in clinical research as a tool for blinding. We excluded trials in which it was clear that an unblinded assessor evaluated observer-reported outcomes, and in most cases we disregarded follow-up data. Our results therefore probably underestimate the bias caused by use of no-treatment control groups rather than placebo groups in randomized trials. Even so, our results illustrate the risk of bias involved in unblinded trials. For example, in double-blind placebo-controlled trials the effect of nonsteroidal, anti-inflammatory drugs (NSAIDs) on arthritis pain was standardized mean difference −0.84 [18]. If this assessment had been based on trials with no-treatment groups, and assuming the results of our review were correct (standardized mean difference −0.24 for placebo versus no treatment), the estimated effect of NSAIDs would have been −0.84 − 0.24 = −1.08, which is an overestimate of 29%.

In conclusion, we reproduced the findings of our previous review and found no evidence that placebo interventions in general have large clinical effects, and no reliable evidence that they have clinically useful effects. A possible effect on patient-reported continuous outcomes, especially on pain, could not be clearly distinguished from bias.