• allergic rhinitis;
  • Cochrane meta-analysis;
  • heterogeneity;
  • influence statistics;
  • specific Immunotherapy

Results from Cochrane meta-analyses are regarded as ‘golden standard’ within evidence based medicine. Despite compliance with strict selection criteria for studies and specific statistical guidelines for a Cochrane meta-analysis (1), a significant limitation to the current methodology is the common lack of attention to the influence of heterogeneity on the overall outcome.

The aim of this article is to illustrate how valuable information regarding the impact and size of heterogeneity can be obtained using influence statistics. A recently published meta-analysis evaluating the effect of specific immunotherapy (SIT) in patients with seasonal allergic rhinitis (AR) is used as an example (2). This meta-analysis confirms that SIT is effective in reducing symptoms and use of rescue medication; but, it is impaired by significant heterogeneity. The Cochrane statistical guideline recommends a test for the potential presence (χ2 test) and size (I2) of heterogeneity (1); but, no recommendations regarding the influence of heterogeneity on the final result of the meta-analysis are provided.

Here, we use the Cochrane AR meta-analysis (2) to illustrate how influence plots and statistics can be applied. An influence plot is easily interpretable and shows the overall result of the meta-analysis when leaving one study out at a time (3–5). Figure 1 is an influence plot of the symptom scores (Fig. 1A) and the medication scores (Fig. 1B) from the Cochrane AR meta-analysis (2). A large difference between the result obtained with fixed and random models indicates heterogeneity (3–5), which is confirmed by a small χ2 test value and a large I2 (Fig. 1).


Figure 1. Figure 1 shows influence plots illustrating the symptom and medication scores, respectively, based on data from the AR meta-analysis (2). An influence plot shows the effect on the overall result after removing each of the single studies one by one. (A) Influence plot for the symptom scores based on data from the AR meta-analysis (2). The square frames the two outlier studies (Ortolani et al.). Significant heterogeneity is present (χ2 test = 0.0005), [I2 = 63% (36%; 79%)]. (B) Influence plot for the medication scores based on data from the AR meta-analysis (2). The square frames the outlier study (Dolz et al.). Significant heterogeneity is present (χ2 test = 0.0009), [I2 of 64% (34%; 80%)]. (C) Influence plot for symptom scores after omitting the two outlier studies by Ortolani et al. Significant heterogeneity is not present (χ2 test = 0.41) [I2 is approaching zero = 3.2% (0%; 58%)]. (D) Influence plot for medication scores after omitting the outlier study (Dolz et al.). Significant heterogeneity is not present (χ2 test = 0.56) [I2 = 0% (0%; 53%)].

Download figure to PowerPoint

For the symptom score, two outlier studies (6, 7) (dfbetas <−0.5) are identified as major contributors to the heterogeneity. The two studies performed by Ortolani et al. (6, 7) comprise 26 actively treated patients (and 24 placebo group patients) (2). After exclusion of these two studies comprising 4.7% of the patients, the overall standardized mean difference (SMD) changes from −0.73 [−0.97; −0.50] to −0.53 [−0.66; −0.39] [SMD (95% CI)] and the heterogeneity disappears (Fig. 1C).

For the medication score, a single study (8) comprising only 2.9% of all patients (18 actively treated and 10 placebo group patients) causes the heterogeneity and is an outlier (2) (dfbetas<−1.3). After exclusion of this study, the overall SMD for medication score changes from −0.57 [−0.82; −0.32] to −0.41 [−0.54; −0.27] (Fig. 1D).

Obvious reasons for the heterogeneity caused by outlier studies should be sought. In both studies by Ortolani et al. (6, 7) experimental extracts were used (2). The study by Dolz et al. (8) showing an impressive medication score, was performed with an established product (Alutard SQ, ALK-Abello, Denmark) given continuously for 3 years, whereas results following shorter treatment regimens were reported in most of the remaining studies (2). These factors may at least partly explain the extreme outcomes of the three influential studies, which were causing the heterogeneity of the original AR meta-analysis, and the lesson learned from this could be that characteristics of individual SIT products might also contribute to the different clinical outcomes.

Thus, excluding studies identified as extreme outliers from the AR meta-analysis (as generally advised within statistics) using influence statistics helped identify outliers and reasons for heterogeneity. Subsequently, the overall result of the AR meta-analysis was still clearly significant and without heterogeneity (Fig. 1C and D). However, there might be other cases where the outcome is less clear, and where very few patients can be pivotal for the final result.

In conclusion, caution should be exercised upon interpretation of results from meta-analyses with significant heterogeneity. We recommend including influence statistics and influence plots into the statistical guidelines of the Cochrane Library and in all meta-analyses impaired by significant heterogeneity.


  1. Top of page
  2. Acknowledgments
  3. References

The authors wish to thank Dr Moises Calderon and Jørgen Nedergaard Larsen for valuable input to this manuscript.


  1. Top of page
  2. Acknowledgments
  3. References