Sirs, The systematic review by Gisbert and Morena1 attempted to answer an important clinical question, concerning the efficacy and tolerability of Helicobacter pylori eradication rescue regimens. The authors claimed that levofloxacin-based triple regimens are more effective and better tolerated than quadruple therapy. Although the pooled estimate did not achieve statistical significance and there was significant heterogeneity between the studies, the authors re-analysed the data by excluding a study which most favoured quadruple therapy to ‘decrease’ the heterogeneity and ‘increase’ the odds ratio (OR) which significantly favoured levofloxacin-based rescue regimen.
We believe that the results of this meta-analysis should be interpreted with caution. There are several methodological and analytical limitations on the validity and generalizability of the results. For example, only 14 of 241 retrieved articles or abstracts were finally assessed, but exclusion reasons were given for only three, perhaps because exclusion criteria were not specified at all. One study was excluded because ‘the patients were known to be resistant to both metronidazole and clarithromycin previously to the treatment with’. As most rescue regimens were prescribed without tests for bacteria resistance, sensitivity testing around the range of the studies would be more appropriate than simply excluding studies.
We are particularly concerned by the authors’ handling of tests for heterogeneity. The random effects model was used, but the pooled estimate was not significant and showed substantial heterogeneity between included studies. The authors chose to exclude the one study (ref. 19) which favoured quadruple therapy in order to increase the pooled estimate of the OR which then became statistically significant. The authors state that ‘heterogeneity markedly decreased’ without providing the result of the test for heterogeneity. Our calculation indicates that there is significant evidence against the null hypothesis of homogeneity (Q = 19.68, d.f. = 8, I2 = 59.4%, P = 0.01). The authors called the excluded paper an ‘outlier study’, but an outlying study furthest from the null effect is actually the Nista study (ref. 28) which favoured levofloxacin therapy (OR 9.75 and 5.32 for the two arms compared). The authors should have performed a proper sensitivity analysis to assess the robustness of the result when ‘extreme’ studies are excluded. We performed the sensitivity analysis by excluding this study which also decreased the heterogeneity to the same extent (P = 0.01, I2 = 60.2% when excluding both arms; P = 0.001, I2 = 69.1% when excluding only the arm which most favoured levofloxacin) but also decreased the pooled OR (OR = 1.26, 95% CI 0.71–2.27, OR = 1.51, 95% CI 0.81–2.79 respectively), which no longer significantly favoured levofloxacin.
As H. pylori rescue studies are diverse in many aspects, whether the statistical non-significant advantage of levofloxacin was caused by confounding is not clear. The effect of clinical differences, population differences and other clinical factors should always be considered as they can cause heterogeneity in trial results.2 To address heterogeneity, we believe the authors should identify variables that could potentially explain the results, and use either meta-regression or subgroup analysis.3 However, in all the subgroup analysis, either no statistically significant difference was seen or statistical heterogeneity between studies was observed (e.g. high-quality studies only), or some important clinical characteristics were not considered in the subgroup analysis. For example, in the subgroup analysis of efficacy, treatment duration was compared within levofloxacin studies without testing for heterogeneity. In our opinion with such diverse studies it is important to provide more details of all the subgroup analyses together with the summary estimate and test of homogeneity; to discuss the studies critically and consider all possible reasons for heterogeneity; not group studies which are too diverse; and to give clinical suggestions based on clinical scenarios but not the pooled figures.
There are also some other uncertainties of their study. The authors stated ‘per-protocol’ (PP) data would be used if studies did not specify the type of analysis. PP data usually provide higher eradication rates than intention-to-treat data. However, there is no data on how many papers finally used PP data and no sensitivity analysis appears to have been carried out. The generalizability of the results is also limited given that 5 of 10 studies including the two which most favoured levofloxacin came from the same investigators (Nista et al.). Whether the outcome measures are validated and consistent between studies (e.g. the way of confirmed cure of infection) is not mentioned. In the adverse events analysis, significant heterogeneity was also mentioned but without discussion and subgroup analysis. The authors conducted a further analysis including only severe adverse events, and homogeneity was reported. However, the incidence is much lower for severe adverse events, with many zero cells in the levofloxacin group, and so the Peto method would be preferable to the Mantel–Haenszel method for handling zero cells.4
The authors’ conclusions favouring levofloxacin regimens over quadruple regimens is misleading because it is based on a summary estimate that is statistically questionable. Rescue H. pylori eradication therapy is an important clinical issue and any new treatment requires consideration based on individual patient trials and cost-effectiveness analysis at the population level. Much evidence is needed to support any new strategy. Until more homogenous clinical trials are published providing more robust data, meta-analysis should be undertaken and interpreted with caution.