Meta-analysis: the efficacy of proton pump inhibitors for laryngeal symptoms attributed to gastro-oesophageal reflux disease

Authors


Dr N. Vakil, University of Wisconsin Medical School, Aurora Sinai Medical Center, 945 North 12th Street, Room 4040, Milwaukee WI 53212, USA.
E-mail: nvakil@wisc.edu

Abstract

Summary

Background

Many investigators have proposed an association between gastro-oesophageal reflux disease and laryngo-pharyngeal symptoms, suggesting that medical or surgical therapy for reflux may be useful.

Aim

To perform a meta-analysis assessing the effectiveness of medical or surgical therapy for reflux disease in adult patients with laryngeal or pharyngeal symptoms presumed to be due to gastro-oesophageal reflux disease.

Methods

Randomized controlled trials comparing medical or surgical treatments for gastro-oesophageal reflux disease against placebo were identified by searching MEDLINE (1966–September 2005), EMBASE (1974–September 2005), the CCRCT (until September 2005) and abstracts from gastroenterology and ENT meetings. The relative risks of reporting symptomatic improvement or resolution of symptoms was evaluated using a random-effects model.

Results

Five studies using high-dose proton pump inhibitor as intervention met the inclusion criteria and were included in the meta-analysis. No surgical studies met inclusion criteria. The pooled relative risk was 1.18 (95% confidence interval: 0.81–1.74). There was no heterogeneity between studies but evidence of significant publication bias. Sub-group analysis performed evaluating Jadad scores and symptom type, did not change the relative risk.

Conclusions

Therapy with a high-dose proton pump inhibitor is no more effective than placebo in producing symptomatic improvement or resolution of laryngo-pharyngeal symptoms. Further studies are necessary to identify the characteristics of patients that may respond to proton pump inhibitor therapy.

Introduction

Gastro-oesophageal reflux disease (GERD) is a disorder caused by the backflow of gastric contents into the oesophagus.1 A recent epidemiological review reported a prevalence of GERD of 10–20% and 5% in Western and Asian countries respectively.2

Many investigators have proposed an association between GERD and laryngo-pharyngeal symptoms such as hoarseness, globus pharyngeus, vocal fatigue, frequent sore throat, frequent throat clearing, chronic cough.3–7 Estimates for acid reflux causing laryngitis have ranged from 18 to 80%.8, 9 The two major proposed mechanisms for GERD-associated laryngeal disorders are acid stimulation of vagal afferents in the distal and/or proximal oesophagus and direct laryngeal contact with acid, pepsin, or other substances present in the gastro-oesophageal refluxate.7

As GERD is implicated in the pathogenesis of symptoms, it has been suggested that medical or surgical therapy for reflux disease may be useful to manage these patients. However, the results of therapy are variable and it is uncertain if either medical or surgical therapy is effective.7, 10, 11 Therefore, the aim of this study was to perform a systematic review to determine the effectiveness of medical or surgical therapy for reflux disease in adult patients with laryngeal or pharyngeal symptoms presumed to be due to GERD.

Materials and methods

Study retrieval and selection

The present meta-analysis follows the Quality of Reporting of Meta-analyses conference (QUOROM) statement guidelines.12 We performed an electronic search of the following databases: The Cochrane Central Register of Controlled Trials (until September 2005), MEDLINE (1966–September 2005), EMBASE (1974–September 2005). The search strategy for MEDLINE and EMBASE had four sets of terms: (i) terms to search for the condition of interest (pharyngeal diseases, pharyngitis, epipharyngitis, peritonsillar abscess, retropharyngeal, parapharyngeal, retrotonsillar, sore throat, throat mucus, throat clearing, throat tickle, cough, hoarseness, vocal fatigue, dysphagia, vocal cord edema, vocal cord granuloma, globus, pharyngeal reflux, laryngeal diseases, posterior laryngitis, reflux laryngitis, reflux pharyngitis, laryngeal reflux, pharyngeal reflux, laryngomalacia); (ii) terms to search for the pharmaceutical interventions evaluated (anti-ulcer agents, histamine H2antagonists, cimetidine, cisapride, ranitidine, famotidine, nizatidine, alginate, aluminium hydroxide, aluminum-hydroxide, calcium-carbonate, magnesium-hydroxide, magnesium-oxide, sodium-bicarbonate, sucralfate, proton pump inhibitor, proton pump blocker, omeprazole, lansoprazole, pantoprazole, rabeprazole, esomeprazole); (iii) terms to search for the surgical interventions evaluated (fundoplication, gastroplasty, laparoscopy, antireflux operation, antireflux surgery, gastro-esophageal reflux surgery, laparoscopic surgery, fundoplication); and (iv) the search strategy developed by the Cochrane Collaboration13 for the identification of randomized controlled trials (RCTs) and controlled clinical trials (CCTs). Sets of terms (i), (ii), (iv) and (i), (iii), (iv) was joined together with the ‘AND’ operator, in order to retrieve appropriate studies. The search strategy for the Cochrane Central Register of Controlled Trials was based only on the sets of terms (i), (ii) and (i), (iii), with the ‘AND’ operator. A manual and electronic search of the abstracts from the American Gastroenterological Association (1975–2005), United European Gastroenterology Federation (1992–2005), and American Academy of Otolaryngology – Head and Neck Surgery Foundation (1994–2005) meeting proceedings’ books was also performed. There was no language restriction in the search strategy. We did not include review articles, position papers, editorials, commentaries, and book chapters. The criteria for study inclusion are given in Table 1. Two investigators (LG and NV) separately performed the search, selected the studies, and jointly performed data extraction using pre-defined data extraction forms, and quality assessment was performed using a scale described by Jadad et al.14 A third investigator (DV) arbitrated in the event of a lack of agreement.

Table 1.   Criteria for inclusion of studies in the meta-analysis of efficacy of medical or surgical treatment in adult patients with laryngeal or pharyngeal symptoms presumed to be due to gastro-oesophageal reflux disease
  1. GERD, gastro-oesophageal reflux disease.

Patients aged ≥18 years
Patients with laryngeal or pharyngeal symptoms presumed secondary to GERD lasting not <2 weeks
Use of medical or surgical antireflux therapy
Randomized controlled trials (RCTs) or controlled clinical trials (CCTs)
Comparison with placebo
Exclusion of other primarily identifiable causes of laryngeal or pharyngeal symptoms with physical examination and/or appropriate tests
RCTs or CCTs from which was possible to extract data on proportion of patients meeting pre-defined criteria for treatment success or symptom resolution after treatment
Intention-to-treat analysis

Outcome measure and statistical methods

The outcome measure, predefined as a binary variable, was the proportion of patients reporting symptomatic improvement or resolution of symptoms after any treatment they received (medical or surgical or placebo). For this, we used the criteria for defining a ‘responder’ chosen by the authors of each paper, as these differed slightly across studies. Sub-group analysis was performed evaluating Jadad score, and the symptom reported when patients were enrolled. The relative risks (RR) of reporting symptomatic improvement or resolution of symptoms were pooled using a random-effects model.15 An analysis of treatment effect was performed on an intention-to-treat basis on evaluable patients. Proportions of patients responding to active treatment or placebo, their differences and 95% confidence intervals (CI) were calculated using the method recommended by Newcombe.16 The number needed to treat (NNT) was calculated using the reciprocal of the pooled absolute risk reduction. A Funnel scatter plot was drawn to assess the potential for publication bias. We plotted the studies’ risk ratios vs. the square root of the studies’ sample sizes to detect asymmetry in the distribution of trials and regressed the individual studies’ risk ratios to the respective sample sizes. In a funnel plot, larger studies providing a more precise estimate of the true effect of the intervention in question form the spout of a funnel, whereas smaller studies provide less precise estimate, and form the cone of the funnel. A gap in the funnel plot would indicate the potential for publication bias. According to Egger's linear regression for publication bias, a two-sided P-value of 0.10 or less was regarded as significant.17 For all other tests, a two-sided P-value of 0.05 or less was regarded as significant. NNT and its 95% CI, presented as Number Needed to Harm (NNTH) to Number Needed to Benefit (NNTB) were calculated according to the method recommended by Altman.18 All the calculations were performed with Intercooled STATA (version 8.1; Stata corporation, College Station, TX, USA) using the metan and metabias command.

Results

Study retrieval and inclusion

A flow diagram of this systematic review, with the number of papers retrieved, included and excluded with the reasons for exclusion, is shown in Figure 1. The characteristics of the studies included in the meta-analysis are shown in Table 2.

Figure 1.

 Meta analysis flow.

Table 2.   Characteristics of the studies included in the meta-analysis
ReferenceCountrySettingType of studySymptomsNo of PatientsMedication, dose (daily), durationOutcome measure
  1. PCPG, placebo-controlled, parallel groups; PCC, placebo-controlled cross-over; No, numbers; SQ, Symptom Questionnaire.

Eherer et al.19AustriaSpecialty carePCCCough, nocturnal cough, globus, sore throat, hoarseness, dysphonic attacks21Pantoprazole (80 mg, 12 weeks)SQ (response to treatment was defined as improvement in SQ)
El-Serag et al.20USASpecialty carePCPGHoarseness, frequent clearing of the throat, dry cough, globus, persistent sore throat22Lansoprazole (60 mg, 12 weeks)SQ (complete symptomatic response defined by the total resolution of all presenting symptoms of laryngitis)
Ours et al.21USASpecialty carePCPGCough17Omeprazole (80 mg, 12 weeks)SQ (response to treatment defined as a weekly cough, frequency combined with severity score for daytime or night-time of ≤1 for ≥2 weeks consecutively)
Steward et al.22USASpecialty carePCPGHoarseness, throat clearing, non-productive cough, globus sensation, sore throat42Rabeprazole (40 mg, 8 weeks)SQ (proportion of subjects noting significant global improvement defined by post-treatment symptoms reported as ‘much better’ or ‘gone’ vs. ‘worse’, ‘unchanged’, or ‘slightly better’)
Vaezi et al.23USASpecialty care in 7 US centresPCPGThroat clearing, cough, globus, sore throat, hoarseness145Esomeprazole (80 mg, 16 weeks)SQ [percentage of patients who had resolution of the primary symptom, defined as a primary symptom severity score of 0 (none) during the last 7 days of the study, but allowing a score of 1 (minimal severity) for up to 3 days]

Efficacy of medical therapy in improving or resolving laryngo-pharyngeal symptoms

Five trials19–23 met the criteria for inclusion in the meta-analysis of effect, with a total of 247 patients. Four studies were RCTs.20–23. Data from one trial was initially retrieved from the published abstract.24 However, the analysis was updated after publication of the full paper.23

The last study included was a randomized controlled crossover trial.19 Because of the unexpected carry-over and period effects, only data from the first period were included in the analysis.13

All the studies used proton pump inhibitors (PPIs) as medical treatment. As shown in Figure 2, the pooled RR to produce symptomatic improvement or resolution of symptoms in adult patients with laryngo-pharyngeal symptoms presumed secondary to GERD was 1.18 (95% CI: 0.81–1.74; heterogeneity chi-squared = 3.76; degree of freedom = 4; P = 0.439) using the individual studies’ definition of response, giving a NNT of 54 (95% CI: NNTH 7.6 to infinity to NNTB 11.4) indicating a lack of clinically meaningful effect.

Figure 2.

 Forest plot showing the effect of PPIs on symptom improvement or resolution. Estimates of relative risk for improvement or resolution of symptoms are presented, with their 95% confidence intervals using the authors’ definition of response for each study. RR, relative risk; PPI, proton pump inhibitors.

The response rate in the treatment group was 25% (95% CI: 18.9–32.3) and in the placebo group was 21.4% (95% CI: 14.8–29.9). The difference between the two rates was 3.6% (95% CI: −6.9 to 13.4).

Sub-group analysis performed evaluating Jadad score, and type of patients’ symptoms at the enrolment (only cough vs. multiple symptoms), did not produce any significant change in the pooled RR (Figures 3 and 4).

Figure 3.

 Forest plot of the sub-group analysis according to the Jadad scale. Estimates of relative risk for improvement or resolution of symptoms are presented with their 95% confidence intervals using the authors’ definition of response for each study. RR, relative risk; PPIs, proton pump inhibitors.

Figure 4.

 Forest plot of the sub-group analysis according to the type of symptoms. Estimates of relative risk for improvement or resolution of symptoms are presented with their 95% confidence intervals using the authors’ definition of response for each study. RR, relative risk; PPIs, proton pump inhibitors.

The funnel plot (Figure 5) showed an asymmetry (small trials favouring placebo are missing from the left bottom side of the plot) and the Egger's test was significant (coefficient = −0.857; 90% CI: −1.53 to −0.18, P = 0.054), suggesting the potential for publication bias or small study effects.

Figure 5.

 Funnel plots for detection of publication bias. There is plot asymmetry and a significant bias (Egger's test: coefficient = −0.857; 90% CI = −1.53 to −0.18, P = 0.054). PPIs, proton pump inhibitors.

Efficacy of surgical therapy in improving or resolving laryngo-pharyngeal symptoms

No trials of surgery met the inclusion criteria for this analysis

Discussion

This study shows that treatment with PPIs is no more effective than placebo in resolving or improving laryngo-pharyngeal symptoms presumed to be due to GERD. Sub-group analysis also shows that the quality of the manuscript, based on the Jadad scale, and the symptoms reported by the patients when enrolled, did not influence the results. The studies we evaluated used different PPIs, and used doses higher than those usually used in GERD patients, lasting at least 2 months.

Most clinicians realize that there are some patients with laryngeal symptoms who seem to benefit from PPI therapy, but our results cast serious doubt on the current practice of using an ENT examination and symptoms as a presumptive basis for a diagnosis of GERD. The accurate identification of patients with laryngo-pharyngeal symptoms who might have their symptoms caused by GERD is therefore a critical issue. Exclusion of laryngeal disorders with laryngoscopy is an appropriate first step. However, if this test is negative, there are no accepted guidelines to proceed further in the diagnostic process. ENT specialists have developed the Reflux Symptom Index (RSI) and Reflux Finding Score (RFS) to assess laryngo-pharyngeal symptoms and to record laryngoscopic findings.25, 26 The RSI is a self-administered, nine-item questionnaire for laryngo-pharyngeal symptoms and is entirely subjective. The RFS is an eight-item grading scale, but it is also subjective because it depends on the experience of the laryngologist who grades it. Neither of these has been validated as a tool to identify patients with reflux induced laryngo-pharyngeal symptoms. There is marked variability in the interpretation of the laryngoscopic findings of reflux disease. In 1992, Johnson et al.27 found that there was a tremendous variability for both ENT and GI experts in the interpretation of the larynx due to lack of standardization and definition of nomenclature. More recently, Milstein et al.,28 found that several signs of posterior laryngeal irritation (e.g. interarytenoid bar, erythema of the medial wall of the arytenoids), which are generally considered to be signs of laryngo-pharyngeal reflux, are present in a high percentage of asymptomatic individuals, raising question about their diagnostic specificity.

The use of laryngo-pharyngeal pH-metry has been proposed as a tool to identify patients with GERD as the cause for their symptoms. However, there is a lack of universally agreed-upon pH criteria to optimally define a pharyngeal regurgitation event.29 Although most studies of laryngo-pharyngeal pH have adopted the same criteria used to define gastro-oesophageal acid reflux as in esophageal pH testing (i.e. time of pH < 4), these criteria may not be valid in the pharynx, which is a less stable environment for intraluminal pH-metry than oesophagus.30 Williams et al.31 concluded that accepted criteria for gastro-oesophageal reflux are not applicable to the detection of oesophago-gastro-laryngeal acid regurgitation.

Recently, Ahmed et al.32 reported the result of a survey mailed randomly to 2000 members of both the American Academy of Otolaryngology Head and Neck Surgery and the American Gastroenterological Association. The results of the study demonstrated there was considerable variability in treatment dose, duration and perceived patient response to therapy between the two specialists. Indeed, 74% of ENT physicians reported they made the diagnosis more on symptoms than on laryngeal signs, and initiated therapy most often with PPI once daily for 2 months. Gastroenterologists were divided on pretherapy testing, 50% reporting testing with oesophagogastro-duodenoscopy followed by pH monitoring (distal more than proximal) prior to therapy, while the remaining 50% reported treating empirically with PPI twice daily for 3 months. Seventy per cent of gastroenterologists reported treatment response of <60%, while 62% of ENT physicians reported response rate of >60%.

We did not include studies that reported symptom improvement by measuring changes in symptom scores without predefining an end-point that characterized a responder. This is because it is impossible to determine a clinically meaningful change as the scores used by different investigators varied widely.

We also chose to use random effect model as the statistical technique to synthesize data. This model assumes that each trial will give a different underlying effect and corrects for this by adding the among study variance (τ2) to the weight of each study. The problem with this approach is that by adding a constant number (τ2) to the weight of each study, the relative contributions of each trial will become more equal. Small studies will therefore become more prominent and larger trial will contribute less to the overall effect estimate.33 This explain why in the forest plots the trial by Eherer et al.19 has the largest weight, despite having a considerably smaller size that the trials by Vaezi et al.23 and Steward et al.22

Potential limitations of our study do need to be considered in interpreting our results. First, with the exception of one study, the number of patients in the individual studies was relatively small. Secondly, the treatment effect sizes shown may vary according to different inclusion criteria. For example, higher proportions of subjects with oesophagitis or predominant heartburn in the study population could increase the likelihood of a response to PPIs. We were unable to measure this effect in a formal metaregression as it would have been under-powered given the small number of studies included.34

In conclusion, our evidence does not support the common clinical practice for prescribing PPI in adult patients with laryngeal or pharyngeal symptoms presumed to be secondary to GERD. Further study is necessary to identify the characteristics of the small group of patients who respond to PPI therapy. In many patients, the cause of laryngo-pharyngeal symptoms may be multifactorial and identifying patients in whom GERD may be playing a role remains a challenge. Standardized criteria for detection of potential responders that are acceptable to gastroenterologists and ENT physicians are therefore necessary for further research in this area. At the present time, identifying the patient with heartburn and or regurgitation who has laryngeal symptoms may be the most useful clinical strategy to determine who might receive PPI therapy.

Acknowledgement

Authors' declaration of personal interests: D. Vaira is a consultant for Orexo; N. Vakil is a consultant for Astra-Zeneca, Infai, Orexo, Novartis and Shire.

Declaration of funding interests: N. Vakil received grant support from Astra-Zeneca, Novartis, Boston-Scientific, Medtronics and Altana.

Ancillary