Address for Correspondence: P. W. Weijenborg, MD, Department of Gastroenterology and Hepatology, Academic Medical Center Amsterdam, Meibergdreef 9, Postbox 22660, 1100 DD, Amsterdam, The Netherlands. Tel: +31 (0)20 566 8091; fax: +31 (0)20 5669478; e-mail: email@example.com
Background Symptomatic response to proton pump inhibitor (PPI) therapy in patients with non-erosive reflux disease (NERD) is often reported as lower than in patients with erosive reflux disease (ERD). However, the definition of NERD differs across clinical trials. This meta-analysis aims to estimate the rate of symptom relief in response to PPI in NERD patients.
Methods MEDLINE (1966–2010), Cochrane Comprehensive Trial Register (1997–2010) and EMBASE (1985–2010) databases were searched and manual searches from studies’ references were performed. Randomized clinical trials were selected that included patients with heartburn, and analyzed the effect of short-term PPI treatment. The primary outcome of selected studies was defined as complete or partial heartburn relief. Two reviewers independently extracted data and assessed study quality of selected articles. Random effects models and meta-regression were used to combine and analyze results.
Key Results The pooled estimate of complete relief of heartburn after 4 weeks of PPI therapy in patients with ERD was 0.72 (95% CI 0.69–0.74) (32 studies), vs 0.50 (0.43–0.57) (eight studies) in empirically treated patients, 0.49 (0.44–0.55) (12 studies) in patients defined as non-erosive by negative endoscopy, and 0.73 (0.69–0.77) (two studies) in patients defined as non-erosive by both negative endoscopy and a positive pH-test.
Conclusions & Inferences In well-defined NERD patients, the estimated complete symptom response rate after PPI therapy is comparable to the response rate in patients with ERD. The previously reported low response rate in studies with patients classified as NERD is likely the result of inclusion of patients with upper gastrointestinal symptoms that do not have reflux disease.
Gastro-esophageal reflux disease (GERD) is described by the Montreal definition as a condition in which reflux of gastric content into the esophagus leads to troublesome symptoms and/or complications.1 The most frequently reported symptom of GERD is heartburn, followed by regurgitation.2 The prevalence of the symptom heartburn is high, with 12–20% of the general population experiencing heartburn at least weekly.3,4 The symptom regurgitation is less common, with a prevalence of 6%.3 Reflux symptoms are accompanied by a decrease in health-related quality of life and an increase of healthcare costs due to frequent physician visits and prescription of chronic medication.5
Patients presenting with symptoms suggestive of reflux disease are often empirically treated with lifestyle advice and acid suppressive drugs, including proton pump inhibitors (PPIs).6 When patients do not respond to standard therapy, testing is performed to challenge the initial diagnosis and to investigate the reasons of treatment refractoriness. An upper endoscopy can confirm the diagnosis of reflux disease and is appropriate to rule out malignancy and other mucosal abnormalities of the esophagus.7 While the presence of erosive reflux disease (ERD) confirms the diagnosis, a negative endoscopy cannot be used to rule out reflux disease as a substantial part of the patients with reflux disease have no abnormalities seen on endoscopy, a condition known as non-erosive reflux disease (NERD).8,9 Once no abnormalities are found at endoscopy, an ambulatory 24-h pH monitoring, combined with intraluminal impedance measurements can help define entity, timing, chemical characteristics of reflux as well as relationship of the reflux episodes to the patient’s symptoms.10,11 The severity of esophageal acid exposure correlates with the degree of mucosal damage,12 and finding a pathological amount of acid reflux and/or an association between heartburn symptoms and reflux events, confirms the diagnosis of GERD. Patients with positive findings at pH-impedance monitoring without distal esophageal mucosal breaks are classified as having NERD. However, if no relationship is found between acid reflux and perceived heartburn and no excessive esophageal acid exposure is present, gastro-esophageal reflux is not the cause of the patient’s symptoms, and patients are classified as having ‘functional heartburn’ according to the Rome III criteria 13 or functional dyspepsia depending on the symptoms that are dominant.
The efficacy of PPI therapy in healing esophageal erosions is very high with a number needed to treat for benefit of 1.7 (95% CI 1.5–2.1), one of the lowest seen in clinical medicine.14 Besides healing reflux esophagitis, treatment with a PPI is very effective for the symptomatic relief of heartburn.6 Clinical trials evaluating the efficacy of PPI in relieving symptoms among different types of GERD generally state a larger rate of non-response in patients with NERD than in patients with ERD.15–17
In most clinical trials, NERD patients are defined only by the presence of typical reflux symptoms and a negative endoscopy. However, without appropriate functional testing it is difficult, if not impossible, to distinguish between functional heartburn, functional dyspepsia, and true NERD. The heterogeneity of the trial participants across studies could cause underestimation of the response rates to PPI treatment in NERD.
We performed a systematic review and meta-analysis of the literature aiming: (i) to estimate the response to PPI therapy in ‘true’ NERD patients (in whom diagnosis was confirmed with pH testing) compared to patients with ERD, and (ii) to estimate whether the response to PPI therapy would vary according to different diagnostic criteria for NERD.
We conducted a comprehensive search of MEDLINE (1966–2010), Cochrane Comprehensive Trial Register (1997–2010) and EMBASE (1985–2010) databases. The search strategy consisted of a combination of the following MESH terms and text words: gastro-esophageal reflux, GERD, non-erosive reflux, NERD, heartburn, pyrosis, regurgitation, and terms describing treatment: proton pump inhibitor, PPI, omeprazole, esomeprazole, rabeprazole, pantoprazole, lansoprazole, and dexlansoprazole. A Cochrane filter for identifying randomized trials was applied to the search results. Furthermore, we manually searched reference lists from trials obtained through electronic searching to identify additional studies of interest.
Selection of studies
Articles were eligible for inclusion in the meta-analysis if they fulfilled the following criteria: Studies included adults of either gender with heartburn as the predominant symptom. Subjects underwent short-term treatment with a PPI (esomeprazole, lansoprazole, omeprazole, pantoprazole, rabeprazole, or dexlansoprazole) and study outcome measures were compared using a control group treated with another pharmacological intervention or placebo. The studies were divided into groups treating ERD or NERD patients (Fig. 1). Subclassification of studies treating NERD patients were based on the way NERD was defined in the inclusion criteria of trials. Subcategory (a) contains patients with heartburn in whom no endoscopy was performed, category (b) contains patients that were defined as NERD by the presence of heartburn and a normal endoscopic assessment, and category (c) contains patients defined as NERD by the presence of heartburn, a normal endoscopic assessment and a positive 24-h pH-measurement (Fig. 1). The pH-measurement was considered positive when there was either abnormal esophageal acid exposure or a positive temporal relationship between symptoms and reflux events during the measurement. The primary outcome of selected studies was defined as complete heartburn relief or partial heartburn relief. Complete heartburn relief is defined as the complete absence of the symptom at the end of the treatment period. Partial heartburn relief is specified as ‘no more than 1 episode in the previous 7 days of mild severity’ in all studies, except in three studies where it is specified as an improvement in frequency from the baseline score.
Two reviewers independently extracted data and assessed study quality of selected articles using a data extraction form. The following clinical features mentioned in the trials were extracted: setting (primary, secondary, or tertiary centres); number of participating centers; country of origin; inclusion and exclusion criteria used; baseline characteristics of study participants; study drug; frequency of dosing; study duration; reported complete relief of heartburn; definition of partial heartburn relief; reported partial relief of heartburn. Assessment of study quality was performed according to the Jadad scoring system.18 A score of 0–5 was given depending on the article’s description of randomization, description of the method of blinding, and drop-outs during the trial were adequately described.
All statistical analyses were performed using JMP v.7 and Comprehensive Meta-Analysis 2.2. First, after obtaining the proportion of individuals achieving symptom relief, the relief rate and its respective 95% confidence interval (CI) on PPI treatment were calculated for each study. Statistical heterogeneity across the various studies was then tested with the use of Q-statistic.19 A PQ-value <0.10 indicated a significant statistical heterogeneity across studies, allowing for the use of a random effects model. Heterogeneity was quantified with the I2 metric, which is independent of the number of studies included in the meta-analysis.20I2 measures values between 0% and 100%, with higher values suggesting a greater degree of heterogeneity (I2 = 0–25%: no heterogeneity; I2 =25–50%: moderate heterogeneity; I2 =50–75%: large heterogeneity; I2 =75–100%: extreme heterogeneity).
As significant heterogeneity was found across all studies included, a random effects (RE) model was used to calculate pooled rates for symptom relief.21 These rates indicate the estimate proportion of patients reporting heartburn relief across studies. As these rates are among patients on active treatment, and no direct comparisons with placebo or control treatment are done in the studies, neither the relative risk or odds ratios for improvement vs placebo or control, nor the number needed to treat can be calculated.
The main analyses on symptom response calculated the pooled estimates of complete relief at week 4, partial relief at week 4, complete relief at week 8, and partial relief at week 8.
Potential association between the placebo response and study quality as measured by the Jadad score was determined by calculating contingency coefficients in the meta-regression. The log event rate of symptom relief on PPI therapy was the dependent variable in the meta-regression model, and the independent variable was the quality of study estimated by the Jadad score. Coefficients that reflect the percent increase in placebo response for each unit increase of the independent variable and the 95% CI for the respective coefficients were calculated. Additional analyses were performed using type, dose of PPI used, patients’ age and gender as independent variables.
Assessment of publication bias
Publication bias was determined by the funnel plot of Beggs & Mazumdar’s rank correlation test 22 and Egger’s intercept test.23 Funnel plot of studies’ standard error vs logit symptom relief rates is also included in this report.
The search strategy identified a total of 4995 references that met the search criteria. Figure 2 shows the flow of this meta-analysis, listing studies excluded and citing reasons for exclusion. A total of 59 randomized controlled trials met the criteria for inclusion in the meta-analysis comprising a total of 26 885 patients in active treatment arms (see supplementary Table S1 online).15,16,24–80 The studies were published between 1988 and 2009. Thirty-seven studies described multiple active treatment arms with different dosages or types of PPI. Fifty-seven studies were parallel group design and the remaining two studies used a crossover design. Fifty-seven studies were double blinded, while two studies used an open label design. Three studies added pH-metry to the inclusion criteria for NERD, no clinical trial has combined the pH-metry with impedance tracings.
Significant heterogeneity was found (P < 0.0001, I2 = 93.9%) across clinical trials using the I2-metric. Heterogeneity was quantified across an N of 84 studies, including studies and sub-studies that for separate treatment groups reported complete relief at week 4 as the main outcome. Complete relief of heartburn at week 4 was chosen as the primary outcome for our analysis as it was the most commonly reported outcome.
Symptom relief after 4 weeks of PPI therapy
Thirty-two studies described the rate of complete relief of heartburn after 4 weeks of PPI therapy in patients with ERD. The pooled estimate of complete relief after 4 weeks in patients with ERD was 0.72 (95% CI 0.69–0.74) (Fig. 3). Twenty-two studies described the complete relief of heartburn after 4 weeks of PPI therapy in patients with NERD, of which eight studies from subcategory (a), defining NERD by the presence of heartburn without the performance of an upper endoscopy, 12 studies in subcategory (b), defining NERD by the presence of heartburn and a normal endoscopy, and two studies in subcategory (c), defining NERD by the presence of heartburn, a normal upper endoscopy and a positive pH measurement. One of these studies reported outcome after 2 weeks of PPI treatment. The pooled estimate of complete symptom relief after 4 weeks for patients with NERD was 0.50 (95% CI 0.43–0.57), 0.49 (95% CI 0.44–0.55), and 0.73 (95% CI 0.69–0.77) for NERD subcategories (a–c) respectively (Figs 4–6).
Partial relief of heartburn after 4 weeks of PPI therapy was reported by six studies including patients with ERD and the pooled estimate was 0.75 (95% CI 0.71–0.78) (data not shown). Nineteen studies described the partial relief of heartburn after 4 weeks of PPI therapy in patients with NERD, of which eight studies from subcategory (a), 10 studies from subcategory (b), and one study from subcategory (c). The pooled estimate of partial symptom relief after 4 weeks was 0.71 (95% CI 0.59–0.81), 0.65 (95% CI 0.61–0.69), and 0.85 (95% CI 0.55–0.96) for NERD subcategories (a–c) respectively (data not shown).
Symptom relief after 8 weeks of PPI therapy
Complete relief of heartburn after 8 weeks of PPI therapy was reported by six studies including patients with ERD and the pooled estimate was 0.73 (95% CI 0.59–0.84) (data not shown). Three studies described the complete relief of heartburn after 8 weeks of PPI in patients with heartburn in whom no endoscopy was performed, the pooled estimate was 0.47 (95% CI 0.43–0.51) (data not shown). No studies reported complete symptom relief at 8 weeks in NERD groups defined by endoscopy or by pH measurement.
Partial relief of heartburn after 8 weeks of PPI therapy was reported by three studies including patients with ERD and the pooled estimate was 0.76 (95% CI 0.72–0.80) (data not shown). Three studies described partial relief of heartburn after 8 weeks of PPI therapy in patients with NERD, of which two studies treated patients empirically and one study classified NERD by a normal endoscopy. No studies classifying NERD by a normal endoscopy and a positive pH measurement reported partial symptom relief at 8 weeks. The pooled estimate of partial symptom relief after 8 weeks was 0.69 (95% CI 0.64–0.74) for empirically treated patients and 0.76 (95% CI 0.66–0.84) for NERD defined by endoscopy respectively (data not shown).
Meta-regression analysis and publication bias
Symptom response to PPI did not significantly correlate with Jadad score, age or gender of the study subjects, study setting, type or dosage of PPI used in the study (P > 0.05). Funnel plot analysis showed no immediately apparent asymmetry, arguing against the presence of a publication bias (Fig. S1).
This meta-analysis demonstrates that the estimated rates of complete and partial symptom relief after short-term PPI therapy are lower in poorly characterized NERD patients compared to the response rate in patients with ERD. However, in patients in whom the diagnosis of NERD is supported by a positive pH-measurement, the estimated complete and partial response rates after 4 weeks of PPI therapy are comparable to the response rate in ERD.
This finding is in contrast to the widespread belief that NERD patients are less responsive to PPI,6,17,81,82 indicating that when characterized correctly, NERD patients do not have a poorer response to PPI than patients with ERD. The apparent contrast with prior views on NERD derives from the issue that in the majority of trials patients have been labeled as NERD because they suffered from symptoms of heartburn in the absence of esophageal erosions. While to many clinicians heartburn and regurgitation are equivalent to acid reflux, in effect during 24-h pH measurement, up to 50% of subjects with typical reflux symptoms have a physiologic amount of acid exposure.83,84 Using the symptom association probability, a score measuring the correlation between acid reflux episodes and symptoms,85 a similar distribution is found, and in half of the subjects reporting typical reflux symptoms these symptoms are not related to actual reflux.86 This supports the theory that patients with heartburn are a heterogenous group and not all patients with a normal endoscopy can be regarded as NERD. In those patients with heartburn in whom no abnormalities during upper endoscopy and a 24-h pH measurement are found, gastro-esophageal acid reflux can be excluded as a cause of their symptoms, and they are classified as functional heartburn according to the Rome III criteria.13 Functional heartburn is a functional disorder in which symptoms are related to psychological factors and disturbed visceral perception, while reflux of acidic gastric content does not play a role. Acid suppressive therapy is thus not effective for the treatment of functional heartburn and tricyclic antidepressants and cognitive behaviour therapy can be used.87
A careful characterization of NERD patients thus seems imperative to obtain representative results for future clinical trials, both in studies with new acid-suppressive agents and in studies aimed at NERD patients refractory to PPI, otherwise pollution of the study population with functional heartburn and dyspepsia patients will significantly underestimate the true effect of the drug or operation on reflux. Characterization can be done with 24-h pH measurements, as has been done in the studies by Richter et al. and Johnsson et al.79,80 However, the addition of 24-h cathether based pH measurements to clinical trial inclusion criteria could have a negative effect on patient enrolment. A validated questionnaire such as the RDQ could be used as an alternative to characterize true GERD patients, as it has shown to increase sensitivity of GERD diagnosis.88
The rate of antireflux therapy failure is an important statistic in the clinical setting as well, as persisting reflux symptoms are one of the most frequent reasons for consultation of a gastro-enterologist. PPI failure in the clinical setting can be defined as the persistence of erosions during endoscopy, pathological acid reflux during pH-metry performed under PPI, or the persistence of typical symptoms. In NERD patients the analysis of symptoms is the most appropriate, as erosions are by definition not present in NERD patient and a clear cut relation between acid reflux episodes and symptoms (a positive SAP or SI 11) can be present in the absence of pathological acid reflux. In this meta-analysis PPI failure rate was calculated for both complete and partial symptom relief. We believe the complete relief of symptoms is closest to an objective definition of PPI success, being the most stringent method and thereby reducing the concomitant interindividual variance of subjective symptom improvement scores. When using the latter criterion and when adequately characterizing NERD patients by a negative endoscopy and a positive 24-h pH measurement, the PPI failure rate is only between 15% and 27%. This figure is compatible with failure rates in patients with ERD, and far from the 40–50% of PPI failure, which is often reported.
An additional observation in this meta-analysis is that the estimated complete symptom relief rate in patients with ERD and in empirically treated patients after 4 weeks of PPI therapy does not differ from the estimated rates in the same groups after 8 weeks of PPI therapy. This indicates that a short follow-up in the clinical setting is sufficient to estimate whether a patient will respond symptomatically to a PPI, and prolonging of the same treatment regimen in non-responders would be of limited value. This rapid response is in accordance with a previously calculated optimal cut-off value to assess symptomatic response to PPI of 1 week.89 A similar inference for research purposes could be that a follow-up of 4 weeks in clinical trials should be sufficient to observe a treatment effect, if present. It should be taken into consideration, however, that no 8-week data was available for patients with NERD defined by endoscopy with or without pH-measurement. A potential limitation in this respect is that our meta-analysis is conducted using heartburn as the main outcome, whereas other GERD-related symptoms might take longer to respond to PPI treatment.
The main potential limitation of the present meta-analysis is the small number of clinical trials including NERD patients after appropriate functional testing, although it was performed on sizeable groups. The definition for a positive pH-measurement differed slightly between studies. To define pathological esophageal acid exposure Richter et al.79 used a cut-off value of 5% time with pH < 4, while Johnsson et al.80 used 3.2% of total time or 3.4% of supine time with pH < 4 as the cut-off value and these authors also included patients with a SAP > 95%. Both approaches lead to a selection of NERD patients in which acid reflux underlies the present symptoms, however, the approach by Johnsson et al. leads to a less stringent selection and will certainly not lead to an overestimation of PPI response. Additionally Johnson et al. reported the response to PPI after 2 weeks instead of the regularly used period of 4 weeks. The expected response rate is higher after four than after 2 weeks, so the inclusion of the study in the meta-analysis will certainly not lead to an overestimation of the response to PPI in this group. A potential limitation of the present study is that it was conducted with relief of heartburn as the predominant outcome measure, as this is the most frequently reported reflux symptom.2 It must be noted that part of the GERD patients present with other dominant symptoms,88 and no analysis was performed on the response to PPI of those less prevalent reflux related symptoms as regurgitation or chest pain. Therefore no conclusions can be drawn regarding the response of those symptoms to PPI in NERD patients.
The management of the non-responding GERD patients with proven acid-related symptoms remains challenging however. Firstly, careful instructions on the use of PPIs should be given, as only 36% of patients treated in primairy care appropriately administer their PPI.90 Frequently applied strategies include a switch of PPI type or doubling the PPI dosage. There is no evidence to support these strategies, a finding supported by the lack of association between symptomatic response and PPI type and dosage in this study. There might be a role for new drug formulations or for drugs with more extensive acid suppressive potential, these are extensively discussed elsewhere.91,92 New drugs that inhibit the occurrence of both acidic and weakly acidic reflux episodes will not be on the market for the next few years. Finally, if maximum medical treatment provides insufficient symptom relief in patients with proven acid-related symtomps, surgical intervention may be warranted.
In conclusion, our analyses support the conclusion that when NERD is well defined with functional studies, PPI therapy is as effective in NERD patients as it is in patients with ERD and the PPI failure rate is only around 20%. We argue that the previously reported lower response rates in patients with NERD are the result of contamination of the NERD study populations with subjects with functional heartburn and functional dyspepsia.
A.J. Bredenoord is supported by The Netherlands Organisation for Scientific Research (NWO).
P. Weijenborg, F. Cremonini, and A. Smout have no conflicts of interest. A. Bredenoord: research funding of Astra Zeneca and Movetis-Shire; lecture fees of AstraZeneca and MMS. P.W. Weijenborg had full access to all of the data in the study and takes responsibility for integrity of the data and the accuracy of the data analysis.
PWW studied concept and design, acquired data, analyzed and interpreted data, drafted manuscript, approved final submitted draft; FC studied concept and design, analyzed and interpreted data, critically revised manuscript for important intellectual content, approved final submitted draft; AJB studied concept and design, acquired data, interpreted data, supervised, critically revised manuscript for important intellectual content, approved final submitted draft; AJPMS studied concept and design, interpreted data, supervised, critically revised manuscript for important intellectual content, approved final submitted draft.