By continuing to browse this site you agree to us using cookies as described in About Cookies
Notice: Due to essential maintenance the subscribe/renew pages will be unavailable on Wednesday 26 October between 02:00- 08:00 BST/ 09:00 – 15:00 SGT/ 21:00- 03:00 EDT. Apologies for the inconvenience.
The diagnosis of gastro-oesophageal reflux disease (GERD) remains a challenge as both invasive methods and symptom-based strategies have limitations. The symptom-based management of GERD in primary care may be further optimised with the use of a questionnaire.
To assess the diagnostic validity of the GerdQ questionnaire in patients with symptoms suggestive of GERD.
Patients with symptoms suggestive of GERD without alarm features, underwent upper endoscopy, and if normal, pH-metry. Patients were followed for 4 weeks and GerdQ was completed blinded to the investigator at both visits. Reflux oesophagitis or pathological acid exposure was used as diagnostic references for GERD. The diagnostic accuracy for GERD on symptom response to proton pump inhibitor (PPI) was assessed.
Among the 169 patients, a GerdQ cutoff ≥9 gave the best balance with regard to sensitivity, 66% (95% CI: 58–74), and specificity, 64% (95% CI: 41–83), for GERD. The high prevalence of reflux oesophagitis (81%) resulted in a high proportion of true positives, but at the same time a high proportion of false-negatives. Consequently, GerdQ had a high positive predictive value, 92% (95% CI: 86–97), but a low negative predictive value, 22% (95% CI: 13–34), for GERD. Symptom resolution on PPI therapy had high sensitivity, 76% (95% CI: 66–84), but low specificity, 33% (95% CI: 17–53), for GERD.
GerdQ is a useful complementary tool for the diagnosis of gastro-oesophageal reflux disease in primary care. The implementation of GerdQ could reduce the need for upper endoscopy and improve resource utilisation. Symptom resolution on proton pump inhibitor did not predict gastro-oesophageal reflux disease.
There is no single perfect test for establishing the diagnosis of gastro-oesophageal reflux disease (GERD) neither by symptom evaluation nor by objective investigations, including oesophagogastroduodenoscopy (EGD) and pH-metry, and each of these methods has limitations. EGD has high specificity but moderate sensitivity for untreated GERD since many of these patients do not have reflux oesophagitis. In patients with endoscopy negative reflux disease (ENRD), oesophageal pH-metry has moderate sensitivity and high specificity for the diagnosis of GERD. EGD and pH-metry can be unpleasant to some patients, have limited availability in most healthcare systems and are resource demanding and costly. Symptom response to a short course of proton pump inhibitor (PPI) therapy is of questionable value as a diagnostic method for GERD.[3-7]
The awareness of the fact that many patients with symptoms of GERD have no visible oesophageal mucosal breaks on endoscopy, i.e. the ENRD patients, has led to a change in how these patients are diagnosed and treated. Current guidelines recommend a symptom-based approach for the diagnosis and treatment in primary care of patients who are young, have a short disease history and no alarm symptoms.[8-11]
However, the symptom-based assessment is not straightforward either and often leads to misinterpretation of symptoms, including their localization and burden. The sensitivity and specificity of heartburn and/or regurgitation for GERD varies considerably depending on the criteria set for frequency and intensity of symptoms. GERD has also overlapping symptoms with other differential diagnoses such as functional dyspepsia, irritable bowel syndrome, and extra-oesophageal syndromes like chronic cough, laryngitis and asthma adding to the complex symptomatology of patients with upper GI symptoms.[13, 14]
There have been several attempts to develop questionnaires to facilitate the symptom-based diagnosis and management of GERD. However, most of these questionnaires such as the Gerd Impact Scale (GIS) and the ReQuest questionnaire are not developed or validated as diagnostic tools per se and are often too long and complicated for use in routine clinical care.
GerdQ is a 6-item, easy to use questionnaire that was developed primarily as a diagnostic tool for GERD in primary care patients consulting for upper GI complaints. In a previous study, GerdQ achieved similar diagnostic precision as a routine symptom-based diagnosis made by a gastroenterologist and performed slightly better than the symptom-based diagnosis of GERD in primary care.
However, validation of GerdQ for the diagnosis of GERD is needed in a primary care population in which the GerdQ was developed and is primarily intended to be used. The aim of this study was therefore to assess the diagnostic validity of GerdQ in a population with suspected GERD referred for open-access EGD.
Patients And Methods
This diagnostic validation study was done as a part of an open, randomised parallel-group study performed at 18 gastroenterology out-patient clinics in Norway from March to December 2009 (called the Symptom-based vs. Endoscopy-based Approach – the SVEA study). This study compared an investigation-based approach (called Ordinary Clinical Pathway – OCP) with a symptom-based approach (called New structured Pathway – NSP) for the diagnosis and initial treatment of GERD. This design gave us the unique possibility to utilise the data from patients randomised to the OCP arm, and to perform a diagnostic validation study of the GerdQ.
Eligible patients provided written informed consent followed by completion of the GerdQ questionnaire, blinded to the investigator. EGD was then performed and if EGD revealed no reflux oesophagitis (Los Angeles grade A–D), subsequent pH-metry was scheduled. Following this, acid-suppressive treatment was started and after 4 weeks all patients returned for follow-up, again completing the GerdQ. GerdQ was completed blinded to the investigator at both visits.
The study protocol was approved by the Regional Committee for Health Research Ethics in Western Norway. This study was a part of a pan-European project evaluating GerdQ as a tool for a more structured management of GERD. ClinicalTrials.gov identifier NCT00842387.
The GerdQ questionnaire
The GerdQ questionnaire is a simple, self-administered and patient-centred questionnaire including six items (Table 1). The questionnaire was developed as an exploratory part of the Diamond study[1, 17] and the six items were derived from three questionnaires (Gastrointestinal Symptom Rating Scale – GSRS, Reflux Disease Questionnaire – RDQ and the GERD Impact Scale – GIS) used in the study.
Table 1. The GerdQ questionnaire asks patients to score the number of days with symptoms and use of over-the-counter (OTC) medications during the previous 7 days. It uses a four graded Likert scale (0–3) to score the frequency of four positive predictors of GERD (heartburn, regurgitation, sleep disturbance due to reflux symptoms or use of over-the-counter (OTC) medications for reflux symptoms) and a reversed Likert scale (3–0) for two negative predictors of GERD (epigastric pain and nausea) giving a total GerdQ score range of 0–18. The sleep disturbance and use of OTC medication are also used for assessment of the impact of GERD, giving a separate ‘impact score’ ranging from 0 to 6
Frequency score (points) for symptom
How often did you have a burning feeling behind your breastbone (heartburn)?
How often did you have stomach contents (liquid or food) moving upwards to your throat or mouth (regurgitation)?
How often did you have pain in the centre of the upper stomach?
How often did you have nausea?
How often did you have difficulty getting a good night's sleep because of your heartburn and/or regurgitation?
How often did you take additional medication for your heartburn and/or regurgitation, other than what the physician told you to take) (such as Tums, Rolaids, Maalox?)
For recruitment a referral list from primary care physicians containing patients with symptoms suggestive of GERD and requesting further evaluation was used. Eligible patients were ≥18 years of age and had symptoms suggestive of GERD. Patients presenting with alarm symptoms such as unintentional weight loss, severe or progressive dysphagia or GI bleeding were excluded from participation as well as those who had undergone endoscopy and/or pH-metry during the last year.
Criteria for objectively verified GERD
Pre-endoscopy use of acid-suppressive medications was restricted in accordance with existing local guidelines, usually forbidding continuous use of both PPI and histamine-2-receptor antagonists (H2RA) two weeks before EGD, but allowing limited on-demand use of antacids or H2RA. All evaluations of EGD and pH-metry were performed by experienced and board certified endoscopists. EGD was performed according to local routines and the Los Angeles (LA) classification was used for grading of reflux oesophagitis. Patients with normal EGD were referred for 24 h conventional or, if available, 48 h wireless oesophageal pH-metry. Percent time with pH <4 was registered, including measurements in the upright and supine positions. The total number of reflux episodes and numbers of reflux episodes longer than 5 min were also recorded, and given as average per 24 h. The same reference range (see below) was applied for conventional (24-h) as for 48-h pH-metry. During the pH-metry, patients were instructed to record typical reflux symptoms on the receiver, and a cutoff level of ≥95% in Symptom Association Probability (SAP) with acid reflux indicated a positive test for GERD irrespective of the level of oesophageal acid exposure. For patients with normal endoscopy but in whom pH-metry could not be undertaken within the timeframe of the study, an effort was made to collect data from pH-monitoring performed off medication within the 3-months immediately post study.
GERD was diagnosed if at least one of the following criteria was fulfilled:
LA grade A-D reflux oesophagitis at endoscopy
Pathological oesophageal pH-metry, defined by fulfilling one of the following criteria:
Total oesophageal pH<4 for ≥5.5% of time.
Total oesophageal pH<4 for <5.5% of time but ≥6.9% (supine) or ≥6.7% (upright) of time.
Positive SAP ≥95% for association of symptoms with acid reflux.
Acid-suppressive therapy in the study
Acid-suppressive treatment was not prespecified in the study protocol and the investigator initiated treatment according to ordinary clinical practice and in line with current rules for reimbursement in Norway. First-line treatment for GERD in Norway specifies generic PPI (lansoprazole, omeprazole or pantoprazole once daily in standard doses) or H2RA in most patients and esomeprazole once daily reserved for patients with severe reflux oesophagitis, complications such as oesophageal strictures and metaplasia or symptoms refractory to other treatment. For patients with no objective confirmation of GERD, in those with mild symptoms of GERD, or suspicion of other upper GI morbidity, it was an alternative not to prescribe any acid-suppressive treatment at baseline.
PPI treatment test
The study also assessed whether the response to 4 weeks treatment with a PPI could be utilised as a separate diagnostic test for GERD and furthermore if the PPI treatment test could be included as a third diagnostic modality, additional to EGD and pH-metry. In patients with heartburn on two or more days per week at baseline a positive PPI test was defined as presence of not more than one day with heartburn during the last of the four weeks of PPI treatment. Patients with no or prescribed acid-suppressive treatment other than PPI at baseline were not included in this analysis.
Receiver operating characteristic (ROC) curves with associated 95% confidence intervals (CI) were used for estimating the optimal cutoff value for a GERD diagnosis and its associated sensitivity and specificity (SE and SP), positive and negative likelihood ratios (LR+ and LR−) and positive and negative predictive values (PPV and NPV). Chi-squared statistics were used to calculate response rates, in confirmed GERD vs. non-GERD patients with different GerdQ baseline scores and the response to PPI therapy. Descriptive data are expressed with mean and standard deviation for continuous variables and with number and percentages for categorical variables. A two-sided P-value of less than 0.05 was considered to be statistically significant. Statistical analyses were performed using the statistical package stata version 12 (StataCorp LP, College Station, TX, USA).
Of the 347 patients randomised to the original study, 173 patients were randomised to the OCP arm. Three had endoscopic findings indicating disease other than GERD. One additional patient did not complete the GerdQ questionnaire and was excluded. In total, 169 patients (98%) were included in the validation analysis of the GerdQ. Fifty-three per cent were men, mean age was 47 years and mean duration of symptoms 9 years. Three patients had Barrett's metaplasia, but no malignancies were discovered during the course of the study (Table 2). Hiatal hernia was noted in 54% of the patients and none had suspected eosinophilic oesophagitis. During the 4 weeks before entering the study most patients used over-the-counter (OTC) H2RA and antacids (38%) or had no treatment at all (36%), while 26% used a PPI. Following investigation, 152 patients (90%) were prescribed a PPI, eight patients (5%) a H2RA and, according to the protocol, nine patients (5%) were not prescribed any acid-suppressive treatment at baseline (Table 2).
Table 2. Baseline characteristics on demographics, acid-suppressive medications, GerdQ symptoms scores and endoscopic findings on the 169 included patients
n = 169
PPI, proton pump inhibitor; H2RA, histamine-2 receptor antagonist; LA, Los Angeles; BMI, body mass index.
Mean age, years (s.d.)
Mean BMI, kg/m2 (s.d.)
Mean symptom duration, years (s.d.)
Medications prescribed at baseline
Symptom scores at baseline
Mean GerdQ total (s.d.)
GerdQ score, ≥8
GerdQ score, ≥9
GerdQ score, ≥10
LA grade A
LA grade B
LA grade C
LA grade D
Normal endoscopy patients with pH-metry performed
Objectively verified GERD
Of the 169 included patients, 147 patients (87%) fulfilled the criteria for objectively verified GERD (see Methods). One hundred and thirty seven patients (93%) had reflux oesophagitis on EGD and 10 patients (7%) did not, but had pathological pH-metry (Table 2). Among the 32 patients that had no reflux oesophagitis (LA grade A–D) at baseline, 18 underwent subsequent pH-metry. In 14 patients having no reflux oesophagitis at baseline, pH-metry was not undertaken, mainly for practical reasons.
Total GerdQ scores at baseline
Forty-three patients (25%) had a total GerdQ score below 8 and 129 patients (75%) had a total score ≥8 (Table 2). Symptoms reported on two or more days the last week prior to inclusion are reported in Table 3; heartburn 137 patients (81%), regurgitation 82 patients (49%), heartburn and/or regurgitation combined 146 patients (86%) and epigastric pain 59 patients (57%). A histogram of the distribution of total GerdQ scores is displayed in Figure 1.
Table 3. Baseline frequency distribution of symptoms from the six items included in the GerdQ questionnaire
Use of Over-the-Counter medications
The number of patients with objectively confirmed GERD is plotted against total GerdQ score categories in Figure 2. The total GerdQ score increased with increasing grade of reflux oesophagitis. For patients with a total GerdQ score of eight or more, 91% had objectively verified GERD while for patients with a total GerdQ score of less than eight, 74% had objectively confirmed GERD (P = 0.005). For patients with a total GerdQ score of nine or more 92% had an objectively verified GERD, while for patients with a total GerdQ score of less than nine, 78% had objectively confirmed GERD (P = 0.008).
There was a direct correlation between the ‘impact score’ (sleep disturbance and OTC medications caused by reflux symptoms) and the total GerdQ score, as displayed in Figure 3.
Diagnostic accuracy of the GerdQ
The ROC analysis gave the optimal balance between sensitivity (66%) and specificity (64%) for GERD with a cutoff of 9 in total GerdQ score (Figure 4). Calculations of the negative and positive predictive values (NPV and PPV) as well as the positive and negative likelihood ratio (LR+ and LR−) also confirmed the optimal cutoff of 9 in this study population. A cutoff of 10 or 8 resulted in reduced sensitivity and specificity, respectively (Table 4). In patients with a GerdQ score of 9 or above, 97 of 105 patients (92%) had proven GERD and 8 of these 105 patients (8%) had no proven GERD. By comparison, 50 of remaining 64 patients (78%) had a GerdQ score below 9 and proven GERD and 14 of 64 patients (22%) had a GerdQ score below 9 and no proven GERD.
Table 4. Test characteristics of total GerdQ cutoff score at 8, 9 or 10. Objective diagnostic criteria for proven GERD defined by endoscopy verified reflux oesophagitis or normal endoscopy with pathological oesophageal acid exposure
A sensitivity analysis was done by excluding all patients with low-grade reflux oesophagitis (LA grad A). The ROC sensitivity analysis confirmed 9 as the optimal cutoff GerdQ value with sensitivity slightly improved to 70% (95% CI 57-80%) while specificity remained constant at 64% (95% CI: 41–83). NPV and PPV were 41% (95% CI: 25–59) and 85% (95% CI: 73–94) respectively.
PPI treatment test
Of the 124 PPI-treated patients with heartburn on 2 or more days per week at baseline, 94 patients (76%) had symptom resolution after 4 weeks (≤1 day of symptoms during the last week). A positive response to PPI therapy was observed in 90 of 117 patients (77%) with confirmed GERD (EGD or pH-metry) compared with four of seven patients (57%) without proven GERD (P = 0.24).
Symptom resolution on PPI treatment alone had a sensitivity of 76% (95% CI: 66–84) and a specificity of 33% (95% CI: 17–53) for a diagnosis of GERD when applying GerdQ cutoff score of 9.
Adding a positive response to PPI as a diagnostic criterion, in patients with normal EGD or pH-metry, contributed an additional four patients (in total 151 patients) with confirmed GERD. However, the diagnostic accuracy was essentially unchanged with sensitivity at 65% (95% CI: 57–73) and specificity at 61% (95% CI: 36–83) when adding a positive PPI test to the definition of GERD, again at GerdQ cutoff ≥9.
This is the first validation study of the diagnostic accuracy of the GerdQ in a population with symptoms suggestive of GERD. We found that the optimal GerdQ cutoff score for GERD in our primary care population referred for open-access endoscopy of suspected GERD was 9, corresponding to a sensitivity of 66% and a specificity of 64% for the diagnosis of GERD. Furthermore, our analyses show that symptom resolution to a short course of PPI treatment does not add value as a diagnostic test for GERD, either used alone or when combined with EGD/pH-metry.
The GerdQ was developed as an exploratory part of the Diamond study in which upper gastrointestinal symptoms were correlated with several objective markers of GERD.[1, 17] Cutoff score for GERD at 8, in patients without acid-suppressive therapy, yielded a sensitivity of 65% and a specificity of 71% for symptom-defined GERD compared with an investigation-based diagnosis of GERD. The primary care patients enrolled in the Diamond study consulted for a wide spectrum of frequent upper gastrointestinal symptoms (e.g. reflux and/or dyspeptic symptoms). In contrast, our study enrolled primary care patients suspected of suffering from GERD, and referred for open-access EGD. Hence, the population in our study represents a more selected reflux population compared with that of the Diamond study and this difference in pre-test probability of GERD helps to explain why we report a higher prevalence of reflux oesophagitis and a higher GerdQ cutoff score.
We found that the GerdQ questionnaire had moderate sensitivity and specificity for the diagnosis of GERD, thereby reflecting the inherently imperfect correlation between reflux symptoms and objective markers of GERD. As there is no gold standard for the GERD diagnosis, the sensitivity and specificity we report in this study is therefore what can be expected within the limitations of the objective reference criteria we used for the GERD diagnosis (EGD/pH-metry). Applying a GerdQ cutoff score of 9 or above correctly identified two-thirds of our patients with proven GERD, but failed to correctly identify one third of those with proven GERD. Ninety-two per cent of those with a GerdQ cutoff score of 9 and above had proven GERD while only 22% of those with a GerdQ cutoff score below 9 had neither reflux oesophagitis nor pathological pH-metry. While this is documenting a high rate of true positives it also illustrates the dilemma with the high rate of patients with false-negative test owing to the high prevalence of reflux oesophagitis in patients with low GerdQ scores. A sensitivity analysis, excluding all patients with low-grade oesophagitis (LA grade A), showed roughly the same sensitivity and specificity of the questionnaire. However, the negative predictive value for patients with a GerdQ score under 9 and EGD/pH-metry verified GERD, increased from 22% to 41%, related to a reduction in prevalence of endoscopically verified GERD.
Against this background, and for routine clinical care, the GerdQ questionnaire should not be used as a standalone diagnostic tool for GERD, but serve as a complement to other investigations and management by the treating physician. We have previously documented the usefulness of the GerdQ in an algorithm to structure the initial management of GERD in patients without alarm features. We found that patients with a high likelihood of GERD (high scores) profited from a symptom-based approach with immediate acid-suppressive therapy while patients with a low likelihood of GERD (low scores) favoured further investigations with EGD/pH-metry. Therefore, to investigate for differential diagnoses, such as functional dyspepsia and extra-oesophageal syndromes that cannot be captured by the GerdQ questionnaire, it is even more important to intensify the follow-up and referral of patients with low GerdQ scores. The pragmatic use of GerdQ for diagnostic work-up in primary care, after excluding patients with alarm features, is as a tool to be used for picking out-patients in need of referral to EGD or pH-metry. We can argue that endoscopy in younger patients without alarm features and GerdQ score at and above 9 are unnecessary for the primary diagnosis of GERD. This strategy could potentially lead to a reduction of up to 60% of upper endoscopies for primary diagnosis of GERD. However, other reasons like anxiety for underlying malignancy, co-morbidities and co-medication will still be indications for diagnostic endoscopy. This approach will lead to a higher proportion of patients kept in primary care and a proper referral to secondary care for EGD/pH-metry when symptoms persist despite treatment, with long symptom history, high age and the development of alarm features.
A PPI test alone had good sensitivity, but poor specificity for GERD in our population and was not considered an accurate method alone for diagnosing GERD. Adding a positive PPI test (symptom resolution after 4 weeks on PPI) as an additional criterion on proven GERD, in patients with normal EGD or pH-metry, did not enhance the diagnostic accuracy. This is in line with other studies confirming the limited diagnostic value of symptomatic response to a PPI as a viable diagnostic test strategy for GERD.[3-7]
Lacy et al. tested the association between the total GerdQ score and oesophageal acid exposure in 358 consecutive patients referred for 48 h wireless pH-metry and the authors concluded that GerdQ was not sufficiently accurate for the diagnosis of GERD. Their study is based on a tertiary care population enriched of patients with refractory GERD or functional and atypical complaints and many were tested on PPIs. Hence, their population is not representative of an untreated primary care population for which the GerdQ primarily is intended to be used. Furthermore, only one modality was used for reference diagnosis, and EGD, which is an important method for diagnosing GERD, was not a part of their study.[24, 25]
A special feature of our study is the high prevalence of reflux oesophagitis (81%). A plausible explanation for this, is that the Norwegian reimbursement scheme necessitates objectively verified disease (LA grade A-D reflux oesophagitis or pathological pH-metry) in order for GERD medications to be reimbursed. Consequently, EGD units need to be meticulous in stopping acid-suppressive medications before EGD, leading to a higher prevalence of underlying reflux oesophagitis. Since we identified potential patients from open-access endoscopy waiting lists, a further selection of patients was made leading to an enriched reflux population. An unselected primary care population consulting for upper GI complaints would probably result in more patients with functional complaints and a higher proportion of non-erosive reflux disease. We do not consider rebound acid hypersecretion to have impacted the results since the majority of patients were either treatment naïve or used only OTC medications at study entry. One limitation of our study is that for 14 patients having no reflux oesophagitis at baseline, pH-metry was not undertaken, mainly for practical reasons. However, it is unlikely that these few patients would have altered the outcome of the study.
Upper gastrointestinal symptoms, including those related to GERD, constitute one of the most frequent reasons for consultation in primary care. Consequently, efficient methods for diagnosis, management and referral of these patients are important to further optimise care and drive cost efficiency. A properly validated questionnaire, like the GerdQ, is therefore a valuable tool for facilitating the symptom-based diagnosis of GERD. It contributes also to standardisation and improves the awareness among healthcare providers on the correct phrasing of questions on upper GI symptoms. The GerdQ can hopefully also empower primary care physicians to take on larger responsibility for managing patients with GERD, leading to a reduced number of costly and sometimes unnecessary referrals.
Guarantor of the article: C. Jonasson.
Author contributions: CJ, JGH and BW contributed to study concept and design. JGH and DALH contributed to data collection. CJ analysed the data. CJ, JGH, DALH and BW interpreted the data and wrote the paper. All authors have approved the final version of the manuscript.
Declaration of personal interests: C. Jonasson is a former employee of AstraZeneca. B. Wernersson is an employee of AstraZeneca R&D Mölndal, Sweden.
Declaration of funding interests: This study was funded by AstraZeneca.
The eighteen participating sites in Norway: J. Hatlebakk, Bergen, B. Moum, Oslo, O. Sandstad, Oslo, T. Sandanger, Asker, G. Noraberg, Arendal, J.Matre, Kristiansand, C.Bang, Nesttun, J. Takvam, Tønsberg, R. Breckan, Bodø, V. Høeg, Tynset, J. Langtind, Orkanger, A. Wilskow, Mosjøen, U. Fjøsne, Levanger, O. Lange, Molde, V. Glazkov, Haugesund, E Melsom, Kristiansund, D. A. Hoff, Ålesund, J. A. Sparby, Kongsvinger.