Non-invasive diagnostic tests for Helicobacter pylori infection

  • Protocol
  • Diagnostic

Authors


Abstract

This is the protocol for a review and there is no abstract. The objectives are as follows:

To compare the diagnostic accuracy of urea breath test, serology, and stool antigen test, either alone, or in combination, in the diagnosis of Helicobacter pylori (H. pylori ) infection in symptomatic and asymptomatic people in whom H. pylori infection status is sought, so that eradication therapy for H. pylori can be started.

If we identify heterogeneity, we plan to explore the following potential sources of heterogeneity: risk of bias, publication status, prospective or retrospective studies, symptomatic versus asymptomatic participants, recent or current use of proton pump inhibitors or antibiotics, different subtypes of tests, and the interval between the index test and reference standard.

Background

Helicobacter pylori (H. pylori) is a gram negative spiral bacterium (NCBI 2014). Approximately 13% to 81% of people have H. pylori infection (Peleteiro 2014). Prevalence of the bacterium varies according to age (generally increasing with age, although infection rates tend to fall among older age groups in some Latin American and Northeast Asian countries), region (lower infection rates are seen in Australia and the United Kingdom, while higher rates are reported in Chile, China, Japan, Korea and Latvia), race (more prevalent amongst Afrocarribeans compared to Caucasians), and socioeconomic class (more common in poorer settings) (Graham 1991; Laszewicz 2014; Muhsen 2012; Peleteiro 2014).

Based on observational studies, H. pylori infection has been implicated in a number of malignancies including gastric cancer, premalignant lesions of the stomach (atrophic gastritis and intestinal metaplasia), gastric lymphoma, pancreatic cancer, colorectal cancer, and laryngeal cancer (Huang 1998; Huang 2003; Wu 2013; Xiao 2013; Xue 2001; Zhuo 2008). However, H. pylori is associated with a lower incidence of oesophageal adenocarcinomas (Islami 2008). H. pylori is also associated with a number of non-malignant conditions including peptic ulcers, non-ulcer dyspepsia, recurrent peptic ulcer bleeding, unexplained iron deficiency anaemia, idiopathic thrombocytopaenia purpura, and colorectal adenomas (DuBois 2005; Franchini 2007; Gisbert 2004; Huang 2002; Jaakkimainen 1999; Wu 2013).

Although a number of pathogenic factors such as cytotoxin-associated gene A (CagA), vacuolating cytotoxin A (VacA), and blood group antigen binding adhesin (BabA) are associated with increased virulence of H. pylori (Huang 2003; Malfertheiner 2012), detection of these pathogenic factors has no role in the management of H. pylori infection currently (Malfertheiner 2012). The recommended initial treatment for H. pylori infection is with a combination of proton pump inhibitor, clarithromycin, and amoxicillin or metronidazole (triple therapy) in regions with low resistance to clarithromycin (< 20% resistance rate in the area), and the triple therapy along with bismuth (quadruple therapy) in regions with high resistance to clarithromycin (> 20% resistance rate in the area) (Malfertheiner 2012). If this results in failure of eradication, bismuth-quadruple therapy or levofloxacin-triple therapy (replacement of clarithromycin with levofloxacin in the classical triple therapy) when triple therapy was used as the initial treatment and levofloxacin-triple therapy when bismuth quadruple therapy was used as the initial treatment is recommended (Malfertheiner 2012). If even this treatment fails to eradicate H. pylori, then further treatment should be based on antibiotic susceptibility (Malfertheiner 2012). Eradication of H. pylori might lead to a decrease in malignant and non-malignant conditions associated with H. pylori infection. Adverse events related to H. pylori treatment include taste disturbance, diarrhoea, nausea, headache, skin rash, abdominal pain, dizziness, bloating, myalgias (muscle pain), and constipation (Ye 2014).

A glossary of terms in included in Appendix 1.

Target condition being diagnosed

Helicobacter pylori (H. pylori ) infection.

Index test(s)

Urea breath test

The urea breath test is based on the presence of urease enzyme in live Helicobacter pylori (H. pylori) which breaks down urea into ammonia and carbon dioxide (McNulty 2005; Ricci 2007). After ingestion of urea labelled with either 13C or 14C, breath samples are collected for up to 20 minutes by exhaling into a carbon dioxide trapping agent (Ricci 2007). The urea breath test is performed by the clinician or the clinician's assistant. The thresholds used include the percentage of carbon recovered during the collection time or counts per minute (Ferwana 2015). Threshold levels above 4% or 5% are commonly used to diagnose H. pylori infection (Ferwana 2015). A wide range of threshold counts per minute, ranging from more than 25 counts per minute to 1000 counts per minute, have been used for diagnosis of (H. pylori) infection (Ferwana 2015).

Serology

These tests are based on circulating antibodies to H. pylori. There are three main methods for these tests: the enzyme-linked immunosorbent assay (ELISA) test, latex agglutination tests, and Western blotting (Ricci 2007). Of these ELISA is the most common method used. The total immunoglobulin, immunoglobulin subtypes, and antibody response to specific antigens can all be tested. Since they do not require any special equipment, they can be performed easily (Ricci 2007). However, the serology may be positive because of the presence of active infection at the time of the test, previous infection, or because of non-specific cross-reacting antibodies (McNulty 2005). Tests that use whole blood (rather than serum) and other bedside tests (that involves using a bedside centrifuge), are also available, although these whole blood tests and bedside serum tests are generally considered unreliable (Ricci 2007). Routine serum tests are performed by the laboratory technician and interpreted by the clinician. The bedside serum tests and whole blood tests are performed by the clinician or the clinician's assistant. Different threshold levels have been used by different researchers, for example, a titre ≥ 300 was interpreted as a positive serology by Lindsetmo 2008 et al while a titre ≥ 500 was interpreted as a positive serology by Granberg 1993 et al evaluating the prevalence of H. pylori.

Stool antigen tests

These tests use monoclonal and polyclonal antibodies to detect the presence of H. pylori antigen in stools and active H. pylori infection can be diagnosed (McNulty 2005; Ricci 2007). Serum tests are performed by the laboratory technician and interpreted by the clinician. As for other tests, several threshold levels have been used by different researchers, for example, an optical density of ≥ 0.15, ≥ 0.16, and ≥ 0.19 have all been used as the thresholds for the diagnosis of H. pylori using monoclonal antibodies for stool antigen tests.

Clinical pathway

Evidence from randomised controlled trials showed that screening and eradication programmes for H. pylori in populations at high risk of gastric cancer (e.g. East Asians), lowered the incidence of gastric cancer (Ford 2014). The Asia-Pacific Gastric Cancer Consensus conference recommended that screening and eradication of H. pylori is recommended in populations in countries at high risk of gastric cancer (i.e. Japan and Korea) (Talley 2008). The updated European Helicobacter Study Group (EHSG) Fourth Maastricht/Florence Consensus Conference guidelines suggest that people should be tested for H. pylori, and eradication of H. pylori (when present) has been recommended for the following conditions (Malfertheiner 2012).

  1. People at high risk of gastric cancer.

  2. Adults with dyspepsia with a locally determined age cut-off point (depending on local incidence of gastric cancer in different age groups), and without ‘alarm’ symptoms or signs associated with an increased risk of gastric cancer such as weight loss, dysphagia, upper gastrointestinal bleeding, abdominal mass, or iron deficient anaemia.

  3. Unexplained iron deficiency anaemia.

  4. Idiopathic thrombocytopenic purpura.

  5. Uninvestigated young patients with dyspepsia should also be considered for testing for H. pylori when the prevalence of H. pylori is high (≥ 20%).

The clinical pathway is shown in Figure 1.

Figure 1.

Clinical pathway

Prior test(s)

The index tests can be performed without any prior test.

Role of index test(s)

The index tests are used for the screening and diagnosis of H. pylori.

Alternative test(s)

Other tests used in the screening and diagnosis of H. pylori infection include non-invasive saliva and urine antigen-based tests (Ricci 2007), and invasive gastric biopsy followed by Campylobacter-like organism (CLO) test, culture, histology, and polymerase chain reaction (PCR) (van Doorn 2000). We have not included the non-invasive saliva and urine antigen-based tests, since these are not commonly used (Ricci 2007).

Rationale

Testing for Helicobacter pylori (H. pylori ) and eradication of H. pylori has been recommended for a number of population groups (Clinical pathway). These tests have to be non-invasive so that testing can be carried out for a large number of people. Undetected patients with H. pylori continue to be at high risk of gastric cancer or continue to have dyspepsia, anaemia, or purpura. Overdiagnosis (false positive tests) of H. pylori means that patients are subject to unnecessary adverse events related to eradication therapy (approximately 27% of patients receiving eradication therapy develop mild adverse events such as, bitter taste, nausea, diarrhoea, etc). Comparing the diagnostic accuracy of different index tests will highlight the best test for the diagnosis of H. pylori infection. There has been no Cochrane review on the diagnostic accuracy of the different non-invasive tests for the diagnosis of H. pylori infection.

Objectives

To compare the diagnostic accuracy of urea breath test, serology, and stool antigen test, either alone, or in combination, in the diagnosis of Helicobacter pylori (H. pylori ) infection in symptomatic and asymptomatic people in whom H. pylori infection status is sought, so that eradication therapy for H. pylori can be started.

Secondary objectives

If we identify heterogeneity, we plan to explore the following potential sources of heterogeneity: risk of bias, publication status, prospective or retrospective studies, symptomatic versus asymptomatic participants, recent or current use of proton pump inhibitors or antibiotics, different subtypes of tests, and the interval between the index test and reference standard.

Methods

Criteria for considering studies for this review

Types of studies

We will include studies that evaluate the accuracy of the different index tests mentioned above, in the appropriate patient population (see below), irrespective of language or publication status, or whether data are collected prospectively or retrospectively. However, we will exclude case reports (that describe how the diagnosis of Helicobacter pylori (H. pylori) was made on an individual patient or group of patients, and which do not provide sufficient diagnostic test accuracy data i.e. true positive, false positive, false negative, and true negative). We will also exclude case-control studies because case-control studies are prone to bias (Whiting 2011).

Participants

Symptomatic and asymptomatic people in whom H. pylori infection status is sought, so that eradication therapy for H. pylori can be started. We will exclude people with acute upper gastrointestinal bleeding, as such patients are likely to undergo endoscopy, and invasive testing can be performed, if required.

Index tests

Urea breath test, serology, and stool antigen test, either alone, or in combination. We will include only initial testing and not repeat testing (monitoring), since the diagnostic test accuracy may vary depending upon whether the test is used for initial testing versus monitoring the success of the treatment (Ricci 2007).

Target conditions

H. pylori infection.

Reference standards

There is no true gold standard for the diagnosis of H. pylori infection; endoscopic biopsy followed by histology, endoscopic biopsy followed by polymerase chain reaction (PCR), and endoscopic biopsy followed by rapid urease testing all have excellent sensitivity and specificity (Chey 2007). However, the PCR methodology is not standardised across laboratories (Chey 2007); it is an unreliable reference standard. Endoscopic biopsy followed by rapid urease testing has poor sensitivity following treatment with proton pump inhibitors (Chey 2007). Endoscopic biopsy with culture has high specificity but poor sensitivity (Chey 2007). So, we will only consider endoscopic biopsy followed by histology as the reference standard in this review. We will stratify the studies according to the type of stains (haemotoxylin and eosin (H&E) stain versus special histological stains such as Giemsa stain, Warthin-Starry silver, or Genta stain versus immunohistochemical stains). It is generally considered that special histological stains and immunohistochemical stains have better specificity than H&E stains in the diagnosis of H. pylori infection (Laine 1997; Lee 2015).

In terms of interpretation, we will consider endoscopic biopsy with histology using immunohistochemical stain as the best reference standard and endoscopic biopsy with histology using H&E stains as the worst reference standard. This is because immunohistochemical stains have better test accuracy than special stains, which in turn have better test accuracy than H&E stains (Lee 2015).

Search methods for identification of studies

We will include all studies irrespective of the language of publication and publication status. If we find non-English articles, we will obtain translations.

Electronic searches

We will search the following databases.

  1. MEDLINE via OvidSP (January 1946 to present) (Appendix 2).

  2. EMBASE via OvidSP (January 1947 to present) (Appendix 3).

  3. Science Citation Index Expanded via Web of Knowledge (January 1980 to present) (Appendix 4).

  4. National Insitute for Health Research (NIHR HTA) via Centre for Reviews and Dissemination (present) (Appendix 5).

Searching other resources

We will search the references of the included studies to identify additional studies. We will also search for articles related to the included studies by performing the 'related search' function in MEDLINE (OvidSP) and EMBASE (OvidSP) and a 'citing reference' search (by searching the articles which cite the included articles) (Sampson 2008) in MEDLINE (OvidSP) and EMBASE (OvidSP).

Data collection and analysis

Selection of studies

Two review authors (research assistants to KG) will independently search the references to identify relevant studies. We will obtain the full texts for references considered relevant by at least one of the authors. Two authors will independently screen the full text papers against the inclusion criteria. Any differences in study selection will be arbitrated by KG. We will contact the study authors if there are any doubts about the study eligibility.

Data extraction and management

Two review authors will independently extract the following data from each included study using a prepiloted data extraction form, and any differences will be resolved by discussion with KG.

  1. First author.

  2. Year of publication.

  3. Study design (prospective or retrospective cohort studies; cross-sectional studies or randomised controlled trials).

  4. Inclusion and exclusion criteria for individual studies.

  5. Total number of patients.

  6. Number of females.

  7. Average age of the participants.

  8. Initial testing versus testing after eradication.

  9. Number of people with bleeding ulcers, gastric atrophy, lymphoma, and recent or current use of proton pump inhibitors or antibiotics.

  10. Number of symptomatic participants.

  11. Tests carried out prior to the index test.

  12. Description of the index test.

  13. Threshold used for the index test.

  14. Reference standard.

  15. Number of true positives, false positives, false negatives, and true negatives.

If the same study reports multiple index tests, we will extract the number of true positives, false positives, false negatives, and true negatives for each index test at each threshold. If the same study reports the number of true positives, false positives, false negatives, and true negatives for each index test at different thresholds, we will extract this information for each threshold. If the study reports the results for a combination of tests, we will extract the number of true positives, false positives, false negatives, and true negatives for each different combination of tests.

A common way that the diagnostic accuracy of a combination of tests are assessed is that at least one test is positive versus all tests being positive. We will extract the number of true positives, false positives, false negatives, and true negatives for both scenarios. If different reference standards are reported in the same study, endoscopic biopsy with histological confirmation with H&E stain, special stains and immunohistochemical stains are available from the same study, we will extract the true positives, false positives, false negatives, and true negatives for only one of the reference standards. For this purpose, immunohistochemical stains will be preferred over special stains, which in turn will be preferred over H&E stains. This is because immunohistochemical stains have better test accuracy than special stains, which in turn have better test accuracy than H&E stains (Lee 2015).

We will exclude patients with uninterpretable index test results (no matter the reason given for lack of interpretation) since in clinical practice, uninterpretable index test results will result in additional tests for diagnosis of H. pylori infection. However, we will record the number of uninterpretable index test results, as this will provide information on the applicability of the test in clinical practice, and may affect the cost-effectiveness of a test (although cost-effectiveness is outside the scope of this review, cost-effectiveness studies may use data from this review).

If there is an overlap of participants between multiple reports, as suspected by common authors and centres, we will attempt to contact the study authors to seek clarification about the overlap. If we are unable to contact the authors, we will extract the maximum possible information from all the reports. We will seek further information from study authors, if necessary.

Assessment of methodological quality

Two authors will independently assess study quality using the QUADAS-2 assessment tool (Whiting 2006; Whiting 2011). Any differences will be resolved by KG. The criteria that will be used to classify the different studies are shown in Table 1. We will consider studies which are classified as 'low risk of bias' and 'low concern' in all the domains as studies with high methodological quality. We will present the results in a 'Risk of bias' summary and graphs, in addition to a narrative summary.

Table 1. QUADAS-2 classification
Domain 1: Patient selectionPatient samplingSymptomatic people and asymptomatic people in whom Helicobacter pylori (H. pylori ) infection status is sought so that eradication therapy for H. pylori can be started
Was a consecutive or random sample of patients enrolled?Yes: If a consecutive sample or a random sample of symptomatic people and asymptomatic people in whom H. pylori infection status is sought was included in the study
No: If a consecutive sample or a random sample of symptomatic people and asymptomatic people in whom H. pylori infection status is sought was not included in the study
Unclear: If this information was not available
Was a case-control design avoided?Yes: If a cohort of symptomatic people and asymptomatic people in whom H. pylori infection status is sought were studied
No: If people with H. pylori infection were compared with people without H. pylori infection (controls). Such studies will be excluded
Unclear: We anticipate that we will be able to determine whether the design was case-control. So, we anticipate that all studies included in the review to be classified as 'yes' for this item
Did the study avoid inappropriate exclusions?Yes: If all symptomatic people and asymptomatic people in whom H. pylori infection status is sought were included
No: If the study excluded patients based on high probability of false negative results (for example, people with bleeding ulcers, gastric atrophy, lymphoma, and recent or current use of proton pump inhibitors or antibiotics)
Unclear: If this information was not available
Could the selection of patients have introduced bias?Low risk of bias: If 'yes' classification for all the above three questions; high risk of bias: if 'no' classification for any of the above three questions; unclear risk of bias: if 'unclear' classification for any of the above three questions, but without a 'no' classification for any of the above three questions
Patient characteristics and settingYes: If all symptomatic people and asymptomatic people in whom H. pylori infection status is sought were included
No: If a proportion of symptomatic people and asymptomatic people in whom H. pylori infection status is sought were excluded on the basis of the high probability of false negative results (for example, people with bleeding ulcers, gastric atrophy, lymphoma, and recent or current use of proton pump inhibitors or antibiotics)
Unclear: If it is not clear whether the patients have been included on the basis of the probability of H. pylori infection
Are there concerns that the included patients and setting do not match the review question?Low concern: if the patient characteristics and setting is classified as 'yes'; unclear concern: if the patient characteristics and setting is classified as 'unclear'; high concern: if the patient characteristics and setting is classified as 'no'
Domain 2: Index testIndex test(s)Urea breath test, serology, and stool antigen test
Were the index test results interpreted without knowledge of the results of the reference standard?Yes: If the index test is conducted and interpreted without the knowledge of the results of the reference standard
No: If the index test is interpreted with the knowledge of the results of the reference standard
Unclear: If it is not clear whether the index test was interpreted without the knowledge of the results of the reference standard
If a threshold was used, was it prespecified?

Yes: if a prespecified threshold was used

No: if a prespecified threshold was not used

Unclear: if it was not clear whether the threshold used was prespecified

Could the conduct or interpretation of the index test have introduced bias?Low risk of bias: If 'yes' classification for both questions above; high risk of bias: if 'no' classification for any of the above two questions; unclear risk of bias: if 'unclear' classification for any of the above two questions, but without a 'no' classification for any of the above two questions
Are there concerns that the index test, its conduct, or interpretation differ from the review question?Low concern: If the criteria for a positive index test is clearly stated; high concern: if the criteria for a positive index test is not stated
Domain 3: Target condition and reference standardTarget condition and reference standard(s)

Target condition: H. pylori infection

Reference standard: Endoscopic biopsy with histology

Is the reference standard likely to correctly classify the target condition?

Yes: If H. pylori infection is confirmed by endoscopic biopsy with special stains or immunohistochemical stains
No: If the reference standard is endoscopic biopsy with haemotoxylin and eosin stain in some or all participants

Unclear: If the reference standard was not described adequately. Such studies will be excluded

Were the reference standard results interpreted without knowledge of the results of the index tests?Yes: If the reference standard is interpreted without the knowledge of the results of the index test
No: If the reference standard is interpreted with the knowledge of the results of the index test
Unclear: It is not clear if the reference standard is interpreted without the knowledge of the results of the index test
Could the reference standard, its conduct, or its interpretation have introduced bias?Low risk of bias: If 'yes' classification for both questions above; high risk of bias: if 'no' classification for any of the above two questions; unclear risk of bias: if 'unclear' classification for any of the above two questions, but without a 'no' classification for any of the above two questions
Are there concerns that the target condition (as defined by the reference standard) does not match the question?Considering the inclusion criteria for this review, we anticipate that all the included studies will be classified as 'low concern'
Domain 4: Flow and timingFlow and timingPeople with H. pylori infection may have resolution of infection (usually with treatment) and people without H. pylori infection may get infected with H. pylori if there is a long delay between the index test and reference standard. An arbitrary two weeks were chosen as acceptable delay between the index test and reference standard
Was there an appropriate interval between index test and reference standard?Yes: If the time interval between index test and reference standard was less than two weeks
No: If the time interval between index test and reference standard was more than two weeks, or if any treatment for H. pylori had been performed
Unclear: If the time interval between index test and reference standard was unclear or it was not clear if any treatment for H. pylori had been performed
Did all patients receive a reference standard?Yes: If all patients received a reference standard
No: If some of the patients did not receive a reference standard. Such studies will be excluded
Unclear: If it was not clear whether all patients received a reference standard. Such studies will be excluded. So, we anticipate that all studies included in the review will be classified as 'yes' for this item
Did all patients receive the same reference standard?

Yes: If all the patients received the same reference standard
No: If different patients received different reference standards

Unclear: If this information was not clear

Were all patients included in the analysis?Yes: If all the patients are included in the analysis, irrespective of whether the results were uninterpretable
No: If some patients are excluded from the analysis because of uninterpretable results
Unclear: If this information is not clear
Could the patient flow have introduced bias?Low risk of bias: If 'yes' classification for all the above four questions; high risk of bias: if 'no' classification for any of the above four questions; unclear risk of bias: if 'unclear' classification for any of the above four questions, but without a 'no' classification for any of the above four questions

Statistical analysis and data synthesis

We plan to perform a separate meta-analysis (i.e. stratify the analysis) for each index test, at each different threshold (i.e. we will consider tests at different thresholds as different index tests), and for each different reference standard (i.e. we will consider index tests with reference standards as different index tests). We will plot study estimates of sensitivity and specificity on forest plots and in receiver operating characteristic (ROC) space to explore between-study variation in the performance of each test stratified by different thresholds of index tests and different reference standards.

To estimate the summary sensitivity and specificity of each test at each threshold level and different reference standard, we will perform the meta-analysis by fitting the bivariate model (Chu 2006; Reitsma 2005).This model accounts for between-study variability in estimates of sensitivity and specificity through the inclusion of random effects for the logit sensitivity and logit specificity parameters of the bivariate model. If sparse data resulted in unreliable estimation of the covariance matrix of the random effects, as indicated by very large variance of logit sensitivity and specificity, we will simplify the model by assuming an exchangeable covariance structure (i.e. common variances for the random effects and one common pairwise covariance) instead of the more complex unstructured covariance matrix that allows for separate variances for each random effect and distinct covariances. Alternate models that we will try include the random-effects model, ignoring the inverse correlation between sensitivities and specificities in the different studies due to intrinsic threshold effect, and the fixed-effect model for either sensitivity or specificity, or both. The choice between the different models will be based on the distribution of sensitivities and specificities as noted in the forest plots or ROC space (Takwoingi 2015),

We will compare the diagnostic accuracy of the different tests by including a single covariate term for test type in the bivariate model to estimate differences in the sensitivity and specificity of the tests. We will consider a combination of tests for each of the scenario (any test positive or all tests positive) as different index tests. We will allow the variances of the random effects and their covariance to also depend on test type, thus allowing the variances to differ between tests/thresholds. We will use the hierarchical summary receiver operating characteristics curve (HSROC) to test hypotheses whether one test is superior to another, and investigate heterogeneity (Rutter 2001). For this purpose, we will combine tests, irrespective of the thresholds and reference standards. In the case that the study reports results at multiple thresholds, we will use the threshold used for the primary analysis by the authors for inclusion in the HSROC model. We will use likelihood ratio tests to compare the model with and without covariate (test type/thresholds). A P value of < 0.05 for the likelihood ratio test will indicate differences in the diagnostic accuracy between the tests.

We will also compare the estimates of sensitivity and specificity between models to check the robustness of our assumptions about the variances of the random effects. If studies that evaluate different tests in the same study population are available (for example, in studies that perform more than one index test in all the participants, individual index tests and combination of index tests in all the participants, or randomised controlled trials in which participants have been randomised to the different index tests) from at least four studies, we will perform a direct head-to-head comparison by limiting the test comparison to such studies. We will also present the relative sensitivities and relative specificities of the index tests from the direct comparisons in a table.

We will perform the meta-analysis using the NLMixed command in SAS version 9.3 (SAS Institute Inc, Cary, North Carolina, USA). We will create a graph of pretest probabilities (using the observed median and range of prevalence from the included studies) against post-test probabilities for each test stratified by different thresholds and reference standards. We will calculate the post-test probabilities using these pretest probabilities and the summary positive and negative likelihood ratios. We will calculate the summary likelihood ratios and their confidence intervals from the functions of the parameter estimates from the bivariate model that we will fit to estimate the summary sensitivities and specificities. Post-test probability associated with positive test is the probability of having the target condition (H. pylori infection) on the basis of a positive test result, and is the same as the term 'positive predictive value' used in a single diagnostic accuracy study. Post-test probability associated with a negative test is the probability of having the target condition (H. pylori infection) on the basis of a negative test result and is 1 - 'negative predictive value'. Negative predictive value is the term used in a single diagnostic accuracy study to indicate the chance that the patient has no target condition when the test is negative. We will report the summary sensitivity, specificity, positive and negative likelihood ratios, post-test probabilities for the median, lower quartile, and upper quartile of the pretest probabilities.

Investigations of heterogeneity

We plan to explore heterogeneity by using the following sources of heterogeneity as covariate(s) in the regression model.

  1. Studies at low risk of bias in all the domains versus those at unclear or high risk of bias (as assessed by the QUADAS-2 tool, recommended by the Cochrane Diagnostic Test Accuracy Group) (Whiting 2006; Whiting 2011).

  2. Full text publications versus abstracts (this can give an idea about publication bias since there may be an association between the results of the study and the study reaching full publication status) (Eloubeidi 2001).

  3. Prospective studies versus retrospective studies.

  4. Symptomatic versus asymptomatic participants.

  5. Recent or current use of proton pump inhibitors or antibiotics as these patients are at higher risk of false negative results for the urea breath test and stool antigen test, with serology being the only non-invasive test unaffected by the use of proton pump inhibitors or antibiotics (Malfertheiner 2012; Ricci 2007).

  6. Different subtypes of tests (13C versus 14C urea breath tests; ELISA, latex agglutination test, and Western blot methods of serological tests; formal serological tests versus bedside serological tests; and monoclonal versus polyclonal antibodies for stool antigen tests).

  7. Interval between index test and reference standard. Resolution of H. pylori infection in people with H. pylori infection (usually with treatment) and infection in those without H. pylori infection may occur if there was a long time interval between the index test and reference standard. This may alter the diagnostic test accuracy.

Of the seven sources of heterogeneity mentioned above, we will use risk of bias, publication status, prospective or retrospective studies, recent or current use of proton pump inhibitors, and different subtypes of tests as categorical covariates; we will use the proportion of symptomatic versus asymptomatic participants, and the interval between the index test and reference standards as continuous covariates in the regression model. We will include one covariate at a time in the regression model. We will use the likelihood ratio test to determine whether the covariate is statistically significant.

Sensitivity analyses

We do not plan any sensitivity analyses except when the data available from the studies are ambiguous (for example, the numbers in the text are different from the numbers in the figures) in which case, we will assess the impact of different data used by a sensitivity analysis.

Assessment of reporting bias

We plan to investigate whether the summary sensitivity and specificity are different between studies that are published as full texts and those that are available only as abstracts (at least two years prior to search date), using the methods described in Investigations of heterogeneity.

Acknowledgements

We thank the Cochrane Upper Gastrointestinal and Pancreatic Diseases (UGPD) Group, the UK Support Unit for Diagnostic Test Accuracy (DTA) Reviews, and the DTA editorial team for their advice in the preparation of this review.

Appendices

Appendix 1. Glossary

Adenomas: a non-cancerous growth arising from the glands and has a structure similar to glands.

Asymptomatic: without symptoms.

Dyspepsia: indigestion.

Heterogeneity: differences in results between studies.

Idiopathic thrombocytopaenia pupura: purpura (purplish spots or patches on the skin and inner lining of the mouth) resulting from bleeding due to a reduction in circulating blood platelets caused by antibodies against platelets.

Appendix 2. MEDLINE search strategy

1. exp Helicobacter pylori/ or exp Helicobacter/ or exp Helicobacter infection/

2. (pylori or pyloridis).mp.

3. helicobacter.mp.

4. HP.mp.

5. Campylobacter.mp.

6. 1 or 2 or 3 or 4 or 5

7. exp Breath Tests/

8. (breath adj3 test).mp.

9. exp Enzyme-Linked Immunosorbent Assay/

10. (Enzyme-Linked Immunosorbent Assay or ELISA).mp.

11. exp Blotting, Western/

12. (Western adj1 (blot or blotting or immunoblot or immunoblotting)).mp.

13. exp Latex Fixation Tests/

14. ("latex agglutination test" or "latex fixation test" or LAT).mp.

15. ((stool or "stool antigen" or feces or faeces or fecal or faecal) adj3 test).mp.

16. 7 or 8 or 9 or 10 or 11 or 12 or 13 or 14 or 15

17. 6 and 16

18. exp animals/ not humans.sh.

19. 17 not 18

Appendix 3. EMBASE search strategy

1. exp Helicobacter pylori/ or exp Helicobacter/ or exp Helicobacter infection/

2. helicobacter.mp.

3. (pylori or pyloridis or HP).mp.

4. Campylobacter.mp.

5. 1 or 2 or 3 or 4

6. exp urea breath test/

7. (breath adj3 test).mp.

8. exp enzyme linked immunosorbent assay/

9. (Enzyme-Linked Immunosorbent Assay or ELISA).mp.

10. exp Western blotting/

11. (Western adj1 (blot or blotting or immunoblot or immunoblotting)).mp.

12. exp latex agglutination test/

13. ("latex agglutination test" or "latex fixation test" or LAT).mp.

14. exp feces analysis/

15. ((stool or "stool antigen" or feces or faeces or fecal or faecal) adj3 test).mp.

16. 6 or 7 or 8 or 9 or 10 or 11 or 12 or 13 or 14 or 15

17. 5 and 16

18. exp animal/ not exp human/

19. 17 not 18

Appendix 4. Science Citation Index search strategy

#1 TS=(pylori or pyloridis or helicobacter or HP or Campylobacter)

#2 TS=("breath test" or Enzyme-Linked Immunosorbent Assay or ELISA or "Western blot" or "Western blotting" or "Western Immunoblot" or "Western Immunoblotting" or ("latex agglutination test" or "latex fixation test" or LAT) or "stool test" or "stool antigen test" or "feces test" or "faeces test" or "fecal test" or "faecal test")

#3 #1 AND #2

Appendix 5. National Institute for Health Research - Health Technology Assessment

helicobacter pylori and accuracy

Contributions of authors

KSG, MY and BRD wrote the protocol.

Declarations of interest

This report is independent research funded by the National Institute for Health Research (NIHR Cochrane Programme Grants, 13/89/03 - Evidence-based diagnosis and management of upper digestive, hepato-biliary, and pancreatic disorders). The views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the National Institute for Health Research or the Department of Health.

KS: none known.

MY: Mohammah Yaghoobi is an Editor with the Cochrane Upper GI and Pancreatic Diseases (UGPD) Review Group however other UGPD Editors were responsible for the editorial processing of this protocol.

BRD: none known.

Sources of support

Internal sources

  • University College London, UK.

External sources

  • National Institute for Health Research, UK.

    This project was supported by the National Institute for Health Research, via Cochrane Programme Grant to the CHBG and UGPD groups. The views and opinions expressed therein are those of the authors and do not necessarily reflect those of the Systematic Reviews Programme, NIHR, NHS or the Department of Health.

Ancillary