Biomarkers for assessing disease activity in inflammatory bowel disease

  • Protocol
  • Diagnostic

Authors


Abstract

This is the protocol for a review and there is no abstract. The objectives are as follows:

The primary objective is to determine the diagnostic accuracy of three commonly studied biomarkers (serum CRP, FC and SL) for assessing disease activity in patients with established IBD (i.e. UC or CD) presenting with symptoms suggestive of active disease.

A secondary objective is to investigate sources of heterogeneity by disease type (i.e. UC or CD), disease severity (e.g. mild, moderate or severe disease), disease location (e.g. distal or proximal involvement) and age.

Background

Significant morbidity and mortality are associated with the progression of untreated inflammatory bowel disease (IBD) (Casellas 2001). It is however, frequently difficult to determine clinically if a patient with known IBD (ulcerative colitis (UC) or Crohn's disease (CD)) is having a disease flare, or if symptoms suggestive of active disease can be attributed to other causes such as irritable bowel syndrome or gastroenteritis (Lasson 2008). This can lead to unnecessary treatment with corticosteroids or immunosuppressive medications with potentially serious adverse events (Lichtenstein 2006). Invasive measures to determine disease activity such as colonoscopy are not ideal due to cost, lack of immediate availability, need for bowel preparation and risk of complications (Cobb 2004). It is therefore important to be able to identify acute flare-ups of IBD through non-invasive measures that are reliable, readily available and relatively affordable. As such, many non-invasive biomarkers have been studied for assessing disease activity in patients with IBD, with mixed results (Langhorst 2008). These biomarkers include blood, urine and stool tests. Fecal calprotectin (FC), stool lactoferrin (SL) and C-reactive protein (CRP) are three biomarkers commonly used for the evaluation of patients with IBD. CRP is a serum acute phase reactant that has been used for decades as a general marker of inflammation which is not specific to the gastrointestinal tract (Langhorst 2008). SL, which has recently been introduced as an evaluative tool that is elevated in response to gut inflammation, is a component of neutrophils that is secreted by mucosal membranes during inflammatory states (Walker 2007). FC is a protein that serves as a surrogate marker of bowel inflammation. FC correlates with the excretion of leukocytes labelled with 111Indium, which is considered a fairly accurate marker of inflammation (Roseth 1999). The use of CRP, SL and FC has expanded over the years to include: diagnosing IBD (Tibble 2000), monitoring therapy (Sipponen 2008), confirming mucosal healing (Langhorst 2008), predicting relapse and investigating symptoms suggestive of active IBD (Kallel 2010; Van Rheenen 2010; Manz 2012). As the latter is a clinically important indication, the reliability of these evaluative methods in this clinical setting needs to be evaluated and confirmed. Further, it is important to determine whether or not these three tests are more useful in certain subgroups depending on patient age (pediatric or adult), disease type (UC or CD) and disease location (proximal or distal disease) as this would have a significant effect on clinical practice.

Target condition being diagnosed

The target condition is disease activity in patients with established IBD (i.e. UC or CD) presenting with symptoms suggestive of active disease.

Index test(s)

The index tests will include: serum CRP, SL and FC.

Clinical pathway

Current practice for assessing patients with IBD presenting with symptoms of active disease includes performing a limited or full range colonoscopy to confirm activity prior to initiating therapy. However, given the limitations of endoscopy including time, high-cost and risks associated with performing colonoscopy, non-invasive biomarkers are safer and more cost-effective. Therefore, a cost-effective pathway incorporating such non-invasive markers should be developed to triage patients truly in need of colonoscopy. The clinical role of non-invasive tests within this pathway will depend on the diagnostic performance characteristics of the test. For example, non-invasive tests with high negative predictive value could be used as a triage test to minimise the use of unnecessary colonoscopy, whereas those with high positive predictive value could be used for selecting patients who need confirmation by colonoscopy. The choice of which non-invasive test to use will depend on performance characteristics in different patient groups, availability, the ease with which a test can be performed in a patient in this clinical setting, and the cost of the test. This review will focus on the ability of non-invasive tests to triage patients for colonoscopy.

Rationale

These tests, if proven accurate in assessing disease activity in patients with IBD presenting with active symptoms, can prevent unnecessary endoscopic evaluations and treatment with corticosteroids and immunosuppressive medications. This may ultimately prove to be a cost-effective strategy for individualizing the management of patients with UC or CD.

Objectives

The primary objective is to determine the diagnostic accuracy of three commonly studied biomarkers (serum CRP, FC and SL) for assessing disease activity in patients with established IBD (i.e. UC or CD) presenting with symptoms suggestive of active disease.

Secondary objectives

A secondary objective is to investigate sources of heterogeneity by disease type (i.e. UC or CD), disease severity (e.g. mild, moderate or severe disease), disease location (e.g. distal or proximal involvement) and age.

Methods

Criteria for considering studies for this review

Types of studies

Diagnostic cohort studies and diagnostic case-control studies that evaluate the diagnostic accuracy of FC, SL or CRP for assessing disease activity in patients with previously diagnosed IBD (i.e. UC or CD) presenting with symptoms suggestive of active disease will be included. Studies will be included irrespective of publication status or language.

Participants

Participants will include pediatric or adult patients of any age previously diagnosed with IBD (i.e. UC or CD), presenting with symptoms suggestive of active disease, in whom the presence or absence of active disease is confirmed by colonoscopy or sigmoidoscopy.

Index tests

Studies that examine the accuracy of serum CRP, FC and SL will be considered for inclusion.

Target conditions

Studies will be required to identify active IBD (i.e. UC or CD) as a target condition, the decision has to be dichotomized (i.e. active "Yes" or inactive "No") based on which endoscopic scoring system was adopted by the investigators.

Reference standards

The reference standard will be ileocolonoscopy in patients with CD and colonoscopy or sigmoidoscopy in patients with UC. Studies that do not apply this reference standard to all patients will not be included. For patients with suspected isolated small bowel CD, tests such as computed tomography (CT) enterography, magnetic resonance (MR) enterography, CT enteroclysis, small bowel follow-through, capsule endoscopy, small bowel ultrasound or technetium labelled white blood cells (WBC) scan will be used as the reference standard if the initial assessment using ileocolonoscopy was normal.

Search methods for identification of studies

Electronic searches

We will search MEDLINE using OvidSP, EMBASE using Ovid SP, the Cochrane Library, and the ISI Web of Knowledge from inception to December 2013 for diagnostic studies. No language or document type restrictions will be applied. The multipurpose search command for the Ovid SP interface (.mp.) will be used to search both text and database subject heading fields. To capture variations in suffix endings, the unlimited truncation symbol ‘*’ will be used. The search strategies are listed in Appendix 1.

Searching other resources

Additional studies will be identified by manually searching the references of articles retrieved from the computerised databases and relevant review articles. Unpublished studies will be sought by contacting experts in the field. Conference proceedings from the last 10 years including Digestive Disease Week (DDW) and the United European Gastroenterology Week (UEGW) will be searched to identify studies published as abstracts. Grey literature databases (e.g. SIGLE) will also be searched to identify additional studies not indexed in traditional databases.

Data collection and analysis

Selection of studies

Studies identified by the literature search will be independently screened for eligibility by at least two authors (i.e. MM, MF, SKG, SGF, KAB, or JKM). Abstracts of selected titles will be reviewed and all potentially eligible studies will be selected for full text review. Two review authors will independently assess full manuscripts against the inclusion criteria described above. Any disagreements will be resolved by discussion and consensus with the senior authors (i.e. WJS or NC). Eligible studies published as abstracts will be included if sufficient data are provided for analysis.

Data extraction and management

At least two authors (i.e. MM, MF, SKG, SGF, KAB or JKM) will independently complete a data extraction form for all included studies. The following data will be retrieved:

1. General information: title, journal, year, publication status, and study design (prospective versus retrospective).

2. Sample size: number of participants meeting the criteria and total number screened.

3. Baseline characteristics: baseline diagnosis, age, sex, race, disease severity, and concurrent medications used.

4. The index test with all cutoffs tested.

5. Clinical reference standard test: features of disease activity, scoring system used, number of endoscopists and handling of interobserver error of colonoscopy or sigmoidoscopy.

6. Number of true positive (TP), true negative (TN), false positive (FP), and false negative (FN).

Assessment of methodological quality

QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies) is an evidence-based tool used for the assessment of quality in systematic reviews of diagnostic accuracy studies (Whiting 2011). QUADAS-2 is structured so that four key domains are each rated in terms of the risk of bias and concerns regarding applicability to the research question. Each key domain has a set of signalling questions to help reach the judgments regarding bias and applicability. The tool is based on items that cover a wide range of methodological issues in diagnostic test-studies. The QUADAS-2 tool has been tailored to reflect our review question (See Appendix 2). Two authors will independently pilot the tool until satisfactory inter-rater agreement has been achieved. Two authors will independently assess the methodological quality of included studies using the finalized QUADAS-2 tool. Any disagreement will be resolved by consensus with the senior authors. If sufficient details are not described in the studies to allow for quality assessment the study authors will be contacted for missing information. We will use tabular and graphical displays to summarise the QUADAS-2 assessments.

Statistical analysis and data synthesis

The reported number of true positives (TP), false negatives (FN), true negatives (TN) and false positives (FP) will be used to construct two-by-two tables for each index test. The data will be entered into Review Manager 5 (RevMan 5.2) to produce forest plots of sensitivity and specificity and 95% confidence intervals (95% CI) and to plot study specific estimates of sensitivity and specificity and 95% CI in the receiver operating characteristic (ROC) space for each index test. Meta-analysis will be performed where appropriate. We will use the hierarchical summary ROC model (HSROC) to pool data (i.e. sensitivities and specificities) and to plot a summary ROC curve (Rutter 2001) as we expect the primary studies to report accuracy estimates of the index tests using different cut-off points. We will report sensitivities and specificities for all cut-off points for primary studies that report accuracy results for more than one cut-off point. However we will use a single cut-off point for each study in the HSROC analysis. The choice of the cut-off will be based on the maximum of the Youden's index (i.e. sensitivity + specificity - 1). Although this method may lead to an overestimate of diagnostic accuracy (Leeflang 2008) there are no accepted thresholds for the index tests to a priori define active disease. It is also likely that studies that report accuracy estimates using one cut-off point based the selection of the cut-off on Youden's index. We plan to use the approach suggested by Zou and Donner to compare the summary estimates of accuracy among the index tests (Zou 2008). The essence of this approach is that the confidence interval for a difference can be obtained from the confidence limits for each parameter. These analyses will be performed using the SAS (release 9.2) statistical software. If sufficient data are available subgroup analyses will be conducted to determine whether or not the index tests are more sensitive in different subgroups including disease type (e.g. UC or CD), age (e.g. pediatric or adult), and disease location (e.g. distal or proximal involvement).

Investigations of heterogeneity

Potential sources of heterogeneity that will be investigated include disease subtype (i.e. UC or CD), disease severity (e.g. mild, moderate or severe disease), disease location (e.g. in CD where you could have isolated small bowel disease not seen on colonoscopy), disease extent (in UC where inflammation may affect only the rectum or may be spread through the entire large intestine) and age (e.g. pediatric versus adult patients). Heterogeneity, other than different cutoffs, will be addressed in both clinical and statistical ways. First, depending on the number of eligible studies, covariates that may explain heterogeneity will be added to a regression model including disease subtype, disease location and disease severity and age. Separate SROC curves will be calculated for each subgroup, depending on the number of included studies.

Sensitivity analyses

In order to assess the robustness of the eligibility criteria, sensitivity analyses will be undertaken to explore the effect on the overall results of removing poor quality studies and abstract publications. Poor quality studies will include any study that was assessed to be at high risk of bias for the tailored QUADAS-2 assessment. High risk of bias studies will include studies that used convenience samples or inappropriate exclusions (QUADAS item 1); studies that used thresholds where the thresholds were not prespecified (QUADAS item 2); studies where reference standards were interpreted with knowledge of the index test results (QUADAS item 3); and studies where there was a high proportion of missing patients or patients excluded from the analysis and there was no explanation given (QUADAS item 4).

Assessment of reporting bias

We do not plan to carry out a formal assessment of publication bias using funnel plots or regression tests because these techniques have not been found to be useful for diagnostic test accuracy studies (Macaskill 2010).

Acknowledgements

Funding for the IBD/FBD Review Group (September 1, 2010 - August 31, 2015) has been provided by the Canadian Institutes of Health Research (CIHR) Knowledge Translation Branch (CON - 105529) and the CIHR Institutes of Nutrition, Metabolism and Diabetes (INMD); and Infection and Immunity (III) and the Ontario Ministry of Health and Long Term Care (HLTC3968FL-2010-2235).

Miss Ila Stewart has provided support for the IBD/FBD Review Group through the Olive Stewart Fund.

Appendices

Appendix 1. Preliminary Search Strategies

MEDLINE – SEARCH STRATEGY

  1. Crohn's Disease.mp. or exp Crohn Disease/

  2. Ulcerative Colitis.mp. or exp Colitis, Ulcerative/

  3. Inflammatory Bowel Disease.mp. or exp Inflammatory Bowel Diseases/

  4. 1 or 2 or 3

  5. exp "Severity of Illness Index"/ or disease activity.mp.

  6. limit 5 to humans

  7. C-reactive protein.mp. or exp C-Reactive Protein/

  8. lactoferrin.mp. or exp Lactoferrin/

  9. calprotectin.mp. or exp Leukocyte L1 Antigen Complex/

  10. 7 or 8 or 9

  11. 4 and 6 and 10

EMBASE – SEARCH STRATEGY

  1. Crohn's Disease.mp. or exp Crohn disease/

  2. Ulcerative Colitis.mp. or exp ulcerative colitis/

  3. Inflammatory Bowel Disease.mp. or exp enteritis/

  4. 1 or 2 or 3

  5. disease activity.mp. or exp disease activity/

  6. limit 5 to human

  7. C-reactive protein.mp. or exp C reactive protein/

  8. calprotectin.mp. or exp calgranulin/

  9. lactoferrin.mp. or exp lactoferrin/

  10. 7 or 8 or 9

  11. 4 and 6 and 10

Cochrane Library (CENTRAL) – SEARCH STRATEGY

  1. “Inflammatory Bowel Disease” or “Ulcerative Colitis” OR “Crohn’s Disease”

  2. “disease activity”

  3. “C-reactive protein” OR calprotectin OR lactoferrin 

ISI Web of Knowledge - SEARCH STRATEGY

  1. “ulcerative colitis” OR “Crohn’s disease” OR “inflammatory bowel disease”.

  2. “disease activity”

  3. “c-reactive protein” OR “CRP” OR lactoferrin OR calprotectin

Appendix 2. Assessment of Methodological Quality: QUADAS-2

Domain 1: Patient Selection
Risk of bias: Could the selection of patients have introduced bias?
Signalling question 1: Was a consecutive or random sample of patients enrolled?
We will score ‘yes’ if the study enrolled a consecutive or random sample of eligible patients; ‘no’ if the study selected patients by convenience; and ‘unclear’ if the study did not report the manner of patient selection.
Signalling question 2: Was a case-control design avoided? We will score 'yes' if a case-control design was avoided; 'no' if a case-control design was used; and 'unclear' if insufficient information was reported to allow a judgement.
Signalling question 3: Did the study avoid inappropriate exclusions? We will score ’yes’ if the study avoided inappropriate exclusions; ’no’ if the study did not avoid inappropriate exclusions (e.g. only patients with severe disease included); and ’unclear’ if insufficient information was reported to allow a judgement.
Risk of bias will be scored as ‘low concern’ if selection was done in a random or consecutive manner and the study avoided inappropriate exclusions; ‘high concern’ if selection was by convenience or the study had inappropriate exclusions; and ‘unclear concern’ if the manner of participant selection was unclear and no clinical information was provided.
Applicability: Are there concerns that the included patients and setting do not match the review question? We are interested in diagnosing disease activity in patients with established IBD. We will score ’low concern’ for applicability if the patients clearly have established IBD and ’high’ concern if it is possible that the sample includes patients with their first presentation of IBD. We will judge applicability to be of ‘unclear concern’ if the study does not provide enough clinical information to make a judgement about applicability.

Domain 2: Index Test
Risk of bias: Could the conduct or interpretation of the index test have introduced bias?
Signalling question 1: Were the index test results interpreted without knowledge of the results of the reference standard? We will not score this signalling question as it is not applicable to our review. The index tests for this review include serum CRP, SL and FC. These are objective tests based on laboratory results which would not be influenced by blinding the test interpreter to the results of the reference standard.
Signalling question 2: If a threshold was used, was it prespecified? We will score 'yes' if the threshold was prespecified; 'no' if the threshold was not prespecified; and 'unclear' if insufficient information was reported to allow a judgement.
Risk of bias will be scored as ‘low concern’ if prespecified thresholds were used; ‘high concern’ if thresholds were not prespecified; and ‘unclear concern’ if insufficient information was reported to allow a judgement.
Applicability: Are there concerns that the index test, its conduct, or its interpretation differ from the review question? Although variations in test technology, execution, or interpretation might affect estimates of the diagnostic accuracy of a test, we feel that applicability of the index tests are not a concern for this review.

Domain 3: Reference Standard
Risk of bias: Could the reference standard, its conduct, or its interpretation have introduced bias?
Signalling question 1: Is the reference standard likely to correctly classify the target condition? Endoscopy is considered to be the gold standard for diagnosis of IBD. We will score 'yes' for all studies.
Signalling question 2: Were the reference standard results interpreted without knowledge of the results of the index test? We will score ’yes’ if blinding was explicitly stated, or it was clear that the reference standard was performed at a separate outpatient location or performed by different people. We will score ‘no’ if the study stated that the reference standard result was interpreted with knowledge of index test results. We will score ’unclear’ if insufficient information was reported to allow a judgement.
Risk of bias will be scored ‘low concern’ if the reference standard results were interpreted without knowledge of the results of the index test; ‘high concern’ if the study explicitly stated the result of the reference standard was interpreted with knowledge of the index test results. We will score ‘unclear concern’ if insufficient information was reported to allow a judgement.
Applicability: Are there concerns that the target condition as defined by the reference standard does not match the question? Endoscopy is considered to be the gold standard for diagnosis of IBD. We feel that applicability of the reference standard is not a concern for this review.

Domain 4: Flow and Timing
Risk of bias: Could the patient flow have introduced bias?

Signalling question 1: Was there an appropriate interval between the index test and reference standard? In the majority of included studies, we expected specimens for the index tests to be obtained two weeks (or less) before endoscopy. However, even if there were a delay of several days or weeks between index test and reference standard, UC and Crohn's are chronic diseases and we consider misclassification of disease status to be unlikely. We will score ‘yes’ for studies that report an appropriate interval and 'unclear' for studies that do not report the interval.
Signalling question 2: Did all patients receive a reference standard? We will score ’yes’ if all patients received a reference standard; ‘no’ if some patients did not receive the reference standard; and ’unclear’ if insufficient information was reported to allow a judgement.
Signalling question 3: Did all patients receive the same reference standard? We will score ‘yes’ if it is clear that all patients in the study received the same reference standard; 'no' if it is clear that some patients received a different reference standard (e.g. patients with UC receiving colonoscopy or sigmoidoscopy); and 'unclear' if insufficient information was reported to allow a judgement.
Signalling question 4: Were all patients included in the analysis? We will score 'yes' if the number of patients enrolled matches the number of patients included in the analysis; 'no' if the number of patients enrolled does not match the number of patients included in the analysis; and 'unclear' if insufficient information was reported to allow a judgement.
Risk of bias will be scored ‘low concern’ if the number of patients enrolled was clearly reported and corresponds to the number of patients in the analysis or if exclusions were adequately described. We will score ’high concern’ if there was a high proportion of missing patients or patients excluded from the analysis and there was no explanation given; and ’unclear concern’ if insufficient information was reported to allow a judgement (i.e. the number of patients originally enrolled in the study was not explicitly reported).

Contributions of authors

MM formulated the research question and drafted the protocol. MM will screen the studies, perform data extraction and write the review.

MF will assist with screening studies and data extraction.

SKG will assist with screening studies and data extraction.

SGF will assist with literature search, screening studies and data extraction.

KAB will perform the literature search and assist with screening studies and data extraction.

GYZ will perform the statistical analysis.

JKM provided methodological expert opinion, was involved in decision making and helped to draft the protocol. JKM will assist with screening studies, data extraction and writing the review.

WJS reviewed the protocol.

NC reviewed the protocol.

Declarations of interest

MM: None known.

MF: None known.

SKG: None known.

SGF: None known.

KAB: None known.

GYZ: None known.

JKM has received fees for consultancy from Tillotts Pharma AG. All of these financial activities are outside the submitted work.

WJS: All potential conflicts of interest for William Sandborn have been noted on the form. All of these financial activities are outside the submitted work.

NC has received fees for consultancy from Abbott/AbbVie and Ferring, fees for lectures from Abbott and Janssen, travel expenses from Merck and has stock/stock options in Pfizer, Glaxo Smith Kline, Proctor and Gamble and Johnson and Johnson. All of these financial activities are outside the submitted work.

Notes

None

Ancillary