ACADEMIC EMERGENCY MEDICINE 2012; 19:48–55 © 2012 by the Society for Academic Emergency Medicine
Objectives: Over the past decade, clinicians have become increasingly reliant on computed tomography (CT) for the evaluation of patients with suspected acute appendicitis. To limit the radiation risks and costs of CT, investigators have searched for biomarkers to aid in diagnostic decision-making. We evaluated one such biomarker, calprotectin or S100A8/A9, and determined the diagnostic performance characteristics of a developmental biomarker assay in a multicenter investigation of patients presenting with acute right lower quadrant abdominal pain.
Methods: This was a prospective, double-blinded, single-arm, multicenter investigation performed in 13 emergency departments (EDs) from August 2009 to April 2010 of patients presenting with acute right lower quadrant abdominal pain. Plasma samples were tested using the investigational S100A8/A9 assay. The primary outcome of acute appendicitis was determined by histopathology for patients undergoing appendectomy or 2-week telephone follow-up for patients discharged without surgery. The sensitivity, specificity, negative likelihood ratio (LR–), and positive likelihood ratio (LR+) of the biomarker assay were calculated using the prespecified cutoff value of 14 units. A post hoc stability study was performed to investigate the potential effect of time and courier transport on the measured value of the S100A8/A9 assay test results.
Results: Of 1,052 enrolled patients, 848 met criteria for analysis. The median age was 24.5 years (interquartile range [IQR] = 16–38 years), 57% were female, and 50% were white. There was a 27.5% prevalence of acute appendicitis. The sensitivity and specificity for the investigational S100A8/A9 assay in diagnosing acute appendicitis were estimated to be 96% (95% confidence interval [CI] = 93% to 98%) and 16% (95% CI = 13% to 19%), respectively. The LR− ratio was 0.24 (95% CI = 0.12 to 0.47), and the LR+ was 1.14 (95% CI = 1.10 to 1.19). The post hoc stability study demonstrated that in the samples that were shipped, the estimated time coefficient was 7.6 × 10−3 ± 2.0 × 10−3 log units/hour, representing an average increase of 43% in the measured value over 48 hours; in the samples that were not shipped, the estimated time coefficient was 2.5 × 10−3 ± 0.4 × 10−3 log units/hour, representing a 13% increase on average in the measured value over 48 hours, which was the maximum delay allowed by the study protocol. Thus, adjusting the cutoff value of 14 units by the magnitude of systematic inflation observed in the stability study at 48 hours would result in a new cutoff value of 20 units and a “corrected” sensitivity and specificity of 91 and 28%, respectively.
Conclusions: In patients presenting with acute right lower quadrant abdominal pain, we found the investigational enzyme-linked immunosorbent assay (ELISA) test for S100A8/A9 to perform with high sensitivity but very limited specificity. We found that shipping effect and delay in analysis resulted in a subsequent rise in test values, thereby increasing the sensitivity and decreasing the specificity of the test. Further investigation with hospital-based laboratory analyzers is the next critical step for determining the ultimate clinical utility of the ELISA test for S100A8/A9 in ED patients presenting with acute right lower quadrant abdominal pain.
Abdominal pain is the most common reason for seeking emergency care in the United States, and acute appendicitis is the most common surgical etiology.1 As clinical assessment is difficult, with only half of all appendicitis cases presenting in a “classic” pattern,2 clinicians frequently rely on imaging, such as computed tomography (CT), to help distinguish self-limited causes of abdominal pain from those requiring timely surgical or medical intervention. The risk of perforation or complication is over 90% in adults in whom appendicitis is missed in the initial presentation.3 Despite a dramatic increase in the use of abdominal CT, many of these examinations are thought to be unnecessary or of marginal clinical benefit, with unnecessary radiation exposure and costs.4–6 Health services literature has demonstrated more than a doubling in abdominal CT use over a 5-year period, with no increase in detection rate for appendicitis or reduction in hospitalizations observed in a national sample of patients presenting with abdominal pain.6 Previous studies have sought to identify biomarkers that could increase diagnostic accuracy and limit the utilization of CT, with limited success thus far.7–13
Calprotectin, or S100A8/A9, is a calcium-binding protein associated with acute inflammation. Growing evidence from laboratory, animal, and now human studies suggests utility as a biomarker for a number of inflammatory conditions,14–16 including intestinal inflammatory conditions.17–21 In one pilot study of 181 acute abdominal pain patients with a 23% prevalence of acute appendicitis, the sensitivity and specificity of S100A8/A9 for appendicitis were estimated to be 93 and 54%, respectively.22
The objective of this study was to evaluate the diagnostic performance characteristics of the enzyme-linked immunosorbent assay (ELISA) test for S100A8/A9 in a large multicenter investigational study of adult and pediatric patients with acute right lower quadrant abdominal pain. As this study used transported shipments, we performed a post hoc stability study to investigate the potential effect of time and courier transport on the measured value of the S100A8/A9 assay test results.
This was a prospective, double-blinded, single-arm, multicenter investigation to evaluate the diagnostic performance characteristics of an ELISA test for S100A8/A9. Institutional review board approval was obtained at each institution, and written informed consent was obtained from each patient or parent, with pediatric assent from minors.
Study Setting and Population
The study population was a convenience sample of patients presenting to participating emergency departments (EDs) with acute right lower quadrant abdominal pain. The study was performed between August 2009 and March 2010 at 13 ED sites with annual censuses ranging from 36,000 to 112,000 visits (see acknowledgements for a site listing).
A convenience sample of adult and pediatric patients of all ages with abdominal pain presenting to the ED of one of the participating sites was screened for study eligibility. Inclusion criteria included a chief complaint of abdominal pain duration of less than or equal to 72 hours (acute) and pain location primarily in the right lower quadrant. Exclusion criteria included active inflammatory bowel disease, history of abdominal trauma or invasive abdominal medical procedure within the prior 14 days, prior appendectomy, inability or unwillingness to provide informed consent or pediatric assent, participation in a research protocol in the preceding 28 days, inability or unwillingness to comply with study protocols and follow-up requirements, or completion of radiologic imaging procedures (CT, magnetic resonance imaging, or ultrasound) prior to the evaluation at the participating sites during their current episode of abdominal pain. Patients who met the inclusion/no exclusion criteria and who granted consent were enrolled. Informed consent was obtained as patients consented to a blood draw for this study and also gave consent to be contacted for follow-up.
Standardized data collection forms were used to record demographic data, historical and physical examination information, and the results of pertinent laboratory or radiographic imaging studies. Imaging and other diagnostic tests were not required and were determined by the treating physicians. Participants who did not proceed to appendectomy during the initial episode of care had telephone follow-up performed at 2 weeks (±3 days) to determine whether the subject had an appendectomy or persistent symptoms during the follow-up period.
A whole blood sample for plasma was obtained from each subject and processed on site for the determination of S100A8/A9 levels. Samples were centrifuged within 2 hours of the initial blood draw, and the plasma was then transferred to a test tube and refrigerated. Plasma samples were shipped daily from the sites to a central laboratory where analysis of S100A8/A9 levels was completed. A sandwich ELISA (AspenBio Pharma, Inc., Castle Rock, CO) was performed by laboratory technicians following the Investigational Use Product Insert. A preliminary normal range was determined in a prior pilot study, and a cutoff value of 14 units was prospectively defined to interpret the assay results.22 The laboratory personnel performing the assay were blinded to the patients’ medical information and final diagnoses to minimize observer bias. Results of the ELISA test for S100A8/A9 were neither made available to, nor were they accessible by, the clinical study sites or anyone having clinical patient information during the course of the study and were not used to direct patient care. For those patients who underwent appendectomy, tissue slides were obtained for review.
The criterion standard for the presence of acute appendicitis was established in one of two ways. For patients who underwent an appendectomy at the enrolling hospital site, the presence or absence of appendicitis was determined by histopathologic examination of the excised appendix tissue by an independent pathology panel of three blinded, board-certified pathologists using protocol-specified definitions. Each of the three members of the pathology panel reviewed the slides of the appendix tissue removed by appendectomy, and evaluated the slides using the following definitions: 1) not appendicitis—no neutrophilic infiltrate or a process other than inflammation present; 2) chronic appendicitis—lymphocytic/eosinophilic predominance of infiltrate equal to or greater than 75% of cells and no area of only neutrophilic infiltrate (acute inflammation); and 3) acute appendicitis—significant neutrophilic infiltrate involving at least the mucosa, submucosa, and/or muscularis propria. Acute appendicitis was considered the positive diagnosis; the other two categories were considered negative. If discrepancies between the three pathology panel members were noted in the primary tissue categorization, a sponsor designee, who was blinded to clinical information, coordinated a face-to-face meeting with all three reviewers to review the relevant slides on a multihead microscope and discuss tissue characteristics and their rationale for categorization. A final decision was made by consensus agreement as to the category that best characterized the tissue.
For patients who were discharged from the ED visit or hospital without surgery, the presence of appendicitis was established by telephone follow-up that determined whether the patient underwent an appendectomy within 2 weeks following study enrollment. For patients who underwent an appendectomy within the 2-week follow-up period at an institution other than the enrolling hospital, the final diagnosis was established by the results of the institution’s reviewing clinical pathologist (if available). If the pathology report was not available for these patients, the surgical report served as the source for establishing the final diagnosis. The final population of subjects for analysis included those who were enrolled with a valid blood sample collected, met the inclusion/exclusion criteria, and completed the 2-week follow-up.
After the primary analysis was completed, a post hoc stability study was performed to investigate the potential effect of time and courier transport on the measured value of the S100A8/A9 assay test results. To examine the effect of time alone, blood samples from 20 asymptomatic volunteers were analyzed immediately after collection and were stored at 4°C before reanalysis at 24, 48, 72, and 96 hours postcollection, without transportation between locations. To investigate the combined effect of time and transportation, samples from 12 of the subjects were shipped via overnight delivery service in a manner identical to that used in the protocol, but back to the original site. These samples were analyzed 24 and 48 hours after collection. The n of 20 and 12 was chosen arbitrarily, as there was no background information to determine a sample size.
Skewed continuous variables were summarized with medians and interquartile ranges (IQRs) and compared with the Wilcoxon rank sum test. Categorical variables are presented as percentages and were compared using Fisher’s exact test. The sensitivity, specificity, predictive values, and likelihood ratios of the ELISA test for S100A8/A9 were calculated using the prespecified cutoff value of 14 units. Ninety-five percent confidence intervals (95% CIs) for the accuracy indexes were calculated. The area under the receiver operating characteristic (ROC) curve, with the associated 95% CI, was determined for the ELISA test for S100A8/A9.
For the post hoc stability study, we performed a linear regression analysis of log-transformed assay values to determine the effect of time, both with and without transport, on the test value. We employed a mixed linear model with a random, subject-specific intercept at time = 0 and a fixed coefficient or slope for the time variable. Visual inspection of the data failed to reveal any clear nonlinear trends, significant outliers, or heteroscedasticity. Because of the log transformation, the slope represents the average exponential rate of change, if any, that occurs in the assay results over time. Transported samples were analyzed separately from the nontransported specimens. All analyses were performed using SAS (Version 9.1.3, SAS Institute, Cary, NC), JMP 9.0 (SAS Institute, Cary, NC), and StatXact for Windows (Version 8, Cytel, Inc., Cambridge, MA).
From August 2009 until April 2010, a total of 1,052 patients were enrolled, of whom 204 were excluded, leaving 848 patients for analysis (Figure 1). The median age of the study group was 24.5 years (IQR = 16 to 38 years), 57% were female, and the prevalence of acute appendicitis was 27.5%. Patients with appendicitis were more likely to be male and younger than those without appendicitis. The demographic and clinical characteristics of the appendicitis and nonappendicitis groups are shown in Table 1.
|Patient Characteristics||Appendicitis, n (%)||Not Appendicitis, n (%)|
|Total||N = 234||N = 614|
|Median (IQR)||22 (13–36.3)||25 (17–39)|
|≤12 yr||48 (21)||71 (12)|
|Female||94 (40)||393 (64)|
|White||126 (54)||299 (49)|
|Black or African American||19 (8)||131 (21)|
|Hispanic||72 (31)||153 (25)|
|Asian||10 (4)||19 (3)|
|Other||7 (3)||12 (2)|
|Duration of symptoms|
|≤24 hours||145 (62)||350 (57)|
|>24 to 72 hours||89 (38)||264 (43)|
|Locations of pain|
|RLQ||234 (100)||614 (100)|
|RUQ||13 (6)||67 (12)|
|LLQ||24 (10)||97 (16)|
|LUQ||4 (2)||24 (4)|
|Epigastric||14 (6)||39 (6)|
|Pelvic||1 (0.5)||12 (2)|
|Other||53 (23)||156 (25)|
|Multiple locations||88 (38)||281 (46)|
|Similar abdominal pain previously|
|Yes||20 (9)||127 (21)|
|No||213 (91)||487 (79)|
|Inflammatory bowel disease||0 (0)||6 (1)|
|Irritable bowel syndrome||3 (1)||10 (2)|
|Cholecystitis||2 (1)||19 (3)|
|Diverticulitis||0 (0)||10 (2)|
|None||212 (91)||501 (82)|
|Periumbilical pain with migration to RLQ||155 (66)||389 (63)|
|Anorexia||113 (48)||175 (29)|
|Vomiting||105 (45)||254 (41)|
|Dysuria||10 (4)||50 (8)|
|Diarrhea||37 (16)||127 (21)|
|Nausea||155 (66)||389 (63)|
|Constipation||19 (8)||57 (9)|
|Tenderness present||228 (97)||582 (95)|
|Rebound tenderness||66 (28)||67 (12)|
|Rigidity and guarding||81 (35)||90 (15)|
|Rovsing’s sign||45 (19)||44 (7)|
|Obturator sign||11 (5)||20 (3)|
|Psoas sign||23 (10)||29 (5)|
|No tenderness||3 (1)||28 (5)|
Diagnostic imaging obtained was CT in 66% and ultrasound in 22%. Of the 823 patients (97%) who underwent laboratory testing, 47% had an elevated white blood cell count (>10.5 × 109/L). Imaging and laboratory characteristics are listed in Table 2. The test performance characteristics of the ELISA test for S100A8/A9 for diagnosing acute appendicitis are presented in Table 3. Figure 2 demonstrates an ROC curve of assay results with an area under the curve of 0.66 (95% CI = 0.617 to 0.694).
|White blood cell count (×109/L)||N = 233||N = 590|
|>10.5, n (%)||195 (84)||190 (32)|
|Median (IQR 25%–75%)||14.2 (11.4–17.1)||8.6 (6.7–11.6)|
|Neutrophil %||n = 217||n = 550|
|>70%, n (%)||179 (82)||252 (46)|
|Median (IQR 25%–75%)||81 (74–86)||69 (59–80)|
|Absolute neutrophil count (×109/L)||n = 208||n = 531|
|Median(IQR 25%–75%)||11.4 (8.8, 14.6)||5.6 (4.0, 8.9)|
|Imaging data,*n (%)|
|CT scan||n = 159||n = 399|
|Positive for appendicitis||149 (94)||14 (4)|
|Perforated appendicitis||5 (3)||1 (7)|
|Negative for appendicitis||2 (1)||253 (63)|
|No visualized appendix||1 (<1)||27 (7)|
|Inconclusive||5 (3)||10 (3)|
|Other||2 (1)||95 (24)|
|Ultrasound||n = 41||n = 145|
|Positive for appendicitis||23 (56)||2 (1)|
|Negative for appendicitis||1 (2)||32 (22)|
|No visualized appendix||4 (10)||62 (43)|
|Inconclusive||5 (12)||9 (6)|
|Other||8 (20)||40 (28)|
|Imaging not performed||n = 34||n = 70|
|<14 units||9 (4%)||98 (16%)||107|
|≥14 units||225 (96%)||516 (84%)||741|
|Result, %||95% CI|
|Negative predictive value||91.6||84.8–95.5|
|Positive predictive value||30.4||27.2–33.8|
|Negative likelihood ratio||0.24||0.12–0.47|
|Positive likelihood ratio||1.14||1.10–1.19|
Stability Study Results
The results of the post hoc stability study demonstrated that in the samples that were not shipped, the estimated time coefficient was 2.5 × 10−3 ± 0.4 × 10−3 log units/hour, representing a 13% increase on average in the measured value over 48 hours, which was the maximum delay allowed by the study protocol. For the samples that were shipped, the estimated time coefficient was 7.6 × 10−3 ± 2.0 × 10−3 log units/hour, representing an average increase of 43% in the measured value over 48 hours. For example, adjusting the cutoff value of 14 units by the magnitude of systematic inflation observed in the stability study at 48 hours would result in a new cutoff value of 20 units and a “corrected” sensitivity and specificity of 91 and 28%, respectively.
For the diagnosis of acute appendicitis, our study found a high sensitivity (96%) and high negative predictive value (92%), but a low specificity (16%) for the ELISA test for S100A8/A9 in patients presenting to the ED with acute right lower quadrant abdominal pain. This suggests potential use of this test as a screening tool in a subset of low-risk patients with suspicion of appendicitis, the results of which may obviate the need for further radiographic testing, surgical consultation, and/or admission.
Given the wide array of clinical presentations of acute appendicitis and the lack of useful risk stratification tools, clinicians rely on diagnostic studies to aid in the accurate diagnosis of ED patients with abdominal pain suspicious for acute appendicitis. Imaging studies such as ultrasound and CT are often used in this evaluation, but as ultrasound has lower than ideal sensitivity, CT use has increased dramatically in the United States. Fear of malpractice and poor outcome may also lead to increased use of radiographic imaging. Appendicitis is one of the top three acute medical conditions associated with litigation against emergency physicians, resulting in claims paid to patients in almost one-third of cases.23
Computed tomography presents several well-documented downsides, including exposure to a significant amount of radiation, risks of intravenous contrast, limited availability of skilled technicians at all hours at many institutions, increased resource use and length of stay, and cost.24 As such, there is a need for better tools to help distinguish patients who should proceed to CT for further diagnostic assessment from those for whom limited or no further diagnostic evaluation related to presumption of appendicitis is necessary. The ELISA test for S100A8/A9 has the potential to fill this need and, when used appropriately, could significantly affect patient management in the ED setting.
Earlier studies examining the utility of other inflammatory biomarkers for the diagnosis of appendicitis have demonstrated only moderate sensitivities ranging from 63% to 85%, restricting clinical utility.7,25,26 Calprotectin, or S100A8/A9, has been shown to be useful as a biomarker in a number of inflammatory conditions. Calprotectin levels have been shown to rise in acute inflammatory conditions of the gut, leading to the use of fecal levels in the diagnosis of inflammatory bowel disease.27
This study shows a sensitivity of 96%, higher than that seen in the prior pilot study that was conducted in a smaller, homogenous study population.22 This high degree of sensitivity found for the ELISA test for S100A8/A9 could allow for its use in patients presenting with signs and symptoms suspicious for possible acute appendicitis to provide valuable additional information to clinicians. In patients with a low pretest probability of disease, a negative test could decrease the likelihood of appendicitis as the diagnosis, and the need for further testing such as CT or admission could be avoided in a select population in the ED. This reduction in unnecessary diagnostic evaluation could lead to more efficient utilization of ED space, hospital personnel, and diagnostic resources, while also decreasing radiation risk exposure to patients.
Stability Study Discussion
The observed sensitivity in our study (96.2%) was greater than that observed in the prior pilot study (90%) despite using the same cutoff value. In contrast, our observed specificity (16.0%) was less than that observed previously (33%). This combination of observations suggests that the values in the current study were systematically shifted upward, resulting in a cutoff that effectively was lower than in the pilot study. The stability study results provide one possible explanation. In the pilot study, samples were analyzed on site within 2 to 4 hours of collection time. However, in this study, samples were shipped to an offsite location for processing. There was a bimodal distribution of the time between sample collection and sample analysis with most samples being analyzed within either 24 ± 6 or 42 hours. The mean and median times between sample collection and analysis were 32 and 29.5 hours, respectively. The stability study demonstrated that the measured value of the ELISA test for S100A8/A9 can increase 13% to 43% due to a shipping effect and a delay in analysis. A systematic inflation in test values due to these effects would be equivalent to lowering the test threshold for positivity, increasing the sensitivity, and decreasing the specificity. Further research is needed to determine how the test will perform when run in real time with on-site laboratory-based analyzers.
Other unforeseen factors that may have influenced the specificity of the test include comorbid conditions such as bacterial diarrhea, rheumatoid arthritis, or secondary infections, each of which has been shown to increase calprotectin levels.28,29 Knowledge of how these conditions affect the test may be beneficial in creating decision rules for its use, similar to rules created for the use of D-dimer in the evaluation of thromboembolic disease.30 We do not foresee this test being used as a definitive diagnostic test, but rather as an adjunctive test for the diagnosis of appendicitis.
Our limitations include the use of telephone and record follow-up for those who were discharged at the enrolling visit without surgery, which was determined to be the best way to verify the outcome of interest. Of note, there were differences in rates of appendicitis by site, from 5.9% to 43.3%. Our methods did not assess for all missed eligible patients where a stronger assessment of enrollment bias could be determined. The ELISA test for S100A8/A9 was used on all patients who had appendicitis in the differential diagnosis. Future studies may evaluate a specific subset of patients that cannot be stratified in a study this size, such as those with migrating pain or pain and fever, etc. In our population, patients with appendicitis were more likely to be male and younger than those without appendicitis, which is likely a reflection of the population of patients who develop appendicitis. While we have no evidence to believe demographic characteristics affect the results of the test, this may have been a possibility. The last potential important limitation is the determination that shipping and delay in sample analysis may have affected the ELISA test results as described under the Results section. While the post hoc stability analysis provides an estimate of the effect, it is unclear how the results would have been different had testing been performed in real time on site.
In patients presenting with acute right lower quadrant abdominal pain, we found the enzyme-linked immunosorbent assay test for S100A8/A9 to perform with high sensitivity but very limited specificity and an area under the curve of only 0.66. We have found that shipping effect and delay in analysis resulted in a subsequent rise in test values, thereby increasing the sensitivity and decreasing the specificity of the test. Further investigation with on-site hospital-based laboratory analyzers is the next critical step for determining the ultimate clinical utility of the ELISA test for S100A8/A9 in ED patients presenting with acute right lower quadrant abdominal pain.
The authors acknowledge South Shore Hospital, South Weymouth, MA (John Benanti, MD, Maureen DeMenna, BSN); University of Pennsylvania Medical Center, Philadelphia, PA (Angela M. Mills, MD, Christine McCusker, BSN: two separate sites); Cooper University Hospital, Camden, NJ (Brigitte Baumann, MD, Lisa Capano-Wehrle); Olive View UCLA Medical Center, Sylmar, CA (David Talan, MD, Kavitha Pathmarajah, MPH); Harbor-UCLA Medical Center, Torrance, CA (Roger Lewis, MD, PhD, Heemun Kwok, MD, Ekaterina Tzvetkova, Si-Kyung Jung); Johns Hopkins University Medical Center, Baltimore, MD (Richard Rothman, MD, Alex Kecojevic, MPH); Maricopa Integrated Health System, Phoenix, AZ (Frank Lovecchio, MD, Cheri Pantoja); North Shore Long Island Jewish Medical Center, New Hyde Park, NY (William Krief, MD, Sandra DeCicco, MD); Bay State Health Center, Springfield, MA (Aaron Hexdall, MD, Howard Smithline, MD, Fidela Blank, Rick Barus); Newton-Wellesley Hospital, Newton, MA (David Huckins, MD, Adam Potter); Metrohealth Medical Center, Cleveland, OH (Jon Schrock, MD, Julie Nichols, RN, Heather Federle, RN); and Cincinnati Children’s Hospital Medical Center, Cincinnati, OH (Richard Ruddy, MD, Richard Strait, MD, Nicole McClanahan)