ACADEMIC EMERGENCY MEDICINE 2011; 18:988–1000 © 2011 by the Society for Academic Emergency Medicine
Objectives: The objectives were to conduct a comprehensive, systematic review of the literature for risk adjustment measures (RAMs) and outcome measures (OMs) for prehospital trauma research and to use a structured expert panel process to recommend measures for use in future emergency medical services (EMS) trauma outcomes research.
Methods: A systematic literature search and review was performed identifying the published studies evaluating RAMs and OMs for prehospital injury research. An explicit structured review of all articles pertaining to each measure was conducted using the previously established methodology developed by the Canadian Physiotherapy Association (“Physical Rehabilitation Outcome Measures”).
Results: Among the 4,885 articles reviewed, 96 RAMs and/or OMs were identified from the existing literature (January 1958 to February 2010). Only one measure, the Glasgow Coma Scale (GCS), currently meets Level 1 quality of evidence status and a Category 1 (strong) recommendation for use in EMS trauma research. Twelve RAMs or OMs received Category 2 status (promising, but not sufficient current evidence to strongly recommend), including the motor component of GCS, simplified motor score (SMS), the simplified verbal score (SVS), the revised trauma score (RTS), the prehospital index (PHI), EMS provider judgment, the revised trauma index (RTI), the rapid acute physiology score (RAPS), the rapid emergency medicine score (REMS), the field trauma triage (FTT), the pediatric triage rule, and the out-of-hospital decision rule for pediatrics.
Conclusions: Using a previously published process, a structured literature review, and consensus expert panel opinion, only the GCS can currently be firmly recommended as a specific RAM or OM for prehospital trauma research (along with core measures that have already been established and published). This effort highlights the paucity of reliable, validated RAMs and OMs currently available for outcomes research in the prehospital setting and hopefully will encourage additional, methodologically sound evaluations of the promising, Category 2 RAMs and OMs, as well as the development of new measures.
While emergency medical services (EMS) systems have developed widely in the United States and much of the developed world, little is known about the actual effect of prehospital interventions on patient outcomes. Identifying “what works” in EMS has been significantly hampered by the absence of validated scientific methodologies by which to reliably evaluate the effect of prehospital care and the systems in which it is deployed. The identification of this “vacuum” led the National Highway Traffic Safety Association to fund the Emergency Medical Services Outcomes Project (EMSOP). The purpose of EMSOP is to develop a foundation and framework for prehospital outcomes research.1 In prior work, the EMSOP team delineated the priority conditions 2 for emphasis in future EMS outcomes research, described conceptual models for carrying out outcomes research in the prehospital environment,3,4 suggested core risk adjustment (severity) measures (RAMs) and outcome measures (OMs)5 considered essential for conducting meaningful EMS research, and recommended specific RAMs and OMs for evaluating the prehospital treatment of pain6 and respiratory distress.7 In this article, we continue the work of EMSOP Project by reporting a comprehensive literature search and recommending RAMs and OMs for use in prehospital outcomes research on patients presenting with trauma.
The objectives of this effort were to 1) perform an explicit literature search on current RAMs and OMs that have been developed for trauma to determine which ones can be reliably and reproducibly measured in the prehospital setting and 2) make recommendations for RAMs/OMs to be used in EMS outcomes research. The identification of valid, reliable measures that can be obtained in the field will provide the basis for linkage to distal outcomes obtained after arrival at the hospital and will aid in the development of a solid foundation for evaluating the effectiveness of EMS interventions in the care of trauma patients.
This study involved repeated literature searches using an explicit search methodology for identifying pertinent studies that reported on the development and/or validation of RAMs and OMs for use in EMS trauma outcomes research. The studies that met inclusion criteria were then evaluated by group consensus reviews using an established process for determining the quality of evidence and for making recommendations related to the potential use of the identified RAMs and OMs in prehospital outcomes research.8 We did not evaluate most of the measures that we have previously described as core measures,5 since these are considered routine RAMs/OMs for all EMS conditions, including trauma (Figure 1). However, we did include the studies related to Glasgow Coma Score (GCS) and prehospital provider judgment/impression in this review because there is a significant literature specifically evaluating the use of these measures in the setting of prehospital trauma care. In addition, we did not evaluate measures related to pain or discomfort, since we have previously reviewed these RAMs and OMs in detail, and made recommendations for their use in many EMS conditions, including trauma.6
Definitions and Usage of Terms
As with previous EMSOP work, we sought to identify measures that could be obtained during the prehospital care interval. For clarity we use the following definitions developed during previous work and based on classical usages from the disciplines of EMS research and outcomes research.1–5
Outcome. The outcome is classically described as one of the “Six Ds”: death, discomfort (pain), disability (functional impairment), disease (physiological parameters), destitution (cost), and dissatisfaction (satisfaction)1–4,9,10 (Table 1).
|Death||Failure to survive the health event that generated the EMS response||Mortality|
|Disease||Physiologic/anatomic abnormalities||Vital signs, GCS, ISS, organ damage|
|Disability||Failure to return to preevent health status||Ability to return to work, ability to do activities of daily living, Cerebral Performance Category, SF-36|
|Destitution||Cost, cost benefit, cost utility, cost-effectiveness||Comparison of alternative treatments in terms of cost and effect, personal and societal economic impact of health event|
|Dissatisfaction||Patient/family satisfaction||Methods of assessing how satisfied patients/family are with their care and with their other outcomes|
|Discomfort||Uncomfortable sensations or physical suffering||Pain, shortness of breath, visual analog scale for pain, peak expiratory flow rate|
EMS Condition. The EMS condition is an illness, injury, or combination of signs and symptoms that result in EMS system activation. A single EMS condition can encompass multiple diagnoses or diseases. Conversely, a given disease entity may present as various clinical conditions in the EMS setting.
Trauma. Multiple specific injuries and combinations of injuries (in a nearly unlimited number of permutations) are included in this definition. This search dealt exclusively with injury cause by transfer of physical energy (e.g., falls, motor vehicle crashes, and gunshot wounds, but not chemical or thermal burns.)
Risk Adjustment Measure (RAM). A RAM is a variable that: 1) meaningfully reflects patient characteristics and clinical attributes, 2) has the potential to affect patient outcomes, and/or 3) has the potential to confound the results of outcomes evaluations. Since measures must be quickly obtainable to be feasible in the prehospital setting, RAMs can often also serve as OMs. An example is measurement of blood pressure before (RAM), and after (OM), an intervention. Previous work by the EMSOP group has described a conceptual framework for risk adjustment and outcomes evaluation in EMS research3,5 and has recommended a standard set of “core” EMS RAMs (described in the Results section).5 Some RAMs will be similarly applied across a wide variety of conditions (e.g., heart rate), while others are specifically developed for particular conditions (e.g., revised trauma score [RTS] is used exclusively for injury). RAMs are essential for evaluating the effectiveness of health care interventions to help minimize the effect of confounding factors.
Outcome Measure (OM). An OM is a variable that meaningfully reflects one or more of the Six Ds (Table 1). Some OMs may be applicable across conditions, whereas others are specific to a particular condition. Ideal OMs for prehospital use are easily and quickly applied, are applicable to all ages, and do not require prolonged training or expensive, complicated, or bulky equipment to acquire.5
Study Phase I
RAMs and OMs were identified by a systematic literature search and a structured review of original research articles pertaining to each potential measure. Measures were evaluated by a method previously used to develop OMs in physical therapy by the Canadian Physiotherapy Association11 and used in previous EMSOP study methodologies.3–7 After evaluation, each measure (and each study evaluating that measure) was discussed by the EMSOP investigators and a decision made about the level of evidence and the category of recommendation (Figure 2).
Step 1: Literature Search Strategy. The initial phase consisted of a systematic literature search and review of the existing published studies evaluating RAMs and OMs for potential use in prehospital injury research. The following sources were searched: Medline/PubMed (1950–January 2008), EMBASE (1980–January 2008), CINAHL (1982–January 2008), the Cochrane Database of Systematic Reviews, the Database of Abstracts of Reviews of Effects (DARE), and the Cochrane Central Register of Controlled Trials. Search results were limited to English language only, and all references contained in the identified studies were subsequently evaluated to see if they met inclusion criteria. The broad search strategy used for OVID Medline/PubMed and EMBASE is given in the next paragraph. Search terms were taken from both the Medical Subject (MeSH) Headings and EMBASE subject headings. These often encompassed broad topical areas. To identify additional articles with a potential connection to EMS RAMs and OMs, specific terms and keyword search combinations were used when logically appropriate. The same approach was used for the other databases and the search strategy was similarly amended to comply with search subject headings and syntax requirements unique to each.
Broad Search Strategy
- 1 exp *injury scale/or exp *trauma severity indices/or exp *illness Severity of Illness Index/or exp health status indicators/or risk assessment or exp risk assessment/or risk adjustment or exp risk adjustment/or outcome* or exp “outcome and process assessment (health care)”/or exp treatment outcome/or exp outcomes research/or patient satisfaction/
- 2 “Out of hospital” or “out-of-hospital” or prehospital or pre-hospital or exp emergency health service/or exp emergency medical services
- 3 (exp *”wounds and injuries”/or exp *injury/) and (emergency* or trauma*)
- 4 Evaluation or validation study or evaluation studies or validation studies or comparative study or scientific integrity review
- 5 #1 and (#2 or #3) and #4
Step 2: References Limited. An explicit process for study inclusion was determined and agreed upon a priori and matched that of the previous EMSOP reports identifying RAMs/OMs for other conditions.6,7 The investigators reviewed the titles in a structured manner. One author investigator (DB) performed the preliminary title reviews. The sole criterion for inclusion at this step was if the reviewer thought there might be any possibility that the paper reported an evaluation, comparison, or development of, or otherwise looked at, the validity, reliability, or feasibility of use of any tool or scoring method with potential for use as a prehospital RAM or OM for injury-related research. Studies that simply used measures in clinical trials, but did not specifically develop and/or evaluate RAMs or OMs, were not included.
Step 3: Abstracts Reviewed. The abstracts of the articles selected in Step 2 were reviewed independently by three authors (DB, DS, SK). Studies that did not specifically report on the development, feasibility, reliability, or validity of RAMs or OMs, or evaluate the performance of RAMs or OMs, were excluded from the next part of the process. However, to ensure that potentially eligible studies were not inappropriately removed, unanimous agreement was required to exclude an article from advancing to the next step (i.e., if even one investigator thought that the study might meet inclusion criteria, the reference was advanced to Step 4).
Step 4: Articles Reviewed and Sorted. Examination then occurred of the full-length papers that reported on the development or evaluation of a RAM or OM. The articles were sorted based on the measure addressed (e.g., trauma scores, GCS, pediatric trauma). A single investigator using the established guidelines from the “Physical Rehabilitation Outcome Measures” methodology conducted an explicit structured review of each group of articles pertaining to each measure.11 Studies were included if they contained original data evaluating the development and/or performance characteristics (feasibility, reliability, and/or validity) of a RAM and/or OM. The characteristics evaluated for each measure included time taken to collect or complete the measure, cost and training, scaling, reliability, and validity. These were used to determine the level of evidence (Figure 2: Hierarchy of Evidence). The “Physical Rehabilitation Outcome Measures” guidelines were modified to also include feasibility of use in the prehospital setting. This was a key factor in the authors’ deliberations. The measure had to be feasible to use in the prehospital care interval to be considered. For example, Abbreviated Injury Score (AIS) was excluded because it cannot be determined in the prehospital setting.
Published literature reviews were also included if they pertained to the development and/or performance characteristics of a measure. Critical appraisals by the investigators were conducted independently and documented in a standardized fashion. The quality of evidence was classified by accepted standards for outcomes or prognosis investigations established by the Oxford Centre for Evidence Based Medicine.8
Step 5: Group Presentations and Consensus. The written individual reviews were presented to the entire group of investigators. Each investigator orally presented the results of his or her review following the modified Canadian Physiotherapy Association guidelines and made recommendations regarding the appropriateness of the measure for EMS trauma outcomes research (Figure 2: Categories of Recommendations). This did not necessarily require the reviewer to present every paper in detail (e.g., unanimous consensus was often reached after the reviewer gave only a brief overview for RAMs/OMs that are inherently not feasible for use as prehospital research tools). After each group of papers (representing a RAM/OM) was presented, a discussion resulted in a decision regarding the level of evidence and a recommendation category for the measure. As with previous EMSOP reports, discussion continued until unanimous agreement existed both for the level of evidence for a RAM/OM and for the category of recommendation. The exception to this was that, with Category 5 measures that were quickly identified as nonfeasible in EMS, the level of evidence was not deliberated upon extensively since this became moot once nonfeasibility was established.
Study Phase II: Repeat Search and Review
Following the identification of each measure, another structured literature search was performed. A MEDLINE/PubMed search using the Ovid search engine was performed of English-language articles from 1959 to February 2010 inclusive. The specific name of each measure was used as a search term and the title, abstract, and body of each article was searched to find all manuscripts containing the measures. A single investigator (DB) reviewed the title list generated from this search and any title that represented the potential for dealing with the development or evaluation of the targeted measure was selected. Three investigators (DB, DS, SK) reviewed the abstracts of these articles and, if any reviewer felt that the paper might be appropriate for inclusion, a review of the full-length article was performed. In addition, a careful review of the reference lists for all articles chosen for full review was performed to identify any other potential articles. The results of this review were then discussed with all investigators in the same process with a written and oral presentation to the entire team and a consensus reached on measures to recommend.
Search and Selection
The initial search resulted in a set of 4,885 references. The preliminary review (Step 2) of these titles was conducted by DB and yielded a total of 200 articles that were kept for further review. Step 3 was a review of the abstracts of these articles by three investigators (DB, DS, SK). Twenty-six of the 200 articles were excluded at this step, requiring unanimous agreement that they did not report on the development and/or evaluation of a RAM or OM. Thus, Step 3 resulted in a total of 174 studies selected for full-length article review. Examination of the full papers in Step 4 yielded 162 studies that, after full review, were determined to actually report on the development/evaluation of a RAM or OM (12 were excluded). These underwent critical appraisal via the structured review and consensus process described for Step 5 (Figure 3).
In parallel with the ongoing consensus reviews of the 162 papers (Step 5), a second search was conducted to minimize the likelihood of missing any measures (Phase II). Using the list of identified RAMs and OMs that had been generated from Steps 1–5, a second search using the specific names of the measures (e.g., “Revised Trauma Score”) was conducted to identify additional potential articles that might have been missed with the previous searches (Phase II, Figure 3). This second search resulted in the identification of 2,171 new references. Then, the relevant steps from Phase I were repeated for these newly identified studies. A single investigator review (Step 2, DB) resulted in 26 titles being selected for abstract review. Three investigators (DB, DS, SK) reviewed these 26 abstracts and 12 were unanimously excluded (Step 3). The remaining 14 were then critically appraised (Step 4) and added to the first cohort of standardized reviews. After consensus discussion (Step 5), five newly discovered trauma RAMs and OMs were identified (Rapid Acute Physiology Score [RAPS]; rapid emergency medicine score [REMS]; alert, confused, drowsy, unresponsive [ACDU]; alert, verbal response, painful response, or unresponsive [AVPU]; and Trauma Score [TS] plus mechanism of injury [MOI]). For clarity, the repeat of Steps 2 through 5 in Phase 2 are shown as Step 6 in Figure 3. The final, comprehensive list of RAMs and OMs identified from the entire search/review is shown in Figure 4.
Category 1 RAMs/OMs: Recommended for Use in EMS Outcomes Research
Glasgow Coma Scale (GCS). The GCS is a neurologic scale that assesses the level of consciousness of an individual. The scale measures a patient’s best eye response (1–4), best verbal response (1–5), and best motor response (1–6). The three values are summed together to give a score ranging from 3 (deep coma) to 15 (fully awake and verbal). The scale was published in 1974 by Teasdale and Jennett.12 The use of the GCS to assess trauma patients in both the in-hospital and prehospital settings has been studied extensively.13–24 The GCS has been validated in multiple studies and has performed well in predicting four clinically relevant traumatic brain injury (TBI) outcomes (emergency intubation, neurosurgical intervention, brain injury, and mortality).18 GCS measured 6 hours postinjury has been shown to be predictive of outcome in patients with head injury, and it is a reliable physiologic parameter for predicting hospital admission after motor vehicle crashes.22,25
Category 2 RAMs and OMs: Promising for Prehospital Trauma Research, but Further Study and Validation Are Needed Before They Can Be Recommended for Use
Motor Score of the GCS. The MGCS is the 6-point subscale of the GCS that measures a patient’s best motor response and ranges from 1 (no movement) to 6 (obeys commands). Multiple studies indicate that MGCS may be as accurate as the total GCS for predicting survival and the need for admission to an intensive care unit (ICU),26 for predicting mortality,27,28 and for prehospital triage.29 However, it did not reach Category 1 (recommended) status because the volume of the literature supporting it was substantially smaller than that of the GCS.
Revised Trauma Score. The RTS was derived by Champion in 1989.30 It consists of a score from 0 to 12, where the GCS score, respiratory rate, and systolic blood pressure (sBP) are given scores from 0 to 4 and then summed. The RTS is a revision of the Trauma Score (TS), which has two other components: capillary refill and respiratory expansion. The TS has been essentially replaced by RTS because these two components are often difficult to assess. The RTS has been validated in a few studies and has generally performed well in predicting survival. It was previously recommended for use in evaluating trauma care by the American College of Surgeons, Committee on Trauma.22,25,31–35 However, after reviewing the studies and the practicality of RTS as a triage criterion, the National Expert Panel on field triage determined that RTS is not useful and deleted it from the 2006 Decision Scheme.36 The Panel noted that the complexity of the formula used to calculate RTS makes doing so in the field unwieldy and time-consuming. In addition, the RTS has not been validated for predicting any outcome other than mortality, and the difficulty in collecting the components of the RTS creates issues for data completeness and validity.36 Although the weighted RTS has been developed to improve its predictive capacity, studies reporting its use are rare, and there is debate regarding the applicability of the published coefficients for broad use.37
Rapid Acute Physiology Score (RAPS). RAPS consists of four variables: pulse, mean arterial pressure, respiratory rate, and GCS. The scores range from 0 (normal) to 16 (very severely ill). It is generally reliable and predictive of a patient’s severity/physiologic stability before and after transport to critical care.38,39
Rapid Emergency Medicine Score (REMS). REMS was adapted from the RAPS score in an attempt to improve its predictive ability.40,41 REMS includes the four data elements of the RAPS and adds pulse oximetry (SpO2; 0–4 points) and patient age (0–6 points). REMS has been shown to have an equivalent predictive power to the APACHE II score for in-hospital mortality.42 A single study has shown that the REMS is more accurate than RAPS in predicting mortality and length of hospital stay.39 However, it has not been evaluated for use in trauma patients. The need for an accurate patient age and pulse oximetry creates additional challenges for its utility in the prehospital setting.
Simplified Motor Score (SMS). The SMS is an easily remembered three-point scale (2 = obeys commands, 1 = localizes pain, 0 = withdraws to pain or less response) that has been shown to be as accurate as GCS in the assessment of patients with altered level of consciousness from both nontraumatic and traumatic causes in a few studies.18,19,43,44 In one validation study, the SMS demonstrated similar test performance when compared with GCS and its components for the prediction of four clinically important TBI outcomes.44
Prehospital Index (PHI). The PHI is a triage-oriented trauma severity scoring system that consists of four components: sBP, pulse rate, respiratory status, and level of consciousness. Each component is scored from 0 to 5, with 0 indicating normal function and 5 indicating maximum physiologic dysfunction. An additional four points are added for the presence of penetrating abdominal or thoracic trauma. The PHI ranges between 0 and 24, with 0 to 3 indicating minor injury, 4 to 7 moderate injury, and >7 severe injury. Retrospective evaluation has shown a negative predictive value (NPV) of 99.7% for those patients categorized as minor injury for needing general surgery. The PHI had a positive predictive value (PPV) of 45.9% for predicting mortality or the need for general surgery.45 A prospective study revealed a NPV of 99.4% and PPV of 52.1% for emergency surgery.46 In this study, among the 3,120 patients scored as “minor trauma” in the field (PHI = 0–3), there was a 0% mortality rate and only a 0.6% emergency operative rate.46
Simplified Verbal Score (SVS). The SVS is a three-point scoring system (2 = oriented, 1 = confused conversation, 0 = inappropriate words or less response). In one small prospective study, despite its simplicity, it demonstrated similar test performance to the total GCS score for the prediction of four clinically relevant TBI outcomes (emergency intubation, neurosurgical intervention, brain injury, and mortality).19
EMS Provider Judgment/Impression. Multiple studies have evaluated EMS provider judgment in predicting the “need” for triage to a trauma center.47 Ornato et al.48 found that EMTs were better at identifying critical patients who needed to go to the operating room than the TS or the CRAMS score (circulation, respiration, abdomen, motor, speech). Hedges et al.49,50 initially identified that paramedics could recognize serious injury and determine the need for triage to a trauma center. However, on further study, he was concerned about an inadequate sensitivity for this purpose. EMS provider judgment was not sufficiently sensitive to be used for determining trauma team activation for pediatric patients.51 In addition, Mulholland et al.52 found no clear evidence supporting EMS provider judgment as an accurate trauma triage method. Lavoie and colleagues53 compared a PHI > 4, high-velocity impact, and EMT judgment and found that combining these three criteria had a sensitivity of 74.2% for making proper triage decisions. However, this sensitivity was associated with a trauma center overtriage rate of 85.1%. While EMT judgment had the highest single-measure sensitivity, it was inferior to the accuracy of combining the measures.
Out-of-hospital Decision Rule for Pediatrics. This rule combines GCS, intrusion into passenger space > 5 inches, and restraint use (yes/no). The rule had a sensitivity of 92% and a specificity of 73% for determining an injury severity score (ISS) ≥16.54 The small prospective validation of the decision rule by Newgard et al.55 had a sensitivity and specificity of 100 and 73%, respectively, for an ISS > 15 or the need for emergent intubation, major nonorthopedic operative intervention, death within 24 hours, or pediatric ICU stay >24 hours. Unfortunately, 20% of the patients were excluded from the analysis due to insufficient data.
Revised Trauma Index (RTI). RTI is a derived triage tool calculated by determining the body region, type of injury, cardiovascular response, respiratory effort, and neurologic status. Summing the values of each category from 1 to 6 gives a total score ranging from 3 (minor injury) to >20 (critical).56 Smith and Bartholomew56 showed that an RTI of >15 resulted in an “acceptable” undertriage rate, while obtaining a better rate for overtriage than the TS, CRAMS, PHI, or MOI scales.
Field Trauma Triage (FTT). The FTT combines the PHI with criteria related to MOI in an attempt to increase the accuracy of trauma triage. A combined PHI/MOI score had a sensitivity of 78% with a similar specificity to the PHI alone for identifying those patients with an ISS of ≥16.57
A Pediatric Triage Rule Combining GCS and Heart Rate. This rule was determined retrospectively by combining a GCS < 12 with a heart rate >160 beats/min in children. It had a sensitivity of 98.9% and a specificity of 90.1% measuring morbidity, survival, and hospital charges.58 While this may be promising, there is insufficient evidence to know whether it will be an effective measure when tested prospectively in varying settings.
We conducted a systematic literature search for all studies evaluating RAMs and OMs that could potentially be utilized in EMS trauma outcomes research. The studies that met inclusion criteria were then included in a validated review process carried out by the EMSOP investigators using the process that has been developed and used for previous reports for other measures (core measures, pain, respiratory distress).4–7 While a large number of potential measures were identified in the literature, nearly all of them fell short of the necessary validity, reliability, and feasibility for use in prehospital trauma research.
One of the challenges in the development and validation of useful prehospital RAMs/OMs is that, in classical outcomes and effectiveness research, these measures are developed in a setting where patient populations have specific medical diagnoses. However, in the field, patients present with symptoms, complaints, and/or conditions (e.g., multisystem injury with altered level of consciousness) rather than diagnoses. Thus, any attempt to overlay a research typology that is diagnosis-driven would be artificial, since EMS providers are trained to make and act on assessments of presenting conditions, while diagnoses are established after the fact. It has been recognized that the mixing of multiple diagnoses into a condition may introduce substantial heterogeneity among patients assigned to a specific group. While this creates a significant challenge in the evaluation of EMS, it has not prevented meaningful outcomes research when care has been taken to develop sound methodologies and proper attention is paid to this limitation.59–65
Based on the quality of studies and supporting evidence, GCS is the only measure that can currently be given a strong recommendation for use. GCS was also recommended in EMSOP III as one of the core measures to be used broadly, across conditions, in prehospital research (Figure 1).5 In this effort, we did not specifically examine the core measures even though many of them (e.g., BP) are applicable to trauma. However, the reasons for including GCS (and provider judgment) in this search/analysis were discussed above.
As with all of the measures that can be quickly obtained in the prehospital setting, GCS can be used as both a RAM and an OM. That is, a baseline measurement can be obtained to establish the pretreatment score (thus being used as a risk adjuster) and then be used as an OM to assess the effect of an intervention. The way that this is applied in prehospital outcomes research has been described in previous EMSOP publications.3,4,7
It should be noted that the GCS has several limitations including issues with interrater reliability and inaccuracy.18,19,21 In addition, different combinations that add up to a given total GCS do not necessarily reflect the same risk and can lead to significantly different outcomes. For example, there are three permutations that give a total GCS of 4. However, the survival rates vary significantly (m/v/e of 2/1/1 has a survival rate of 52%, 1/2/1 has a rate of 73%, and1/1/2 a rate of 81%).27 There have also been concerns with the inaccuracies of the GCS in intubated and chemically paralyzed patients. Because of these limitations, simplified neurologic scales have been recommended for use in the prehospital setting. These include the motor score of the GCS, the SVS, and the SMS.18,19,26,27,43,44 These measures show promise, but require additional prospective validation before they can be recommended.
Recommendation Category 2 was reached when the consensus was that a measure was promising, but had not yet been shown to be valid, reliable, precise, or feasible enough for merit recommendation (Figure 4). One Category 2 measure deserves discussion (RTS), because it is used extensively to determine the effectiveness of trauma care. No doubt, some will question our giving the RTS a Category 2 recommendation rather than Category 1. However, pervasiveness of use was not a criterion considered in our deliberations. Thus, the clearly identified shortcomings of the RTS prevented it from being a Category 1 recommendation for EMS use.
We did not consider anatomic scoring measures (e.g., AIS, ISS, TRISS) that have been used extensively in trauma research because our objective was to indentify RAMs and OMs that can be obtained and used in the field. However, it is important to highlight the fact that these kinds of measures can be linked to the measures that are obtained in the field. Linking to anatomic scores and other distal risk adjusters and outcomes will be essential in the future evaluation of the EMS trauma care.
It is encouraging that recent investigations conducted as part of the Ontario Prehospital Advanced Life Support (OPALS) Study have linked intermediate and distal outcomes to advanced life support interventions performed in the field.59,66–69 These have included several measures related to disability (functional measures) and discomfort: the SF-36, the Cerebral Performance Category (CPC), the Health Utility Index Mark III (HUI-III), the Functional Independence Measure (FIM), and self-reported symptom relief.68,70 The SF-36 has become a widely used tool. It is a 36-item survey that was constructed to evaluate health status in the Medical Outcomes Study.71 The OPALS investigators also published their methods of determining costs.72,73 They have reported cost-effectiveness in terms of dollars per life saved,74 and dollars per quality-adjusted life-year.75 Since EMSOP evaluated the actual feasibility, reliability, and validity of the measures for prehospital use, these measures obviously do not apply. However, there are a growing number of EMS studies that are beginning to link EMS measures to these intermediate and long-term outcomes with the intent of identifying the effects of prehospital care. Hopefully future studies will add much to our knowledge of the potential linkage of prehospital RAMs/OMs to these more complex and robust distal OMs.
This article underscores a number of important research issues that need to be addressed in the future: 1) the paucity of RAMs and OMs that have been clearly shown to be valid, reliable, and feasible in EMS trauma outcomes research; 2) the absence of any identified RAMs or OMs for disability, satisfaction, or cost; and 3) the encouraging fact that there are multiple measures (Category 2) that show promise for future research.
First, while the search parameters were very broad, we cannot be absolutely sure that we have identified all potentially meaningful measures. We believe that including full article and reference list review followed by conducting the entire search again, with the addition of the RAMs/OMs identified in Phase I, maximized the likelihood of identifying all pertinent studies. For example, the RAPS score did not show up in our initial review because we did not include the MeSH term “Severity of Illness Index” originally. However, this omission was discovered by our secondary review. This MeSH term was then added, along with other keywords that corresponded to known trauma scoring systems, and this resulted in an additional 313 references. Each of these then underwent the entire structured review process. The second limitation is that the EMSOP methodology depends on the validity of the use of an expert panel. Such methodologies always raise the possibility of bias or other attributes that can compromise the validity of the recommendations. However, both the literature search and the review strategy followed currently accepted and validated models for accomplishing such a task and this specific methodology has previously successfully undergone the rigors of peer review.
Using an explicit, systematic literature search, a previously published process for structured review of studies, and consensus expert panel opinion, we recommend the use of the Glasgow Coma Scale, along with core measures, as risk adjustment and outcome measures for prehospital trauma outcomes research. Our findings highlight the paucity of reliable, validated risk adjustment measures and outcome measures for trauma outcomes research in the prehospital setting. Thus, there is much work to be done in the future in the exceptionally important endeavor to create a solid foundation for identifying “what works” in EMS and in what settings.