Prospective Evaluation of a Pediatric Inpatient Early Warning Scoring System


  • Karen M. Tucker, MSN, RN, is a Clinical Director; Tracy L. Brewer, DNPc, MSN, RNC, is an Education Specialist; Rachel B. Baker, PhD, RN, is an Outcomes Manager; and Brenda Demeritt, BSN, RN, is a Clinical Manager, Patient Services, Cincinnati Children's Hospital Medical Center; and Michael T. Vossmeyer, MD, is Medical Director, Division of General and Community Pediatrics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH.

Author contact:, with a copy to the Editor:


PURPOSE. The present study evaluated the use of the Pediatric Early Warning Score (PEWS) for detecting clinical deterioration among hospitalized children.

DESIGN/METHODS. A prospective, descriptive study design was used. The tool was used to score 2,979 patients admitted to a single medical unit of a pediatric hospital over a 12-month period.

RESULTS. PEWS discriminated between children who required transfer to the pediatric intensive care unit and those who did not require transfer (area under the curve = 0.89, 95% CI = 0.84–0.94, p < .001).

IMPLICATIONS. The PEWS tool was found to be a reliable and valid scoring system to identify children at risk for clinical deterioration.

Between 0.7% and 3% of hospitalized children have a cardiopulmonary arrest during their hospitalization (Nadkarni et al., 2006; Reis, Nadkarni, Perondi, Grisi, & Berg 2002; Suominen et al., 2000). While pediatric cardiopulmonary arrest is uncommon, when it does occur the outcome is poor, with only 15–36% of children surviving to discharge. Despite technological and pharmaceutical advances, the survival rate for children who experience cardiopulmonary arrest in the hospital has not improved in the past 10 years (Nadkarni et al.; Reis et al.; Tibballs & Kinney, 2006; Young & Seidel, 1999).

The challenge in intervening to prevent cardiopulmonary arrest lies in the ability of healthcare providers to identify the early signs of deterioration and to intervene. Two such concepts for early identification and intervention for deteriorating patients are the implementation of a medical emergency team (MET) and the use of an early warning score.

The notion of an MET is to bring critical care expertise to the bedside of deteriorating patients for identifying early patient deterioration (Mistry, Turi, Hueckel, Mericle, & Meliones, 2006). In one study at a large tertiary children's hospital, the implementation of an MET was associated with a reduction in the risk of respiratory and cardiopulmonary arrest outside the intensive care unit (Brilli et al., 2007). For those patients who suffered from a cardiopulmonary arrest prior to the implementation of the MET, the mortality rate was 0.12 per 1,000 days compared with 0.06 per 1,000 days (p = .13) post-MET implementation.

An additional concept for identifying early signs of deterioration is the use of an early warning score tool that combines clinical parameters into a single score. Several objective early warning score tools have been developed and validated to increase recognition of clinical deterioration among adult patients (Cuthbertson, Boroujerdi, McKie, Aucott, & Prescott, 2007; Hodgetts, Kenward, Vlachonikolis, Payne, & Castle, 2002). An increased early warning score has been associated with intensive care unit admission and mortality (Goldhill, McNarry, Mandersloot, & McGinley, 2005; Harrison, Jacques, McLaws, & Kilborn, 2006; Subbe, Kruger, Rutherford, & Gemmel, 2001). Additionally, the use of objective early warning scores with adult patient populations has been associated with increased confidence among nurses and increased communication among healthcare providers (Andrews & Waterman, 2005).

While the advantages of using early warning scores with adult patients is clear, their effectiveness in pediatric patients has not been well studied.

While the advantages of using early warning scores with adult patients is clear, their effectiveness in pediatric patients has not been well studied. To date, three early warning scoring systems have been developed for use in pediatric populations (Duncan, Hutchison, & Parshuram, 2006; Haines, Perrott, & Weir, 2006; Monaghan, 2005). Two of these early warning systems have demonstrated good ability to identify patients who were deteriorating (Duncan et al.; Haines et al.). Both of these tools require the use of multiple items, which raises concern about their practical utility on a busy pediatric inpatient service. Additionally, neither of the pediatric early warning score tools was evaluated for reliability. Of note is that Duncan et al. and Haines et al. titled their tools Pediatric Early Warning System and Pediatric Early Warning tool, respectively; however, these tools were different from the Pediatric Early Warning Score (PEWS) used in the present study. The third early warning score tool for children, the PEWS, consists of three items related to the patient's behavior, cardiovascular status, and respiratory status (Monaghan) (see Figure 1). Scores for the PEWS scale can range from 0 to 13, with a higher number representing a higher risk of clinical deterioration. In a preliminary evaluation, Monaghan found that scoring a patient using the PEWS tool added 30 sec to the routine bedside assessment. On face value, the PEWS system appears to be the least cumbersome and perhaps the most reasonable tool to be incorporated into bedside clinicians’ assessments. However, to date, Monaghan's PEWS tool has not been studied prospectively for reliability or validity. With permission from Monaghan, the PEWS tool was adapted and used to identify patients at risk for clinical deterioration. This paper presents the evaluation of PEWS in the clinical setting of a busy pediatric inpatient unit.

Figure 1.

Pediatric Early Warning Score (PEWS)



The present study was conducted on a 24-bed inpatient general medical unit. The unit is housed in a quaternary regional pediatric medical center in the Midwestern United States, with 475 inpatient beds. All patients admitted to this unit during a 1-year period were included in the study. Approval for the study was granted by the medical center's institutional review board.


Registered nurses (RNs) were trained in the use of PEWS through learning modules and case studies. PEWS became a standard component of the assessment conducted every 4 hr on all patients admitted to the unit. Bedside RNs documented PEWS in patients’ electronic patient records every 4 hr for the duration of the patient's admission.

In addition to the PEWS tool, an algorithm was developed to prescribe actions to be taken based on the PEWS. In the Monaghan study (2005), there were four possible actions chosen by the nursing staff based on the calculated PEWS. These actions could be as simple as informing the charge nurse for a lower PEWS or initiating the MET for a higher PEWS.

For this study, a process using multiple rapid Plan, Do, Study, Act cycles (Langley, Nolan, Nolan, Norman, & Provost, 1996) was used in developing the algorithm. Data from several weeks of PEWS were analyzed with respect to patient outcomes (cardiopulmonary arrest, medical emergency team calls, and unexpected transfer to the pediatric intensive care unit [PICU]). Front-line nurse experts evaluated the data and made a best estimate of determining the scores for the minimally required action steps. The PEWS, patient outcomes, and feedback from unit staff were reviewed at weekly unit leadership meetings. The algorithm was adjusted based on that feedback. It was nearly 15 months before the current version of the algorithm was completed.

The algorithm incorporated a tiered response to scores; increased PEWS corresponded to increased allocation of resources to the patient. A score of 0–2 required no additional intervention; a 3 required that the senior RN assess the patient; a 4 required that the bedside RN notify the pediatric resident of the patient's PEWS; a 5 required that the senior RN and pediatric resident assess the patient; a 6 required that the senior RN, pediatric resident, and senior resident assess the patient at the bedside; and a 7 or above required that the bedside RN activate the hospital's MET.

It is important to note that the algorithm provided the minimum required actions. Bedside RNs, based on clinical judgment, could contact senior clinicians and activate the MET at any time regardless of the patient's PEWS. While the PEWS required senior clinicians to assess the patient, the decision about interventions to implement at the bedside and the decision about whether to transfer a patient to the PICU were made at the discretion of the clinicians evaluating the patient, independent of the PEWS.

Outcome Measure

Various objective indicators of deterioration were considered for use as the outcome variable. Cardiopulmonary arrest was considered initially, but no cardiopulmonary arrests occurred during the study time period. The need for a proxy measure of deterioration was explored. Duncan et al. (2006) found a limited number of code blue events during their study and felt this posed a challenge in the development of screening mechanisms to identify children who were at risk for deterioration. Haines et al. (2006) chose several outcome measures, including assessing whether patients required an increased level of care, including transfer to the PICU. For the present study, transfer to the PICU was chosen as an objective proxy measure of clinical deterioration.

Data Collection

A tool was developed to collect data on all of the PEWS obtained during the 1-year period. The charge nurse for each shift recorded all PEWS for the patients on the unit. In addition to the PEWS, patient age, diagnosis, length of stay, and any actions taken because of patient deterioration (e.g., a call to the medical emergency team or a PICU transfer) were recorded. These tools were completed for every shift during the entire year. Data were collected, de-identified, and entered into a secure database.

Data Analysis

Screening tools are evaluated by examining the validity of the data produced and the feasibility of the tool. Sensitivity and specificity are two measures of the validity of a tool. Sensitivity is the probability of testing positive on the screening tool when the outcome measure truly is present. In this study, sensitivity is the probability of scoring high on the PEWS scale when the patient is truly clinically deteriorating. Specificity is the probability of testing negative on the screening tool when the outcome measure truly is absent. Specificity in this study is the probability of scoring low on the PEWS scale when the patient is not deteriorating (Hennekens & Buring, 1987). In addition to calculating the sensitivity and specificity of various scores, a receiver operating characteristic (ROC) curve can be created. This graph is a plot of the sensitivity against the false-positive rate (1-specificity) for the various cutoff points of the screening test. An ROC curve visually depicts the trade-off between sensitivity and specificity for various cutoff points. The area under the curve (AUC) is a measure of the screening test accuracy. An ROC curve that lies along the left side and top of the graph suggests an accurate test and the AUC would approach 1, whereas an ROC curve that lies along the 45° diagonal of the graph suggests a less accurate test and the AUC would approach 0.5.

Positive predictive value (PPV) and negative predictive value (NPV) are two measures of the feasibility of a screening tool. PPV is the probability that the outcome measure truly is present given a positive test on the screening tool. For the present study, PPV is the probability that a patient truly is clinically deteriorating given that they scored a high PEWS. NPV is the probability that the outcome measure truly is absent given a negative test on the screening tool. In the present study, NPV is the probability that a patient is not deteriorating given they scored low on the PEWS scale (Hennekens & Buring, 1987).



The sample consisted of 2,979 patients ranging in age from newborn to 22 years (M = 2.28 years, SD = 3.33 years). Patients were admitted to the general care inpatient unit for a variety of diagnoses, the most common being asthma exacerbation, bronchiolitis, and pneumonia. Patients’ lengths of stay ranged from less than 24 hr to 225 days (M = 2.3 days, SD = 6.01 days). Patients were assessed and a PEWS was calculated every 4 hr during their hospitalization, resulting in over 40,000 individual scores. To avoid statistical analyses based on interdependent data, only the highest PEWS for each patient was used in the analyses.


Patients’ highest PEWS ranged from 0 to 9 (M = 2.22, SD = 1.38). The majority of the patients (73.2%) scored 0–2 throughout their entire hospitalizations. Approximately 8% of patients’ highest PEWS were 3, 8% of patients’ highest PEWS were 4, 7% of patients’ highest PEWS were 5, and 1.2% of patients’ highest PEWS were a 7 or above. PEWS were unrelated to age of patient (r = .029, p = .412).

To assess interrater reliability, two RNs independently scored 55 patients (intraclass coefficient = 0.92, p < .001). The two RNs went into the patients’ rooms a few minutes apart and scored the patients. Because of the high interrater reliability found on this initial review, each patient was scored only by their bedside RN for the remainder of the study.

PICU Transfers

Of the 2,979 patients studied, 51 children were transferred to the PICU for clinical care, representing a rate of PICU transfer of 1.8%. There was a relationship between PEWS and the likelihood of PICU transfer, with higher PEWS being associated with increased likelihood of PICU transfer. Less than 1% (0.23%) of children who scored 0–2 were transferred to the PICU as compared to 80% of children who scored a 9 (see Figure 2). A logistic regression was conducted to further determine the relationship between PEWS and PICU transfer. PEWS were able to discriminate between children who required transfer to the PICU and those who did not require transfer (AUC = 0.89, 95% CI = 0.84–0.94, p < .001). A statistically significant association between PEWS and transfer to the PICU indicated that for each 1-point increase in PEWS children were more than twice as likely to transfer to the PICU (odds ratio = 2.8, 95% CI = 2.36–3.35, p < .001).

Figure 2.

Percentage of Patients Who Transferred to Pediatric Intensive Care Unit With Given PEWS (N = 2,979)

Less than 1% (0.23%) of children who scored 0–2 were transferred to the PICU as compared to 80% of children who scored a 9.

Sensitivity, specificity, PPV, and NPV of PEWS in predicting PICU transfer were calculated at each cutoff score (see Table 1). For the purpose of this analysis, PEWS between 0 and 2 were considered collectively and each score 3 and above was analyzed separately. For a PEWS of 3, which was the lowest score requiring additional intervention according to the algorithm, sensitivity was 90.2%, specificity was 74.4%, PPV was 5.8%, and NPV was 99.8%. For a PEWS of 9, which was the highest PEWS in the sample, sensitivity was 7.8%, specificity was 99.9%, PPV was 80%, and NPV was 98.4%. The discrimination ability of PEWS was very good as demonstrated by the ROC curve in Figure 3 (AUC = 0.89, 95% CI = 0.84–0.94, p < .001). (Editor's note: See Houser, 2008, for additional information on reliability and validity.)

Table 1. Sensitivity, Specificity, Positive Predictive Value (PPV), and Negative Predictive Value (NPV) of PEWS in Predicting Pediatric Intensive Care Unit Transfer
9  7.899.980 98.4
≥ 8 13.799.858.3 98.5
≥ 7 33.399.448.6 98.8
≥ 6 54.997.628.9 99.2
≥ 5 70.690.811.8 99.4
≥ 4 78.482.4 7.2 99.5
≥ 3 90.274.4 5.8 99.8
0–2100 0 1.7100
Figure 3.

The Receiver Operating Characteristic (ROC) Curve. The Comparison of Sensitivity Versus Specificity for PEWS for 51 Patients Who Were Transferred to the Pediatric Intensive Care Unit (PICU) and 2,928 Patients Who Were Not Transferred to the PICU
Note: The area under the ROC curve was 0.89.

To obtain an in-depth evaluation of the scoring system, the five patients who have a PEWS of 0–2 but who required a PICU transfer were reviewed to determine the reason they were transferred. Two of these five false negatives were patients transferred to the PICU because of a hospital protocol for PICU transfer based on laboratory results; the PEWS instrument is based on bedside assessment, but not laboratory data. Two of the five false-negative patients were transferred because of clinicians’ request for increased level of monitoring because of the potential for deterioration based on neurological status or skin sloughing. The final false negative was a patient with nonsustained ventricular tachycardia who was transferred for more intense therapy for his arrhythmia. Therefore, four of the five patients who were considered false negatives (all except the patient with ventricular tachycardia) were transferred to the PICU because of concern that they might deteriorate, but these four patients did not clinically deteriorate while on the unit.


Similar to previous evaluations of pediatric early warning systems, the present study supported the use of an objective scoring tool to identify clinical deterioration in hospitalized children. The present study added to the previous literature by providing a systematic evaluation of Monaghan's PEWS used in the clinical setting (Duncan et al., 2006; Haines et al., 2006; Monaghan, 2005). The present study provides the first analysis of the sensitivity and specificity of an early warning score tool in a pediatric population. Additionally, the present study provided the first examination of interrater reliability when using a pediatric early warning score.

The present study provided a prospective evaluation of the PEWS tool. In this study, we found that the PEWS tool produced data that were reliable and valid. Additionally, high PEWS were associated with an increased likelihood of transferring to the PICU.

Reviewing the false negatives highlighted a limitation of the present study: the use of PICU transfer as the proxy measurement of clinical deterioration. Even though PICU transfer is more common than cardiopulmonary arrest, it is rare, which limits its use as the outcome measure. Additionally, PPV and NPV are greatly influenced by the prevalence of the outcome variable. By using a proxy outcome variable that has a very low prevalence, the predictive values were poor.

In addition to the low prevalence of PICU transfer in this population, it was discovered that four of the five false-negative patients did not clinically deteriorate on the unit prior to PICU transfer. These patients were included in the analyses as false negatives, thereby decreasing the sensitivity of the tool. However, since the patients were not clinically deteriorating, the low PEWS actually were accurate. This suggests that if a more discriminating outcome measure such as cardiopulmonary arrest was used, a higher sensitivity may have been calculated. Similarly, the true specificity may have been higher than reported. It is feasible that some of the patients who scored high PEWS but were not transferred to the PICU were actually clinically deteriorating, as the PEWS suggested. The high PEWS would have required assessment of the patient by the senior RN, pediatric resident, senior pediatric resident, or MET and the interventions implemented by these senior clinicians could have resulted in improvement and subsequent prevention of a PICU transfer. These cases would have been included in the analyses as false positives and contributed to a lower specificity.

Several lessons were learned by our team during the planning and implementation phases of PEWS. Having nurses at the table from the initiation of the project was instrumental in obtaining nurses’“buy in” for implementing PEWS. Despite the fact that nurses were actively involved early in the process after implementation, several nurses did not understand the need for a score to identify deteriorating patient condition. They felt they were capable of determining patients at risk for deterioration and that PEWS would add more work to their daily routine. As a result, focus groups with front-line nurses and unit leadership took place at monthly staff meetings during the early implementation phase for obtaining nurses’ feedback. In addition, during the initial testing phase, a large poster was placed in the conference room for nurses to leave written feedback for suggestions for improving the use of PEWS and the action-based algorithm. The unit director would address concerns as they arose with unit staff for rapid project improvement.

An area of concern discovered by the team early in the project was there were no physicians involved in the initial planning discussions for implementing PEWS on the unit. The only physician representation was the medical director of the unit, who was a member of the project team. After implementation of the PEWS tool, there were issues that impacted the work of the physicians that were not considered during the early planning stages. Physicians were then added to the team for input in resolving practice issues. Having physician involvement and their understanding of the process strengthened the collaboration between nurses and physicians when PEWS impacted patient interventions.


In conclusion, the present study suggests that the PEWS tool provides highly reliable and valid clinical scoring data. High PEWS are predictive of patients who will require transfer to the PICU. The use of PICU transfer as the proxy measurement of clinical deterioration is a limitation of the study. However, even with this limitation, the tool yielded reliable and valid data. Therefore, we suggest that the PEWS tool may be even more sensitive and specific than reported. Further research using alternative outcome measures of clinical deterioration such as cardiopulmonary arrest is warranted. Furthermore, additional studies that evaluate the impact of the PEWS tool on clinical outcomes would contribute importantly to the pediatric medical and nursing literature.

How Do I Apply This Information to Nursing Practice?

Early identification of deterioration and subsequent intervention are important for pediatric patients in preventing codes outside the PICU. The PEWS was found to be a reliable tool for use by the bedside nurse for identifying early patient deterioration. Consistent with the findings of Monaghan (2005), nurses found the PEWS tool easy to use, with minimal to no time delay in calculating the score during patient assessment. Nurses reported an average time of 15 to 30 sec to calculate the score. The PEWS allowed nurses to communicate with healthcare providers using one common language. Nurses stated that when concerns regarding patient symptoms arose, there was reduced miscommunication among the healthcare team about the patient's “true” condition. Furthermore, staff felt empowered to make independent clinical decisions based on the actions outlined in the predetermined algorithm. PEWS discriminated between those patients requiring higher level care and PICU transfer, so earlier collaboration took place among experienced healthcare providers at the patient's bedside. This permitted timely and controlled interventions and/or transfer to the PICU. The bedside RNs were able to recognize positive outcomes to using the PEWS tool, including early identification of patient deterioration and improved collaboration and communication with physicians that built their faith in using the PEWS tool.


We would like to thank Alan Monaghan, the staff on A6South, Dr. Uma Kotagal, Dr. Steve Muething, Melinda Corcoran, the pediatric residents, and the pediatric attending physicians who participated in this project.