Comparison of two behavioural pain scales for the assessment of procedural pain: A systematic review

Abstract Aim To examine the clinical utility and measurement properties of the Critical‐Care Pain Observation Tool and the Behavioural Pain Scale when used to assess pain during procedures in the intensive care unit. Design A systematic review was conducted, guided by the Preferred Reporting Items for Systematic Reviews and Meta‐Analyses checklist. Methods A systematic search was conducted in CINAHL, MEDLINE, EMBASE and PsychINFO (01 October 2019). Study selection, data extraction and assessment of methodological quality were performed by a pair of authors working independently. Different psychometric properties were addressed: inter‐rater reliability, internal consistency, test–retest reliability, discriminant validity and criterion validity. Results Eleven studies were included. Both Critical‐Care Pain Observation Tool and the Behavioural Pain Scale showed good reliability and validity and were good options for assessing pain during painful procedures with intensive care unit patients unable to self‐report on pain. The Critical‐Care Pain Observation Tool is to be preferred since this tool was shown to have particularly good reliability and validity in assessing pain during procedures, but the Behavioural Pain Scale is an appropriate alternative.


| INTRODUC TI ON
Critically ill patients experience frequent pain and discomfort throughout their stay in the intensive care unit (ICU) and pain seems to be the patients' worst memory after discharge (Gélinas, 2007;Zetterlund et al., 2012). Uncontrolled pain has significant short-and long-term psychological and physiological consequences, delaying recovery and even being life-threatening (Baron et al., 2015;Barr et al., 2013;Peng et al., 2014;Puntillo et al., 2014). Treatment in the ICU may be provided while the patient is already under stress, such as the fear of losing his/her life or the threat of not regaining well-being (Gélinas, 2016). This more affective dimension of pain was emphasised in a recent proposal to change the definition of pain, which now reinforces pain as a et al., 2014). Some of the most painful procedures experienced by ICU patients are nursing care procedures such as turning, endotracheal suctioning, tube and drain removal, wound care and arterial line insertion (Payen et al., 2007;Puntillo et al., 2014;Vázquez et al., 2011).
Systematic pain assessment with valid tools is essential for adequate pain management and acts as an indicator of good clinical practice (Skrobik et al., 2010;Wøien & Bjørk, 2013). The patient's self-report of pain is regarded as the gold standard in the assessment of pain (Breivik et al., 2008;Merskey, 2007). However, in the ICU, a number of patients are unable to self-report and verbally communicate their pain and discomfort due to critical illness, decreased level of consciousness, mechanical ventilation and sedation. This makes pain assessment and pain management even more challenging (Alderson & McKechnie, 2013;Chanques et al., 2006;Payen et al., 2009). Therefore, for an assessment of pain, observable behavioural and physiological indicators become important indices (Gélinas et al., 2006).

| BACKG ROU N D
There are numerous tools for assessing pain in adult ICU patients, including the Nonverbal Pain scale (NVP), the Critical-Care Pain Observation Tool (CPOT), the Behavioural Pain Scale (BPS), the Comfort scale, the Face, Legs, Activity, Cry, Consolability scale (FL ACC), all of which have numeric rating scales (Rose et al., 2013). Of all these tools, the CPOT and the BPS are the most commonly used (Aïssaoui et al., 2005;Rijkenberg et al., 2015). They seem valid and sensitive for capturing changes in pain response among patients receiving sedatives or lacking the ability to communicate (Ahlers et al., 2008;Barr et al., 2013;Gélinas, 2007;Young et al., 2006).
In two systematic reviews (Gelinas et al., 2013;Pudas-Tähkä et al., 2009) that compared the psychometric properties of pain assessment scores for intensive care patients who were unable to self-report pain, the CPOT and the BPS received the best quality assessment scores. The CPOT was designed to detect pain in critically ill patients (Gélinas et al., 2006), while the BPS was developed to assess pain in unconscious mechanically ventilated patients (Payen et al., 2001). The main difference between these tools is in their evaluation of body movements and muscle tension (Severgnini et al., 2016). Improved pain management is associated with better outcomes for ICU patients (Chanques et al., 2006;Payen et al., 2009;Robinson et al., 2008;Skrobik et al., 2010).
However, pain caused by procedures in the ICU appears to remain underestimated and undertreated (Puntillo et al., 2014;Siffleet et al., 2007).
To ensure that the measurement error of pain assessment tools is as small as possible, the tools' validity and reliability need to be determined to ensure the instruments are functioning correctly (Field, 2013). Validity refers to whether the instrument measures what it is intended to measure (Polit & Beck, 2013) and reliability is the ability of the pain assessment tool to deliver the same results under the same circumstances (Field, 2013).
Several systematic reviews have compared a number of pain assessment scales used in the ICU (Barzanji et al., 2019;Fischer et al., 2019;Grosso et al., 2019;Pudas-Tähkä et al., 2009). The aim of these studies was for example to identify the most used pain assessment scales for the critically ill unconscious adult patient (Grosso et al., 2019) and instruments developed for pain assessment in unconscious or sedated intensive care patients (Pudas-Tähkä et al., 2009). Furthermore, for a pain scale to guide pain management decisions and to support efficient evaluations, it must be actionable and easy to interpret and it cannot take so many resources that it disrupts clinical care in the hectic ICU context. A feasible, useful and accurate scale is essential to ensure that the pain of ICU patients is correctly and consistently identified by procedures. However, to our knowledge, no reviews have evaluated studies that use both the CPOT and the BPS in relation to procedures in the ICU with the purpose of informing and guiding nurse decision-making. This systematic review therefore aimed to examine the measurement properties of the CPOT and BPS when used to assess pain during procedures in the ICU. It was directed by the following research questions: • To what extent have the CPOT and the BPS been tested for validity, reliability and responsiveness during painful procedures in the intensive care setting?
• Which of these two tools is best suited to assess pain in non-verbal critically ill intubated patients during painful procedures?

| Design
This systematic review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist (Moher et al., 2009). The protocol was not published or registered.

| Eligibility criteria
Studies were included if they met the following criteria: 1) they had a quantitative design; 2) they included ICU patients aged 18 years or older who were unable to self-report pain due to critical illness; 3) the patients received mechanical ventilation and/or sedation; and 4) were tested for the validity and reliability of both the CPOT and the BPS during painful procedures. Studies were excluded if the data were published as a conference paper, abstract, doctoral thesis, letter or comments. publication year and ended on 01 October 2019. The Medline search strategy is described in Data S1.

| Search outcomes
Our primary outcomes were the validity and reliability of the CPOT scale and the BPS scale. The CPOT scale includes four behavioural indicators: 1) facial expression; 2) body movements; 3) muscle tension; and 4) compliance with the ventilator (for intubated patients) or verbalization (for extubated patients) (Gélinas et al., 2006). The BPS scale includes three behavioural indicators: 1) facial expression; 2) movements of the upper extremities; and 3) compliance with the ventilator (Payen et al., 2001).

| Study selection and data extraction
A pair of authors independently assessed whether titles, abstracts and full-text papers met the inclusion criteria. When there was any doubt whether a paper should be included, a third author independently assessed the paper. The data from the included papers were extracted independently by the same pair of authors using a standardized data collection form that included: author, year, location for research, aim, study design, population and results. Reasons for excluded articles are presented in Figure 1.

| Quality assessment
The methodological quality of the included studies was assessed by the pair of authors independently, using the Critical Appraisal Skill Programs (CASP) checklist (Critical Appraisal Skills Programme, 2018; Nadelson & Nadelson, 2014). The quality assessment criteria for the included articles are shown in Table S1.

| Analysis
To assess the validity and reliability of the CPOT and BPS pain assessment tools, the results from the studies included were organized according to psychometric properties, such as inter-rater reliability, internal consistency, test-retest reliability, discriminant validity and criterion validity (see Table S2). Due to heterogeneity in study design, patient population, intervention, context and time of pain assessment, a quantitative synthesis was not possible. Consequently, the results are presented in a narrative form and with a table describing the validity and reliability scores and the analyses of each paper.

| RE SULTS
The literature search identified 100 publications. After removal of duplicates, 51 titles and abstracts were screened. After this first screening, 32 articles were excluded as they did not meet the inclusion criteria. The full text of 19 papers was assessed, and the final sample included a total of 11 studies: Nine prospective observational studies, one crossover observational study and one crosssectional study (Figure 1). No studies were identified that employed randomized controlled trial designs. The studies were conducted in the USA (N = 1), Taiwan (N = 2), Saudi Arabia (N = 1), China (N = 1), and Italy (N = 1). The sample sizes ranged from six-316 ICU patients.
The painful procedures were endotracheal suctioning (N = 8), turning (N = 5) and standardized nociceptive stimulation by pressure algometry (N = 1). The characteristics of the included studies are shown in Table 1.

| Quality assessment
A summary of the assessments of methodological quality is shown in Table 1 and in Table S1. Overall, the quality of the articles was rated as relatively high and 10 of the 11 articles presented with a score that had more than 10 out of a possible 14 "Yes" assessments. The assessments did show that the question of whether the outcomes were accurately measured to minimize bias was not sufficiently reported on in the articles. Furthermore, the question "How precise are the results?" was difficult to assess, as very few of the studies provided confidence intervals for their mean values, which could have given a more precise estimate of the range in which the real answers lay.

| Reliability
Four studies calculated weighted κ coefficients as a measure of inter-rater reliability Cheng et al., 2018;Klein et al., 2018;Liu et al., 2015). Chanques et al. (2014)  0.64-1.00) but also showed relatively high reliability. Liu et al. (2015) showed that the inter-rater reliability was not significantly different with the intubated compared with the non-intubated patients  Table 2).

Pudas-Tähkä et al. (2018) used the Shrout-Fleiss ICC test during suc-
tioning and showed that the best results following the painful procedure showed slightly lower scores for the BPS than for the CPOT.

TA B L E 2 Reliability and Validity findings of BPS and CPOT during painful procedures in non-verbal critically ill intubated patients
Research study   Al Darwish et al. (2016) showed the lowest agreement in the Facial Expression subscale during suction when using the BPS (r = .77), while in the CPOT, they found weak agreement in the Muscle Tension subscale.
In this study, the values showed good internal consistency for both tools (intubated = 0.785; 0.981; non-intubated = 0.812; 0.812, respectively). Al Darwish et al. (2016) presented the best results with a Cronbach's alpha for internal consistency of 0.95 in both scales.

| Validity
Validity was assessed using discriminant validity in ten studies (Table 2) and three studies also reported on criterion validity by using different analyses (Cheng et al., 2018;Liu et al., 2015;Severgnini et al., 2016). Only one study did not report on any validity tests (Pudas-Tahka & Salantera, 2018). About discriminant validity,  Rijkenberg et al. (2015) and Rijkenberg et al. (2017) calculated the discriminant validity between the CPOT and the BPS using the Friedman test. In Rijkenberg et al. (2015), the median scores increased by two points from rest to painful procedure (p < .001). The median BPS scores between rest and non-painful procedure showed a significant increase of one point (p = .002), whereas the median CPOT score remained unchanged. In Rijkenberg et al. (2017), the median CPOT and BPS scores for both nurses increased significantly (p = .001) from rest to painful procedure (Table 2). Klein et al. (2018) showed a significant increase in the median scores for pain in both measures between rest and turning, using

| D ISCUSS I ON
This systematic review aimed to examine the measurement properties of the CPOT and BPS when used to assess pain during proce- This review showed that inter-rater reliability showed that the nurses assessing the pain had a substantial to near perfect agreement in their observations related to the measurement of pain in mechanically ventilated ICU patients. However, the BPS showed a significantly greater inter-rater reliability in non-intubated compared with intubated patients, which may indicate that the BPS needs further assessment in intubated patients for nurses to provide adequate pain management to this latter group of patients Liu et al., 2015). The most likely reason for the BPS having a higher score in the non-intubated patients is the fact that BPS requires assessing ventilator waveform and asynchrony, which could be difficult while simultaneously observing a patient's face and body. Listening to ventilator alarms, as used by the CPOT, could be a useful alternative and CPOT may therefore be a more accurate tool for assessing pain in intubated patients Liu et al., 2015). However, systematic assessment for pain in mechanically ventilated ICU patients at rest and 30 min after any procedure resulted in smaller doses of sedation being required, a three day reduced duration on the respirator and a five-day reduction in ICU stay (Payen et al., 2009).
Most studies included measured internal consistency by estimat- The results about discriminant validity suggest that both pain assessment tools were well suited to measure the presence of pain when moving from rest to a painful procedure. However, there were some concerns about the BPS as it also showed a significant increase in scores during non-painful oral care, while the CPOT score remained unchanged (Rijkenberg et al., 2015(Rijkenberg et al., , 2017. These studies reported that most of the increase in BPS score during oral care was the result of changes in facial expression and movements of the upper limbs. The increase might have been due to reflexes to touch rather than response to pain. Coughing and straining might also be reflexes due to movement of the endotracheal tube during oral care (Rijkenberg et al., 2017 (Severgnini et al., 2016). Scores may differ due to the "muscle tension" item of the CPOT, an item not included in the BPS. For patients with high muscle tension related to pain, the CPOT would be a more effective assessment tool (Liu et al., 2015). Facial expression and ventilator compliance are recorded in both scales, although using different individual scores. Severgnini et al. (2016) showed that facial expression was the most important parameter related to pain assessment. It is important to note that facial expression is also easier to score at the bedside. A limitation in the study by Severgnini was that discriminant validity should be assessed during both painful and non-painful procedures in the same population. If the values calculated through the tools are increased by both painful and non-painful procedures, the validity and reliability are questionable.
The results suggest that both the CPOT and the BPS are reliable and valid pain assessment tools. However, the CPOT seems to be the preferred option for assessing pain during painful procedures due to its discriminant validation, meaning that CPOT can better detect pain whenever the patient is believed to be in pain. This may also be an important tool to distinct between discomfort and pain to provide the best treatment (Ashkenazy & Ganz, 2019). On the other hand, the BPS is rated as a little easier to remember during clinic practice than the CPOT as the BPS has only three domains for observation rather than four domains, as included in the CPOT .

| Limitations
There are limitations to our systematic review that need to be addressed. The systematic literature search was limited to the English and Scandinavian languages and publication types such as conference papers, abstracts, doctoral theses, letters and comments were excluded. Consequently, the results may be affected by publication bias. However, we searched multiple databases and collaborated with a librarian to ensure that the search was extensive. Furthermore, owing to the pre-experimental, pre-test-post-test nature of the designs, several threats to validity are potentially present, involving selection bias, lack of blinding, the order in which the instruments were tested and cultural competence. For example, in the study by Rijkenberg et al. (2015), the nursing staff were not blinded and when pain assessments were performed, the assessors were aware of which procedures were to be performed. This may have led them to perceive more behavioural changes during events, leading to higher scores during painful procedures. Additionally, the BPS was always completed first. An essential consideration is that no gold standard has been established for pain assessment in patients who are unable to give self-reports.

| CON CLUS ION
Both of the pain assessment tools addressed in this review have a systematic approach to evaluating pain. The CPOT especially has been shown to have good reliability and validity for assessing pain during painful procedures in ICU patients unable to self-report their pain. The BPS is an appropriate alternative, but because of the discriminant validation, the CPOT is to be preferred.

ACK N OWLED G EM ENT
We acknowledge research librarian Kari Mariussen for helping us to build the search strategy.

CO N FLI C T S O F I NTE R E S T
We have no conflict of interest.

AUTH O R CO NTR I B UTI O N S
HCB and MTS: Manuscript drafting, conception and design. HB, MHL, SAS and MTS: Acquisition of data, analysis and interpretation.
All authors: Manuscript revision and final approval of manuscript to publish in Nursing Open.

E TH I C A L A PPROVA L
No approval from the university college or the data protection officer was needed to conduct the review since we investigated already published data.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data availability is not relevant, since all data are available in original articles.