Predictive validity of the braden scale for pressure injury risk assessment in adults: A systematic review and meta‐analysis

Abstract Aim Pressure injuries are common adverse events in clinical practice, affecting the well‐being of patients and causing considerable financial burden to healthcare systems. It is therefore essential to use reliable assessment tools to identify pressure injuries for early prevention. The Braden Scale is a widely used tool to assess pressure injury risk, but the literature is currently lacking in determining its accuracy. This study aimed to evaluate the accuracy of the Braden Scale in assessing pressure injury risk. Design Systematic review and meta‐analysis. Methods Articles published between 1973–2020 from periodicals indexed in the PubMed, EMBASE, CINAHL, Web of Science and the Cochrane Library were selected. Two reviewers independently selected the relevant studies for inclusion. Data were analysed by the STATA 15.0 and the RevMan 5.3 software. Results In total, 60 studies involving 49,326 individuals were eligible for this meta‐analysis. The pooled SEN, SPE, PLR, NLR, DOR and AUC were 0.78 (95% CI: 0.74 to 0.82), 0.72 (95% CI: 0.66 to 0.78), 2.80 (95% CI: 2.30 to 3.50), 0.30 (95% CI: 0.26 to 0.35), 9.00 (95% CI: 7.00 to 13.00) and 0.82 (95% CI: 0.79 to 0.85), respectively. Subgroup analyses indicated that the AUC was higher for prospective design (0.84, 95% CI: 0.81 to 0.87), mean age <60 years (0.87, 95% CI: 0.84 to 0.90), hospital (0.82, 95% CI: 0.79 to 0.86) and Caucasian population (0.86, 95% CI: 0.82 to 0.88). In addition, 18 was found to be the optimal cut‐off value. Conclusion The evidence indicated that the Braden Scale had a moderate predictive validity. It was more suitable for mean age <60 years, hospitalized patients and the Caucasian population, and the cut‐off value of 18 might be used for the risk assessment of pressure injuries in clinical practice. However, due to the different cut‐off values used among included studies, the results had a significant heterogeneity. Future studies should explore the optimal cut‐off value in the same clinical environment.


| INTRODUC TI ON
Pressure injuries (PIs), also known as decubitus ulcers, ischaemic ulcers, bedsores, pressure sores and pressure ulcers, are localized damage to the skin and underlying soft tissue usually over a bony prominence or related to a medical or other device (NPUAP, 2016).
Individuals who are at high risk are those characterized by multiple risk factors that affect both the mechanical boundary conditions and the susceptibility and tolerance of the individual (National Pressure Ulcer Advisory Panel and Alliance, 2014). However, most PIs can be prevented if effective measures including systematic skin examination, risk assessment, bed and chair support surfaces, repositioning and mobilization, and nutritional support are implemented (Bredesen et al., 2015). Risk assessment is a central component of PI prevention (Coleman et al., 2013, National Pressure Ulcer Advisory Panel andAlliance, 2014), so it is important to use a valid and reliable assessment tool to identify high-risk patients and implement appropriate interventions for the prevention of PIs.
Since the early 1960s, a variety of risk assessment tools have been developed with over 50 scales currently to determine the risk of PIs, such as the Norton Scale, the Waterlow Scale and the Braden Scale (Shi et al., 2019). The Braden Scale is the most common around the world due to its ease of use with wider risk factor incorporation (e.g. moisture and sensory perception) when compared to other scales (National Pressure Ulcer Advisory Panel and Alliance, 2014).
However, it has been used in different population clinical settings, with a variety of re-verification results. In order to take appropriate measures and prevent PI development early, practitioners must ascertain whether the Braden Scale can accurately identify the risk of PIs.

| BACKG ROU N D
PIs are one of the most frequently occurring adverse events in hospitalized patients worldwide (Li et al., 2020, National Pressure Ulcer Advisory Panel andAlliance, 2014), which prolong hospital stay, increase medical expenses, decrease quality of life and result in increased nosocomial infection, disability, morbidity and mortality (Al Mutairi & Hendrie, 2018;Aloweni et al., 2019;Amir et al., 2017;Coleman et al., 2013;Ferris et al., 2019;Jackson et al., 2019;Mallow et al., 2013). The prevalence of PIs remains unacceptably high, ranging from 1.1%-26.7% in the hospital setting and 6%-29% in the community setting (Graves & Zheng, 2014). It has been estimated that the annual cost of treating PIs is $26.8 billion in the United States (Padula & Delarmente, 2019), €334.86 million to €2.59 billion in Europe (Severens et al., 2002) and A$983 million in Australia (Nguyen et al., 2015). A recent study noted that the cost of PI prevention was more cost-effective than that of PI treatment across all clinical settings (Demarré et al., 2015). For these reasons, PI prevention is of great importance. An essential component of preventive strategies is the risk assessment of PI development in the individual.
Risk assessments tools are generally used to assess the risk of developing PIs, such as the Norton Scale, the Waterlow Scale and the Braden Scale. The ideal risk assessment tool must accurately identify individuals at risk, as well as those not at risk.
The Norton Scale is the first structured risk assessment tool for predicting PIs, but it lacks the part of friction shear, which may result in the occurrence of PIs (National Pressure Ulcer Advisory Panel and Alliance, 2014). Although it was also developed to assess senile patients at risk of developing PIs, the Waterlow Scale cannot accurately identify those individuals who are not at risk, with the specificity of 32.9% (Serpa et al., 2009). The Baden Scale is based on six common risk factors including sensory function, moisture, activity, mobility, nutrition, shearing force and friction. A summative score reveals the level of risk where lower values are indicative of higher risk (Kelechi et al., 2013). Due to the ease of use and interpretation of the point system, the Braden Scale has quickly gained popularity among practitioners. However, in order to reflect the population characteristics and the medical culture of the country, the Braden Scale has been re-verified by different researchers in the past 30 years. The sensitivity and specificity of it showed a wide range of differences from 50%-100% depending on the research subjects or conditions (Chou et al., 2013), and the cut-off point differed as well (Cowan et al., 2012). Some studies (Chen et al., 2016;Pancorbo-Hidalgo et al., 2010; found that the Braden Scale offered the best balance between sensitivity and specificity. But a systematic review (Wei et al., 2012) revealed that the Braden Scale could not be used alone in assessing PIs' risk in surgical patients. As a result, there is no consensus on predictive validity of the Braden Scale among different studies.
Given the importance of risk assessment for PI prevention, practitioners have used the Braden Scale in different population and clinical settings. However, it is unclear whether the Braden scale can accurately identify the risk of PIs in practice. The purpose of this study was to determine predictive validity of the Braden Scale and to explore the suitable population and optimal cut-off value through a diagnostic method oriented meta-analysis. Understanding the predictive validity, applicable population and optimal cut-off value is beneficial for practitioners to identify the risk of PIs and take preventive measures early.

| DE S I G N
We conducted a systematic review and meta-analysis. The study was performed in accordance with the guidelines from the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy from the Cochrane Collaboration (Macaskill et al., 2010) and Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) (Moher et al., 2009). Our study protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO) (CRD42020142181).

| Search strategy
The digital databases including PubMed, EMBASE, Web of Science, the Cochrane Library and the Cumulative Index of Nursing and Allied Health (CINAHL) were searched, from inception of each database to July 2020.
In addition, we explored the bibliographies of relevant reviews in order to identify other potentially eligible studies. The literature search terms and strategies used are available in supplementary appendix 1.

| Inclusion and exclusion criteria
The eligible studies must meet the following criteria: (a) patients were

| Study selection
Two reviewers independently screened titles and abstracts for eligibility with the consistent accomplishment of a pilot literature selection. The full text was read if the abstract and title cannot be determined for inclusion. In case of disagreement, a third reviewer resolved the conflict between them.

| Data extraction
Two reviewers extracted data into a spreadsheet independently and resolved any discrepancies through discussion to reach a consensus.
For each study included, the following information was extracted: first author, publication year, country, study design, age, gender, sample size, cut-off value, reference standard, TP, FP, TN and FN.

| Quality assessment
The Quality Assessment of Diagnostic Accuracy Studies Ⅱ (QUADAS-Ⅱ) (Whiting et al., 2011) was used to assess the quality of each of the included studies. It contains four domains: patient selection, index test, reference standard, and flow and timing, classifying the methodological quality as having a low, high or unclear risk of bias. Two reviewers independently rated the applicability and risk of bias, and any conflict was resolved by a third reviewer.

| Statistical analysis
All statistical analyses were performed using STATA 15.0 (Stata, College Station, TX, USA) and Review Manager 5.3 software (Cochrane Collaboration, Oxford, UK). The bivariate meta-analysis model was selected to calculate the pooled sensitivity (SEN), specificity (SPE), positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR) and their corresponding 95% confidence intervals (95% CIs) (Reitsma et al., 2005). Furthermore, the summary receiver operator characteristic (SROC) curve was constructed and the area under the curve (AUC) was calculated to quantify the diagnostic power (Jones & Athanasiou, 2005).
With respect to the value, a value of 0.5 was deemed informative, 0.5 < AUC≤0.7 was considered less accurate, 0.7 < AUC≤0.9 was thought to be moderate, 0.9 < AUC<1 was deemed very accurate, and AUC = 1 was considered a perfect test (Greiner et al., 2000).
Heterogeneity was analysed by I 2 statistics. ≤25%, 25%<I 2 ≤ 75% and > 75% indicated respectively low, moderate and high heterogeneity between studies (Higgins et al., 2003). Subgroup analysis and sensitivity analysis were used to identify the sources of het- and (e) reference standard (authoritative vs. non-authoritative). In addition, we used Deeks' funnel plot to assess any potential publication bias (Deeks et al., 2005).

| Search results
A total of 6,441 publications were identified in our initial search. 4,215 studies remained after removing duplications. After scanning titles and abstracts, 71 studies were identified for further examination. By reviewing the full text of the remaining articles, 11 studies with insufficient data or no relevance to the diagnosis were rejected. Finally, a total of 60 studies were included in this review (Table 1). The detailed screening process is presented in Figure 1.

| Study characteristics
The baseline characteristics of these included studies are shown in

| Results of risk of bias
The risk of bias and applicability were assessed. In the risk of bias, a low risk of patient selection was shown in 11 (18%) studies, and 39

| Threshold effect
Visual inspection of forest plots and SROC curves, as well as Spearman's correlation of 0.334 (p =.009), suggested the presence of a threshold effect to some extent. The pooled results of different cut-off points are shown in Table 2.

| Subgroup analyses
In order to explore possible heterogeneity factors, we performed subgroup analyses based on study design (prospective vs. retrospective), mean age (<60 years vs. ≥60 years) (Matsumoto et al., 2018), setting (hospital vs. LTCF) and ethnicity (Asian population vs. Caucasian population). The pooled diagnostic parameters for subgroup analyses are summarized in Table 2.

| Sensitivity analysis and publication bias
We carried out sensitivity analysis to assess the result reliability in Figure 5. The goodness of fit and bivariate normality showed that the included studies had only minimal influence on the overall estimates.
Influence analysis and outlier detection identified eight outlier studies. After excluding these outlier studies, the SEN increased from 0.78-0.79, the SPE dropped from 0.72-0.70, the PLR decreased from 2.80-2.60, the NLR showed no change from 0.30-0.30, the DOR decreased from 9.00-8.00, and the AUC showed no change from 0.82-0.82, which suggested that the random-effects bivariate model was robust for the calculation of the pooled estimates. Finally,

F I G U R E 3 Sensitivity and specificity of included studies
Deeks' funnel plot asymmetry test was used to assess the potential publication bias. The funnel plot ( Figure 6) was not fully symmetrical, suggesting publication bias may exist in this meta-analysis (p <.05). The reasons are shown as follows: (a) a good assessment tool was high in both SEN (true-positive rate) and SPE (true-negative rate), which was generally unavailable in clinical settings (Park et al., 2015).

| D ISCUSS I ON
PI risk assessment was a screening inspection that preferred a higher sensitive tool rather than a higher specific tool. When the AUC was the same, the higher SEN was better in identifying the risk of PIs, which was beneficial to taking PI preventive interventions in time; and ( was that a risk assessment tool for PIs was not a diagnostic tool for the incidence of PIs but instead a screening tool assessing the risk of PIs. The cut-off value of 18 had a higher SEN than that of 16.
However, in view of the characteristics in the specific clinical setting, whether the value of 18 can be treated as the optimal cut-off was unknown. Future studies could explore this issue among different populations, such as medical, surgical, critical and elderly patients.
In addition, it is necessary to conduct multi-centre, large-sample studies in order to verify the effectiveness of 16 and 18 in PI risk assessment.
Based on the subgroup analyses, we found that results showed a higher level of accuracy among prospective studies (AUC: 0.84) than retrospective design (AUC: 0.78), which may be attributed to more rigorous design in the prospective studies. Although there was no significant difference in the AUC (0.87 vs. 0.81) between the young and middle-aged population and the elderly, the pooled SEN and SPE of the young and middle-aged population were 0.83 and 0.78, while those of the elderly were 0.77 and 071. Based on these, we found that the Braden Scale was more accurate in the young and middle-aged population than in elderly. The possible reason was that older people developed chronic diseases due to their declined physiological reserve (Jaul, 2010), which was not considered in the F I G U R E 4 Summary receiver operating characteristic curve Braden Scale. Moreover, oxygenation and perfusion situations that do not exist in the Braden Scale may also affect PI development in elderly people (Iranmanesh et al., 2012). An additional finding was that the Braden Scale had a higher diagnostic accuracy in the hospital than in the LTCF (AUC, 0.82 vs. 0.77). Such result correlated with some published studies (Park et al., 2015;Wei et al., 2020), but different from another study (Wei et al., 2012). Taking cultural difference into account, the Braden Scale, which was developed in the United States, might be more suitable to Caucasian population.

| Strengths and limitations
The strengths of this meta-analysis included the large number of patients retained in the quantitative synthesis. Furthermore, this is the first meta-analysis on the overall accuracy of the Braden Scale for identifying PI risk. In addition, we performed threshold analyses and cut-off-stratified analyses, and identified the optimal cutoff value, which played an important role in determining the risk of

| Implication for practice
The Braden Scale is more suitable to identify the risk of PI for mean age <60 years, hospitalized patients and the Caucasian population. It appears that 18 is the optimal cut-off value in clinical practice.

| CON CLUS ION
The Braden Scale has a moderate predictive validity for PI risk assessment, and it is more suitable for mean age <60 years, hospitalized patients and Caucasian population, compared with mean age ≥60 years, long-term care facility and Asian population. Meanwhile, the cut-off value of 18 afforded the best choice in SEN and AUC, and could be recommended for use in clinical practice. Future studies should explore the optimal cut-off in the specific environments.

ACK N OWLED G EM ENTS
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. This study has been funded by the National Nature Science Foundation of China (Grant

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data used to support the findings of this study are available from the corresponding author on reasonable request.