SEARCH

SEARCH BY CITATION

Keywords:

  • acute care;
  • falls;
  • nursing;
  • reliability;
  • risk assessment

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgement
  9. References

A prospective, descriptive study was conducted in an acute care hospital in Singapore to determine the inter-rater reliability of the modified Morse Fall Scale by evaluating the degrees of agreement on the ratings of the individual items and overall score between the ‘gold standard’ assessor and the facility assessors. One hundred and forty-two subjects were recruited during the 1.5 month data collection period. The simple and weighted κ-values were all > 0.8 except for the item ‘effects of medications’ (κ and κw = 0.63), and the correlation coefficient (rs = 0.89) was significantly high at a significance level of < 0.001. The modified Morse Fall Scale was shown to be a reliable fall risk assessment tool having a relative high inter-rater reliability level for the overall score and individual items. This study provides evidence-based psychometric support for the clinical application of this tool.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgement
  9. References

A fall is defined as ‘an unexpected, involuntary loss of balance by a person before coming to a rest at a lower or ground level’.[1] Falls are the most common adverse events reported in a hospital setting[2] and have been reported to be the most frequent form of trauma-associated deaths among people aged ≥ 55 years.[3] In Singapore, fall rates vary on 0.68–1.44 per 1000 patient bed days.[4] Patient falls lead to major morbidity, reduced functioning and increased health-care expenditure.[3]

A programme of multifaceted interventions typically comprising of fall assessment is the most common approach to fall prevention.[5] Singapore is no difference. An objective and systematic fall assessment commonly involves a fall risk assessment tool.[6] The majority of older patient falls are multifactorial in aetiology, caused by intrinsic and extrinsic factors, and risk of falling corresponds to increased number of risk factors.[3]

The Morse Fall Scale has been developed through rigorous research design and validated in various settings.[7] As tested by Morse[8] in 1997, this scale has high inter-rater reliability (r = 0.96), relatively high sensitivity of 78% and specificity of 83%. In a literature review by Myers and Nikoletti,[6] the Morse Fall Scale was the only tool which had been tested by other authors in various clinical settings to the development population.[7, 9, 10] It had been utilized in a wide spectrum of patient populations.[11-13]

In order to identify higher fall-risk patients, the Morse Fall Scale was modified by the experienced quality nurse manager through allied health-care professionals' focus group discussions in a local hospital. Though the risk factors in the Morse Fall Scale seem to show similarities to the fall risk assessment tool used in the study hospital, the adjusted scores and additional components have not been tested for its reliability. Therefore, this research aimed to determine the inter-rater reliability of the modified Morse Fall Scale in an acute care hospital by further understanding the degrees of agreement on the ratings of the individual items and overall scores between the ‘gold standard’ assessor and the facility assessors. More specifically, the two null hypotheses were as follows:

  1. There would be no agreement (κ and κw = 0) on the ‘gold standard’ assessor's and the facility assessors' ratings for the individual items in the fall risk assessment tool.
  2. There would be no correlation (rs = 0) in the overall scores of the fall risk assessment tool between the ‘gold standard’ assessor and the facility assessors.

Methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgement
  9. References

Research design and setting

A prospective, descriptive study was conducted in a 550 bedded acute care hospital in Singapore. Two geriatric, two surgical and three medical wards were selected for the study.

Sample size

As reference from the previous similar local study,[14] the sensitivity of the Morse Fall Scale at optimal cut-off point was tested to be 68%. When the null was set as 0.68 (68%) and the alternative was expected to be 0.778 (77.8%), having 14.5% increase from the null, a convenience sample of 142 observations was therefore needed at 5% alpha and 80% power (nQuery Advisor 7.0, 2009).

Study sample

Patients

Patients who were aged ≥ 55 years and admitted to the selected wards within first 24 h of admission were recruited. Those who suffered from falls in hospital before undergoing fall risk assessment were excluded from the study.

The instrument

The original version of Morse Fall Scale consists of six items; the item ‘history of falling’ is distributed with 25 risk points, ‘secondary diagnosis’ with 15, ‘ambulatory aid’ with 15 or 30, ‘intravenous/heparin lock’ with 20, ‘gait/transferring’ with 10 or 20 and ‘mental status’ with 15.[7] In total, 125 risk points can be achieved. The Morse Fall Scale score ranges of 0–24, 25–50 and ≥ 51 are categorised as ‘no risk’, ‘low risk’ and ‘high risk’, respectively.

The modified version of Morse Fall Scale in this study has an additional component (‘risk-taking behaviour’). The maximum total score is adjusted to 255. Patients with 0–24 risk scores are categorized into the low risk group, 25–44 scores into medium risk group and ≥ 45 scores into high risk group.

The history of falls is assessed by asking the patients or their family whether the patient has a fall within 6 months excluding recreational and industrial incidents. The category under ‘mental state’ includes agitation and confusion, and the category under ‘medication’ includes the effect of medication and postgeneral anaesthesia (GA) or regional anaesthesia (RA) to be reviewed after 24 h. Mobility is assessed by monitoring ambulation or patient's verbalization of feeling weak and by asking the patient about use of any assistive devices such as wheelchair, umbrella, crutches and cane walker. Regarding medical condition, the patient or family is asked if he or she has any other past medical conditions (e.g. hypertension, diabetes and stroke) besides the reason for admission. Patients who display risk-taking behaviours or do not comply to fall prevention instructions are considered under ‘safety awareness: risk-taking behaviour’. The category ‘continuous intravenous therapy’ is scored if the patient is receiving intravenous therapy.

Data collection procedures

The data collection was over a period of 1.5 months. All the nurses and the researcher were required to undergo 1 day training relevant to the usage of the fall risk assessment tool. Minimum score of 80 was needed to pass the training test.

During the data collection, permission to approach the patients and nurses was sought from the nurse managers. The nurses were all the registered nurses who had undergone training on fall risk assessment and were working on day shift. The trained researcher read through the case sheets to identify the potential participants. Before giving informed consents, all selected patients were given participant information sheets with verbal explanations on the details of study.

The researcher and the nurse then assessed the same patient independently. The nurse's assessment on the patient's fall risk is part of the patient's admission procedure. The researcher rated the patients using a separate data collection form within first 24 h of admission on a separate occasion. After rating, the researcher transcribed the admission nurse's rating scores of the same patient from the hospital's copy into the second column of the data collection form under the ‘nurse's assessment’ heading. Being the consistent assessor on all occasions of fall risk assessment, the researcher provides the ‘gold standard’ rating in comparison with that of the ground nurses. In order to minimize any transcription error, the staff nurses on duty would oversee all the data being transcribed correctly.

Ethical consideration

Ethical approval was granted by the Institutional Review Board of the study hospital. The anonymous data collection forms with no names or identification numbers were used to prevent breach of confidentiality.

Data analysis

spss software version 18.0 (IBM, Chicago, IL, USA) and MedCalc software version 11.5.1 (MedCalc, USA, 2010) were used for analysis. Descriptive statistics were used to define the participants' demographic characteristics. Kolmogorov–Smirnov (K-S) normality test, non-parametric statistical tests such as Kappa statistic and Spearman ranked correlation coefficients were used for data analysis as the sample responses were not normally distributed.

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgement
  9. References

Demographic data

One hundred and forty-two patients were assessed (Mean age = 72.3, standard deviation (SD) = 10.98) (Table 1). The participants aged ≥ 75 years formed the largest age group (n = 69, 48.6%). There were equal proportions of men (50%) and women (50%). Medical patients made up the largest proportion (n = 80, 56.3%) of participants recruited, followed by geriatric medicine (n = 23, 16.2%), surgical (n = 20, 14.1%), cardiology (n = 10, 7.1%), orthopaedic (n = 8, 5.6%), and lastly, urology (n = 1, 0.7%). One hundred and one recruited patients (71.1%) were Chinese, 31 (21.8%) were Malays and the remaining 10 (7.1%) were Indians. One hundred and forty participants (98.6%) were married, one (0.7%) was single and the remaining one (0.7%) was widowed. The majority of the participants (n = 119, 83.8%) had achieved education level of primary school and below.

Table 1. Demographic data of study sample
Demographic characteristicTotal (n = 142)
n(%)
Age (years)  
55–6441(28.9)
65–7432(22.5)
≥ 7569(48.6)
Gender  
Male71(50.0)
Female71(50.0)
Marital status  
Single1(0.7)
Married140(98.6)
Widowed1(0.7)
Race  
Chinese101(71.1)
Malay31(21.8)
Indian10(7.1)
Education level  
Primary school and below119(83.8)
Secondary school20(14.1)
Tertiary school and above3(2.1)
Discipline  
Medical80(56.3)
Geriatric medicine23(16.2)
Surgical20(14.1)
Cardiology10(7.1)
Orthopaedic8(5.6)
Urology1(0.7)

The simple and weighted κ-values were all > 0.8 except for the item ‘effects of medications’ (κ and κw = 0.630) (Table 2). Among all the items, the simple and weighted κ-values (κ and κw = 1.000) were the highest for the items ‘post-GA/RA’ and ‘risk-taking behaviour’.

Table 2. Simple and weighted kappa statistic values for each item assessed
Itemκ (95% CI)κw (95% CI)
  1.  Kappa.  Kappa with linear weights. CI, confidence interval; GA, general anaesthesia; RA, regional anaesthesia.

1. Fall(s) over the past 6 months0.968 (0.92–1.00)0.968 (0.92–1.00)
2. Mental status0.881 (0.72–1.00)0.881 (0.72–1.00)
3. Effects of medication0.630 (0.51–0.75)0.630 (0.51–0.75)
4. Post-GA/RA1.000 (1.00–1.00)1.000 (1.00–1.00)
5. Unsteady gait0.955 (0.90–1.00)0.958 (0.91–1.00)
6. Use of assistive devices0.838 (0.76–0.92)0.835 (0.75–0.92)
7. Secondary diagnosis0.902 (0.77–1.00)0.902 (0.77–1.00)
8. Risk-taking behaviour1.000 (1.00–1.00)1.000 (1.00–1.00)
9. Continuous intravenous therapy0.913 (0.85–0.98)0.913 (0.85–0.98)

Items where the weighted κ-value is different from the simple κ-value are ordinal response measures. The simple κ-value (0.955) was less than the weighted κ-value (0.958) for item ‘unsteady gait’, whereas the simple κ-value (0.838) was slightly greater than the weighted κ-value (0.835) for item ‘use of assistive devices’. Nevertheless, for these two ordinal response measures, a high rate of absolute agreement (κw-values > 0.8) was shown.

The median score of the ‘gold standard’ assessor was 80.0 and that of the facility assessors was 77.5 (Table 3). These resulted in a sum score difference of 2.5. Although there was a difference in the sum scores, the correlation coefficient (rs = 0.89) was significantly high at a significance level of < 0.001.

Table 3. Difference and correlation in sum scores between the assessors
Number of assessed patientsMedian score (‘gold standard’ assessor)Median score (facility assessors)Difference of sum scoreCorrelation (Spearman)
  1. †Correlation is significant at < 0.001 level (two-tailed).

14280.077.52.50.89

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgement
  9. References

Agreement on the assessors' ratings for the individual items in the fall risk assessment tool

The results showed that the simple and weighted κ-values were significantly high, indicating that there was a significantly high degree of agreement on the ‘gold standard’ assessor's and the facility assessors' ratings for the individual items in the fall risk assessment tool with the exception of one item, ‘effects of medications’, having significant substantial simple and weighted kappa values (κ and κw = 0.630) of < 0.8 to achieve almost absolute agreement.[15]

By comparing the weighted κ-values, the range was relatively large (0.630–1.000). The low degrees of inter-rater reliability were possibly because of ambiguous operational definitions of some items. For instance, it was stated in the work protocol that 15 points are scored if the patient verbalized that he/she feels weak. However, there was no definition or elaboration on the meaning of ‘weak’. This supports further improvements in the definitions of the items, enabling an accurate identification of high-risk patients.

For the item ‘use of assistive devices’, the weighted κ-value (κw = 0.835) was slightly lower than the simple κ-value (κ = 0.838). For the purpose of illustration, the contingency table of the item ‘use of assistive devices’ is presented (Table 4). Both raters had the exact agreements in 129 out of 142 ratings. Three ratings deviated from exact agreements in one category. However, maximum disagreements occurred in nine cases; the ‘gold standard’ rater rated ‘crutches, cane, walker’, whereas the facility raters rated ‘no device’. These strong disagreements subsequently decreased the weighted κ-value. This is perhaps because of the assessors' different perceptions on the individual patients' need of assistive device. Training with evaluations of performance on real patients and usage of photographs to illustrate the items is recommended to enhance the learner's memory.

Table 4. Contingency table of item ‘use of assistive devices’
Facility rater‘Gold standard’ rater
No deviceFurniture, support (e.g. walls, wheelchair)Crutches, cane, walkerTotal
No device520961
Furniture, support (e.g. walls, wheelchair)010313
Crutches, cane, walker106768
Total531079142

Contrary to the item ‘use of assistive devices’ which had the weighted κ-value (κw = 0.835) slightly lower than simple κ-value (κ = 0.838), the item ‘unsteady gait’ had a weighted κ-value (κw = 0.958) that was slightly higher than the simple κ-value (κ = 0.955). This indicates that the relative agreement on the item ‘unsteady gait’ was slightly higher than the exact agreement. Because the difference was minimum, the effect from the raters' disagreement would not result in a very negative impact. However, in terms of mobility, there should be an absolute agreement between the raters to ensure the implementation of the appropriate fall prevention strategies. It might be that the explanations of the terms were not specific enough. A more distinct measure of the mobility should be utilized, and an example would be the ‘get up and go’ test.[16]

It is expected that the items ‘fall(s) over the past 6 months’ and ‘continuous intravenous therapy’ should have been assessed by all the staff members in the same manner. Although the simple and weighted kappa values appeared relatively high for ‘fall(s) over the past 6 months’ (κ and κw = 0.968) and for ‘intravenous therapy’ (κ and κw = 0.913), both did not achieve an excellent value of 1. The problem with rating correctly for the item ‘fall(s) over the past 6 months’ was likely attributed to the patients' ambiguous answers and abilities to recall their fall histories because of their old age. For instance, the ratings between the assessors might vary if the patients recalled differently on separate occasions. Recall bias was therefore introduced. For an easy reference by the ward staff, a fall over the past 6 months should be documented in the case sheets on the patient's admission into the hospital.

Assessment of continuous intravenous therapy was by direct observation of the patients' present status. However, there was a time gap in the ratings by the ‘gold standard’ assessor and the facility assessors, leading to a discrepancy in the assessments of the patients' conditions. As stated in the study hospital's work protocol, the patient's fall risk status should be assessed on admission, when there is a change in patient's condition/ treatment, when transferred from another department or after a fall. It was observed that the majority of the ward staff only reassessed the patients' fall risk status while writing their nursing reports within 2 hours before the report passing time. Prompt reassessments could be compromised by their busy work routines. However, the importance of accurate and prompt fall risk assessments should be reinforced among the ward staff so as to facilitate the implementation of effective fall prevention interventions.

When taking only weighted κ-values into consideration, the highest exact agreement between the raters was on the items ‘post-GA/RA’ and ‘risk-taking behaviour’, having an excellent weighted κ-value of 1. This shows that the ‘gold standard’ rater agreed with the facility raters on all the assessment occasions. Conversely, the results revealed that the raters disagreed most during the rating of the item ‘effects of medications’ (κw = 0.63). It was observed that not all the patient case folders or inpatient medication records had a list of ‘fall risk’ medications inserted for reference and that the majority of the staff did not refer to the patient's medication records during the fall risk assessment. Similarly, there was no absolute agreement on the item ‘secondary diagnosis’ (κw = 0.902). This could be assessed easily and accurately by reading through the patients' medical history in the case notes, but some of the ward staff did not do so. Assessment based on the nurses' prior knowledge of the patients' conditions could cause inaccuracy in the documentation, directly leading to miscategorization of the patient's fall risk status. Counterchecking the reliable sources should be reinforced.

Correlation in the overall scores of the fall risk assessment tool between the assessors

From the results, the rho coefficient (rs) was 0.89, and the significant level was very small (P < 0.001) and was < 5% (0.05) level of significance as set. This showed that there was a significantly high correlation in the overall scores of the fall risk assessment tool between the ‘gold standard’ assessor and the facility assessors.

Inter-rater reliability is an important quality of any assessment tool in health-care settings, and the overall agreement was relatively good with significantly high correlation coefficient of 0.89. Though there was a difference in the median scores of the ‘gold standard’ assessor (80.0) and of the facility assessors (77.5), the resulted sum score difference (2.5) was not relatively large, and both scores were still categorized as ‘high fall risk’.

The original version of Morse Fall Scale had been tested rigorously in other settings. It was revealed that its inter-rater reliability coefficient varied in different study contexts: high inter-rater reliability (r = 0.96) tested by Morse in 1997,[8] relatively low inter-rater reliability of 0.68 by McCollam in 1995[17] and moderate reliability (κ = 0.80) by Ang et al. in 2007.[14] The inter-rater reliability (rs = 0.89) in this study seems to be relatively high as compared with the previous studies. However, caution is warranted in directly comparing the inter-rater reliability across the studies as the studies took place in different settings with different populations, study designs and data analyses. The values with different units cannot be compared directly.

The variability in the overall scores between the assessors may have been because of different levels of knowledge regarding the patients being assessed. On admission, the facility assessor could have the privilege to observe the patient's condition and carry out a more detailed interview with the patient when filling in the nursing assessment record. Their levels of knowledge on the patient's background may therefore be higher than that of the ‘gold standard’ rater because the ‘gold standard’ rater only focused on the fall risk assessment tool without having a more in-depth knowledge about the patients' backgrounds.

Limitations

This study is a single-centre study and a purposive selection of the study wards; the generalizability of the results to the population is therefore limited. Secondly, it is impossible not to prevent any patient from falling while doing the fall risk assessment on them because of ethical consideration. Having both prediction of the criterion variable and prevention of its occurrence simultaneously renders the research study to be imperfect as the accuracy of the assessment tool was affected. Thirdly, a method on gold standard and paired observer is considered when resources are insufficient to allow all the subjects to be assessed simultaneously by two raters.[18] However, the time gap separating ratings of the ‘gold standard’ rater and facility raters might decrease the inter-rater reliability coefficient because of the unpredictable changes within the time lapse. Fourthly, emphasizing the researcher as the ‘gold standard’ is truly debatable. As compared with the ward nurses, the researcher was certainly in a less advantageous position with less observation periods or prior knowledge of the patients' conditions. Lastly, there is a possibility that the ‘gold standard’ assessor might have been more careful in her fall risk assessment because the data were collected as part of a research study. This ‘Hawthorn effect’ can artificially alter the inter-rater reliability.[19]

Conclusion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgement
  9. References

The modified Morse Fall Scale was shown to be a reliable fall risk assessment having a relative high inter-rater reliability level for the overall score and individual items. This study gives a psychometric support for the application of fall risk assessment tool through quantitative inter-rater reliability and provides evidence-based support for the clinical application of this tool. The findings provide insight into the ability of the nurses to reliably assess the patients' levels of fall risk, and any disagreements on the rating of the patients' fall risk level may imply the need for improvement in the assessment tool and/or nurses' training. In the future, inter-rater reliability may be further strengthened when the assessors receive better training and feedback and if explicit definitions of the terms are introduced.

Acknowledgement

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgement
  9. References

The study was funded with funds from the Alice Lee Centre for Nursing Studies (National University of Singapore). Special thanks to Dr. Chow Yeow Leng, Dr. Serena Koh Siew Lin, Dr. Chan Moon Fai, Ms. Velusamy Poomkothammal and Ms. Jesbindar Kaur for their valuable contributions to the research study.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgement
  9. References
  • 1
    Commodore DIB. Falls in the elderly population: A look at incidence, risks, healthcare costs, and preventive strategies. Rehabilitation Nursing 1995; 20: 8489.
  • 2
    Eldridge C. Evidence-Based Falls Prevention: A Study Guide for Nurses. Marblehead, MA, USA: HCPro, 2004.
  • 3
    Tinetti ME, Baker DI, Garrett PA, Gottshalk M, Koch ML, Horwitz RI. Yale FICSIT: Risk factor abatement strategy for fall prevention. Journal of the American Geriatrics Society 1993; 41: 315320.
  • 4
    Koh SSL, Manias E, Hutchinson AM, Johnston L. Fall incidence and fall prevention practices at acute care hospitals in Singapore: A retrospective audit. Journal of Evaluation in Clinical Practice 2007; 13: 722727.
  • 5
    Evans D, Hodgkinson B, Lambert L, Wood J, Kowanko I. Falls in acute hospitals: A systematic review. The Joanna Briggs Institute for Evidence Based Nursing and Midwifery 1998; 1: 753.
  • 6
    Myers H, Nikoletti S. Fall risk assessment: A prospective investigation of nurses' clinical judgement and risk assessment tools in predicting patient falls. International Journal of Nursing Practice 2003; 9: 158165.
  • 7
    O'Connell B, Myers H. Research in brief: The sensitivity and specificity of the Morse Fall Scale in an acute care setting. Journal of Clinical Nursing 2002; 11: 134136.
  • 8
    Morse JM. Preventing Patient Falls. Thousand Oaks, CA, USA: Sage, 1997.
  • 9
    Eagle DJ, Salama S, Whitman D, Evans LA, Ho E, Olde J. Comparison of three instruments in predicting accidental falls in selected inpatients in a general teaching hospital. Journal of Gerontological Nursing 1999; 25: 4045.
  • 10
    McFarlane-Kolb H. Falls risk assessment, multitargeted interventions and the impact on hospital falls. International Journal of Nursing Practice 2004; 10: 199206.
  • 11
    Camicioli R, Licis L. Motor impairment predicts falls in specialized Alzheimer care units. Alzheimer Disease and Associated Disorders 2004; 18: 214218.
  • 12
    Lai KCKYS, Wong KST. Validation of the Cantonese version of the Morse Fall Scale. In: 11 Annual Congress of the Hong Kong Association of Gerontology, Hong Kong; 2003.
  • 13
    Ledsham RBJ, Beardsall A. Implementing a fall risk assessment strategy for older people: Issues and outcomes. Clinical Government Bulletin 2002; 3: 24.
  • 14
    Ang NKE, Mordiffi SZ, Wong HB, Devi K, Evans D. Evaluation of three fall-risk assessment tools in an acute care setting. Journal of Advanced Nursing 2007; 60: 427435.
  • 15
    Landis JR, Koch G. The measurement of observer agreement for categorical data. Biometrics 1977; 33: 159174.
  • 16
    Mathias S, Nayak US, Issacs B. Balance in elderly patients: The ‘get-up and go’ test. Archives of Physical Medicine and Rehabilitation 1986; 67: 387389.
  • 17
    McCollam ME. Evaluation and implementation of a research-based falls assessment innovation. The Nursing Clinics of North America 1995; 30: 507514.
  • 18
    Zenk SN, Schulz AJ, Mentz G et al. Inter-rater and test-retest reliability: Methods and results for the neighbourhood observational checklist. Health and Place 2006; 13: 452465.
  • 19
    Forsyth DR. Group Dynamics. Belmont, CA, USA: Wadsworth, 1999.