Early prediction of hospital admission of emergency department patients

The early prediction of hospital admission is important to ED patient management. Using available electronic data, we aimed to develop a predictive model for hospital admission.


Introduction
EDs are under high levels of pressure globally. They need to assess, treat and dispose of patients as rapidly as possible to avoid long waiting times, ambulance diversion, lengthy offload times, patient discomfort and ED overcrowding. 1 These problems have a profoundly unfavourable impact on patient outcomes. 2 In response to these challenges, one option has been to introduce ED length of stay (EDLOS) targets to generate a 'whole of hospital' response to ED overcrowding. Following the example of the UK, Australia introduced a target known as the National Emergency Access Target (NEAT) in 2011, aiming to have 90% of all patients (admitted or discharged) depart the ED within the 4-h timeframe, whereas New Zealand adopted a 6-h target. [3][4][5][6] In the last quarter of 2022, Victorian health services and public hospitals achieved a 4-h performance of only 52%. 7 As most Australian and New Zealand EDs admit 30-50% of attendances to a multiday or short stay (24 h observational) bed, 4 early prediction of admission has the potential to reduce EDLOS, improve 4-h performance, assist clinical decision-making and enhance the streaming of patients in the ED to improve efficiency and patient flow.
The early prediction of hospital admission, however, is challenging as multiple factors contribute to the final decision to admit a patient. Nonetheless, such multiplicity of               factors lends itself to machine learning (ML). ML can use data routinely collected and available in electronic medical records (EMRs) to predict admission early after ED presentation. 8 Accordingly, we hypothesise that, using available electronic data, an ML model could be developed that, within the first 4 h of presentation, would achieve good performance for the prediction of hospital admission.

Study design and population
We analysed all adult and paediatric presentations to the ED at the Austin Hospital, a tertiary referral centre in Melbourne, Australia, from 1 January 2015 until 30 June 2022. The aim was to develop ML models to predict whether a patient presenting to the ED would receive an inpatient admission. 9 We based our models on data collected by the EMR at different stages of the patients' presentation to the hospital. We chose the following specific time points at intervals: baseline (presentation), 30, 60, 120, 180, until 240 min from arrival to the ED; the maximum limit of 240 min was chosen to reflect the Australian NEAT target.

Data collection
We used routinely available EMR data (see Table 1 for variable list and  Table 2 for variable explanation), which include demographics, previous medical conditions (if available), sociodemographic data (e.g. socioeconomic index for suburb of residence, 10 use of community outreach and support programmes, place of residence), ED demographics (e.g. source of presentation, triage category), vital signs and common pathology results. We also included comorbidity indices such as the Charlson Comorbidity Index (CCI), 11 the Elixhauser Comorbidity Index (ECI) 12 and the Multi-purpose Australian Comorbidity Scoring System (MACSS) 13 alongside data on past presentations to the hospital in ED or outpatient clinics or inpatient admission. We further included the presence of certain words which might indicate a fall ('falls', 'fall', 'multiple falls', 'hip', 'fracture', 'broken' and 'poor mobility') to indicate the presence or risk of fall induced injury (which suggest a patient is unlikely to be discharged from the ED and is likely to require hospital admission). For same reason, we also included neuro-psychological related terms which might reflect neurocognitive impairment ('aggression', 'aggressive', 'confused', 'confusion', 'olanzapine', 'agitated', 'dementia', 'Parkinson', 'claustrophobic', 'disorient', 'screaming', 'midazolam', 'droperidol', 'distressed', 'delirium' and 'delirious') in the triage information or presenting complaint free text as categorical features in the model. The medications listed above and included in the model reflected their preferential and common use in the ED of the study hospital to treat delirium and agitation.

Outcomes
The primary outcome was an inpatient (IP) admission following an ED presentation. This was defined as a conversion of care type from an ED episode to an IP episode. The secondary outcome(s) were: (i) admission to the short stay unit (SSU), an IP area administratively but managed by ED or (ii) admission to a ward managed by specialities other than ED, each assessed separately.

Statistical analysis
The model was trained on presentations from 1 January 2015 to 31 March 2020 and was evaluated on a held-out data set of presentations from 31 December 2020 to 30 June 2022. This was done to evaluate the performance of the classification model prior to the commencement of local COVID-19 pandemic lockdowns. For missing data, continuous features were imputed (single imputation) with mean (if normally distributed) or with median (non-normally distributed), and categorical features were imputed with the most frequent class. All the models discussed in this paper are ensembles known as Extreme Gradient Boosting (Xgboost). 13,14 These models were chosen over traditional regression models because of: (i) improved performance much closer to state-ofthe-art performance in the industry; (ii) handling of multicollinearity; and (iii) model explainability.
For model explanation, we utilised the Shapley Additive Explanations (SHAP) algorithm. 15 Originating from game theory, a SHAP summary plot provides a density scatter plot of SHAP values for each feature to identify how much impact a given individual feature has on the model output for individuals in the validation data set. This approach provides a visual representation of the relative contribution of an individual variable to the model, and what values of a feature contribute toward a positive or negative outcome. SHAP plots were created for all time-point models.
Patient data such as vital signs, demographics and comorbidity scores were summarised using medians and interquartile ranges when the data followed a non-normal distribution and with mean and standard deviation when following a Gaussian distribution. Patient characteristics for the two populations were compared using the Mann-Whitney U test as appropriate for continuous variables. For categorical variables, the chi-squared test or Fisher's exact test was applied as appropriate. Jupyter-Lab 16 with a Python (v3.8.5) 17 backend was used for modelling and analysis. Discrimination was assessed by the area under the receiver operating characteristic curve (AUROC). An AUROC of >0.70 but <0.75 was classified as fair, a value >0.75 but <0.80 fair to good, Figure 1. Diagram of data split for model development, validation and testing. a value >0.80 but <0.85 as good, a value >0.85 but <0.90 as very good and a value >0.90 as excellent. A P value <0.05 was considered statistically significant.

Ethics
The project was approved by the Austin Hospital Human Research and Ethics Committee (HREC/91113/Austin-2022).

Demographics
We studied 599 015 ED presentations (

ML models
The training data set comprised 424 354 data points (70.84% of total data) and the validation data set included 53 403 data points, 8.92% of total data (Fig. 1).
We were able to build a binary classifier to predict any hospital admission from ED with 87% accuracy using data readily available at patient registration at ED and 86% accuracy within the first 30 min using historical and current inhospital clinical data, a level of accuracy, which remained consistent overall time-points used for analysis and slightly increased with time spent in ED ( Table 4).
The ability to discriminate admissions from non-admission was expressed with the AUROC which was 0.94 at 30 min and remained consistently high at all intervals (  The performance of the ML model, however, decreased when attempting to separate the type of admission from SSU to a general ward (Table 4).

SHAP analysis
Explanatory SHAP analysis for the weight of variables used in the ML model and its interpretation for data obtained at 30 min are presented in Figure 2 for all admissions, Figure 3 for hospital ward admission and Figure 4 for SSU admissions.
In relation to the model applied in the present study (IP-Admit-30 min), SHAP analysis showed that higher values of age contributed to an increasing chance of admission. Similarly, ambulance arrival as a categorical variable contributed positively toward the probability of admission. Higher number of ED admissions in the last 360 days and higher number of pathology tests done after ED arrival contributed toward greater risk of hospitalisation. Peripheral IV insertion in the first 30 min also contributed to the prediction of a patient's admission in the first 30 min. As suspected and observed in SHAP analysis, higher comorbidity scores (CCI, EHS and MACSS) contributed positively toward a patient's risk of being admitted.

Key findings
Using available electronic data, we developed an ML-based model for the early prediction of hospital admission among patients presenting to the ED. We found that such a model could be developed and validated. Our model could deliver excellent performance in predicting hospital admission already within the first 30 min of presentation. Moreover, although the performance of model decreased when trying to predict SSU or general ward admission separately, it remained very good. The lower value for SSU admissions is not surprising. SSU admissions are tailored for patients with a predicted LOS of 24 h or less. In times of peak demand, these beds are often repurposed or admit patients with a higher likelihood of a multiday admission. Our analysis did not look at 'intention to treat'. Finally, we found that using the SHAP algorithm, we were able to identify major drivers for model performance, which carried clinical plausibility, thus further validating the credibility and logic of the models.

Relevance to practice
No Australian state or territory has maintained the 90% NEAT. During the current COVID wave of Winter 2022, 4-h EDLOS performance declined to 52%. 7 However, there is ample evidence that reduction in EDLOS has a beneficial impact on standardised hospital mortality 18,19 as well as improvement in the mortality of discharged patients. 20 In addition, reduction of EDLOS has not demonstrated any adverse impact on the quality and safety of clinical care. 21 It appears logical that the ability to predict which patients will be admitted based upon demographic and clinical criteria and/or indicators of ED 'busyness' should enhance clinical decision-making, reduce EDLOS, achieve better patient outcomes, and assist health services in reaching a higher 4-h discharge target. 9,22 New ML technology may assist in achieving such goals as demonstrated by the present study.
The ED is uniquely placed to benefit from the application of data-based ML software-derived prediction tools because of their potential value in enhancing clinical decision-making. Patients are assessed in the ED with limited information, and physicians often find themselves balancing probabilities to inform decision-making and manage clinical risk. Indeed, a meta-analysis of triage accuracy demonstrated variable performance and moderate accuracy. 23 Emergency physicians determine the need for admission based upon a complex interaction of clinical and psychosocial factors, response to therapy and judgement 22 while often multitasking and working in a challenging environment. 24 Cognitive errors and poor decision-making can occur 25 and the use of predictive algorithms may reduce the potential for error while improving patient safety. 26 Cameron et al. 27 demonstrated that nurses performed worse than ML methods in predicting patient admission. Thus, enhancing ED flow metrics using computerised decision-making should support clinical decision-making. 28 Different models of ML have been used to predict the need for admission. These include Decision Tree, Support Vector Machine, Random Forest, Naïve Bayes classifier, gradient boosting and deep neural networks; providing an accuracy ranging from 84 to 88%. 26 Kirubarajan et al. 29 undertook a scoping review of reported use of artificial intelligence (AI) in ED practice. Stewart et al. 30 acknowledge that while there are concerns and challenges surrounding 'algorithm opacity, trust and patient data security', AI technologies will be increasingly integrated into emergency medicine practice in the coming years.

Implications of study findings
Our findings imply that, using available electronic routinely collected data, it is possible to develop an ML model that predicts hospital admission within 4 h of ED presentation. Moreover, they imply that such a model can achieve excellent performance within 30 min of presentation and continues to improve its ability to discriminate up to 180 min after presentation. Finally, they imply that such a model is driven by variables that carry clinical validity as likely predictors of admission and are generic in nature rather than institution specific. This suggests that such a model could be deployed as a way of facilitating ED patient flow in other institutions.

Strengths and limitations
Our study has several strengths. It is based on thousands of ED presentations with a wide variety of characteristics and diagnoses. The model was validated in a held-out cohort and found to be robust. The information used is typically available in the EMR of essentially all modern hospitals in developed countries. The model was developed with data from a typical teaching hospital in a large metropolitan setting that is representative of such institutions in Australia, thus carrying a degree of external validity. The variables that appeared to drive the model on SHAP analysis were clinically credible and supported the plausibility and logic of the model and its relevance.
We acknowledge several limitations. This is not a randomised controlled trial. Therefore, no inferences can be made on the utility of the model in informing and delivering more efficient patient flow. The validity of the model was tested with a patient sample within the same institution, thus its robustness, performance, and validity outside the hospital where it was developed all need to be tested. This is a singlecenter study and needs to be replicated in a multi-centre setting. The data available in our EMR may not be collected in the same way and to the same degree in other institutions.

Conclusions
In a study involving a large cohort of patients presenting to the ED of an Australian teaching hospital, we developed an ML-based predictive model, which could be applied using data from the hospital EMR and obtained within the first 4 h of ED presentation. We found that this model had excellent discriminative performance already at time of patient registration and within the first 30 min of presentation. We also found that such model was driven by clinically credible variables. These findings need to be confirmed or refuted in studies conducted in other hospitals and may be used to inform trials of clinical decision support systems aimed at increasing the efficiency of ED patient management. In this retrospective study, we could not evaluate the impact of knowledge of admission probability on the behaviour of the ED or IP clinician. We plan to prospectively test the algorithm in real time to determine its utility in flagging admission to reduce EDLOS, improving 4-h performance and enhancing the referral process. If the model is proven to be robust, early disposition decision-making should trigger earlier bed management processes.