Machine learning for early prediction of in‐hospital cardiac arrest in patients with acute coronary syndromes

Abstract Background Previous studies have used machine leaning to predict clinical deterioration to improve outcome prediction. However, no study has used machine learning to predict cardiac arrest in patients with acute coronary syndrome (ACS). Algorithms are required to generate high‐performance models for predicting cardiac arrest in ACS patients with multivariate features. Hypothesis Machine learning algorithms will significantly improve outcome prediction of cardiac arrest in ACS patients. Methods This retrospective cohort study reviewed 166 ACS patients who had in‐hospital cardiac arrest. Eight machine learning algorithms were trained using multivariate clinical features obtained 24 h prior to the onset of cardiac arrest. All machine learning models were compared to each other and to existing risk prediction scores (Global Registry of Acute Coronary Events, National Early Warning Score, and Modified Early Warning Score) using the area under the receiver operating characteristic curve (AUROC). Results The XGBoost model provided the best performance with regard to AUC (0.958 [95%CI: 0.938–0.978]), accuracy (88.9%), sensitivity (73%), negative predictive value (89%), and F1 score (80%) compared with other machine learning models. The K‐nearest neighbor model generated the best specificity (99.3%) and positive predictive value (93.8%) metrics, but had low and unacceptable values for sensitivity and AUC. Most, but not all, machine learning models outperformed the existing risk prediction scores. Conclusions The XGBoost model, which was generated based on a machine learning algorithm, has high potential to be used to predict cardiac arrest in ACS patients. This proposed model significantly improves outcome prediction compared to existing risk prediction scores.

K E Y W O R D S cardiac arrest, machine learning, prediction, XGBoost

| INTRODUCTION
Cardiac arrest is a life-threatening event and a leading cause of mortality globally. 1 Accurate identification of high-risk patients, adequate preparation, and prompt initiation of clinical management are paramount steps for successful cardiac arrest resuscitation. 2 Among these steps, accurately identifying patients who are at high-risk of suffering cardiac arrest is a primary strategy, and various studies have been conducted to predict the risk of cardiac arrest. [3][4][5] Traditional studies commonly use standard statistical methods, such as regression-based stepwise analysis to identify group-level differences, and often include a limited number of variables. [3][4][5] In contrast, machine learning begins with observations on an individual level, automatically searches multivariate data, extracts reliable outcome predictions, and ultimately generates reliable models. 6 Machine learning has been regarded as an indispensable method for handling complex problems in science, especially in biomedical and astronomical research. 7,8 Recently, machine learning has emerged as a promising tool in the field of medicine, as well. With advances in algorithm technology, it is now possible to identify highly relevant features and discover new ways to utilize medical signals to improve the accuracy and functionality of prediction models to solve medical issues. Compared to prediction models of cardiac arrest generated using traditional methods such as regression method analysis or expert opinion, machine learning can achieve a better performance in many cases. [9][10][11][12][13] In addition, current risk scores generated using traditional methods have limitations in clinical use due to their poor performance, low sensitivity, and/or a high falsealarm rate. 14 Despite the potential benefit of machine learning algorithms, several factors need to be taken into account when building a feasible algorithm for predicting cardiac arrest. First, in recent years, some studies extracted clinical features based on only on a patient's vital signs to generate an early warning system to predict cardiac arrest. 12,14 However, many of these attributes were not valuable and insufficient for stratifying the onset of cardiac arrest. 15 Second, several studies have used machine learning to predict cardiac arrest in pediatric, 13 septic, 9 and ward patients, 12 however, no previous research has used machine learning to predict cardiac arrest in acute coronary syndrome (ACS) patients. Finally, although some models derived from machine learning algorithms can accurately predict cardiac arrest, most studies failed to generate a visualization risk score.
XGBoost is an ensemble algorithm based on gradient boosted trees that has an appreciable reputation with regard to overcoming numerous machine learning challenges, but has been seldom used for predicting cardiac arrest.
In the present study, we aimed to extract multivariate clinical features of ACS patients recorded in a database registry, and used various machine learning algorithms to develop several models that had appreciable performance for predicting cardiac arrest in ACS patients. We also endeavored to visualize the machine learning model, which we proposed in order to provide face validity for clinicians and researchers who are interested in implementing this technique. Additionally, we compared the predictability of machine learning with well-known existing risk prediction models for ACS patients, such as Global Registry of Acute Coronary Events (GRACE), 16 National Early Warning Score (NEWS), 17

| Populations
A total of 21 337 ACS patients documented in the registry between January 2012 and December 2016 were initially screened. Inhospital cardiac arrest was defined as a loss of pulse due to pulseless ventricular tachycardia or ventricular fibrillation, pulseless electrical activity, or asystole. In this study, we defined cardiac arrest as the start of cardiopulmonary resuscitation and/or defibrillation. All cardiac arrest events were reviewed by a manual chart to ensure data quality. Patients who met the following criteria were included in the case group: (1) age ≥ 18 years, and (2) diagnosis with unstable angina, acute ST-segment elevation myocardial infarction, or acute non-ST segment elevation. Patients who had one of the following were excluded from this study: (1) a do not resuscitate order; (2) prior out-of-hospital cardiac arrest and ongoing resuscitation at admission; (3) cardiac arrest that had occurred within 24 h after admission or during an operation; (4) secondary multiple organ dysfunction syndrome; and (5) missing data. For patients with more than one cardiac arrest during the same period of hospitalization, only the first cardiac arrest was included in this study. The control group included patients admitted with ACS who did not experience a cardiac arrest during the 3-year study period. Patients in the control group were randomly selected through the database, and the control group was roughly three times larger than the case group in order to satisfy modeling algorithm assumptions. Inclusion criteria for the control group were similar to the case group except that control patients did not have cardiac arrest during hospitalization. Control patients were excluded if they had been discharged "against advice" or had missing data.
After application of the inclusion and exclusion criteria, a total of 166 patients with cardiac arrest were included in the case group, and a total of 521 patients without cardiac arrest were included in the control group.

| Candidate features
Two groups of features, which were used as potential predictor variables, were obtained from the electronic health record. One group of features, including age, gender, history of smoking, history of drinking, ACS type, culprit artery, and comorbidities, was registered at the time of admission and did not change during the hospitalization. The other group of features, including laboratory features, Killip classification, vital signs, mental status, the number of days prior to the occurrence of cardiac arrest, imaging and electrocardiogram examinations, were recorded 24 h preceding cardiac arrest (for patients who did not experience cardiac arrest, a random 24 h period was selected to collect data). Finally, a total of 45 features were selected as candidate features.

| Data preparation
The flow chart of the probability analysis is shown in Figure 1. We adopted the imputation and discretization methods to clean data and deal with noise, missing values, and outliers. We discarded variables with 50% or more missing values. Some machine learning has decreased accuracy in unbalanced data, 19 as observed in our study, so we matched positive samples (event group) to randomly selected negative samples (non-event group) for the training model.

| Feature engineering
We normalized the data and target values, which were locked at 0-1.
A total of 45 candidate features, which described the risk of cardiac arrest, were collected in this study. If all features were present, it would not only increase the computational burden, but also make the calculation very difficult. Therefore, in this study, correlation analyses were applied for feature selection to minimize the number of features.
All features with a p < .01 and correlation coefficient > 0 were determined to be associated with cardiac arrest. The XGboost algorithm provided the important score of each feature.

| Model development
The dataset was randomly split into two sets: the training set (70% of participants) and the testing set (30% of participants). Eight machine learning algorithms were employed to develop cardiac arrest prediction models. XGBoost is a machine learning algorithm that assembles weak prediction models (typically decision trees) to yield a satisfactory predictive results. 20 In the classification tree, the inside nodes represent values for an attribute test and the leaf nodes with scores represent a decision. Seven state-of-the-art algorithms including C4.5, random forest, logistic regression, support vector machine (SVM), back propagation (BP) neural network, Bayes, and K-nearest neighbor (KNN), were used to construct a model for early prediction of cardiac arrest.

| Model comparisons
Model discrimination was assessed using the area under the receiveroperator curve (AUC). Six other performance metrics, including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1 score, were calculated to evaluate the performance of the model for the testing set. To evaluate the superiority of prediction capability of machine learning models, we compared those models with three existing model systems -GRACE, NEWS, and MEWS -using the same patient group.

| Statistical analysis
Continuous variables are expressed as mean ± standard deviation or median with an interquartile range, and categorical variables are expressed as frequency and percentage. Patient characteristics were compared using t-tests, Wilcoxon rank-sum tests, and χ 2 tests where appropriate. All p values were two tailed, and p < .05 was considered statistically significant. Python 3.7 was used for all statistical analysis.

| Patient characteristics
This study consisted of 166 patients with cardiac arrest in the case group, and 521 patients without cardiac arrest in the control group.
We randomly assigned 480 of these participants (70%) to the training set, and the remaining 207 participants (30%) to the testing set ( Figure 2). Patient characteristics in the training and testing sets are listed in Table 1, and there were no significant differences between

| Feature selection
After feature selection using correlation analyses, we reduced the number of features from 45 to 20. Machine learning models were then developed based on the different combinations of these 20 features. Among these 20 features, we found that the number of days prior to the occurrence of cardiac arrest, cardiac troponin I, heart rate, and hematocrit were the four most important predictor features ( Figure 3).

| Evaluation of machine learning models
Our model discrimination analysis showed that the XGboost model   0.751-0.658]) ( Figure S1A). These algorithms produced a specificity value greater than 89.6%, a sensitivity value ranging from 23.8% to 73%, and an F1-score ranging from 38% to 80%, respectively. In general, the XGBoost algorithm provided the best overall performance regarding the AUC, accuracy, sensitivity, NPV, and F1 score compared with the other algorithms, and the KNN algorithm generated the best performance for the specificity and PPV metrics (Table 2). However, the KNN algorithm produced low and unacceptable sensitivity and AUC ( Table 2).
After considering these scores, especially the AUC value, we chose XGboost as the final prediction model. XGBoost is a boosting tree method in which each decision tree can be drawn, as shown in Figure S2.

| Comparison with existing risk prediction models
We next compared our prediction models with three commonly cited  The XGBoost model has been extensively used in a variety of data-mining fields for regression and classification due to its impressive accuracy and usability, [20][21][22] although, there is currently less literature describing the use of XGBoost for predicting cardiac arrest.
In our study, the XGBoost algorithm showed promising performance and had better prediction power compared to the other machine learning models, with an AUC value of 0.958, a specificity of 95.8%, and a sensitivity of 73%. The reasons for the high performance of the XGBoost model are as follows: (1) during training, the XGBoost algorithm generated a series of decision trees in a gradient boosting manner, and produced the next decision tree based on the current one to better predict the outcome; (2) after training, a risk prediction system composed of a series of decision trees was achieved; and (3) during application, the output predicted risk was the cumulative score of each decision tree, which indicates the likelihood of the predicted outcome. Therefore, the XGBoost model we generated can effectively stratify high risk ACS patients for cardiac arrest and truly assist clinicians with making appropriate treatment decisions.
This prediction model will allow a monitoring alert system and lifesaving strategy to be implemented shortly before the occurrence of an adverse event.
In arrest. The GRACE risk score was developed using a logistic regression approach, which was intended to predict in-hospital mortality in the short-and long term for ACS patients. 24,25 The most recent guidelines by international societies recommend that the GRACE risk score should be used in practice as a risk stratification tool. 26,27 In the present study, we found that the performance of GRACE was superior to the algorithms of SVM, BP neural network, and KNN. Thus, it appears that no one machine learning algorithm will be superior to traditional ones, and that no algorithm will be the most accurate in every scenario. Comparisons of algorithms in different research areas and datasets may yield different results.
This study has limitations that need to be addressed in future studies. First, the criticism of most machine learning algorithms is that they are black boxes. Although we derived an XGboost node graph, it was still unable to be applied in a straight forward manner, which in turn may make clinicians wary of its clinical application. Second, the prediction model generated in this study was established based on limited data obtained from a Chinese population and no external validation was performed. Therefore, the XGBoost model should be further evaluated using more data from other ethnic groups and regions in future studies.

| CONCLUSION
In this study, we developed and evaluated the effectiveness of several machine learning algorithms for predicting cardiac arrest in ACS patients. We found that most of the algorithms, specifically the XGBoost algorithm, showed promising performance and had better power than the existing prediction systems, such as GRACE, NEWS, and MEWS. We suggest that the XGBoost model can be used as a complementary tool in medical decision-making for early intervention and prevention of cardiac arrest in ACS patients.

ACKNOWLEDGMENTS
This study was supported by grants from the Young and Middle-aged