Machine learning is a valid method for predicting prehospital delay after acute ischemic stroke

Abstract Objectives This study aimed to identify the influencing factors associated with long onset‐to‐door time and establish predictive models that could help to assess the probability of prehospital delay in populations with a high risk for stroke. Materials and Methods Patients who were diagnosed with acute ischemic stroke (AIS) and hospitalized between 1 November 2018 and 31 July 2019 were interviewed, and their medical records were extracted for data analysis. Two machine learning algorithms (support vector machine and Bayesian network) were applied in this study, and their predictive performance was compared with that of the classical logistic regression models after using several variable selection methods. Timely admission (onset‐to‐door time < 3 hr) and prehospital delay (onset‐to‐door time ≥ 3 hr) were the outcome variables. We computed the area under curve (AUC) and the difference in the mean AUC values between the models. Results A total of 450 patients with AIS were enrolled; 57 (12.7%) with timely admission and 393 (87.3%) patients with prehospital delay. All models, both those constructed by logistic regression and those by machine learning, performed well in predicting prehospital delay (range mean AUC: 0.800–0.846). The difference in the mean AUC values between the best performing machine learning model and the best performing logistic regression model was negligible (0.014; 95% CI: 0.013–0.015). Conclusions Machine learning algorithms were not inferior to logistic regression models for prediction of prehospital delay after stroke. All models provided good discrimination, thereby creating valuable diagnostic programs for prehospital delay prediction.


| INTRODUC TI ON
The Global Burden of Diseases (GBD) 2017 study listed stroke as one of the leading causes of death and adult disability worldwide; the percent of deaths and disability-adjusted life years of stroke in 2017 were 11.02% (ranked second) and 5.29% (ranked third), respectively (GBD, 2017Causes of Death Collaborators, 2018. The China's Ministry of Health survey showed that ischemic stroke accounted for 77.8% of strokes (Wang et al., 2017). It has been proven that intravenous thrombolysis with recombinant tissue plasminogen activator (rt-PA) is a useful method for preventing death, reducing irreversible brain damage, and improving the long-term prognosis in patients with acute ischemic stroke (AIS; Wardlaw, Murray, Berge, & del Zoppo, 2014). However, it is only effective when given within a limited time after stroke onset (Emberson et al., 2014;Lees et al., 2016). Schwamm et al. (2013) reported that only 7.0% of patients with AIS in America were treated with rt-PA. In addition, a research in Australia reported that only 14.7% of the patients who arrived early received thrombolytic therapy (Ashraf, Ines, Christopher, Rabsima, & Beata, 2013). According to the Chinese National Stroke Registry, only 1.6% of the patients with acute strokes received rt-PA (Wang et al., 2011).
Previous systematic reviews reported that prehospital delay (from the onset of symptoms or the last known time without symptoms to the arrival at the hospital) accounted for the majority of treatment delay (Miriam, Robin, & Vincent, 2009;Pulvers & Watson, 2017). Many factors have been shown to influence the prehospital time, including personal demographic factors, such as age, gender, education, income, and place of residence (Pulvers & Watson, 2017;Song et al., 2015;Zhou et al., 2016); clinical factors, such as history of stroke or cardiovascular disease, patient health characteristics, stroke symptomatology, stroke etiology, the vascular area involved in the stroke, and stroke severity (Miriam et al., 2009;Pulvers & Watson, 2017;Sobral et al., 2019;Sommer et al., 2017); cognitive and behavioral factors such as lack of attention to symptoms, stroke treatment awareness, patient and bystander behavior (Pulvers & Watson, 2017;Zhou et al., 2016); mode of transportation to the hospital, first visiting primary care facilities or a general practitioner, referral from another hospital, and others (Oostema, Konen, Chassee, Nasiri, & Reeves, 2015;Pulvers & Watson, 2017;Zhou et al., 2016). Presently, there is no available model for predicting the risk of prehospital delay after AIS. Such model would have the potential to significantly reduce onset-to-door (OTD) time and improve the outcome for patients with AIS. Considerable effort and expertise are required in the multidimensional analysis of prehospital delay, and more complex methods need to be developed to promote this complicated, preferably automated analysis (Wang, Wen, Lu, Yao, & Zhao, 2016).
Machine learning (ML) learns from observed data using a variety of artificial intelligence and statistical models to establish rational generalizations, discover patterns, classify unknown data, or predict new directions (Hosseinzadeh, Kayvanjoo, Ebrahimi, & Goliaei, 2013). ML methodologies, such as the Bayesian network (BN) and support vector machine (SVM), are being rapidly adopted in the medical field, because they enhance the practicability of classification and prediction. Prediction models are used for various diagnosis and prognosis tasks in the medical fields. This implementation may contribute to find ways to reduce drug costs, improve clinical researches, and promote better evaluation by physicians (Wang et al., 2016).
ML has been applied to predict the risk of death and functional outcomes after stroke (Park, Chang, & Nam, 2018). However, no published research used ML to predict prehospital delay after stroke. This study aimed to recognize the factors influencing the OTD time, compare the performance of logistic regression (LR)-, BN-, and SVM-based models for prediction of prehospital delay after AIS, and develop a precise and effective model to predict the risk of prehospital delay in high-risk populations that require intervention in order to shorten the time to medical treatment in these patients, thereby reducing the delay and enabling patients to receive timely and effective treatment.

| Study design and population
A cross-sectional survey was conducted with a convenience sample of patients from a tertiary hospital. Patients diagnosed with AIS, aged 18 years or older, who were hospitalized between 1 November 2018 and 31 July 2019 and who had undergone at least one brain scan by computed tomography or magnetic resonance imaging were included in the study. Patients diagnosed with transient ischemic attack or cerebral hemorrhage, those with symptom onset at nursing homes or hospitals, patients with cognitive impairment or inability to answer question, and those who could not define the time of each interval were excluded from this study.

| Data collection
Following the study protocol, the hospital provided training for researchers. We reviewed the medical records of patients diagnosed with AIS. For the patients who met the inclusion criteria, the data on demographics, health conditions, medical history, laboratory analyses, neuroimaging results, and therapies were extracted from the medical records. Stroke subtypes were recognized based on the Oxfordshire Community Stroke Project classification. Subsequently, the patients were interviewed by two well-trained investigators in the hospital wards.
To determine whether there was prehospital delay, we asked the patients to state the first time he/she or someone else noticed the symptom, the time he/she went to the hospital, the time he/she arrived at the hospital, and the time he/she was started on treatment, and we computed the delay time. If the time we calculated based on the patients' reports was unequal to that listed in the medical records, the former was selected, because the attending physicians may not be as cautious as the investigators in terms of inquiry and recording the times. The last normal time was regarded as the onset time for patients who were uncertain of the time of symptom onset or had a wake-up stroke.

| Measurements
The questionnaire was designed to include known sociodemographic, clinical, cognitive, and behavioral factors. The questions were revised after pilot test with 30 questionnaires.
In addition, the Stroke Premonitory Symptoms Alert Questionnaire developed by Zhang et al. (2015) was used to measure how the patients judged and decided whether to seek medical treatment when premonitory symptoms of stroke occurred. This questionnaire consists of nine items. Using a two-level scoring method, 1 point is assigned for the correct answer, and 0 points for the wrong answer. The higher the score of the questionnaire, the more alert the patients are to the premonitory symptoms of stroke.
The Stroke Knowledge Questionnaire designed by Yan and Yang (2017) was used to evaluate the patients' knowledge about stroke.
The questionnaire includes 6 dimensions of stroke symptoms, first aid treatment, risk factors, safe medication, healthy behavior style, and rehabilitation knowledge, with a total of 40 items. Each item is scored by a two-point method, that is, correct answers are given 1 point and incorrect or not known answers are given 0 points. The total score ranges from 0 to 40 points.

| Definition of prehospital delay
We defined prehospital delay as an onset-to-door time of 3 or more hours. National guidelines and consensus highly recommend intravenous rt-PA thrombolysis to be applied to patients with AIS within 4.5 hr and door-to-needle time within 60 min (Kobayashi et al., 2017;Powers et al., 2018). A study in China reported that the mean in-hospital delay is more than 60 min (Wang et al., 2011). Thus, we considered that 3 hr of prehospital delay would cause the missed optimal thrombolytic therapy time for patients with AIS. In fact, most studies on prehospital delay have used 3 hr as a threshold (Nepal et al., 2019;Zhou et al., 2016).

| Data analysis
The data extracted from the medical records and completed questionnaires were double entered into EpiData (version 3.1). SPSS Statistics 25.0 and SPSS Modeler 18.2.1 (IBM, Armonk, NY, USA) were used for statistical analysis. Patients were divided into timely admission group (OTD < 3 hr) and delayed admission group (OTD ≥ 3 hr). Quantitative variables were expressed as mean and standard deviation and comparative analysis was conducted using Student's t test. Qualitative variables were expressed as numbers and percentages, and comparative analysis was performed using the chi-square test or Fisher's exact test.

| Construction of the prediction models
In this study, the LR, BN, and SVM models were constructed using SPSS Modeler, and the default values in Modeler were applied for all unspecified parameter values. In addition, we used a 10-fold crossvalidation to validate each prediction model, in which the dataset was divided into 10 equal-sized parts, then trained with nine parts and tested with one part. We repeated the process until all data had been tested. Moreover, 10-fold cross-validation was repeated 10 times to avoid the randomness of cross-validation and we used the mean as the final result (Cui, Wang, Wang, Yu, & Jin, 2018).

| LR
Since the risk prediction model in this study targets with two discrete categories (timely/delayed), the option "Binomial" was selected in the logistic regression "Procedure" of the SPSS Modeler.

| BN
BN was designed for prediction and classification based on the Bayes theorem (Friedman, Dan, & Goldszmidt, 1997). BN can intuitively encapsulate the causal relationship between factors stored in the medical data; therefore, it is widely used for medical decision support (Park et al., 2018). In addition, due to the characteristics of conditional probabilities and logical inherence in decision support, BNs can provide interpretable classifiers (Letham, Rudin, McCormick, & Madigan, 2015). Moreover, any given node could be queried in BNs, which are more clinically practical than classifiers built based on specific outcome variables (Park et al., 2018).
There are two possible selections of structure for constructing BNs: Markov Blanket and Tree Augmented Naive Bayes model (TAN).
In order to find the most appropriate structure for predicting prehospital delay, we considered and compared the performance of BN models with different structures. Additionally, "Bayes adjustment for small cell counts" was chosen for parameter learning method.

| SVM
SVM is an ML algorithm with a good regularization attribute that is based on the structural risk minimization principle of statistical learning (Xiang et al., 2019). The SVM optimization process maximizes prediction accuracy and reduces over-fitting of training data To improve the performance of the SVM model, we set the parameters as follows: The kernel function was the radial basis function with parameters C = 5 and γ = 0.161 (Xiang et al., 2019). The weights between generalization error and empirical error were represented by parameter C, and the shape of the separated hyperplane was controlled by parameter γ (Wang et al., 2016).

| Prediction process
The whole process of establishing a model to predict prehospital delay for patients with AIS is shown in Figure 1. We extracted 64 variables from the dataset and implemented a data preparation process to filter the records that met the exclusion criteria or lacked outcome variables. Considering the practicability of the model (predicting the possibility of prehospital delay for high-risk groups), variables that are not suitable for prediction were not included in the model construction.
Considering that the performance of the model is largely dependent on choosing the appropriate variables, four different variable selection methods were tested for model construction in this study.
We first selected 27 variables based on expert opinions and previous studies (called "Prior knowledge"), a classical method that is still widely used (Van Os et al., 2018). In addition, three more variable selection methods were considered: (a) variables with a p-value ≤ .05 were included in the model; (b) feature selection, a method of identifying the most important variables for prehospital delay. Importance was ranked based on the likelihood-ratio chi-square and variables with importance greater than 0.95 were selected. We used the "Feature Selection Node" in SPSS Modeler to implement this operation; and (c) selection based on logistic regression forwards stepwise.
Next, we used LR, BN, and SVM to develop models for discriminating between the timely admission and delayed admission groups.
Each prediction model was validated by 10 × 10-fold cross-validation. The discriminations of all LR, BN, and SVM models were compared. We adopted the area under the receiver operating characteristics curve (AUC) to assess the models' performance (Hosmer & Lemeshow, 2013). Because of the 10 × 10-fold cross-validation, we gained 100 results for each model. We calculated the mean AUC for all models. In addition, we compared the performance of the optimal LR model with that of the optimal ML model by calculating the difference of the mean AUCs, including the relevant 95% confidence interval (CI). The study purpose statement was read by all study subjects and F I G U R E 1 Process of constructing models for predicting prehospital delay after stroke. AUC, the area under the curve of the receiver operating characteristic; BN_M, Bayesian networks built by Markov Blanket structure; BN_TAN, Bayesian networks built by Tree Augmented Naive Bayes model structure; LR, logistic regression; OTD, onset-to-door time; SVM, support vector machine each provided written informed consent. The patients' identity information was kept confidential.

| Statistical characteristics
Our study included 450 patients with AIS. Overall, only 12.7% of them presented to the hospital within 3 hr from onset. Table 1 shows the delay rates according to the patients' characteristics and Table 2 shows the scores of the Stroke Premonitory Symptoms Alert Questionnaire and the Stroke Knowledge Questionnaire in the two groups. Notably, lower rates of delay were found in patients who lived in urban areas, who had higher incomes, those with commercial medical insurance, who underwent physical examination more than once a year, patients with a previous stroke, those whose onset location was at public or in the car, patients who had sudden symptoms, those with the following symptoms: speaking or understanding of speech difficulties, unilateral facial numbness or weakness, left arm weakness or numbness, and unconsciousness or fainting, patients who were aware of stroke and regarded the symptoms as serious, those who knew the time window for intravenous thrombolysis for stroke, patients who called emergency medical services instead of doing nothing, those who were accompanied by someone at the time of stroke onset, those in whom bystanders identified stroke and suggested that the patients go to hospital, patients with a distance between the place of onset and the investigating hospital smaller than 5 km, those who had used an ambulance before and this time, patients who had anterior circulation strokes, and patients who were more alert to the premonitory symptoms of stroke.

| Variable selection results
According to the results of the variable selection (Table 3), the logistic regression forwards stepwise filter method had a better model performance and the number of variables selected by this method was the least. Therefore, the seven variables selected by forwards stepwise filter method were used as the optimal variable subset for model construction. The selected variables were ranked according to the importance: patient's knowledge about time window of intravenous thrombolysis for stroke (0.30), patient's response when symptoms first appeared (0.26), referred from other hospital (0.15), place of residence (0.13), knowing someone who had stroke (0.10), age (0.06), and number of children (3.30E-8).

| Comparison among prediction models
The predictive performance of the four models is shown in Table 3.
All models composed of the seven variables selected by forwards stepwise provide excellent discrimination, with mean AUCs ranging from 0.800 to 0.846. In addition, our results indicated that BN model built by TAN structure (BN_TAN) (mean AUC: 0.832) had a better diagnostic capability than BN model built by Markov Blanket structure (BN_M) (mean AUC: 0.800). The optimal ML model (BN_TAN, mean AUC: 0.832) and the optimal LR model (mean AUC: 0.846) had a similar discriminative power in predicting prehospital delay (difference of mean AUCs: 0.014; 95% CI: 0.013-0.015).

| D ISCUSS I ON
It has been proven that intravenous thrombolysis with rt-PA is highly effective in reducing irreversible brain damage, preventing death, and improving the long-term prognosis (Wardlaw et al., 2014).
Controlled multicenter studies showed that the best time for administration of alteplase is no more than 3 hr, and it is also useful for patients with AIS treated within 4.5 hr (Emberson et al., 2014;Lees et al., 2016). Time is of uttermost importance and this may be the reason why patient's knowledge about the time window of intravenous thrombolysis for stroke was the strongest predictor of prehospital delay. In fact, previous studies reported that the knowledge about thrombolysis was independently associated with a lower rate of prehospital delay (Pulvers & Watson, 2017;Yanagida, Fujimoto, Inoue, & Suzuki, 2014). In addition, our results revealed that in 39.6% of the patients, the initial reaction was doing nothing and waiting for the symptoms to disappear, and the OTD time of these patients was longer than that of patients whose first reaction was to make emergency calls, go to hospital directly, or seek help from other people. Faiz, Sundseth, Thommessen, andRønning (2014) andZhou et al. (2016) also found that patients who hold a wait-and-see attitude and waited for their symptoms to relieve were prone to arrive late. Prior studies have indicated that referral from another hospital was one of the top three factors related to prehospital delay (Pulvers & Watson, 2017). In this study, we also discovered that referral from another hospital would contribute to a long OTD time. Moreover, Yang et al. (2014) and Zhou et al. (2016) indicated that the place of residence was the major factor influencing prehospital delay, which was also found in our research. One study found that patients with relatives or friends suffered a stroke may be more concerned about stroke symptoms and better understand the importance of admission immediately after onset (Zhou et al., 2016). Consistent with the findings of previous studies (Jin et al., 2012;Song et al., 2015), we found that advanced age was related to shorter prehospital delays.
This may be due to younger patients not having a sense of urgency, while older patients are more likely to interpret symptoms as stroke and treat them as emergencies (Pulvers & Watson, 2017). It has also been reported that the lack of company when stroke symptoms occur for the first time may increase the prehospital delay (Jin et al., 2012;Zhou et al., 2016). In addition, we found that patients with less than one child are at higher risk of being alone at the time of stroke onset. Therefore, the patient's response when symptoms first appeared, referral from other hospital, place of residence, knowing someone who had stroke, age, and number of children were selected as predictors of prehospital delay after stroke. Furthermore, the results may indicate that excepting demographic variables in the electronic database, healthcare facilities should also consider collecting social and behavioral variables that are easily available in daily work, which could further improve the practicality and generalizability of prediction models.
In this study, we constructed models for prehospital delay prediction after stroke based on routine available medical records and survey data, and in order to enable the model to be used in high-risk populations, the variables we used are all available before stroke onset. We found that the optimal ML model and the optimal LR model performed similarly in predicting prehospital delay after stroke. For prediction of OTD ≥ 3 hr using variables available in high-risk people, all models performed good discrimination. This may reveal that prehospital delay of stroke depends on the presence of features in the variables selected by logistic regression forwards stepwise, such as whether patients know of the time window for intravenous thrombolysis after stroke, their response when symptoms first appear and whether they are referred from another hospital.
It was anticipated at first that the ML (BN and SVM) would outperform the LR models because they can evaluate a large number The bold value means p-value ≤.05, indicating that the difference is statistically significant.
Abbreviations: NIHSS, National Institutes of Health Stroke Scale; OCSP, Oxfordshire Community Stroke Project. a Manual includes those engaged in construction, farming/forestry/fishing and related, installation and related, manufacture and production, transportation and driver occupations; Nonmanual includes management, service, professional, commercial, and administration. b Defined as illiterate or having only finished primary education. And we speculate that the strengths of applying ML algorithms to the prediction of prehospital delay may be more fully verified in a larger population since ML, when compared to traditional statistical methods, has advantages in handling large scale and high dimensional datasets.
The strengths of this study include the standardized collection of patient data and variable selection methods. In many studies, ML algorithms were compared with LR only using variables selected by prior experience, and the performance of ML was better than that of LR (Decruyenaere et al., 2015;Kop et al., 2016

TA B L E 3 Results of variable selection
predictive performance; to compensate for this, we used 10 × 10fold cross-validation, which is regarded as an effective method (Krstajic, Buturovic, Leahy, & Thomas, 2014). Moreover, the cross-validation showed that each model had 100 performances and we computed the mean performance to compare the advantages and disadvantages of the models, which makes our results more reliable.
The reasons behind a prediction need to be understandable by physicians and patients; thus, for ML models in the medical field that interpretability is a core requirement (Park et al., 2018).
Due to the high incidence, high mortality, high disability, and low thrombolysis rates of stroke, studying prehospital delay in patients with stroke are critical for both policy development and clinical care. Therefore, the prediction models of prehospital delay need to meet the requirements of high specificity and interpretable results. In addition, according to the identified influencing factors, this study could also help healthcare personnel to provide guidance to patients to reduce the OTD time, thereby helping patients to receive timely and effective treatment.

| LIMITATI ON S AND FUTURE DIREC TION
There are several limitations in the present study; prospective studies in predictive modeling need to be improved.
First, convenience samples may be biased because individuals who choose to participate in the study may not fully represent the population from which the sample has been drawn. Nevertheless, this choice was justified as it provided a representative sample of AIS. A low response rate is a limitation. Declining response rates have been recognized among patients. Poor rates cause nonresponse bias, which may seriously affect the validity of the study in terms of generalizability and applicability of the findings. However, this study well exceeded the required number of responses estimated by power analysis providing representative samples of AIS

patients.
Second, we only examined the effects of individual variables but we did not study the relation between variables and the nature of direct or indirect influencing factors. In the future, it is necessary to study how variables affect predictability through detailed univariate analysis and identification of the meaning.
Third, in our research, we used the same data as the training data and test data used for cross-validation. In the future, in case of a larger sample size, we will ensure that the training data differ from the test data in advance to obtain more exact results.
How to implement these models in the clinical practice is an important question we need to solve in the future. In order to promote prehospital delay prevention, we can develop a user-friendly, foolproof, web-based clinical support system based on optimal ML algorithms to achieve "real-time individualized feedback," which can be accessed by means of mobile devices or personal computers. This universal design could facilitate and promote use in busy clinical settings including visits to out-patient clinics, in-patient consultation, or quick assessment by nonphysician users. For healthcare professionals, identification of patients who prone to prehospital delay after stroke has the potential to enhance disease control and management by allowing for tailored interventions which significantly improve the allocation of social-related resources. From the standpoint of policy makers, the system provides a method with less expense to conduct a beneficial evaluation.

| CON CLUS ION
In this study, we identified the important factors that affect the early admission of patients with stroke and we evaluated the performance of LR, BN, and SVM models in predicting prehospital delay after stroke.
We found that ML algorithms were not inferior to conventional LR in recognizing the key variables, thus creating a valuable diagnostic procedure for prehospital delay prediction in high-risk groups for stroke.
For these models to be used in daily routine, some work still needs to be done, nevertheless, this work opens new lines of research.

ACK N OWLED G M ENTS
The authors would like to thank all study participants and interdisciplinary healthcare team members from the second Affiliated Hospital of Harbin Medical University in Harbin city for this survey research.

CO N FLI C T O F I NTE R E S T
The authors have no conflicts of interest to report.

AUTH O R CO NTR I B UTI O N
LY, QL, and QZ contributed conception and design of the study. QL organized the database and performed the statistical analysis. LY wrote the manuscript. LY, QL, QZ, XZ, and LW contributed to data interpretation and revising the manuscript.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1002/brb3.1794.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.