Time‐domain heart rate variability features for automatic congestive heart failure prediction

Abstract Aims Heart failure is a serious condition that often goes undiagnosed in primary care due to the lack of reliable diagnostic tools and the similarity of its symptoms with other diseases. Non‐invasive monitoring of heart rate variability (HRV), which reflects the activity of the autonomic nervous system, could offer a novel and accurate way to detect and manage heart failure patients. This study aimed to assess the feasibility of using machine learning techniques on HRV data as a non‐invasive biomarker to classify healthy adults and those with heart failure. Methods and results We used digitized electrocardiogram recordings from 54 adults with normal sinus rhythm and 44 adults categorized into New York Heart Association classes 1, 2, and 3, suffering from congestive heart failure. All recordings were sourced from the PhysioNet database. Following data pre‐processing, we performed time‐domain HRV analysis on all individual recordings, including root mean square of the successive difference in adjacent RR interval (RRi) (RMSSD), the standard deviation of RRi (SDNN, the NN stands for natural or sinus intervals), the standard deviation of the successive differences between successive RRi (SDSD), the number or percentage of RRi longer than 50 ms (NN50 and pNN50), and the average value of RRi [mean RR interval (mRRi)]. In our experimental classification performance evaluation, on the computed HRV parameters, we optimized hyperparameters and performed five‐fold cross‐validation using four machine learning classification algorithms: support vector machine, k‐nearest neighbour (KNN), naïve Bayes, and decision tree (DT). We evaluated the prediction accuracy of these models using performance criteria, namely, precision, recall, specificity, F1 score, and overall accuracy. For added insight, we also presented receiver operating characteristic (ROC) plots and area under the ROC curve (AUC) values. The overall best performance accuracy of 77% was achieved when KNN and DT were trained on computed HRV parameters with a 5 min time window. KNN obtained an AUC of 0.77, while DT attained 0.78. Additionally, in the classification of severe congestive heart failure, KNN and DT had the best accuracy of 91%, with KNN achieving an AUC of 0.88 and DT obtaining 0.92. Conclusions The results show that HRV can accurately predict severe congestive heart failure. The findings of this study could inform the use of machine learning approaches on non‐invasive HRV, to screen congestive heart failure individuals in primary care.


Introduction
Congestive heart failure is a progressive clinical syndrome characterized by exercise intolerance and/or signs of congestion in the presence of a cardiac condition.It is associated with high morbidity and mortality rates. 1,2The underlying causes of congestive heart failure include structural abnormalities of the heart, functional impairments, and various contributing factors, such as hypertension, valvular heart disease, uncontrolled arrhythmias, myocarditis, and congenital heart disease. 3Additionally, a significant issue associated with the severity of congestive heart failure is end-stage heart failure. 4lobally, there has been a rapid increase in the prevalence of congestive heart failure and its associated health burden, particularly among older individuals and people living in low-to middle-income countries. 5Currently, there are approximately 64.34 million cases of congestive heart failure worldwide, equivalent to 8.52 per 1000 inhabitants, contributing to a loss of 9.91 million years due to disability. 5 Furthermore, congestive heart failure is known to present a range of complications, including arrhythmias (such as atrial fibrillation, ventricular arrhythmias, and bradyarrhythmia), thromboembolism (leading to conditions like stroke, peripheral embolism, deep venous thrombosis, and pulmonary embolism), gastrointestinal issues (including hepatic congestion, hepatic dysfunction, and malabsorption), musculoskeletal problems (such as muscle wasting), and respiratory challenges (involving pulmonary congestion, respiratory muscle weakness, and pulmonary hypertension). 6eart failure, a prognostically severe syndrome, may remain undetected in over half of the cases, especially up to 76% of these undiagnosed cases involving patients with preserved ejection fraction. 7Traditionally, congestive heart failure is diagnosed by physicians upon the onset of symptoms through a combination of physical examination, a review of the patient's medical history, and various diagnostic tests.These tests include, but are not limited to, a complete blood count, urinalysis, a complete metabolic profile assessing serum electrolyte levels (including calcium and magnesium), blood urea nitrogen, serum creatinine, glucose levels, fasting lipid profile, liver function tests, and thyroid-stimulating hormone evaluation to detect abnormal left ventricle and/or heart valve function. 4,8he process of diagnosing congestive heart failure is often time-consuming, requires specialized skills, and is associated with high costs. 4Detecting heart failure at an earlier stage could enable timelier interventions, help address disparities, and reduce disease progression, ultimately leading to decreased morbidity. 9However, despite the potential for symptoms to persist for several months, many initial heart failure diagnoses occur in acute care settings. 9Furthermore, non-invasive monitoring of symptoms could facilitate the early detection of heart failure and development of efficient patient management strategies. 10eart rate variability (HRV) is an indicator of the autonomic nervous system's activity. 11,12It serves as a measurable marker for cardiovascular disease 13 and other chronic diseases, including diabetes, inflammation, obesity, and psychiatric disorders. 12,14HRV refers to the variation in time intervals between successive heartbeats, termed RR intervals (RRi).These intervals represent the time elapsed between two consecutive R-waves of the QRS complex on the electrocardiogram (ECG). 11Congestive heart failure is often associated with autonomic dysfunction, which can be quantified through HRV measurements. 15Therefore, HRV has the potential to be an effective non-invasive technique for detecting heart failure.
Machine learning models offer the potential to make significant contributions to early diagnosis by creating models capable of quantifying the complex physiological interactions between HRV and health risks. 16Machine learning methods have been applied to various heart failure-related tasks, such as detection of heart failure from patient datasets, prediction of hospital readmissions, mortality prediction, and the classification and clustering of heart failure cohorts into subgroups with distinctive features and responses to heart failure treatments. 17However, there is a scarcity of studies examining the effectiveness of machine learning applied to HRV for the detection of individuals with heart failure.Therefore, this study aims to address this gap by evaluating the application of machine learning algorithms in distinguishing between healthy adults and individuals with congestive heart failure using HRV data derived from ECG signals.

Methods
In this study, datasets consisting of ECG recordings from adult subjects with normal sinus rhythm and congestive heart failure were used for the experimental evaluation.

Dataset
All the datasets used in this study are standard datasets available on the PhysioNet portal, which can be freely accessed for research purposes. 18Specific details regarding the datasets, including the normal sinus rhythm RRi database, 18 the congestive heart failure RRi database, 18 and the Beth Israel Deaconess Medical Centre (BIDMC) congestive heart failure database, 19 are provided on the PhysioNet portal alongside the datasets.
The normal sinus rhythm RRi database includes beat annotation files for 54 adults (men: 30 and women: 24, age range: 28.5-76 years). 18The original ECG recordings were digitized at a rate of 128 samples per second, and the beat annotations were obtained through automated analysis with manual review and correction. 18he congestive heart failure RRi database includes the beat annotation files for 29 long-term ECG recordings of adults (men: 8, women: 2, and gender unknown: 21, age range: 34-79 years) with congestive heart failure [the New York Heart Association (NYHA classes 1, 2, and 3)]. 18The original ECG recordings were digitized at 128 samples per second, and the beat annotations were obtained through automated analysis manual review and correction. 18inally, the BIDMC congestive heart failure database includes long-term ECG recordings from 15 subjects (men: 11 and women: 4, age range: 22-71 years) with severe congestive heart failure (NYHA classes 3 and 4). 19The original ECG signals were digitized at 250 samples per second, with 12 bit resolution over a range of ±10 mV.The beat annotations were prepared using an automated detector and have not been corrected manually. 19able 1 provides the details of the participants' demographics.The normal sinus rhythm comprises 54 adults (men: 30 and women: 24, age: 61.36 ± 11.52 years).The congestive heart failure ECG recordings were combined from the congestive heart failure RRi database 18 and the BIDMC congestive heart failure database. 19The congestive heart failure comprises 44 adults (men: 19, women: 6, and gender unknown: 19, age: 55.51 ± 11.3 years).Participants are identified as 'nsr0XX' in the normal sinus rhythm RRi database, 'chf2XX' in the congestive heart failure RRi database, and 'chfXX' in the BIDMC congestive heart failure database, where 'XX' denotes the participant's sequence number, ranging from the first participant to the last participant in each specific dataset.
The beat-annotated ECG compressed files were downloaded from the open-access database. 18,19The WFDB software package, which is part of the freely available Physio Toolkit software, was used to perform operations on the extracted ECG annotated files. 18Initially, the compatibility of the ECG annotated files was checked.Subsequently, essential information such as age, gender, NYHA classification, and recording details were extracted.Each annotation file was read, and the corresponding RRi files were generated and grouped into healthy and congestive heart failure data for analysis.

Data pre-processing
Two common reasons for noise in ECG recordings are lack of contact between the sensor and the participant's skin and rapid movement by the participant, which can cause the sensor to either produce additional spikes or miss some spikes. 20o mitigate these issues and ensure data quality, 10 min of initial and final recordings was discarded from each recording, considering set-up time.Also, ectopic beats were eliminated from the recordings, that is, RRi shorter than 300 ms (i.e.200 b.p.m.) or longer than 1300 ms (i.e.46 b.p.m.). 21he moving average algorithm with a window size of 10 neighbouring data points was chosen.This window size of 10 strikes a balance between reducing amplitude and variance, effectively smoothening the captured signals.It helps mitigate the effects of missing data points, abnormal values, and misidentified R peaks. 20

Signal processing and feature extraction
This study employed time-domain HRV analysis, which is considered sufficient when compared with frequency-domain analysis. 12Moreover, as standard HRV analysis is typically performed on 5 min of RRi series, 22 the filtered RRi data were segmented into 5 min of series for all individual recording.To conduct the HRV analysis, the study utilized the 'hrv' module, a Python package designed for HRV analysis. 23][25] RMSSD reflects the beat-to-beat variance in heart rate and is the primary time-domain measure used to assess changes reflected in HRV and represents the short-term variability between RRi: where N is the count of RRi values and RRi j is the jth RRi value.
The SDNN provides information on short-and long-term variability of the signal and could predict both morbidity and mortality: where N is the count of RRi values, RRi j is the jth RRi value, and RRi is the average value of the RRi series.
The SDSD is the standard deviation of the successive difference between adjacent RRi.
where i is the interval index, n is the number of total intervals, D i is the successive difference between RRi, and The pNN50 quantifies the percentage of successive intervals differing over 50 ms (nRRi 50 ) to the total number of RRi (nRRi): The mean value of the RRi after pre-processing is computed as where N is the number of elements in the RRi and RRi is the ith element in RR time series.

Machine learning approaches
This study used supervised classification machine learning algorithms to classify healthy and congestive heart failure participants.Additionally, utilizing robust off-the-shelf software, including machine learning models, allows manufacturers to focus on developing the application software necessary to run device-specific functions in a medical device. 26While deep learning shows promise and has yielded promising results, it still faces several unresolved challenges in the clinical healthcare application, including issues related to data volume, data quality, disease's varying nature, domain complexity, and interpretability. 27Therefore, given the potential benefits of the findings in developing a home-based medical device for congestive heart failure screening, this study used off-the-shelf machine learning models with high interpretability suitable for small and medium datasets.
The classification problem was addressed by training algorithms on the HRV data, with all programming implemented using the Python Scikit-learn library. 28The classification algorithms utilized in this study include support vector machine (SVM), k-nearest neighbour (KNN), naïve Bayes (NB), and decision tree (DT). 25SVM creates a decision hyperplane for classification to separate different classes. 25In Scikit, SVM using C-support vector classification was implemented. 29,30KNN method selects the most common class among k 'neighbours' of the object, the Gaussian NB applies Bayes' theorem, and the DT model around a sequence of the Boolean queries. 30o assess the effectiveness of machine learning models, a cross-validation is performed.Cross-validation verifies how well the model could predict unseen data to determine whether the model is underfitting, over fitting, or well generalized. 31A common cross-validation technique is k-fold validation, where the parameter k indicates the number of folds or sections that a given dataset is split into.During each fold, the machine learning model is trained using k À 1 folds and validated using the remaining one fold, resulting in k scores (accuracy). 31his study used five-fold cross-validation techniques to evaluate the classification models.The folds were stratified based on computed 5 min HRV parameter; that is, HRV pa-rameters computed for an individual's 5 min duration are completely in the training set or completely in the test set. 32The five-fold cross-validation experiments involved random splits into five folds.However, to address the challenge associated with severely imbalanced datasets that could cause some folds not containing elements from all classes, the stratified cross-validation method is used.This method preserves the percentage of samples from majority and minority classes by splitting the dataset on k folds. 33The stratified five-fold cross-validation ensures that the proportion of instances (healthy and heart failure recordings) is preserved in each partition.Consequently, a model is expected to accurately predict previously unseen HRV parameters as healthy or congestive heart failure during the testing phase.
The hyperparameters for each classifier were optimized through random search with repeated five-fold cross-validation.Furthermore, to assess the prediction accuracy, the cross-validation process was repeated five times for each model on the dataset.This repetition aimed to obtain reliable performance results and report the mean for each metric. 32

Evaluation
The prediction accuracy of classification models was determined using various performance criteria, including precision (equivalent to positive predictive value; Equation 6), recall (equivalent to sensitivity; Equation 7), specificity (Equation 8), F1 score (the harmonic mean of precision and recall; Equation 9), and overall accuracy (Equation 10). 34,35o compute these performance parameters, weighted average measures were considered.Precision measures the relationship between the true positive (TP) predicted values and the false positive (FP) predicted values 35 and is identical to the positive prediction value, indicating the classifier's confidence when it identifies a person with a disease. 36

Precision
Recall is the ratio of the total number of correctly classified positive instances to the total number of positive instances; that is, it is the number of class x cases correctly classified as class x divided by the total number of class x cases. 35In two-class settings, similar to this scenario, recall is equivalent to sensitivity, which means it is the number of correctly classified cases divided by the total number of cases. 36

Recall
where FN is a false negative.Specificity is the correctly classified cases divided by the total number of cases.
The F1 score allows for the comparison of two models, whether they have low precision and high recall or vice versa, by utilizing the harmonic mean. 35 Accuracy is the rate of correctly classified instances. 35curacy where TN is a true negative.
The chosen performance metrics were selected to evaluate binary classifiers on imbalanced datasets because they provide more informative and less misleading results compared with specificity and receiver operating characteristic (ROC) plots. 37OC plots are visual tools for assessing the performance of binary classification models, especially when evaluating their sensitivity and specificity across different decision thresholds.The area under the ROC curve (AUC) is a single metric that summarizes the performance of the model over all possible thresholds.An AUC value closer to 1 indicates better overall classification performance, while an AUC value close to 0.5 suggests a model that performs no better than random chance.Therefore, to provide additional insights, we have presented the ROC plots and AUC values for evaluation and comparison of classification performance among different models.
For each participant, the recordings underwent pre-processing, which involved removing the initial and final 10 min (referred to as set-up time).Subsequently, any recording segments containing ectopic beats were eliminated.Following this pre-processing, approximately 1173 h of recording data was retained for healthy subjects, while approximately 793 h of recording data was retained for participants with congestive heart failure.Among the congestive heart failure patients, around 525 h of recording data came from the congestive heart failure RRi database, and an additional approximately 268 h was sourced from the BIDMC congestive heart failure database.In total, these recording durations were divided into approximately 14 076 five-minute segments for healthy participants and roughly 9516 five-minute segments for participants with congestive heart failure.
Table 2 presents the computed statistical time-domain HRV parameters.The feature set considered for our analysis includes statistical time-domain HRV parameters, specifically RMSSD, SDNN, SDSD, pNN50, and mRRi.
To the obtained HRV parameters, we applied four machine learning classification algorithms.Table 3 presents the classification accuracy of machine learning approaches when considering all congestive heart failure patients.The overall best performance was achieved by KNN and DT trained on HRV data with a time window length of 5 min, achieving an accuracy of 77%.The optimal hyperparameters DT used are best splitter, max_depth as 9, and criterion as gini.This model exhibited a precision of 0.78, recall of 0.77, specificity of 0.79, and an F1 score of 0.77.Similarly, KNN utilized optimal hyperparameters, such as weights as distance, setting the number of neighbours to 20, and using the Manhattan metric.This KNN model demonstrated a precision of 0.77, recall of 0.77, specificity of 0.77, and an F1 score of 0.76.
SVM achieved an accuracy of 74% using optimal hyperparameters, such as rbf kernel, gamma set to 'scale', and C value of 200.This SVM model exhibited a precision of 0.74, recall of 0.74, specificity of 0.74, and an F1 score of 0.73.On the other hand, NB achieved an accuracy of 47% with var_smoothing set to 1e-09.The NB model showed a precision of 0.69, recall of 0.47, specificity of 0.43, and an F1 score of 0.38.
Figure 1 presents the ROC curves for the various machine learning approaches when considering all congestive heart failure patients.SVM achieved an AUC of 0.74, indicating moderate discriminative ability in distinguishing between healthy and congestive heart failure patients.KNN performed slightly better with an AUC of 0.77, suggesting a relatively higher ability to classify the data correctly.In contrast, NB exhibited a poor AUC of only 0.45, suggesting that it struggled to effectively discriminate between the classes.DT, on the other hand, outperformed the other models, with the highest AUC of 0.78.
Heart failure recognition has the potential to reduce morbidity.Unfortunately, many cases of heart failure are diagnosed in acute care settings, after patients have already become clinically ill. 9 The NYHA classification is a widely used method for assessing heart failure severity, but it has faced criticism due to its subjective nature. 38Therefore, in this study, the same machine learning classification algorithms are applied to classify participants as either healthy or suffering from congestive heart failure, based on NYHA classification.This approach is discussed in what follows.
In this experiment, we considered the congestive heart failure RRi database, which included patients from NYHA classes 1, 2, and 3 patients as the congestive heart failure patients.The congestive heart failure RRi database comprised 4 NYHA class 1, 8 NYHA class 2, and 17 NYHA class 3 patients.We also utilized the normal sinus rhythm RRi database for healthy adults.We applied the previously used machine learning classification algorithms to classify participants as either healthy or suffering from congestive heart failure.Table 4 summarizes the precision, recall, F1 score, and accuracy of the machine learning approaches for classifying healthy and congestive heart failure participants.
KNN and DT achieved the highest accuracy of 78%.KNN exhibited a precision of 0.77, recall of 0.78, specificity of 0.70, and an F1 score of 0.76.Similarly, DT had a precision of 0.78, recall of 0.78, specificity of 0.73, and an F1 score of 0.77.SVM achieved an accuracy of 75% with a precision of 0.74, recall of 0.74, specificity of 0.73, and an F1 score of 0.70.On the other hand, NB had a lower accuracy of 37%, a precision of 0.69, recall of 0.37, specificity of 0.32, and an F1 score of 0.27.
Figure 2 displays the ROC curves for various machine learning approaches when considering patients from NYHA classes 1, 2, and 3. We observed that SVM achieved an AUC of 0.73, suggesting reasonable performance in distinguishing between the classes.KNN demonstrated a slightly improved AUC of 0.74, indicating a marginally better ability to classify data points.In contrast, NB exhibited a poor AUC of 0.35, suggesting severe limitations in its ability to effectively differentiate between the two classes.On a positive note, DT outperformed the other models with the highest AUC of 0.75.
In the subsequent experiment, we considered the BIDMC congestive heart failure database, which included 15 NYHA Additionally, we utilized the normal sinus rhythm RRi database for healthy adults.We applied the same machine learn-ing classification algorithms as before to classify participants as either healthy or suffering from congestive heart failure.
Table 5 provides an overview of the precision, recall, F1 score, and accuracy of the machine learning classification algorithms in the classification of healthy and congestive heart failure participants.KNN and DT achieved the highest accuracy of 91%.KNN demonstrated a precision of 0.91, recall of 0.91, specificity of 0.85, and an F1 score of 0.90.Similarly, DT exhibited a precision of 0.90, recall of 0.91, specificity of 0.92, and an F1 score of 0.90.SVM achieved an accuracy of 90% with a precision of 0.90, recall of 0.90, specificity of 0.82, and an F1 score of 0.89.Finally, NB had an accuracy of 82% with a precision of 0.80, recall of 0.82, specificity of 0.53, and an F1 score of 0.80.  Figure 3 depicts the ROC curves for different machine learning approaches when focusing on patients from NYHA classes 3 and 4. SVM achieved an AUC of 0.86, indicating its commendable ability to classify data accurately.Similarly, the KNN model demonstrated a strong AUC of 0.88, suggest-ing a robust capability to differentiate between classes.Conversely, NB exhibited a poor AUC of 0.68, signifying limitations in its effectiveness for classifying data points.In contrast, DT outperformed all other models, with the highest AUC of 0.92.

Discussion
In this study, machine learning was applied to assess the potential of utilizing HRV computed from ECG signals for classification of healthy adults and individuals with congestive heart failure.The results indicate that off-the-shelf machine learning classification algorithms could classify healthy and congestive heart failure participants with an accuracy of 77% using the HRV parameters.The early detection of  heart failure has the potential to enhance patient's quality of life through lifestyle modifications and necessary pharmacologic interventions that may slow disease progression. 39Furthermore, recognizing heart failure at an early stage could lead to reduction in both morbidity and mortality. 9However, heart failure typically has an insidious onset, progressing slowly for many years without evident symptoms, with symptoms becoming apparent only in the later stages of the disease. 40Therefore, there is a critical need for heart failure screening, identifying individuals at risk, and implementation of preventive measures to detect the condition at its initial stages. 40The findings from this study may contribute to development of non-invasive tools for the early detection of heart failure among at-risk populations.
Several studies have attempted to employ HRV in conjunction with machine learning for the purpose of heart failure detection.In a study involving 72 healthy (using MIT-BIH normal sinus rhythm database) and 44 congestive heart failure patients (from the BIDMC database), an automated system to analyse HRV signals by extracting multimodal features to capture temporal, spectral, and complex dynamics was proposed. 41The study evaluated congestive heart failure detection performance based on single and hybrid features comprising time-domain, frequency-domain, and non-linear measures and obtained the highest performance using DT with sensitivity (82%), specificity (82%), and accuracy (81.9%), and using SVM, the highest detection performance was obtained with SVM linear with sensitivity (96%), specific- ity (89%), and accuracy (93.1%). 41A review observed that studies showed greater consensus concerning time-domain measures compared with frequency-domain measures. 42oreover, our study benefits from the simple methodology and explainability of the features utilized.Our study obtained the highest performance using DT with sensitivity (77%), specificity (79%), and accuracy (77%).Likewise, when using SVM, the highest detection performance was attained with sensitivity, specificity, and accuracy, all registering at 77%.Furthermore, considering NYHA classes 3-4 participants exclusively, our study obtained the highest performance using DT with sensitivity (91%), specificity (92%), and accuracy (91%).Similarly, using SVM led to the highest detection performance with sensitivity (91%), specificity (92%), and accuracy (91%).The findings suggest that HRV holds promise as a valuable non-invasive predictor for the detection of severe congestive heart failure.
The application of machine learning on HRV has the potential to assist in categorizing patients based on the NYHA classification system.A prior study demonstrated that a combination of HRV indices and machine learning algorithms could accurately classify patients into NYHA functional classes 1, 2, and 3; however, the evaluation was conducted with a relatively small sample size of 29 participants. 43Affirmatively, this study reveals that the classification performance to classify NYHA classes 3-4 participants is higher when compared with classifying all congestive heart failure patients, including NYHA classes 1 and 2. These findings suggest that with further research and evaluation, HRV may hold the potential to effectively screen severe cases of heart failure.
Our study has limitations, and the results should be interpreted with caution.The healthy dataset consisted of 30 men and 24 women, while the congestive heart failure dataset had 19 men, 6 women, and missing gender details of 19 participants.The limited available data prevented us from conducting sex-specific evaluations, which are essential for a more comprehensive understanding of the observed differences in prognosis among heart failure patients.This study could not explore these aspects due to participants' medical conditions being reported as unknown.Furthermore, our study utilized a combined dataset of 98 participants, with 55% being healthy and 45% having congestive heart failure.This sample size is relatively small, and it is important to recognize that larger datasets and randomized controlled trials would be beneficial for validating the machine learning algorithms for real-world clinical utility, as suggested in previous research. 17Future studies with access to more extensive and diverse datasets can help overcome these limitations and provide more robust insights into the potential applications of machine learning in heart failure detection and classification.Current approaches to heart failure screening and treatment primarily rely on symptom-based assessments, which often result in underdi-agnosis and undertreatment of heart failure in various healthcare settings. 44Several factors contribute to the challenges in recognizing heart failure, including its misclassification as chronic obstructive pulmonary disease, deconditioning, age-related symptoms, or obesity due to overlapping clinical presentations.Additionally, the unavailability of echocardiography in primary care further complicates timely diagnosis. 7hese challenges underscore the need for more objective and rigorous management strategies that cover the entire spectrum of heart failure severity, ranging from mild to severe. 44Furthermore, there is growing interest in leveraging smart home technologies to monitor and manage heart failure patients within their home environments. 45Detecting heart failure at its earliest stages and harnessing the potential of the current technology to monitor patients at home present opportunities for the development of home-based screening devices.The results of this study demonstrate the potential of machine learning in classifying healthy and those with congestive heart failure patients based on HRV analysis.This objective measure can help categorize patients according to the severity of their condition.The ease of capturing HRV data and the application of machine learning algorithms for classification and insights suggest that this research could have significant clinical utility.As HRV data can be readily collected, there is potential for this study to contribute to the improvement of congestive heart failure screening, monitoring, and management practices.

Conclusions
In this study, using machine learning classification algorithms, we explored the potential of non-invasive HRV to detect and manage heart failure patients.Our investigation involved 54 individuals with normal sinus rhythm and 44 congestive heart failure patients categorized under NYHA classes 1, 2, and 3.The study demonstrated that KNN and DT, when trained on HRV parameters with a 5 min time window, achieved the highest overall performance accuracy of 77%, with KNN achieving an AUC of 0.77 and DT attaining 0.78.For the classification of severe congestive heart failure, KNN and DT exhibited exceptional accuracy at 91%, with KNN achieving an AUC of 0.88 and DT obtaining an AUC of 0.92.These findings highlight the potential of HRV data and machine learning techniques as a non-invasive biomarker for heart failure classification, offering it as a valuable tool for early detection and improved management in primary care settings.Also, as HRV data can be easily collected, there is potential for this study to contribute to the improvement of congestive heart failure screening, monitoring, and management practices.

Figure 1 (
Figure 1 (A-D) Receiver operating characteristic (ROC) curves of machine learning approaches considering all congestive heart failure patients.The blue dashed line represents the central line at 0.5.AUC, area under the ROC curve; DT, decision tree; KNN, k-nearest neighbour; NB, naïve Bayes; SVM, support vector machine.

Figure 2 (
Figure 2 (A-D) Receiver operating characteristic (ROC) curves of machine learning approaches considering the congestive heart failure relaxation response interval database.The blue dashed line represents the central line at 0.5.AUC, area under the ROC curve; DT, decision tree; KNN, k-nearest neighbour; NB, naïve Bayes; SVM, support vector machine.

Figure 3 (
Figure 3 (A-D) Receiver operating characteristic (ROC) curves of machine learning approaches considering the Beth Israel Deaconess Medical Centre congestive heart failure database.The blue dashed line represents the central line at 0.5.AUC, area under the ROC curve; DT, decision tree; KNN, k-nearest neighbour; NB, naïve Bayes; SVM, support vector machine.

Table 2
Summary of variables CHF, congestive heart failure; ECG, electrocardiogram; HRV, heart rate variability; MHR, mean heart rate; mRRi, mean relaxation response (RR) interval; NN50, number of RR intervals longer than 50 ms; pNN50, percentage of RR interval longer than 50 ms; RMSSD, root mean square of the successive difference in adjacent RR interval; SDNN, standard deviation of RR interval; SDSD, standard deviation of the successive differences between successive RR intervals.

Table 3
Classification accuracy of machine learning approaches considering all congestive heart failure patients DT, decision tree; KNN, k-nearest neighbour; NB, naïve Bayes; SVM, support vector machine.Time-domain heart rate variability features for automatic congestive heart failure prediction 383 ESC Heart Failure 2024; 11: 378-389 DOI: 10.1002/ehf2.14593classes 3-4 patients as the congestive heart failure patients.

Table 5
Time-domain heart rate variability features for automatic congestive heart failure prediction