Multitask Learning for Automated Sleep Staging and Wearable Technology Integration

Scoring sleep stages is an essential procedure for the diagnosis of sleeping disorders. Conventional sleep staging is a laborious and costly procedure requiring multimodal biological signals and an expert for the assessment. There has always been a demand for approaches which can exempt the need to going through diagnostic procedures under specialized facilities and enable automated sleep staging. Herein, a high‐performance multitask learning model enabling high‐accuracy sleep staging using heart rate data is reported. The proposed algorithm exhibits superior performance with reduced computational resource in comparison with the competing machine and deep learning algorithms when trained and evaluated using electrocardiography and photoplethysmogram (PPG) data. The reported model consumes ≈7.5 times less training parameters and ≈75% less amount of input data than the previously reported models and yields better or comparable performance (mean per night accuracy of 77.5% and Cohen's kappa of 0.643). To demonstrate its potential for wearable electronics, the reported algorithm is implemented in a fully integrated watch. The reported integrated watch is a stand‐alone fully functional platform, which automatedly captures PPG data from the subject's wrist, predicts sleep stages, and displays the result on a screen as well as an associated smartphone application.


Introduction
3][4] Sleep disorders are common worldwide, as 1/3 of the population in the United States is suffering from these problems. [5]The most commonly used physiological method for clinical sleep diagnosis is the polysomnography (PSG), which requires multiple resources to perform a comprehensive diagnosis.A standard PSG involves acquisition of electroencephalogram (EEG), electrooculogram (EOG), electromyogram (EMG), airflow, and other signals.The acquired data are then required to be analyzed by an expert, which demands substantial time and human effort.This makes PSG studies expensive and hard to manage as these need to be carried out in specialized examination centers with trained technicians and limited-capacity data collection facilities. [6,7]Some of the PSG studies can be performed using portable kits, e.g., home sleep apnea test used for obstructive sleep apnea diagnosis, [8] but such devices are mostly disease-specific and thus cannot be used for a cumulative study of sleep disorders.Though the PSG studies are considered universally accepted gold standard for the study of sleep disorders, these are still susceptible to other shortcomings like human errors and probable scoring inaccuracies. [9]Moreover, disruption caused by the laboratory environment makes it difficult to acquire a patient's normal sleep patterns and the overall accuracy of diagnosis contains a certain proportion of uncertainty. [10]The acquisition of data required for sleep scoring can only be accomplished in specialized diagnostic centers with a limited capacity, and the subjects need to stay under observation for substantial period of time.The abovementioned hindrances impel the demand for tools to help escape tedious procedures and facilitate on-spot automated sleep staging.An electronic device capable of probing required biosignals and processing those signals to instantaneously estimate sleep stages appears to be a desired solution to the problems under discussion.
Scoring sleep stages is an essential procedure for the diagnosis of sleeping disorders.Conventional sleep staging is a laborious and costly procedure requiring multimodal biological signals and an expert for the assessment.There has always been a demand for approaches which can exempt the need to going through diagnostic procedures under specialized facilities and enable automated sleep staging.Herein, a high-performance multitask learning model enabling high-accuracy sleep staging using heart rate data is reported.The proposed algorithm exhibits superior performance with reduced computational resource in comparison with the competing machine and deep learning algorithms when trained and evaluated using electrocardiography and photoplethysmogram (PPG) data.The reported model consumes %7.5 times less training parameters and %75% less amount of input data than the previously reported models and yields better or comparable performance (mean per night accuracy of 77.5% and Cohen's kappa of 0.643).To demonstrate its potential for wearable electronics, the reported algorithm is implemented in a fully integrated watch.The reported integrated watch is a stand-alone fully functional platform, which automatedly captures PPG data from the subject's wrist, predicts sleep stages, and displays the result on a screen as well as an associated smartphone application.
[13][14] The algorithms used for sleep scoring are trained using portions of the data utilized in PSG studies, aiming at less computational and expert resources to achieve the same accuracy.However, the requirement for various diagnostic data with minimal disruption remains a challenge. [15][18][19][20] Use of a suitable machine learning algorithm in combination with easy-to-acquire data, like ECG/PPG, helps achieve full sleep architecture of a patient with better efficiency.
23][24] This issue is critical in the case of medical applications including sleep staging, as labeling of the patient data demands for increased labor and consumes additional time.Though sleep staging using deep learning (DL) and HR data has been demonstrated successfully, the performance of models employed for this purpose is generally limited by the issues like data scarcity and computational complexity.The existing DL approaches generally require a large amount of labeled data instances for accurate learning, hence not suitable in scenarios where the available data are insufficient.These circumstances demand looking for machine learning approaches that save computational resources and require reduced amounts of training and/or input data to enable sleep staging in situations involving data insufficiency.
[27][28][29] MTL has been found helpful in scenarios where the amount of labeled data in a single task is insufficient to accurately train a model as it aggregates the labeled data in multiple tasks to achieve accurate learning.[32][33][34][35][36] Combining MTL as a training method with neural networks and other model architectures is a popular approach to solve diverse set of problems in the field of DL. [37][38][39][40] The sleep staging-related tasks have been rarely solved using MTL-enriched models and there exists a room to boost performance of sleep staging algorithms with MTL, specifically those employing ECG/PPG data.
This work reports an MTL-enhanced model enabling accurate sleep staging with a significantly reduced amount of input data, computational cost, and parameter size.To endorse its proficiency, the model presented in this work is fed with highfrequency ECG and low-frequency PPG data as input to predict sleep staging with the help of extracted parameters like HR and HR variance (HRV).The model is fed with HR as a direct and HRV as an auxiliary input.The mean per night accuracy of the prediction is as high as 73% with a maximum Cohen's kappa value 0.521 in the case of ECG data.The parameters and computing resource used in our model are an order of magnitude less than the previously reported works using DL approaches.In the case of PPG signal with a lower sampling frequency of f = 25 Hz, the mean per night accuracy and Cohen's kappa reach 69.76% and 0.545 for the cyclic alternating pattern (CAP) PPG dataset, and those of 77.5% and 0.643 for the Shin Kong Wu Ho-Su Memorial Hospital (SKH) database, respectively.
Wearable electronic devices are considered ideal solutions for on-spot monitoring and processing of biosignals.However, the limited amount of data collected directly from the human body often poses a challenge to the use of machine learning models in such devices.The reported MTL model makes accurate predictions with significantly reduced input data, hence can help overcome the abovementioned limitation when used with the wearable platforms.To demonstrate this capability, the reported MTL model has been implemented into an integrated watch, which probes HR data from the subject's wrist through an embedded PPG sensor and provides entirely automated remote sleep staging.In the subsequent text, we establish the competitiveness of the proposed model by evaluating its performance and describe the design and working of the reported integrated watch.

Model Description
We utilize a model architecture combined with recurrent neural network (RNN) and a 1D convolutional neural network, which takes one-night instantaneous HR (IHR) data as input to predict the sleep stages.The model comprises two distinctive parts, namely, the HR encoder and the stage decoder.The HR encoder consists of multiple 1D convolutional layers, which are capable of learning the local temporal correlation of IHR data.The second part of the proposed model, i.e., the stage decoder involving a two-layer bidirectional long short-term memory (Bi-LSTM), can properly learn the long-term correlation of IHR data and predict the sleep stages.Besides, an HRV decoder is proposed, which exploits MTL to further improve the model accuracy.The HRV decoder is a two-layer fully connected network, which predicts the HRV data by embedding the HR output from the HR encoder.The HRV sequences, usually thought to be a useful set of features for sleep staging, require extra code size for extraction, additional computational cost in the inference phase, and extensive implementation efforts.Moreover, an improper HRV extraction step may discard some key information leading to poor model performance.Using a model, which directly takes IHR as input, can significantly reduce the productization difficulty, and avoid the loss of key information as the IHR is simply a resampled peak to peak interval / R-wave to R-wave interval (PPI/RRI) sequence that carries more information than the HRV sequence.On the contrary, IHR being a raw feature makes the model require more effort to learn the extraction of features and model the correlation among them.As the effectiveness of HRV data in sleep staging problems is already established, we suggest that the HR encoder can be pretrained using HRV information to overcome training difficulty.The working of the proposed model is devised keeping the abovementioned considerations in view (detail in Supporting Information).
Instead of training the HR encoder, HRV decoder, and stage decoder simultaneously, the training phase is split into two steps, referred to as pretraining and main training.In the pretraining step, only the HR encoder and the HRV decoder are trained, and HRV information is predicted using IHR data.After the pretraining step, the HRV decoder is dropped, and parameters obtained by the HR encoder are kept for the next training step.The parameters obtained from the pretrained HR encoder are utilized in the main training step instead of those from a randomly initialized HR encoder.The model connects the pretrained HR encoder with the stage decoder, and the training is carried out using IHR data and stage labels.In the end, the model predicts the sleep stage to be in one of the four categories, namely, wake, light, deep, and rapid eye movement (REM).
The presented model adopts a coordinated learning mechanism as it uses information from the multiple correlated tasks running in parallel.The purpose to introduce the MTL approach is to achieve efficient learning with a smaller number of parameters, reduced training data, and fewer computational resources.A schematic representing the working of the proposed MTL model is shown in Figure 1, along with that of single-task learning (STL) model used to address the same problem.Figure 3a shows the confusion matrix associated with the sleep stages predicted by the proposed MTL model.The class-wise proportion of correct predictions is 70.5%, 71.8%, 80.1%, and 42.7% for wake, REM, light, and deep sleep, respectively (refer to the confusion matrix shown in Figure 3a).The model achieves a mean per night accuracy of 73% and a Cohen's kappa of 0.521.The standard deviation (STD) for mean per night accuracy and Cohen's kappa comes out to be 10.26% and 0.171, respectively.

Model Performance
The MTL model presented in our work achieves the same level of accuracy and Cohen's kappa compared to the usual DL models while utilizing a reduced number of training data and computational resources.In Table 1, we compare the performance of our model with three of recently reported works on sleep staging employing the DL model over the same datasets (CinC). [20,41,42]The proposed MTL model achieves an accuracy of 73% with significantly less number of parameters than the previous works.The reported model consumes %7.6 and %8.8 times less training parameters than ref.[20] and [41], respectively, while yielding similar accuracies.In terms of computational resources, floating point operations (FLOPs) run by the proposed MTL model are 600 M, which are an order of magnitude lower than the previous works. [20,41]The number of parameters and FLOPs considered for comparison comes from the testing phase.The proposed model effectively addresses the data scarcity-related issues by reducing the required amount of input data by %75% (Figure 3b, inset) and achieves comparable or better accuracies than the previously reported models for sleep staging.
To demonstrate the HR model's effectiveness, we replace the HR encoder with an HRV encoder featuring 81 input HRVs in a two-layer fully connected network.However, the model utilizing the HRV encoder achieves only 71% accuracy and a kappa value of 0.49 (refer to Table 1), highlighting the challenge of surpassing the HR model's performance through manual feature engineering.

Integration of MTL with Wearable Technology
Wearable technologies are in high demand since a few decades for continuous acquisition of data manifesting everyday activities, athletic performance, and health status of an individual.[56] PPG is one of the most commonly used HR tracking methods due to the ease of data acquisition, integrating light sources and sensors to detect the HR-driven volumetric changes in human cardiovascular system.Compared to the ECG data ( f = 200 Hz), PPG signal can be acquired from the wearable electronics with a slower sampling frequency ( f = 25 Hz).Being data-efficient and compatible with PPD data, the MTL model introduced in this work can be  Table 1.The comparison of FLOPs, number of parameters, and the size of input data for different sleep staging models utilizing HR-related data.

This work
Ref. [20] Ref. [41] Ref. [42]   MTL implemented in the wearable devices to serve accurate prediction of sleep stages on remote basis.To demonstrate the potential, we present a fully integrated watch for on-spot sleep staging featuring the proposed MTL model.The integrated watch presented here probes PPG signal using a commercially available sensor module (Goodix GH3020) embedded in its body.Meanwhile, the accuracy and Cohen's kappa of 77.5% (STD: 8.63%) and 0.643 (STD: 0.143), respectively, are produced for SKH database.Figure 4g shows the confusion matrix associated with the sleep stages predicted by the proposed MTL model using PPG test data.The class-wise proportion of correct predictions is 77%, 75%, 76%, and 69% for wake, REM, light, and deep sleep, respectively.It is to be noted that the HR extracted from both finger PPG and wrist PPG is thoroughly identical as both the data are highly correlated.The statistical correlation between the finger PPG and wrist PPG data as well as the resultant HR patterns is shown in Figure 5a,b.The performance of proposed model evaluated using finger PPG data corresponds to a realistic estimation of the watch's performance.

Conclusion
We reported here a high-performance MTL model to perform automated sleep staging and demonstrated its potential for wearable electronic devices by implementing it to a fully integrated watch.The architecture of the implemented model implemented adopts a coordinated learning approach and exhibits comparable performance to the competing approaches with significantly reduced amounts of input data, computational resources, and parameter size.With a 200 Hz ECG signal fed as input, the model outputs a mean per night accuracy of 73 % and a Cohen's kappa of 0.521, with about an order of magnitude lesser number of parameters and FLOPs compared to DL models producing similar accuracies.Moreover, the reported model requires %75% less amount of input data to achieve the same performance in comparison with its competing DL counterparts.Using 25 Hz PPG data as an input, the proposed model produces a mean per night accuracy of 69.76% and a Cohen's kappa value of 0.545 for the CAP PPG dataset and of 77.5% and 0.643 for the SKH dataset.The potential of reported algorithm for wearable technology integration is established by embedding it into a fully integrated watch.The integrated watch collects PPG data with the help of a PPG sensor embedded in the watchband, performs automated sleep staging through the proposed MTL model, and displays results on its screen as well as an associated smartphone application.The data collected by wearable sensing platforms are usually limited, hence the model proposed in this work, being data-efficient, can be an effective solution for such devices.Algorithm: For each epoch, the local features are extracted from the 90 s HR signal by the HR encoder, which consists of seven convolutional blocks, two average pooling layers, a linear layer, and a rectified linear unit (ReLU) layer.Each convolutional block is a stack of a single 1D-convolution layer, a 1D-BatchNorm layer, and a ReLU layer.In the HR encoder, the kernel size and the padding are set to be 1 and "same," respectively, for the first convolution block and 5 and "same" for all other convolutional blocks.Two average pooling layers downsample the input by a factor 4 and 3, and make the output shape become 15 Â 16 (length * channel), which can be flattened to a 240-node vector.Finally, a linear layer transforms the flattened vector to 64-node HR Embedding.

Experimental Section
The stage decoder employed to model the long-term temporal correlation between epochs comprises a two-layer, 64-node Bi-LSTM.For each epoch, Bi-LSTM generates a 128-node vector, which is fed as an input to a two-layer fully connected network with 32 nodes and 4 nodes.The final output size for each epoch is 4, which can be interpreted as the predicted probability of wake/REM/light/deep stages after being passed through a SoftMax layer.
Training: The overall training process of the presented model is split into pretraining step and main training steps.For the pretraining step, the model adopts mean squared error loss function and Adam The reported model uses two ECG datasets to train and validate the model.Data for 2056 nights from the Multi-Ethnic Study of Atherosclerosis (MESA) were used as the training set whereas the Physionet Computing in Cardiology (CinC) dataset was utilized as the validation set.The experiment is repeated with low sampling frequency PPG data for 3050 nights from MESA, and CniC used to train the model while data from a private sleep center at the SKH are used to test the model.An exemplary set of the input ECG-derived HR is shown in Figure 2a, while the labeled and predicted hypnograms are displayed in Figure 2b,c, respectively.The architecture of the neural network used for sleep staging in this work is shown in Figure 2d (details in Experimental Section).

Figure 1 .
Figure 1.Schematics showing working mechanisms of the MTL (left) and STL (right) models, used for sleep staging.

Figure 2 .
Figure 2. a) An example of the input ECG-derived HR signal; b) labeled and c) predicted hypnograms.d) Detailed architecture of the MTL-enhanced neural network used for sleep staging.

Figure 3 .
Figure 3. Performance of the proposed MTL model.a) Confusion matrix associated with the proposed MTL model when using CniC dataset for testing.b) Accuracy of the MTL and STL models versus required training with SKH dataset.Inset displays the downsizing of the required training dataset when MTL in comparison with STL using SKH dataset.
Figure 4ashows the full configuration of the watch, indicating binding straps, PPG sensor module, power source, circuit board, and display panel.The integrated watch estimates and displays sleep stages when tied to the wrist.Exemplary readings of the watch in unpaired and paired situations are displayed in Figure4b,c.An associated smart phone application is also developed, which can be used to monitor the estimated sleep stages.Figure 4d schematically depicts the acquisition, flow, and processing of data between the integrated watch and the smart phone application.The required data are collected by inertial measurement unit and PPG sensors, processed by the embedded circuit and the results are communicated to the smart phone with the help of a Bluetooth low energy module.An image of a subject wearing the reported integrated watch while sleeping is shown in Figure 4e.To estimate the performance of the proposed MTL model, high-frequency finger PPG data downsampled to 25 Hz are utilized for training.To highlight the discrepancy of high sampling frequency ECG and low sampling frequency PPG, the root-mean-squared successive differences (RMSSD) in HRV derived from both datasets are shown in Figure 4f.The RMSSD acquired in one night from one subject, and their locations on respective axes correspond to the results from ECG and PPG data.The RMSSD plot shows that low-frequency PPG data are as effective to predict sleep stages as high-frequency ECG data.The MTL model fed with a 25 Hz PPG signal outputs an accuracy and Cohen's kappa of 69.76% (STD: 13.26%) and 0.545 (STD: 0.198), respectively, for CAP PPG dataset.

Figure 4 .
Figure 4. a) Configuration of the reported integrated watch for sleep staging, indicating individual constituent components.b) Unpaired integrated watch.c) The integrated watch paired to the wrist displaying predicted sleep stages.d) Schematic flow diagram depictive of acquisition, channeling, and processing of data for sleep staging in the reported integrated watch.f ) The RMSSD in HRV derived from 200 Hz ECG data compared to that derived 25 Hz PPG data.g) Confusion matrix associated with MTL-assisted sleep staging using PPG data.The PPG data can be acquired from PPG-probing wearable electronics, which can be worn on any locations used to monitor HR signal without interrupting sleep.

Datasets:
The data from MESA used for training was accessed through the online sleep data portal (www.sleepdata.org).The datasets by Physionet CinC and CAP Sleep Database (CAP) were accessed through https://physionet.org/content/capslpdb/1.0.0/ and utilized as the validation set.The PPG dataset was developed on the basis of appropriately deidentified information for 416 adult patients who underwent initial overnight PSG studies for the first time in between January 2009 and December 2016 at the Center of Sleep at SKH (https://www.skh.org.tw/skh/index.html).The continuous PPG signal was extracted from the PSG recordings where the sleep time for every patient was over 4 h.All methods were carried out following relevant guidelines and regulations.The study protocol was approved by the Research Ethics Committee of SKH (No. 20211106 R) and the need for consent from participants was waived by the Research Ethics Committee.Input Features: The HR in work was extracted using ECG and PPG signals.For ECG signal, we use an opensource neurophysiological signal processing toolbox NeuroKit (https://github.com/neuropsychology/NeuroKit) to detect R-peaks and get interbeat interval (IBI) time series.For PPG signal, a self-developed algorithm is used to get IBI time series.The IBI time series is filtered by discarding anomalous values >3 standard deviations which occur due to erroneous peak detection.The IHR values are obtained by simply reciprocating the IBI values.The IHR time series is normalized for each night by subtracting the mean and dividing by the standard deviation of the night, and finally resampled to 2 Hz with linear interpolation.

Figure 5 .
Figure 5. a) Statistical correlation between the wrist and finger PPG data (RR and ms stand for respiration rate and mean square, respectively.The Spearman rank statistics are listed on top of the figure).b) Statistical correlation between the HRs derived from the wrist and finger PPG data.