Effect of dual-wavelength (visible and near-infrared) light sources on non-contact heart rate detection

Image sensors can achieve non-contact detection of heart rate to predict the physiological status of the driver in an automotive driver monitoring system. However, the performance of such methods depends on the in- tensity of the light source. In this study, the effects of visible (VIS) and near-infrared (NIR) light sources on heart rate measurement are inves- tigated. The custom-built setup employs Complementary Metal Oxide Semiconductor (CMOS) image sensors for visible and dual visible and near-infrared spectra, in addition to the controllable light sources with visible and near-infrared wavelengths. As a reference heart rate, a pho- toplethysmogram signal from an heart rate sensor is employed. Upon image acquisition, heart rate is estimated based on the facial images with varying intensities of visible and near-infrared light sources under dim light conditions (10–50 lx). Compared to the values obtained using the visible light source alone, the signal-to-noise of the extracted signal increases and the root mean square error of the estimated heart rate decreases when the dual visible and near-infrared light is applied. This study demonstrates that the use of dual visible and near-infrared light sources can enhance the performance of non-contact heart rate measurements, which could be applied to monitor the driver’s status under dim light conditions.

Introduction: Heart rate (HR) is a vital health parameter, which indicates the medical status of a subject. HR monitoring is conveniently employed in clinics to determine the patient's condition. Traditional methods of acquiring the HR entail contact devices that obtain an electrocardiogram [1] using multiple electrodes or detect the blood volume pulse signals using optical sensors. An example of the latter method is the photoplethysmogram (PPG) [2]. Owing to the recent developments in non-contact HR detection, the scope of HR detection is expanding to non-medical applications, such as the estimation of exercise intensity and the driver's condition in an automated driving system [3].
Recently, driver state monitoring (DSM) has gained substantial attention, given the existing manual driving systems and highly automated driving systems. Although various physiological measures can be regarded as strong indicators of wakefulness, sleepiness, and distraction levels [3], they require an effective collection of measurements while the subject controls the vehicle. Therefore, relevant studies on camerabased devices focus on non-invasive and non-contact measurements of physiological measures [4].
Digital cameras have enabled the use of image sensors for noncontact HR detection using various wavelengths including visible (VIS) and near-infrared (NIR) wavelengths [5]. The heartbeat generates periodical fluctuations in the oxyhaemoglobin present in the microvasculature of the skin, which can be measured by estimating the difference in the light reflection/absorption using an image sensor [6]. The facial region is known to possess a higher capillary density compared to that in the lower body parts [7]. Visible light has become the preferred choice in recently developed reflection-mode non-contact HR detection techniques [8]. Owing to the absorption difference in oxyhaemoglobin and deoxyhaemoglobin, reflective detection of the periodic change in the blood volume in the microvasculature can be enhanced [6]. A previous study revealed the effect of the illumination intensity of visible light on remote HR detection using images captured by NIR image sensors. Furthermore, it showed that the HR can be estimated under dim light conditions [4] and that an increase in the intensity of visible light can improve the HR estimation accuracy. Contrarily, NIR light sources have been widely employed owing to their lower absorption (deeper penetration in the human tissue). Another study compared the remote HR detection in the transmission mode using VIS (559 nm wavelength) light for VIS sensors and NIR (800 nm wavelength) light for NIR sensors and showed that NIR wavelength is more suitable for non-contact HR detection [9]. However, the effect of dual VIS and NIR light sources detected using dual VIS and NIR image sensors on remote HR detection in the reflection mode remains unclear.
Monitoring the drivers' statuses is critical, especially during nighttime, when the visible light sources are limited. According to the National Optical Astronomy Observatory, the recommended light intensity for a highway is approximately 10-14 s [10]. Further, according to the United States Federal General Services Administration, the illumination in a general parking lot should be approximately 50 lx [11]. Thus, in this study, we performed the experiments under dim light conditions, wherein the visible light intensity was less than 50 lx. An increase in the visible light intensity may help reduce the HR estimation error [4]. However, additional visible light can disturb the driver's vision and can be dangerous at night. Therefore, the present study investigates noncontact HR estimation using dual VIS and NIR light sources under dim light conditions (visible light intensity 10-50 lx). To evaluate the HR detection performance with respect to the various image sensors and light conditions, two image sensors (VIS and dual VIS and NIR spectra sensors) were employed under two different lighting conditions (VIS and dual VIS and NIR light sources) with various light intensities. Then, the HRs were estimated from the acquired images. Finally, the signal-tonoise ratio (SNR) and the root mean square error (RMSE) values of the estimated HRs were evaluated under various conditions.
Experimental setup: Figure 1 shows an overview of the experimental setup. The light source employed was a light-emitting diode (LED) array ( Figure 1a) consisting of green (530 nm) and NIR (850 nm) LEDs. The LEDs were covered with a dispersion lens to provide uniform illumination to the subject and were powered using current regulators (AMC7135, ADDtek). The light intensity of the LEDs was adjusted using pulse-width modulation signals controlled by an Arduino board operating at 500 Hz, which was connected to a Raspberry Pi board ( Figure 1c). The intensities of the light sources were measured using a light-to-digital converter (TSL2561, Texas Advanced Optoelectronic Solutions) consisting of two photodiodes for broadband and infrared wavelengths, respectively. The light sources were co-located and faced the subject.
Two Raspberry Pi camera modules (Figure 1b, OV5647) were connected to the Raspberry Pi boards (Figure 1c) for each camera. Furthermore, two image sensors (VIS and dual VIS and NIR spectra) in the Raspberry Pi camera modules were used to receive a selective range of wavelengths by attaching different filters, which spanned the VIS (360-750 nm) and dual VIS and NIR spectra (360-950 nm). The resolution of the acquired images from the red, green, and blue channels was 800 × 600, and the frame rate was 30 fps. Any post-processing functions of the cameras were turned off. The cameras were positioned on a single acrylic board in parallel to align the field of view. The reference signal for measuring the HR was a PPG signal acquired using an HR monitor module (Figure 1d, MAX30102, Maxim Integrated), which was also connected to the Raspberry Pi board. Prior studies have shown that the green light image sensor achieves higher performance compared to the sensors of other visible wavelengths [4,9]. Thus, a green light source for the VIS light source and green channel image data from the image sensor were selected. Images were acquired using the image sensors in the VIS and dual VIS and NIR spectra under various light source intensities. The overall process was controlled via the MATLAB interface. It included the initialization of the Raspberry Pi boards corresponding to the light controller to set the desired lighting condition, acquisition of images using image sensors in each camera, and acquisition of PPG signals from the HR sensor.
Data acquisition: Seven subjects participated to collect the facial images from image sensors using VIS and dual VIS and NIR spectra under various light intensities using the VIS and NIR LEDs. The subject was positioned 50 cm from the board with the light source and camera. For each condition, images were acquired for a duration of 120 s. For each subject, two to three data sets were collected. To avoid exposure to ambient light, we performed all the experiments in a dark room. First, an experiment was performed with visible light intensity varying as 13, 24, 35, and 48 lx and without any NIR light source. Then, another experiment was performed with dual VIS and NIR light source by varying the NIR light intensity as 1.9, 3.3, 5.5, and 8.2 mW, with the VIS intensity fixed as 24 lx. Owing to the dim light conditions, the subject was vaguely recognizable in all the acquired images.
HR estimation: Figure 2a shows the block diagram of the HR estimation process. Using the Viola-Jones algorithm [12], facial regions were detected from the images. Once the facial regions were confirmed, the average pixel intensity of the region was computed from the region of interest (ROI) and normalized by the mean and standard deviation [4]. Once the signal was extracted in the time domain, a median filter was applied to remove the unexpected spikes. The empirical-mode decomposition (EMD) approach was applied to further reduce the high-frequency noise [4]. The processed signals were band-pass filtered using an FIR band-pass filter with a 128-point Hamming window and 0.8-3.2 Hz cut-off frequency corresponding to the target HR range (48-192 bpm). To estimate the HR from the processed signal, the short time Fourier transform (STFT) approach was employed [13]. The window size of the STFT was set as 1024 points, corresponding to 30 s, with 3 s of window overlapping. To eliminate abrupt changes in the estimated HR, a peakhopping filter was applied. The process for acquiring the ground truth HR is shown in Figure 2b. First, the PPG signals were band-pass filtered to reduce noise. The reference HR was then acquired via the STFT approach. As the sampling rate of PPG was 25 Hz, the reference HR was interpolated to match the sampling rate of the estimated HR at 30 Hz.
Evaluation: The signal quality was evaluated by calculating the SNR of the extracted signal [14]. The ratio of the spectrum around the fundamental frequency, in addition to its second harmonic, and the remaining spectrum was used to calculate the SNR, where S(f) denotes the normalized spectrum of the pulse signal of frequency f, and U t (f) is the window

Fig. 4 Performance evaluations for VIS and dual VIS and NIR image sensors under various visible light intensities (13, 24, 35, and 48 lx). (a) SNRs of extracted signals. (b) RMSEs of estimated HRs
for the maximum fundamental frequency and its second harmonic, given by (1) In addition, to assess the difference in the estimated and reference HRs, RMSE was calculated by mean values of the estimated and reference HRs in a temporal segment (10 s). To evaluate and compare the performances with respect to the various light sources and image sensors, the mean and standard deviation of the RMSE values were calculated. Figure 3 shows the extracted signals and estimated HRs using VIS, dual VIS and NIR, and PPG sensors. The extracted signals (output from the signal extraction block in Figure 2a,b) from the facial ROI regions and the PPG sensor are shown in Figure 3a. Figure 3b shows the estimated HRs corresponding to the VIS, dual VIS and NIR, and PPG sensors. Although different sensors were used to measure the reflected signals, the estimated HRs were comparable. Figure 4 shows the performance evaluations using the image sensors in VIS and dual VIS and NIR spectra under VIS light alone. The visible green light intensities were varied as 13, 24, 35, and 48 lx. The HRs were estimated using the images acquired from the VIS and dual VIS and NIR image sensors. Figure 4a,b shows the SNR values from the extracted signals and RMSE values of the estimated HRs, respectively. Overall, the SNR and RSME values corresponding to the dual VIS and NIR sensor were 1.1, 0.1, and 0.9 dB lower and 32%, 45%, and 26% higher than the SNR and RMSE values of the VIS sensor at intensities of 24, 35, and 48 lx, respectively. Nevertheless, the results obtained for both the VIS and dual VIS and NIR sensors show that the errors decrease significantly as the visible light intensity increases. For example, when the VIS intensity was increased from 24 to 35 lx, the dual VIS and NIR sensor showed a 43% reduction in the RMSE and an improvement of 2.4 dB in the SNR. Figure 5 shows the performance evaluations using the image sensors in VIS and dual VIS and NIR spectra under dual VIS and NIR light conditions. The visible green light intensity was fixed as 24 lx and the NIR intensities were varied as 1.9, 3.3, 5.5, and 8.2 mW. The HRs were estimated using the images acquired from the VIS and dual VIS and NIR sensors. Figure 5a,b shows the SNR values from the extracted signals and RMSE values of the HRs estimated under the dual VIS and NIR light, respectively. Notably, Figure 5a,b indicates that increasing the NIR light intensity has no relevant effect on the SNR and RMSE for the VIS image sensor; however, significant improvements were observed when the dual VIS and NIR image sensors were used under the same scenario. When dual VIS (24 lx) and NIR (8.2 mW) light was irradiated, the RMSE errors for dual VIS and NIR sensors were reduced by 69% and the SNR was improved by 3.2 dB, as compared to those with no NIR light (zero NIR intensity). This result demonstrates that the dual VIS and NIR light is capable of enhancing the HR estimation accuracy when a dual VIS and NIR image is used under dim light condition.

Results:
Discussion: To simulate the monitoring of drivers' status during nighttime, we performed non-contact HR estimation under dim light conditions. Once the dual VIS and NIR light sources were applied ( Figure 5), the dual VIS and NIR sensor quickly outperformed the VIS sensor in the HR estimation. Although the VIS intensity at 24 lx was shown as a dim light condition in Figure 5, we also observed up to a 64% reduction in the RMSE for dual VIS and NIR light with a VIS intensity of 35 lx. Under dim visible light conditions, the NIR sensor did not show any significant improvement in the SNR and RMSE even with an increased NIR light intensity. Thus, we only included the results from the image sensors with VIS and dual VIS and NIR spectra. This study suggests that dual VIS and NIR light, which utilizes invisible NIR light sources to avoid visual disturbances to the driver, can be employed to reduce the HR estimation error and improve the SNR of the signals.
The scope of this study is limited to the investigation of the effect of light sources and image sensors on HR detection accuracy; however, the performance of the estimation can be further improved by adopting advanced motion-resistant methods. Although the low-cost CMOS image sensors have a relatively low SNR, the results in this study clearly indicate the positive effect of the dual wavelength light source on noncontact HR detection. Combined implementation using higher-quality image sensors and light sources can be utilized instead of custom-built cameras and light sources, to decrease the noise arising from illumination fluctuations as well as to avoid mismatch owing to the positioning of the sensors and light sources [4]. More quantitative analysis can be performed with high-quality image sensors in the future. In addition, the signals extracted in the present study can be extended to estimate other physiological status indicators, such as the breathing rate.
Conclusion: Application of dual VIS and NIR light sources improved the performance of non-contact HR detection using image sensors under dim light conditions. The dual light source approach can be applied to monitor the driver's status at night without visual disturbances. Further quantitative analysis is required with a higher-quality implementation. In-depth investigation into the mathematical analysis will also be a future work.