Performance of a novel reusable pediatric pulse oximeter probe

Abstract Objective To assess the performance of reusable pulse oximeter probe and microprocessor box combinations, of varying price‐points, in the context of a low‐income pediatric setting. Methods A prospective, randomized cross‐over study comparing time to biologically plausible oxygen saturation (SpO2) between: (1) Lifebox LB‐01 probe with Masimo Rad‐87 box (L + M) and (2) a weight‐appropriate reusable Masimo probe with Masimo Rad‐87 box (M + M). A post hoc secondary analysis comparison with historical usability testing data with the Lifebox LB‐01 probe and Lifebox V1.5 box (L + L) was also conducted. Participants, children aged 0 to 35 months, were recruited from pediatric wards and outpatient clinics in the central region of Malawi. The primary outcome was time taken to achieve a biologically plausible SpO 2 measurement, compared using t tests for equivalence. Results We recruited 572 children. Plausible SpO2 measurements were obtained in less than 1 minute, 71%, 70%, and 63% for the M + M, L + M, and L + L combinations, respectively. A similar pattern was seen for less than 2 minutes, however, this effect disappeared at less than 5 minutes with 96%, 96%, and 95% plausible measurements. Using a ±10 second threshold for equivalence, we found L + M and M + M to be equivalent, but were under‐powered to assess equivalence for L + L. Conclusions The novel reusable pediatric Lifebox probe can achieve a quality SpO2 measurement within a pragmatic time range of weight‐appropriate Masimo equivalent probes. Further research, which considers the cost of the devices, is needed to assess the added value of sophisticated motion tolerance software.

Using a ±10 second threshold for equivalence, we found L + M and M + M to be equivalent, but were under-powered to assess equivalence for L + L.
Conclusions: The novel reusable pediatric Lifebox probe can achieve a quality SpO 2 measurement within a pragmatic time range of weight-appropriate Masimo equivalent probes. Further research, which considers the cost of the devices, is needed to assess the added value of sophisticated motion tolerance software.

| INTRODUCTION
Hypoxemia, an oxygen saturation (SpO 2 ) less than 90%, is a considerable risk for child pneumonia mortality in low-middle income countries (LMIC). 1 Pulse oximetry allows for accurate and noninvasive diagnosis of hypoxemia, but in the absence of oximetry, health providers rely on clinical observations to diagnose severe pneumonia and determine the need for oxygen therapy. 2 Clinical signs lack accuracy in predicting hypoxemia with pulse oximetry identifying 20% to 30% more hypoxemic cases than clinical signs alone. [3][4][5] Additionally, identifying clinical signs of severe pneumonia, often by nonphysician clinicians or community health workers, remains inconsistent and unreliable. [6][7][8][9] Given oxygen availability, universal implementation of pulse oximetry in the 15 highest pneumonia burden countries could avert 148 000 deaths annually. 10 Despite this, evidence on the uptake of pulse oximetry in LMIC is limited. Available estimates suggests it remains low, ranging from less than 30% to more than 70% across different LMIC settings. 11,12 There are examples of pulse oximeter implementation being feasible in LMIC settings, including Malawi and Nigeria, and resulting in improved referral decision-making. 13,14 Barriers to wider implementation include cost, lack of training and supervision, and lack of robust pulse oximeters and probes. In pediatric populations additional barriers include the lack of a highquality, reusable, low-cost probe that fits all ages of children and is tolerant to movement. 15 To facilitate routine pulse oximeter implementation and scale-up, evidence of low-cost but high-quality devices being usable in busy clinical settings, typical of many LMIC settings is needed. In response to this call, the Lifebox Foundation led a project to develop a universal pediatric probe in 2016. 16 Using a human centered design approach to probe development with end-user usability testing in the United Kingdom, Bangladesh, and Malawi, Lifebox developed a novel probe. 17 Usability testing found that among 1307 SpO 2 results, 81% biologically plausible measurements were achieved in less than 2 minutes. 17 This study builds on this work, assessing how the redesigned Lifebox probe functions when paired with a market leading oximeter microprocessor from Masimo that includes motion and low-perfusion tolerance software. We aimed to compare this performance with the same Masimo microprocessor and its weight-appropriate Masimo probe on the same child, to give a direct Lifebox vs Masimo probe comparison.
As a secondary objective, we also sought to compare these Box 1 Testing protocol for different age and weight categories KING ET AL.
| 1053 measurements to historical results that used the redesigned probe with the standard Lifebox V1.5 oximeter microprocessor, which is not enhanced with motion tolerance or low-perfusion software.

| MATERIALS AND METHODS
We conducted a prospective, randomized cross-over study comparing (1) the novel Lifebox LB-01 probe paired with a Masimo Rad-87 oximeter box (L + M) and (2) a weight-appropriate reusable Masimo probe paired with the Masimo Rad-87 oximeter box (M + M; Box 1).
The LB-01 probe used with the Masimo Rad-87 box was specifically adapted to be compatible for the purposes of this study and is not standard for devices available in the market. Data collection was conducted in May 2018.
We conducted a post hoc secondary comparison with existing data on the LB-01 probe paired with the Lifebox V1.5 box (L + L), to explore the added value of motion tolerance capacity. The methods for this study have been reported previously, and data collection was done in February to July 2017. 17

| Recruitment
Children were purposefully recruited using convenience sampling from inpatient and outpatient settings. All children recruited during the cross-over equivalence study contributed data to the analyses; Figure S1 shows participant inclusion from the historical data.
Patients were eligible if they were 0 to 35 months of age excluding those: receiving oxygen therapy; with a nasogastric tube; with a congenital limb malformation; and simultaneously receiving care from a healthcare worker.

| Sample size
For the cross-over study, we were powered to determine equivalence. This required 340 patients to be tested with both L + M and M + M for 80% power to determine equivalence in time to successful measurement within a ±10 second range, with a standard deviation of 40 and a mean time to measurement of 51 seconds. The sample for the L + L testing was based on those meeting eligibility criteria within the existing data set; we did not conduct an a priori power calculation for this analysis.

| Data collection
All measurements were conducted by physicians with expertize in pediatric pulse oximetry (TM, EDM, KS, BZ, and NB), following training in the study protocol. For the cross-over study, two pulse oximetry readings were conducted per child, separated by a 5-minute washout period, allowing the child to settle and reduce potential measurement bias by the tester. The order in which the probes were used was randomly assigned using a random number generator at the point of testing, within the ODK software used for data collection. 18 The measurement procedure was the same for L + L and the crossover study. The tester placed the probe on the foot, toe, or finger of the child, depending on age and weight (Box 1 and Figure 1). An independent observer, a researcher who had received training in the study protocol and was experienced in pulse oximetry but not necessarily clinically qualified, recorded the time from when probe placement was completed to a biologically plausible reading announced by the tester, by them stating "stop." Biologically plausible was defined as having an age appropriate pulse rate above the approximate 10th centile for age, 19 and a consistent waveform or quality signal, depending on the oximeter box.
The observer noted the condition of the child, location of probe placement, number of adjustments, and any issues during the measurement. Neither the tester, observer, or participants were blinded due to the nature of the measurement; however, randomization was done at the point of testing after a participant was recruited, and the tester could not see the timer during measurements.

| Analysis
The primary outcome was the difference in time to a plausible SpO 2 reading between L + M and M + M, based on a cross-over design. The primary analysis approach was testing equivalence, defined as ±10 seconds in the mean time to successful measurement (ie, a measurement between 50 and 70 seconds is equivalent to 60 seconds). We chose equivalence, rather than noninferiority, as we did not hypothesize that M + M would necessarily outperform the L + M combination. We deemed ±10 seconds to be a pragmatic range that would not significantly impact routine care in a busy LMIC pediatric setting. We evaluated this through two one-sided t tests, using the -tosttcommand in Stata 14. 20 The comparison between L + M and M + M took into account the paired nature of the data.
A post hoc secondary analysis was conducted to compare the M + M measurements to historical L + L measurements, using the same definition of equivalence. Additionally, we described the median time to SpO 2 , and proportion of SpO 2 readings within less than 1, 2, and 5 minutes, and conducted a multivariable analysis to examine factors associated with a successful SpO 2 in less than 1 and 2 minutes, with robust standard errors to account for clustering at the participant level. These models included, probe and box combination, order of the measurement, child's condition, age, and weight. Other potential confounders were investigated for an association between testing rounds and were included if there was a difference. All analyses were conducted using Stata 14.

| Testing procedures
Overall 174 of 340 (51%) of the cross-over study measurements were randomized to L + M first, and we did not observe any differences in patient characteristics based on randomization order (Table 1). Seventy-five percent of SpO 2 measurements were on the child's toe, followed by 23% on the child's foot. As recorded by the  were similar between devices and included: poor quality signals, slow or no presentation of SpO 2 results, and implausibly low pulse rates.

| Time to successful SpO 2 measurement
Notably, neither of these differences was statistically significant  a key challenge that has been highlighted by healthcare providers in small children. 15 Crucial to innovation in this field is ensuring lower-cost oximeter boxes designed for LMIC settings are not poor quality. Masimo announced the development of the Rad-G device, designed specifically for spot-checking for the LMIC market. 22 It will be crucial to subject this new device to pragmatic testing in the field, to ensure it maintains the current low-perfusion and motion tolerant software, which we believe to be important. Other initiatives for low-cost devices for LMIC settings include smartphone based oximetry. 23,24 With multiple initiatives, generating clear evidence for policy makers and procurement agencies will be crucial to support implementation and scale-up. Our testing approach could be expanded to benchmark usability, as continuing to evaluate the added value of novel devices in real-world settings is as important as laboratory-based accuracy testing.
An important finding was the difference in device performances according to age. The M + M combination was more successful than L + M combination in children less than 2 months within less than 1 minute, although this effect disappeared within less than 2 minutes. Additionally, as the majority of pneumonia burden is seen in less than 24-months old, 25 a universal design may favor a clip.
We had three key limitations: firstly, the tester in the cross-over study could not be blinded. The tester may have had an inherent preference for one probe or box over another, either through prior personal experience or experience during the testing. This could have influenced their decision on when to accept a measurement as biologically plausible. We were aware of this potential bias during study design, and decided to randomize at the point of measurement rather than in-advance to reduce the potential for selective recruitment of participants. In addition, the independent observer served both a pragmatic and quality control role, to reduce nonstandardized testing. The second limitation was the potential for T A B L E 2 Description of time to reading, comparing the three different probe and device combinations, stratified by age and weight groups  Again, we were aware of this in the design stage and included the 5minute washout period between measurements, and included standardized locations where the first placement of the probe should be according to age and weight in the protocol. We included the order of measurement in the adjusted analyses, and found that it was not significantly associated with successful measurements, suggesting these biases were not present. Finally, we lacked sufficient power from the historical L + L testing to conduct an equivalence analysis, limiting our ability to make a conclusion for this comparison. A prospective three-way cross-over could have overcome this limitation, however, it was beyond the scope of testing at the time.
We found the novel universal reusable pediatric Lifebox clip probe can achieve a quality SpO 2 measurement within a pragmatically equivalent time as the Masimo reusable Y-wrap sensor on children less than 10 kg and the reusable Masimo pediatric clip probe on children more than 10 kg. As cost and sustainability is frequently cited as a key barrier to pulse oximeter implementation and scale-up in LMICs, this is an exciting finding as Lifebox probes are typically available at a fraction of the cost of market leading reusable probes, and requires a single probe for all children rather than multiple specialty designs.
Further work is needed to improve motion tolerance in low-cost oximeter boxes to fully realize the potential of pulse oximetry as a routine point-of-care diagnostic for pediatric hypoxemia in low-resource settings. Additionally, as new devices are released, including multimodel devices with multiple integrated functions, it will be crucial to continue benchmarking these devices not only on cost and laboratory accuracy, but on real-world applicability.

ACKNOWLEDGMENTS
We would like to thank all the caregivers and children who took part in the testing for their time and co-operation, along with the staff at