Accuracy of the Apple watch for detection of AF: A multicenter experience

The Apple watch (AW) irregular rhythm notification (IRN) feature uses photoplethysmography to identify prolonged episodes of irregular rhythm suggestive of atrial fibrillation (AF). IRN is FDA cleared for those with no previous history of AF, however, these devices are increasingly being used for AF management.

as currently cleared, but increased sensitivity and wear times would be necessary for disease management.

K E Y W O R D S
atrial fibrillation, smartwatch, wearable

| INTRODUCTION
Atrial fibrillation (AF) is the most common sustained arrhythmia in adults, has a lifetime risk of 25%−33%, and is associated with heart failure, stroke, dementia, and death. 1,2 Episodes of AF can be asymptomatic, and therefore modalities for continuous monitoring may be used to quantify the occurrence of AF over a prolonged timeframe. The Apple watch (AW) irregular rhythm notification (IRN) feature uses photoplethysmography to identify prolonged episodes of irregular rhythm suggestive of AF ( Figure 1). IRN is FDA cleared for those with no previous history of AF, however, these devices are increasingly being used for AF management. The objective of the present study was to determine the accuracy of the IRN for AF detection in subjects with a previous diagnosis of nonpermanent AF.

| Study design and enrollment
Thirty patients were enrolled in this study from three hospital systems (Northwestern Medicine; St. Luke's Health System; and Boston Medical Center) between January 2020 and October 2021.
Inclusion criteria consisted of a previously implanted insertable cardiac monitor (ICM) or cardiac implanted electronic device (CIED), AF lasting ≥1 h on a device tracing within 90 days before enrollment, and possession of an iPhone compatible with an AW Series 5 or later (iPhone 5s or later with iOS 12 or later). Exclusion criteria included permanent AF, >5% ventricular pacing in those with CIEDs, tattoo on the wrist where the watch was to be worn, and prior surgery on the ipsilateral radial artery.
Demographic and baseline characteristics were collected for each patient by way of medical record review and a participant interview.
Participants were fitted with an AW Series 5 and asked to wear the watch during waking hours for a minimum of 14 h per day. Participants were permitted to keep the AW at the conclusion of the study.
The study was compliant with the Declaration of Helsinki, and all patients provided informed consent. The study was approved by the Institutional Review Boards of all three participating centers. Apple was not involved in any aspect of the study.

| Data collection and analysis
The criteria used by the AW to generate an IRN have been previously published by the device manufacturer. 3 Briefly, the AW records the interval of time between heartbeats known as a tachogram every 2−4 h at baseline. If the rhythm is classified as irregular by the device algorithm, then the frequency of tachogram collection is increased to occur at 15 min intervals. If five out of six sequential tachograms are classified as irregular within a 48 h period then an IRN is generated. If there are two consecutive tachograms classified as regular then tachogram collection is reset to the baseline frequency of every 2−4 h. The AW collects tachograms only if the user is still enough to obtain a recording. 3 Participants provided screen shots of all IRNs received during the study period and these episodes were compared to downloads from the patient's ICMs/CIEDs. AF events detected by ICM/CIED without an associated IRN were further investigated by a screenshot of the participant's "Activity" page for the corresponding day. If the participant's "Move" bar graph was at zero during the time that the ICM/CIED detected an event, the participant was deemed to not be wearing their AW during this time.
The primary endpoints were sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the IRN per subject for detection of AF lasting 1 h or longer while the AW was worn. Subjects with at least one true positive AF detection during AW wear time were considered a true positive subject.
The secondary endpoints were sensitivity and PPV of the IRN by AF episode lasting 1 h or longer while the AW was worn. In this F I G U R E 1 Example of an irregular rhythm notification.
episode-based analysis, all episodes were counted equally, irrespective of the subject who had the episode. If continuous AF developed during the study, this was considered to be a single episode. Only those ICM/CIED events that occurred while the smartwatch was simultaneously being worn were included in the primary analysis. Any

| DISCUSSION
In the present study, the AW IRN feature was associated with a low rate of false positive detections but only modest sensitivity for detection of AF in a population with a previously documented history of the condition.
Previous research on the IRN has focused on accuracy for AF detection in a screening population. The Apple Heart Study recruited over 419 000 subjects without a patient-reported history of AF to be monitored using the IRN. 4 There were 2161 subjects (0.52%) who received a notification. An ECG patch was mailed to these subjects and used as a benchmark to assess the accuracy for subsequent IRNs during concurrent patch monitoring. Four hundred and fifty subjects returned their ECG patch with analyzable data, and in the 86 participants who received an IRN during concurrent use of the ECG patch, 72 had AF detected with the ECG patch resulting in a PPV was 0.84. A similar large scale study conducted on the Fitbit irregular heart rhythm detection feature resulted in a PPV of 98.2%, 5 and the Huawei Heart Study reported a PPV of 91.6%. 6 While these data support the FDA-labeled indication of the IRN in a screening population without known AF, IRN is increasingly being used for management of patients with an existing history of AF. A prior study by our group assessed the accuracy of a different AF sensing watch compared to simultaneous recordings from an ICM. 7 The AF sensing watch consisted of an AW Series 2, a watch band with ECG sensor (KardiaBand; AliveCor) and an experimental app with a deep learning algorithm. The findings were 95.7% sensitivity for AF episodes and 100% sensitivity for subjects with AF lasting 1 h or more while the watch was worn, as well as 39.9% PPV for AF episodes. Differences in algorithms to explain the lower sensitivity and higher PPV of the present study are proprietary, but may, in part, relate to tachogram collection by the AW which is limited to 15 min T A B L E 1 Characteristics of patients enrolled.

| CONCLUSIONS
In a population with known AF, the AW IRN had a low rate of false positive detections and high specificity for subjects with AF.
Sensitivity for detection by subject and by AF episode was lower.
The current IRN algorithm appears accurate for AF screening as currently indicated, but increased sensitivity and wear times may be necessary for disease management.

ACKNOWLEDGMENTS
This work is supported by AHA grant 18SFRN34250013.

DATA AVAILABILITY STATEMENT
Data supporting the results in the paper will be provided upon request.

ETHICS STATEMENT
This study was approved by IRBs at each institution involved with enrollment. All patients provided informed consent for participation in the study.
F I G U R E 2 Accuracy of the IRN by episode and by patient. IRN, irregular rhythm notification; NPV, negative predictive value; PPV, positive predictive value.
F I G U R E 3 Summary of AF episodes by detection. AF, atrial fibrillation.