Efficacy of digital cognitive behavioral therapy for moderate‐to‐severe symptoms of generalized anxiety disorder: A randomized controlled trial

Cognitive behavioral therapy (CBT) is an efficacious intervention for generalized anxiety disorder (GAD). Digital CBT may provide a scalable means of delivering CBT at a population level. We investigated the efficacy of a novel digital CBT program in those with GAD for outcomes of anxiety, worry, depressive symptoms, sleep difficulty, wellbeing, and participant‐specific quality of life.


| INTRODUCTION
Generalized anxiety disorder (GAD) is characterized by excessive anxiety and worry about a variety of events or activities that is difficult to control (American Psychiatric Association, 2013). GAD affects~4% of the population (Kessler, Petukhova, Sampson, Zaslavsky, & Wittchen, 2012). It impacts wellbeing, life satisfaction, and health status; and leads to increased healthcare utilization and decreased work productivity (Revicki et al., 2012). Pharmacotherapy and cognitive behavioral therapy (CBT) are considered first-line interventions for GAD (Anxiety and Depression Association of America, 2020;Canadian Psychiatric Association, 2006;Locke, Kirst, & Shultz, 2015;National Institute for Health and Clinical Excellence, 2011). Meta-analyses show CBT has large intervention effects relative to control conditions (Carpenter et al., 2018) and is more efficacious than pharmacotherapy .
Barriers limit access to CBT including insufficient numbers of trained therapists, intervention costs, waiting lists, incompatible scheduling, distance from services, and perceived stigma (Comer & Barlow, 2014;Gunter & Whittal, 2010). Digital CBT may help overcome such barriers because digital devices (e.g., computers, tablets, and smartphones) are widespread. Smartphone-based applications or "apps" are optimized and designed differently from computer-based digital CBT interventions and offer a promising delivery platform (Mohr et al., 2017). This is because they are owned by 81% of Americans (Pew Research Center, 2019) and permit real-time access to reliable, evidence-based interventions any time or place (Stolz et al., 2018). Smartphone-delivered interventions for anxiety were evaluated in a meta-analysis of 29 randomized controlled trials (RCTs). Results found significant reductions in GAD symptoms, with a small effect size compared with waitlist and/or inactive controls (Linardon, Cuijpers, Carlbring, Messer, & Fuller-Tyszkiewicz, 2019). Given that effect sizes for in-person CBT for GAD are typically large (Cuijpers et al., 2014), there is room for improvement in the efficacy of smartphone-delivered digital interventions.
The present intervention, Daylight™ (https://www.bighealth.com/ daylight), is a novel, smartphone-based and fully automated digital CBT intervention designed to enhance efficient learning and real-time application of CBT concepts and skills. For the goal of delivering engaging and digestible CBT techniques in a flexible, personalized manner, Daylight was developed in collaboration with designers, filmmakers, podcast producers, and animators. The structure, content, and design were all carefully considered to facilitate quick comprehension, frequent practice, and easy integration of CBT skills into the user's daily life.
Daylight was evaluated in a randomized, multiple-baseline single-case experimental design study of 21 participants (Miller et al., in review).
Findings supported the feasibility, safety, and preliminary efficacy of Daylight at 6 weeks, though an RCT is necessary to test efficacy at the next level of rigor. We present a parallel-group superiority RCT examining digital CBT compared with a waitlist control in individuals with GAD. Our primary hypothesis was that digital CBT would be superior to waitlist control for reducing symptoms of GAD at post-intervention (6 weeks from randomization). Secondary aims evaluated effects for secondary outcomes, including worry, depressive symptoms, sleep difficulty, wellbeing, and participant-specific quality of life at post-intervention and evaluated effects for both primary and secondary outcomes at follow-up (10 weeks). Further analyses adjusted for Daylight program use, examined whether effects on GAD symptoms were mediated by reductions in worry, depressive symptoms, or sleep difficulty and if baseline variables moderated effects.

| Design
A phase-II, randomized, partially blind parallel-group superiority trial with a primary endpoint at 6 weeks from randomization was conducted online and evaluated digital CBT for GAD (digital therapy for anxiety) compared with waitlist control. The trial design and participant flow are summarized in Figure 1. The trial was prospectively registered (ISRCTN12765810: http://www.isrctn.com/ISRCTN12765810) and ethical approval was obtained from the University of Oxford Medical Sciences Interdivisional Ethics Committee (MS-IDREC-C1-R61262/RE002). Apart from telephone calls, all aspects of the study, including screening, consent, assessment allocation, and delivery of the intervention were conducted online. The protocol has been published separately (Gu et al., 2020).

| Participants
Participants were 18 years or older with a diagnosis of GAD, as assessed by a score of ≥10 on the GAD-7 (Spitzer, Kroenke, Williams, & Löwe, 2006), combined with a positive screen for a GAD diagnosis on a digital version of the Mini-International Neuropsychiatric Interview (MINI) version-7 for DSM-5 (Sheehan et al., 2014). This self-report digital version included the same questions used in the standard MINI for GAD and did not include any additional follow-up questions. Individuals on prescription medication for anxiety, depressive symptoms, or poor sleep were included in the trial if they were on a stable dose for ≥4 weeks before baseline and were not allowed to have received CBT for anxiety in the last 12 months. Exclusion criteria included: self-reported diagnosis of schizophrenia, psychosis, bipolar disorder, seizure disorder, substance use disorder; recent trauma to the head or brain damage; severe cognitive impairment; serious physical health concerns necessitating surgery or with a prognosis of less than 6 months; or pregnancy. After initial online screening, participants underwent a telephone call with a research assistant (previously trained in the use of the MINI and supervised by a clinical psychologist who was available to help with any queries during the study) before consent to explain study procedures, answer any questions, and verify eligibility relating to GAD diagnosis with the    Table ST1 for a description of each module). Techniques were selected and ordered based on (a) their effectiveness for GAD symptoms, (b) their effectiveness for symptoms often comorbid with GAD (e.g., sleep, low mood), and (c) their potential for translation into an engaging digital format. The program is designed to be self-paced. The app encourages daily use (e.g., practicing techniques in the app) and real-world implementation (e.g., practicing techniques in their daily lives, outside of the app). Daylight users receive reminders and encouragement to use the program in the form of emails, push notifications, and text messages. Users can optin to receive emails automatically upon sign-up and are asked whether they would like to opt-in to receive additional communications in the form of text messages and/or push notifications. Modules can F I G U R E 1 Participant flow through the study. CBT, cognitive behavioral therapy be repeated or shorter practice exercises (~5 min) can be accessed.

| Intervention
Users are asked to complete weekly in-app brief assessments of anxiety, depressive symptoms, and sleep. Both the initial and weekly assessments and each exercise include elements of personalization.
During assessments, users are provided with personalized feedback based on their self-reported anxiety, mood, and sleep, as well as their progress (e.g., users experiencing problems with sleep could receive a suggestion to practice a relaxation exercise before bedtime). The feedback provided in each exercise (e.g., providing additional instructions, troubleshooting, and/or guidance for future practice of the exercise) is tailored based on user inputs during the exercise (e.g., changes in their anxiety level during the exercise, whether they experienced any difficulties doing the exercise). Waitlist control participants did not receive any program during the study and were provided access to Daylight after the final follow-up assessment.
Access to Daylight was not withdrawn and all participants were permitted to continue any other pharmacological or nonpharmacological interventions for GAD and use was tracked.

| Outcomes
Detailed protocol information is available elsewhere (Gu et al., 2020).
Briefly, assessments were captured online using Qualtrics (Qualtrics, 2019) and measured at baseline (Week 0), mid-intervention (Week 3), post-intervention (Week 6), and follow-up (Week 10). Those randomized to the intervention arm only were invited to complete a longer-term uncontrolled assessment at Week 26 from randomization without compensation. The primary outcome was anxiety symptoms, measured by the GAD-7 (Spitzer et al., 2006) Ruta, Garratt, Leng, Russel, & Mac-Donald, 1994), which asks participants to rate their top three problem areas (e.g., relationships, work, etc.) associated with GAD. For anxiety (GAD-7), remission was assessed between baseline and post-intervention and baseline and follow-up if participants scored <10 (Spitzer et al., 2006) and reliable remission if participants scored <10 and demonstrated a change score of ≥5, which is greater than the known unreliability of the measure (Richards & Borglin, 2011). Participants also demonstrated reliable deterioration if their GAD-7 scores increased by the reliable change score (≥5). Concomitant prescription or over-the-counter medication, and use of psychological therapies were documented at all assessment points.
Participants' perceptions of intervention credibility (Devilly & Borkovec, 2000) were recorded before randomization after being informed about the study conditions by telephone. Safety was assessed by the occurrence of any adverse events throughout the study period, reported spontaneously or in response to open-ended questions from study consent until the follow-up assessment. Using a modified Symptom Checklist (Kyle, Morgan, Spiegelhalder, & Espie, 2011) at post-intervention, participants were asked to rate the occurrence of pre-specified unwanted symptoms that may have occurred (e.g., low mood, headache, fatigue, etc.) over the last 6 weeks.

| Sample size
A sample size of 242 participants (121 per group) was required to detect a between-group effect size of 0.5 with 90% power and a significance level of p = .05, accounting for 30% attrition (Gu et al., 2020).
Due to a preponderance of females, recruitment was increased to include more people who identify as male and allow for a more representative sample of the GAD population (McLean, Asnaani, Litz, & Hofmann, 2011). In total, 256 participants were randomized.

| Randomization and blinding
Participants were randomized (1:1 allocation ratio) using a blocked randomization sequence online from Qualtrics survey software (Qualtrics, 2019). The study coordinator enrolled participants after the telephone call. Randomization and allocation were automated and carried out independently from the study team by Qualtrics. Participants and the trial coordinator were not blind to group allocation (digital CBT or waitlist control). The coordinator monitored uptake (download) of the intervention. All other members of the research team were blind to allocation. An external statistician (RE) was blind during the study and subgroup-unblind (groups labeled as "A" and "B") during the statistical analysis.

| Statistical methods
Analyses were intention-to-treat using Stata Version 16 (Stata-Corp., 2017). The primary endpoint was analyzed using a linear mixedeffects model fitted to data at all post-randomization time points. The baseline outcome measure, group assignment, time, and time and group interactions were all included as fixed effects and participants were included as random intercepts to account for repeated measures.
Missing outcome data were assumed to be missing at random. Secondary hypotheses were tested using analogous analyses.
Analyses of binary outcomes used logistic models on participants with outcome data to examine remission from initial assessment to postintervention and initial assessment to follow-up, adjusting for baseline outcome measures. Sensitivity analyses tested the primary GAD-7 hypothesis using data from participants who downloaded the app, completed at least three or more modules, and completed all four modules of digital CBT. Separate sets of analyses were conducted for each of these three categories of intervention completion. For magnitude of change, we report Cohen's d standardized effect sizes (ES), estimated by dividing the adjusted mean difference by the pooled standard deviation (SD) of the corresponding outcome at baseline. A paired t test evaluated uncontrolled effects for outcomes for the CARL ET AL.

| 1171
Daylight group only between baseline and Week 26 follow-up. Exploratory analyses examined potential mediators underlying the effect of digital CBT compared with waitlist control on GAD symptom severity in single mediator models and moderation analyses examined whether the between-group effect on GAD symptom severity was moderated by baseline variables (see Supporting Information for details). An unplanned interim analysis for efficacy was performed for post-intervention after collection of all participant data and partially unblinded the statistician for the follow-up assessment only.
Most participants were from the United States (n = 193), and 63 were from the UK. Table 1 provides an overview of demographics and baseline scores for both primary and secondary outcomes by group.
All 256 participants were recruited between August 2 and November 7, 2019, and the final controlled (Week 10) and uncontrolled (Week 26) follow-up assessments were completed on January 15 and May 11, 2020, respectively. No participants withdrew from the study.

| Intervention effects on secondary outcomes
Digital CBT led to significant improvements compared with waitlist control for worry (PSWQ), depressive symptoms (PHQ-9),   Table ST4). Significant improvements were found for digital CBT participants' first most important area for concern on the Patient-Generated Index with small-to-moderate effect sizes at post-intervention (d = .34) and follow-up (d = .41). No significant differences were present for the second most important concern at post-intervention; however, significant differences were observed at follow-up with a moderate effect (d = .42). Significant differences were present for participants' third most important concern at both post-intervention (d = .29) and follow-up (d = .67) with small and moderate effect sizes, respectively. The denominators are lower than the full sample due to missing data at post-and follow-up. Due to the assumption that data are missing at random, we believe these percentages to be largely representative of the full intention-to-treat sample. Table ST5 reports effects for anxiety, depressive symptoms, and sleep difficulty. In terms of reliable deterioration (increase in GAD-7 scores of ≥5) in the digital CBT group, three participants experienced this at post-intervention and two at follow-up. In the control group, four experienced reliable deterioration at post-intervention and seven at follow-up. concentrating and focusing on things, reduced motivation and/or energy, blurred vision, dizziness, and feeling irritable (see Table ST6).

| Exploratory mediation and moderation analyses
Post-intervention effects on the GAD-7 were mediated by mid- (see Table ST7). No baseline variables moderated the effects of the intervention on the GAD-7 (see Table ST8).

| DISCUSSION
This trial is the first between-group efficacy and safety evaluation of a novel smartphone-delivered digital CBT intervention, Daylight, in participants with GAD. Daylight was designed to enhance efficient learning and implementation of CBT concepts and skills with engaging and brief techniques, personalized content and guidance, and flexible pacing. Results supported the primary hypothesis, that this fully automated intervention leads to large improvements in anxiety symptoms compared with a waitlist control at post-intervention (6 weeks; d = 1.08) and follow-up (10 weeks; d = 1.43). Results are noteworthy given that average baseline GAD-7 scores for both groups were in the severe range (mean > 15; Spitzer et al., 2006).
Improvements were clinically meaningful as 52% of participants in the digital CBT group experienced reliable remission in anxiety symptoms at post-intervention compared with 27% in the waitlist control. This increased to 71% versus 27%, respectively, in participants with outcome data at follow-up. The odds ratio for a reliable remission with the digital CBT intervention (compared with waitlist control) was 3.05 at post-intervention and increased to 5.84 at follow-up.
We found improvements in secondary outcomes, underscoring the potential benefits of digital CBT on broader aspects of mental health and functioning (Figure 3) compared with a waitlist control.
Significant improvements were observed in worry, depressive symptoms, sleep difficulty, wellbeing, and participant-specific quality of life at both post-intervention (d = 0.34-0.73) and follow-up (d = 0.43-1.11). Despite moderate-to-large between-group differences for worry at post-intervention, the raw change within the digital CBT group was relatively small. The PSWQ was designed to evaluate trait worry and may be less sensitive to change (Meyer et al., 1990;Verkuil, Brosschot, & Thayer, 2007). Depressive symptom comorbidity was common in participants as expected in GAD (Kessler et al., 2005), and there were moderate-to-large effects on depressive symptoms at both post-intervention and follow-up.
Longer-term (uncontrolled) improvements were maintained for the  possible that the novel design of Daylight (i.e., intended for flexible daily use; optimized for engagement and accelerated learning of effective techniques) led to these larger effects.
The study has limitations. First, due to our interest in the scalable potential of smartphone-based digital CBT for GAD, both our outcome measures and inclusion criteria relied on self-reported symptoms. We did, however, verify baseline GAD diagnosis with a digital version of the MINI for DSM-5 (Sheehan et al., 2014) and the Structured Clinical Interview for DSM-5 (First et al., 2015) in a telephone call. Second, findings may not be generalizable to more diverse populations, and future studies should evaluate this, explore moderator effects in larger samples, and consider durability of effects over the longer term. Further, for this study, we selected a waitlist as the suitable control condition for digital CBT; this control group provides an index of the efficacy of digital CBT as would be offered by a provider, insurance company, or employer. This does not, however, control for potential expectation, demand, and attention effects. This study did not examine whether specific elements of digital CBT offered differential efficacy. Remission of GAD was predefined using a cut-off point of <10 on the GAD-7 and participants may have been classified as remitted but still experience subthreshold symptoms. Strengths of this study include the high rate of outcome assessment completion (≥89% across all controlled time points), good participant uptake, and subsequent digital CBT program use. The sample was also clinically relevant because participants reported previous use of pharmacologic or psychosocial treatment.

| SUMMARY
The results of this trial indicate that a novel smartphone-delivered, fully automated digital CBT intervention, Daylight, is efficacious and F I G U R E 3 Between-group effect sizes (d) for primary and secondary outcomes at mid-treatment (Week 3), post-intervention (Week 6) and follow-up (Week 10). CBT, cognitive behavioral therapy; CI, confidence interval; GAD-7, generalized anxiety disorder; PHQ-9, 9-item patient health questionnaire; PSWQ, Penn State Worry Questionnaire; SCI-8, 8-item Sleep Condition Indicator; WEMWBS, Warwick-Edinburgh Mental Wellbeing Scale safe for improving anxiety symptoms in adults with GAD compared with waitlist control. Anxiety symptoms improved over time, and other pertinent areas of mental health and functioning including worry, depressive symptoms, sleep difficulty, wellbeing, and participant-specific quality of life also showed significant gains. Our findings suggest that digital CBT can be efficacious for individuals with GAD and may prove to be a scalable solution that avoids some of the burdens and limitations associated with in-person CBT.