Clinical and cost-effectiveness of two ways of delivering guided self-help for people with an eating disorder: A multi-arm randomized controlled trial

Objective: Increasing the availability and accessibility of evidence-based treatments for eating disorders is an important goal. This study investigated the effectiveness and cost-effectiveness of guided self-help via face-to-face meetings (fGSH) and a more scalable method, providing support via email (eGSH). Method: A pragmatic, randomized controlled trial was conducted at three sites. Adults with binge-eating disorders were randomized to fGSH, eGSH, or a waiting list condition, each lasting 12 weeks. The primary outcome variable for clinical effectiveness was overall severity of eating psychopathology and, for cost-effectiveness, binge-free days, with explorative analyses using symptom abstinence. Costs were estimated from both a partial societal and healthcare provider perspective. Results: Sixty participants were included in each condition. Both forms of GSH were superior to the control condition in reducing eating psychopathology (IRR = (cid:1) 1.32 [95% CI (cid:1) 1.77, (cid:1) 0.87], p < .0001; IRR = (cid:1) 1.62 [95% CI (cid:1) 2.25, (cid:1) 1.00], p < .0001) and binge eating. Attrition was higher in eGSH. Probabilities that fGSH and eGSH were cost-effective compared with WL were 93% (99%) and 51% (79%), respectively, for a willingness to pay of £ 100 ( £ 150) per additional binge-free day. Discussion: Both forms of GSH were associated with clinical improvement and were likely to be cost-effective compared with a waiting list condition. Provision of support via email is likely to be more convenient for many patients although the risk of non-completion is greater.

Cognitive behavior therapy (CBT) is a leading treatment for binge eating in adults who are not underweight and there is a "guided selfhelp" form of this treatment (GSH) that is briefer and can be used both in non-specialist settings (Wilson & Zandberg, 2012; e.g., see Carter & Fairburn, 1998) and as part of a "stepped care" model (Mitchell et al., 2011;NICE, 2017). Recent clinical guidelines have recommended GSH as the first-line treatment for non-underweight EDs characterized by recurrent binge eating, namely binge-eating disorder (BED) and bulimia nervosa (BN) (Beintner, Jacobi, & Schmidt, 2014;Hay et al., 2014;NICE, 2017). When implemented well, GSH has demonstrated "clear evidence" of superiority compared with waiting list or no-treatment control conditions (Yim & Schmidt, 2019, p. 234) and has been shown to be cost-effective (NICE, 2017; see also Le, Hay, & Mihalopoulos, 2018). Traditionally, the "guidance" in GSH involves short face-to-face sessions with a trained practitioner, referred to here as a facilitator (Carter & Fairburn, 1998), and requires the patient to attend a clinic.
GSH can be adapted to make it more scalable, or for when faceto-face treatment might be impossible. This has become a particularly salient issue of late (see Waller et al., 2020), with digital treatments becoming more prominent following the COVID-19 pandemic. Providing support via email may be more convenient for patients and cheaper to provide than conventional face-to-face treatment, although barriers to implementation include both patient and clinician acceptance. In a proof-of-concept study, Ljotsson et al. (2007) showed that providing guidance via email was acceptable and associated with large reductions in ED behaviors compared to a control condition. Digital delivery of treatment based on GSH principles has also shown promise on a larger scale, demonstrating significant reductions in ED psychopathology that were comparable to other digital interventions (Fitzsimmons-Craft et al., 2020), suggesting that significant gains can be made by participants in email-assisted GSH (see also Beintner et al., 2014).
Despite the promise of GSH for the treatment of binge eating, limited data exist regarding the cost-effectiveness of ED interventions (Le et al., 2018). Methods for estimating cost-effectiveness have varied and methodological problems are common, with interventions often focused on single disorders, such as BED and BN (Le et al., 2018). Looking at a broader sample of adults with regular binge eating, Lynch et al. (2010) conducted a secondary costeffectiveness analysis from a randomized controlled trial and found that the addition of GSH to treatment-as-usual (TAU) resulted in incremental cost savings of $20.23 per binge-fee day (relative to TAU alone). The authors concluded that GSH "is likely to be at least as cost-effective as many accepted depression treatments" (p. 329).
Using binge-free days as a cost-effectiveness outcome represents a useful approach to assess value-for-money of new treatments when preference-based measures of outcomes are not available, whilst complementing cost-utility results, whenever preference-based outcome measures are available (König et al., 2018). Furthermore, bingefree days reflect one of the key symptoms of interest in studies of recurrent binge eating and "transdiagnostic" samples (Lynch et al., 2010).
No studies have yet provided data on the cost-effectiveness of email-facilitated GSH for binge eating and there have been no estimates of its effects compared to conventional face-to-face GSH, which is necessary to evaluate moves from more "traditional" modes of treatment delivery to those which can be conducted online, for example. Inclusion of a control condition in addition to "active" treatments permits comparison to both the "natural course" of symptoms (Mohr et al., 2009, p. 276) and previous work (Ljotsson et al., 2007).
In the current study, outcomes from face-to-face GSH and emaildelivered GSH were compared to those of a Waiting List control condition to examine whether these treatments are effective in naturalistic environments and, further, to investigate the differential effects of these interventions on several outcomes, including drop-out, symptom outcomes, and cost-effectiveness. As a secondary analysis (stated as an aim in the protocol; Jenkins, Luck, Burrows, & Boughton, 2014), the relative effectiveness of the two GSH treatments was explored.

| Study design and participants
This study was a pragmatic, parallel, three-arm randomized controlled trial, delivered in a routine clinical setting to provide a balance between internal and external validity. The study protocol has been published (Jenkins et al., 2014). Three conditions were evaluated in the treatment of recurrent, broadly-defined binge eating (i.e., subjective and objective episodes): face-to-face GSH (fGSH); email-based GSH (eGSH); and a waiting list comparison condition (WL). All three conditions lasted 12 weeks, with assessments at pretreatment and post-treatment.
Participants were recruited from consecutive referrals to National Health Service (NHS) ED centers serving a large population in central England. Following assessment with their local service, those eligible for the study were approached soon after by NHS clinicians and invited to participate in the trial. Three NHS sites were used but one was discontinued after 13 months due to inadequate recruitment. The study protocol was approved by the South Central-Oxford B Research Ethics Committee (13/SC/0217) and the trial was registered with ClinicalTrials.gov (NCT01832792).
Patients were eligible for inclusion in the study if they were aged over 17.5 years, and on clinical assessment had an ED characterized by recurrent objective or subjective binge eating. Exclusion criteria were recent rapid weight loss, being underweight (BMI <18.5 kg/m 2 ), and current and excessive substance misuse. Those who consented were randomly allocated to one of three conditions.
To begin treatment, patients allocated to either form of GSH met with a facilitator shortly after randomization. Outcome measures were completed at the start and end of the 12 weeks in all three conditions. Data were held securely: personal identifiers were removed and password-protected randomization details held separately.

| Randomization and masking
Participants were randomly assigned (1:1:1) to fGSH, eGSH, or WL. To maximize recruitment and efficiency of the trial (Kahan, 2016), participants allocated to the WL condition were offered randomization to one of the two treatment conditions after the 12 weeks had elapsed (see Jenkins et al., 2014). Sixty-six participants entered active treatment immediately following initial randomization, and 54 were re-randomized having completed a waiting period (thus participating in both the WL condition and one intervention). Simple randomization was carried out on an individual basis using a computer-generated code set out in advance (Jenkins et al., 2014). Participants and investigators were not blind to treatment assignment due to the nature of the interventions, although steps were taken to conceal allocation until the latest possible stage.
Data analysis was carried out according to the specified protocol by a statistician (CR) blind to treatment condition throughout the analyses.

| Interventions
The two GSH conditions involved participants following the cognitive behavioral self-help program Overcoming Binge Eating (Fairburn, 2013). Each participant received a printed copy of the programme in the initial face-to-face meeting with their facilitator. Participants in the fGSH condition then received up to 9 further face-to-face sessions (i.e., 10 in total), each lasting 20-25 min and occurring weekly at first. Those in the eGSH condition were asked to email their facilitator at least once a week (in lieu of attending in person) regarding their progress following the programme and then received written asynchronous feedback up to twice a week (Ljotsson et al., 2007). The role of facilitators in the eGSH condition was similar to that in fGSH, including provision of support and encouragement, instilling hope, and maintaining a focus on changing eating behavior (Fairburn, 1998). Eleven facilitators supported GSH; two were clinical psychologists (with doctoral-level training), three were qualified nurses with mental health experience (one of whom had advanced training in CBT), and the remainder were "paraprofessionals" (i.e., individuals with no specific professional background and no formal CBT-specific training).
The mean number of patients allocated to each facilitator was 10.6 although there was significant variability (range = 1-33 patients) as not all facilitators were employed for the duration of the trial.
Nine facilitators provided both treatments and two saw patients within eGSH only. The facilitators were provided with training by the first author, supplemented with a written manual detailing their role (Fairburn, 1998). In addition to face-to-face briefings on the nature of the treatment, facilitators received weekly individual supervision from PEJ, during which they discussed the content of their sessions and their adherence to the manual.

| Assessment of clinical outcomes
The Eating Disorder Examination Questionnaire (EDE-Q; Fairburn & Beglin, 2008) assesses ED features over the past 28 days. Twentytwo items can be combined to create a Global score, which provides an index of overall ED severity (Friborg, Reas, Rosenvinge, & Rø, 2013). In addition, self-reported frequencies of ED behaviors (objective binge eating [OBE], self-induced vomiting, laxative use) are generated. Cronbach's α for the Global score at baseline was 0.89. The sample size calculation described in the protocol paper was based on change in Global EDE-Q score. Objective binge eating was included as a secondary outcome, in addition to self-induced vomiting and laxative use (Linardon & Wade, 2018).
The Clinical Impairment Assessment (CIA; Bohn & Fairburn, 2008) assesses psychosocial impairment secondary to ED features, asking participants to rate the extent to which eating habits have affected several life domains, such as concentration and social relationships.
Sixteen items are scored from 0 to 3, with higher scores indicating greater impairment. Cronbach's α was 0.91.

The 34-item Clinical Outcomes in Routine Evaluation-Outcome
Measure (CORE-OM; Barkham et al., 2001) was designed for use in evaluating the effectiveness of psychological therapies and included as a measure of psychological distress. The measure assesses symptoms experienced over the previous week and items are scored from 0-4. The Total score is calculated as a mean of all items, multiplied by ten. Cronbach's α was 0.94. The Rosenberg Self-Esteem Scale (RSES; Rosenberg, 1989) is a 10-item measure, where higher scores indicate better self-esteem.
Where items from questionnaires were missing, pro-rating (substituting a missing item with a mean of the scale [CIA, RSES] or subscale [CORE-OM]) was used at the data input stage. Treatment completion was also recorded, using attendance at all planned fGSH sessions as indicative of completion. The number of email contacts was recorded for those in the eGSH condition.

| Economic outcomes
Binge-free days were derived from the EDE-Q and used as a primary outcome, with additional explorative analyses using abstinence as an outcome. Information on resource use was collected retrospectively at the end of the treatment (with reference to the previous 3 months) using a questionnaire designed for the current study (for details, see Jenkins et al., 2014). In the base case analysis, resource use and costs were estimated from a partial societal perspective, which included (NHS) healthcare use and costs, patients' opportunity costs (e.g., cost of attendance time), patients' travel time and costs, as well as wider societal costs (i.e., productivity losses operationalized as absenteeism and presenteeism), whenever applicable. The societal perspective did not include carers'/partners' cost, hence we use the term "partial".
Presenteeism, defined as days of reduced productivity, was captured in our questionnaire by participant report. Valuation of presenteeism (i.e., cost), was based on the findings of a sample of individuals with BED (Pawaskar et al., 2017) who reported around 30% of time lost due to impaired productivity. Thus, if a participant in our study reported 10 days of reduced productivity, this was "costed" as the equivalent of 3 days of lost work, using the Human Capital Method.
This involved taking the number of days missed in the last 3 months and multiplying this by the equivalent mean wage (£16.65 per hour; £124.88 per day [USD: $23.35 and $175.14, respectively]). The wage estimate was based on that of a female in her late 20s (ONS, 2017) given the median age and gender distribution of this sample. Those who were unemployed or retired were recorded as zero. In secondary analyses, a healthcare provider (NHS) perspective was adopted, including only NHS resource use and costs. No individuals were admitted as inpatients in the 3 months prior to study entry.

| Sample size
A power calculation based on Global EDE-Q change indicated that 17 individuals per group were required to detect a large-sized effect (with the contingency of wide confidence intervals around prior estimates of effect size; Jenkins et al., 2014). The trial was terminated in advance of the planned completion date as recruitment targets were met more quickly than anticipated.

| Statistical analysis
The primary analysis was by intention-to-treat (ITT) with missing data imputed using multiple imputation by chained equations; 20 datasets were generated for each outcome. Individual analyses on each imputed dataset were combined (Rubin, 1987). The imputation model contained predictors of the missing data mechanism (treatment arm, age, weight, height, ED diagnosis) and all variables contained in the substantive model. Imputation was performed using predictive mean matching regression for count outcomes and truncated regression for continuous outcomes. An available case analysis was performed as a secondary analysis. Data for seven individuals did not contain enough information for imputation and were excluded from subsequent analysis. We therefore refer to this as a modified ITT (mITT) design in the remainder of this paper.
Analysis was by mixed effects model to account for the potential clustering of individual outcomes within facilitators. Treatment arm and baseline value of the outcome were included as fixed effects with a random effect for facilitator. Negative binomial mixed effects regression was used to deal with overdispersion for count outcomes (e.g., OBE frequency) and mixed effects logistic regression for binary outcomes (e.g., treatment completion). All other outcomes were analyzed using mixed effects linear regression. The primary comparisons were between each treatment arm and control at post-treatment in order to investigate treatment effectiveness relative to a control and provide effect estimates that are generalizable to existing studies. The comparison between the two treatment arms is presented, although this is characterized as a secondary aim given limited discussion of non-inferiority design in the protocol. A subgroup analysis based upon baseline diagnosis (BN, BED, other specified feeding and eating disorder [OSFED]) was conducted for the primary outcome by including an interaction with treatment term to the model. Analyses were performed using Stata Version 13.1. Statistical significance was assessed at the two-tailed 5% level.

| Economic analyses
Current guidelines for conducting and reporting economic evaluations alongside trials (Husereau et al., 2013) were followed in order to enhance transparency and completeness of outcome reporting. A partial societal perspective was adopted for the base case analysis to assess cost-effectiveness of the treatments compared to WL. In addition, a healthcare provider perspective was reported in secondary analyses to inform decision-making about use of eGSH in the treat-  Table S1). Incremental cost-effectiveness ratios were estimated and reported. Uncertainty in the cost-effectiveness results was analyzed using cost-effectiveness acceptability curves (CEACs), derived using Fieller's theorem (Chaudhary & Stearns, 1996;Gray, Clarke, Wolstenholme, & Wordsworth, 2011;Polsky, Glick, Willke, & Schulman, 1997;Willan & O'Brien, 1996) over a range of potential threshold values (Fenwick, Marshall, Levy, & Nichol, 2006) that the NHS and wider society might be willing to pay for an additional binge-free day or per abstinent patient (see also Le et al., 2018).

| Data sharing
We did not seek consent to share data in an online repository.

| Participant flow and recruitment
Recruitment ran from August 1, 2013 until June 1, 2016. All individuals eligible to receive GSH (N = 168) were invited to participate in the trial. One quarter declined and were offered TAU outside the trial.
The CONSORT flow diagram (Figure 1) shows the flow of participants through the study, including the subgroup (n = 54) who were allocated to treatment following a waiting period. Table 1 shows the demographic characteristics of the participants in the three study conditions (see also Table S2).

| Treatment completion
The odds of completing fGSH (41/60 = 68.3%) were 3.73 times higher than those of completing eGSH (22/60 = 36.7%) (95% CIs: 1.75-7.94). The distribution of the reasons given for non-completion F I G U R E 1 Participant flow through trial [Color figure can be viewed at wileyonlinelibrary.com] did not differ between the two forms of GSH (see Table S3). Mean (SD) number of contacts was 7.60 (3.40) in the fGSH condition and 8.98 (7.46) in the eGSH condition.

| Secondary outcomes
Objective binge eating was reported in 173 cases at pre-treatment, and this subset was used to calculate cessation from binge eating (i.e., mITT). At the end of the 12 weeks, nine fGSH patients (16.1% of 56) had ceased binge eating compared with 10 (17.9% of 56) in the eGSH condition and one (1.9% of 54) in the Waiting List condition. As shown in Table 2, frequency of OBEs at post-treatment was significantly lower in the two GSH conditions than in the waiting list condition (effects were smaller in mITT analyses compared to available case analyses; see Table 3). Improvements were also seen in CIA total

| Comparison of email versus face-to-face
No statistically significant differences were seen between the two active treatments (fGSH, eGSH) on any of the primary or secondary outcome variables.

| Diagnostic differences
The effect of treatment on Global EDE-Q scores did not differ by baseline diagnosis (all ps > .88).

| Adverse events
Four adverse events occurred (Table S4). One concerned an eGSH patient who was unable to contact her facilitator consistently due to the intermittent blocking of her emails. The patient felt that this affected her motivation to continue in treatment. The other adverse events concerned deterioration in three participants' mental health, each deemed unrelated to treatment, and so these patients remained in the study. (eGSH); see Table 5. Differences in the societal perspective between the two active interventions were largely accounted for by higher rates of presenteeism in the eGSH condition. Taking sampling uncertainty into consideration, the CEACs for the two primary base-case analyses showed that, in view of the joint distribution of incremental mean costs and effects, the probabilities that fGSH and eGSH were cost-effective compared with WL were 93% (99%) ( Table 4 and Figure S1) and 51% (79%) ( Table 4 and Figure S2), respectively, for a willingness to pay of £100 (£150) per additional binge-free day.

| Economic analysis
Corresponding results for the additional explorative analyses are reported in Table 5 and Figures S5 and S6.

| DISCUSSION
Guided self-help is recommended for the treatment of both BN and BED in clinical guidelines such as those from NICE (2017). However, the optimal means of delivering guidance is uncertain. To address this, the current study compared two independent methods of delivery against a waiting list condition. The findings indicated that both methods were superior to a waiting list control condition and that treatment completion was higher when the guidance was provided face-to-face (Beintner et al., 2014). The findings suggest that both methods are preferable to a waiting period, although the risk of attrition from treatment is higher when guidance is provided via e-mail.
be as acceptable as fGSH (see also Beintner et al., 2014;Linardon, Messer, Lee, & Rosato, 2020) although either is likely to be preferable to more limited treatment access (Watson et al., 2018). Whether this difference is due to its asynchronous nature or the absence of faceto-face contact is not clear. The finding highlights the need to explore why levels of attrition differ between these treatments, and if this has any effect on outcome in the longer-term (see Hildebrandt et al., 2017). It is possible that participants did not open their e-mails, although the digital nature of the treatment was made clear early on. In the current study, three participants did not respond to e-mail contact following their assessment and it was unclear whether this reflected impaired access or a desire to discontinue treatment.
The fact that a greater number of patients in the eGSH group were previously in the WL condition may have affected outcomes, although this was not reflected in the proportions of those completing treatment. Nonetheless, future studies should consider counterbalancing given the association between long waiting times and clinical outcomes (Carter et al., 2012) although it is possible that no such effect is seen with shorter waiting times (Pellizzer, Waller, & Wade, 2019). Similarly, although contact with other healthcare professionals was monitored for the economic analyses, it is possible that such uncontrolled factors may have affected adherence.
In terms of cost-effectiveness, from a (partial) societal perspective GSH delivered either face-to-face or via email was found to be costeffective compared to a waiting list, with the costs of treatment being T A B L E 4 Cost-effectiveness analyses with binge-free days as the measure of outcome The thresholds of £2000, £4,000, and £8,000 have been provided for illustrative purposes only because the maximum threshold value that the healthcare provider and society are willing and able to pay for an additional abstinent day is unknown.
The current study, which should be considered explorative given its shortcomings, adds to the growing literature on costeffectiveness evaluations of ED treatments, being-to the best of our knowledge-the first cost-effectiveness study alongside a RCT including individuals with OSFED. However, as noted by other authors (König et al., 2018;Le et al., 2018), the amount wider society or healthcare providers are willing to pay for an additional binge-free day or additional abstinent patient is unknown. One study (Lynch et al., 2010) found that a GSH-based intervention was cost-saving (from the societal perspective) compared to TAU, and another estimated that conventional CBT was associated with an additional cost of €63 per binge-free day compared to an online GSH program (König et al., 2018 Watson et al., 2018). However, such differences may be due to the fact that our study does not include follow-up data, and this is an important limitation. Further work should expand on these preliminary findings, which were based on relatively small numbers and should be considered exploratory, perhaps considering a noninferiority trial with a pre-specified margin to directly compare these treatments in terms of both clinical effectiveness and cost-effectiveness. It is also recommended that additional measures used to estimate health (e.g., preference-based outcomes) are included to inform further cost-effectiveness and cost-utility studies, and that the possible effect of comorbidities is afforded appropriate consideration (Kessler, Ormel, Demler, & Stang, 2003).
Levels of symptom change were in line with the results of a systematic review by Linardon and Wade (2018), which noted binge eating cessation rates in CBT-based self-help treatments of around 14%.
In the present study, at post-treatment, 16.1% of individuals who began fGSH and 17.9% of individuals in eGSH were abstinent from binge eating. These numbers are lower than those found in studies of CBT in naturalistic settings (with around 35% of intent-to-treat samples achieving abstinence; Linardon, Messer, & Fuller-Tyszkiewicz, 2018) although GSH is highly likely to be cost-effective when provided as part of a stepped care model (NICE, 2017). Another noteworthy finding was that, unlike fGSH, eGSH was not significantly more effective in reducing self-induced vomiting than the waiting list condition. This result is consistent with the observation that CBTbased self-help is less effective in reducing purging than binge eating (Linardon & Wade, 2018).
The sample was recruited from a specialist ED service, and thus degree of morbidity was high. The severity of psychopathology and impairment was in line with similar samples (Jenkins, 2013), with over 90 % having a clinically significant level of psychosocial impairment at baseline. Similarly, almost 90 % scored above the clinical cut-off for the CORE-OM, a measure of general psychological distress. At posttreatment, significant improvements were seen on eating psychopathology following both fGSH and eGSH, with large differences compared to the waiting list control condition.
Strengths of the study include the fact that the GSH programme used has been extensively studied and widely translated. In addition, the sample was recruited from NHS clinics and is likely to be representative of patients referred to many specialist treatment centers, especially given the inclusion of individuals with broadly-defined binge eating. Measures with established psychometric properties were used in the study and the analyses followed best-practice guidelines for randomized and health economics studies, including cost estimates from both healthcare provider and societal perspectives. The findings also add to the literature on cost-effectiveness for outpatient treatment of EDs (Watson et al., 2018), and provide detailed information on a variety of resource use and costs relevant to patients with EDs, which could be used as parameters to inform future modeling studies. Inclusion of an estimate of productivity costs was also a strength given that these have often been overlooked in previous studies (Le et al., 2018).
The pragmatic nature of the study also resulted in some limitations. First, the waiting list control condition was included to control threats to internal validity but should not be considered as equivalent to no treatment and may inflate between-group effects (Mohr et al., 2009). Second, as noted above, the time horizon for the economic analysis did not include follow-up, and is likely to underestimate the cost impact of interventions to the patient, society, and healthcare provider. A more nuanced assessment of productivity costs would have been desirable (see Brouwer, Koopmanschap, & Rutten, 1998), so economic analyses are considered as mainly explorative. Although the re-randomization design has been evaluated as an efficient alternative to parallel designs, this is a relatively new idea and recommendations regarding best practice are needed (Kahan, 2016).
This trial is the first to compare the provision of GSH in person, GSH via email, and a control condition. The findings indicate that both forms of GSH were clinically effective and likely to be cost-effective compared to a waiting list condition and thus are valuable brief interventions for the treatment of EDs.

ACKNOWLEDGMENTS
We are grateful for the support of all staff and patients who were involved in the study. C.G. Health NHS Foundation Trust. The research did not receive any specific grant from funding agencies in the public, commercial, or not-forprofit sectors.

CONFLICT OF INTEREST
C.G.F. developed the treatment manual and contributed to the design of the trial. All other authors have nothing to declare.

ETHICS STATEMENT
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. Informed consent was obtained from all participants.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request. The data are not publicly available due to privacy or ethical restrictions (participants were not informed that this would be the case).