Therapist adherence in the strong without anorexia nervosa (SWAN) study: A randomized controlled trial of three treatments for adults with anorexia nervosa

Objective To develop a psychotherapy rating scale to measure therapist adherence in the Strong Without Anorexia Nervosa (SWAN) study, a multi-center randomized controlled trial comparing three different psychological treatments for adults with anorexia nervosa. The three treatments under investigation were Enhanced Cognitive Behavioural Therapy (CBT-E), the Maudsley Anorexia Nervosa Treatment for Adults (MANTRA), and Specialist Supportive Clinical Management (SSCM). Method The SWAN Psychotherapy Rating Scale (SWAN-PRS) was developed, after consultation with the developers of the treatments, and refined. Using the SWAN-PRS, two independent raters initially rated 48 audiotapes of treatment sessions to yield inter-rater reliability data. One rater proceeded to rate a total of 98 audiotapes from 64 trial participants. Results The SWAN-PRS demonstrated sound psychometric properties, and was considered a reliable measure of therapist adherence. The three treatments were highly distinguishable by independent raters, with therapists demonstrating significantly more behaviors consistent with the actual allocated treatment compared to the other two treatment modalities. There were no significant site differences in therapist adherence observed. Discussion The findings provide support for the internal validity of the SWAN study. The SWAN-PRS was deemed suitable for use in other trials involving CBT-E, MANTRA, or SSCM. © 2015 The Authors. International Journal of Eating Disorders Published by Wiley Periodicals, Inc. (Int J Eat Disord 2015; 48:1170–1175)


Introduction
In clinical trials, measuring the extent to which therapists implement treatments in accordance with their respective protocols is essential, in order for conclusions regarding treatment efficacy to be confidently determined. 1,2 Assessing therapist adherence also provides an indication whether treatments under investigation can be differentiated. 2,3 Therapist adherence is commonly measured by reviewing recordings of therapy sessions, and rating whether core treatment components are observed using a suitable rating scale. These rating scales also tend to include items that measure nonspecific factors, such as therapist empathy. 1,4,5 A well-known adherence scale is the Collaborative Study Psychotherapy Rating Scale (CSPRS), which was designed to assess whether therapists involved in the National Institute of Mental Health Treatment of Depression Collaborative Research Program adhered to the four treatments being compared (Unpublished manuscript). 6,7 Because there are no "gold standard" treatments for adults with anorexia nervosa (AN), it is particularly important to ensure treatments involved in randomized clinical trials (RCT) are implemented in line with their specifications. Only one study has examined therapist adherence in a RCT of treatments for AN. 8,9 McIntosh et al. 9 used a modified version of the CSPRS (CSPRS-AN) to investigate adherence to, and differentiation between Cognitive Behavioural Therapy, Interpersonal Psychotherapy and Specialist Supportive Clinical management (SSCM). Results indicated that the 90-item CSPRS-AN was able to differentiate treatments reliably, and no differences in therapist adherence were found.
The current study examined therapist adherence in the Strong Without Anorexia Nervosa (SWAN) study, a multi-center RCT comparing three different psychological treatments for adults with AN 10 : Enhanced Cognitive Behavioural Therapy (CBT-E) 11 ; Maudsley Model of Anorexia Nervosa Treatment for Adults (MANTRA) 12,13 ; and SSCM. 14 The aims were to (1) develop and test a measure of therapist adherence for use in the SWAN study, (2) assess therapist adherence to the treatments under investigation, and (3) examine inter-site differences in therapist adherence.

Method
Participants Participants in the SWAN study were 120 individuals (97.5% female) recruited at three Australian sites: Perth (n 5 80); Adelaide (n 5 21), and Sydney (n 5 19). Participants in the current study were 64 females drawn from this broader participant pool (Age: M 5 26.72, SD 5 10.21). Ethics approval was obtained and all participants provided informed consent.
Inclusion criteria for the SWAN study were: body mass index (BMI) 14.0 and 18.5; aged 17 years and over; and meeting diagnostic criteria A and B for AN in DSM-IV-TR. 15 Exclusion criteria were severe medical or suicidal risk, inability to complete full treatment course, and current use of olanzapine or other active psychotherapy. Participants were randomized to one of the three treatments. Number of treatment sessions allocated was titrated according to BMI (40 sessions for BMI: <16; 30 sessions for BMI: 16-17.5; 25 sessions for BMI: 17.5-18.5).
Therapists were psychologists (n 5 8) with at least two years of experience in delivering specialized psychological treatments for eating disorders. Therapists delivered all three treatments and received training by the treatment developers prior to study commencement. Thera-pists attended 2 h of supervision weekly with chief investigators (SB, TW, PH).
Raters were two female postgraduate clinical psychology students (ET, LA). Raters received 15 h of training in the use of the SWAN Psychotherapy Rating Scale to ensure consistent interpretation of scale items.

Measures
Audiotapes. Ninety-eight audiotapes of full therapy sessions were randomly selected by the trial co-ordinator (KA; Perth 5 72 audiotapes, Adelaide 5 9; Sydney 5 17). Audiotapes were selected from the early-mid treatment phase and the mid-late treatment phase. One audiotape was excluded due to crisis management circumstances. Audiotapes were selected to ensure that all treatments were comparably represented (CBT-E 5 30 audiotapes; MANTRA 5 32; SSCM 5 35).

SWAN Psychotherapy Rating Scales (SWAN-PRS).
The SWAN-PRS was developed by adapting the CSPRS-AN 9 to form a 52-item measure with 15 CBT-E specific items; 10 MANTRA specific items; 4 SSCM specific items; 11 overlap items; and 12 Non-Specific items. Items are rated on a Likert-type scale ranging from 1 (not at all) to 7 (extensively). Higher scores indicate greater adherence to the specified therapist behavior.

Procedure
Both raters co-rated 48 audiotapes to provide a measure of inter-rater reliability before Rater 1 proceeded to rate the remaining 50 audiotapes independently. Raters were blind to treatment type.

Factor Analysis
Principal Axis Factor Analysis with oblique rotation was used to examine the underlying factor structure of the SWAN-PRS to allow robust, treatment-specific subscales to be determined. The analysis yielded a final SWAN-PRS that included 8 CBT-E items, 9 MANTRA items, 4 SSCM items, and 12 Non-Specific Factor items. See Appendices A and B for further details pertaining to factor analyses and inter-rater reliability.

Agreement Between Treatment Allocation and Treatment Classification
Mean subscale scores for CBT-E, MANTRA, and SSCM were calculated for all audiotapes. The highest subscale score for each audiotape was used to determine treatment classification (e.g., audiotapes were classified as SSCM, if the SSCM subscale score was greater than the MANTRA or CBT-E subscale scores). Eighty-six percent (N 5 83/97) of total audiotapes were correctly classified as the treatment delivered. For CBT-E (n 5 30), 90.0% tapes were classified accurately, while 6.7% were misclassified as MANTRA, and 3.3% as SSCM. For MAN-TRA (n 5 32), 81.2% were correctly classified, while 6.3% were misclassified as CBT-E and 12.5% as SSCM. For SSCM (n 5 35), 85.7% were accurately classified, while 8.6% were misclassified as MAN-TRA and 5.7% as CBT-E. There were no significant differences between treatments with regards to agreement between treatment allocation and treatment classification, v 2 (df 5 2) 5 0.961, p 5 .619.
One-way ANOVA was used to compare subscale scores for each treatment category. There was a significant overall difference found between allocated treatment groups on the CBT-E (F(2) 5 97.68, p < .001), MANTRA (F(2) 5 41.50, p 5 .001), SSCM (F(2) 5 57.68, p < .001), and Non-specific (F(2) 5 7.58, p < .05) subscales. Post hoc analyses (Games-Howell tests) indicated that (i) the CBT-E subscale score was significantly higher for CBT-E than for MANTRA or SSCM (p' s < .001), with no significant difference between MANTRA and SSCM (p 5 .754); (ii) the MANTRA subscale score was significantly higher for MANTRA than for CBT-E and SSCM (p' s < .001), with no significant difference between CBT-E and SSCM ( 5 .99); (iii) the SSCM subscale score was significantly higher for SSCM than for CBT-E and MANTRA (p' s < .001) and the MANTRA score was significantly higher than the CBT-E score (p < .05); (iv) the Non-Specifics subscale score was significantly higher for CBT-E (M 5 5.14, SD 5 0.44) than for SSCM (p < .01) and MANTRA (p < .05) but there was no significant difference between MANTRA and SSCM (p 5 .76). These results are illustrated in Figure 1.

Site Differences
Fisher's Exact tests indicated that there were no significant differences in overall adherence between sites (p 5 .52), nor for CBT-E (p 5 .99), MANTRA (p 5 .36) or SSCM, (p 5 .61), when considered separately. Kruskal-Wallis one-way ANOVA was used to compare subscale scores for each treatment according to site. There were no significant differences between sites in: CBT-E subscale scores for CBT-E therapy sessions, F(2) 5 0.37, p 5 .691; MANTRA subscale scores for MANTRA sessions, F(2) 5 0.36, p 5 .70; or SSCM subscale scores for SSCM sessions F(2) 5 0.37, p 5 .69. Subscale scores were appropriately higher for the allocated treatment compared to the other treatments. There were also no significant differences between sites in Non-Specifics subscale scores F(2) 5 0.26, p 5 .08.

Discussion
The CSPRS-AN 9 was successfully adapted to form a therapist adherence measure that can reliably distinguish between, and measure therapist adherence to, CBT-E, MANTRA, and SSCM. Using the SWAN-PRS, we were able to demonstrate very high agreement between actual treatment allocation and treatment classification in the SWAN study, with 85.6% of audiotapes being correctly classified by independent raters. Further, SWAN-PRS ratings demonstrated significantly higher mean therapyspecific subscale scores appropriate to the actual allocated treatment modality. We were also able to demonstrate that there were no inter-site differences in therapist adherence with regards to either treatment classification or mean subscale ratings for CBT-E, MANTRA, SSCM or the Non-Specifics subscale. These findings confirm that therapists adhered strongly to treatment protocols; that the three treatment modalities could be reliably distinguished; and that therapist adherence was not influenced by treatment site. This not only provides evidence regarding the relative ease with which treatments can be disseminated across sites, but also provides evidence for the internal validity of the study.
Limitations of the present study included the use of only two raters to establish inter-rater reliability, the use of data generated by only one rater for the remaining analyses, and the uneven distribution of audiotapes available for each treatment site. In addition, whilst the SWAN-PRS was developed to measure the extent to which therapists demonstrated behaviors consistent with treatment protocols, it did not measure therapist competence, or the extent to which therapists delivered treatments according to an acceptable standard. 3,11 Therapist competence is another important aspect of treatment integrity in outcome research and should form the focus of future research.
In summary, measuring therapist adherence in the SWAN study was essential to provide evidence of the scientific quality of the trial. The rigorous development and piloting process involved in the development of the SWAN-PRS resulted in a reliable measure of therapist adherence to the three treatment modalities under investigation. The SWAN-PRS can now be used to demonstrate therapist adherence in other contexts involving these treatments. It could also generate useful information to assist with therapist training.
The item inter-correlation matrix was examined and items that correlated weakly (<0.40) with other items on the same subscale were removed. The remaining items were entered into a Principal Axis Factor analysis with oblique rotation, with three fac-tors requested. The three factors explained 60.1% of the variance, although the third factor was weak (Eigenvalues 6.98, 4.94, and 1.23). The first two factors were clearly representative of the CBT-E and MANTRA subscales. As expected, the four SSCM Did the therapist ask the client to report specific thoughts or beliefs (e.g., dietary rules, thoughts regarding weight or shape) that the client experienced either in the session or in a situation that occurred prior to the session?
CBT-E 20.01 0.85 0.71 Did the therapist collaboratively develop a formulation with the client, making reference to the over-evaluation of eating, weight and shape, strict dietary restraint, mood intolerance, and self-maintaining feedback loops, which allowed for a shared understanding of the factors maintaining the eating disorder and specific targets for treatment, OR explicitly refer to such a formulation as previously developed? items tended to cross-load across the first two factors rather than loading strongly on a third factor. Examination of the item inter-correlation matrix showed that while the four SSCM items were all highly and positively correlated (average correlation 5 .56), they were negatively and weakly correlated with the CBT-E and MANTRA items (all < 36). Therefore it was decided to remove the SSCM items and re-run the factor analysis with a two factor solution requested. Fifty-six percent of the variance was explained, with 8 items loading strongly (>.5) onto the CBT-E factor (eigenvalue 3.28), and nine items loading (>.5) onto the MAN-TRA factor (eigenvalue 3.28). Table B1 displays the items, factor loadings, and communalities for the rotated factors. A reliability analysis was used to assess internal consistency, and results revealed an acceptable Cronbach's alpha for the CBT-E (a 5 0.89) and MANTRA (a 5 0.91) factors. An acceptable Cronbach's alpha 18 was also found for the four-item SSCM subscale (a 5 0.76). The SSCM and Non-Specific factor subscale items are listed in Table B2. To what extent did the therapist follow the client's lead in generating issues for discussion? SSCM To what extent did the therapist collaboratively generate a list of target symptoms with the client, OR refer back to the target symptom checklist and review functioning in relation to this list? To what extent did the therapist give specific advice or suggestions regarding eating or other issues? To what extent did the therapist deal with a problem without use of specific Cognitive Behavioural/Cognitive-Interpersonal/Motivational Interviewing techniques? Was the therapist empathetic towards the client (i.e., did they convey an intimate understanding of and sensitivity to the client's experiences and feelings? Non-specific Factors Level of Verbal Activity: How much did the therapist talk? How much did the therapist direct or guide the session in a subtle way? How much rapport was there between the therapist and client (i.e., how well did the therapist and client get along? Did the therapist convey warmth? How involved (e.g., demonstrating interest, encouraging etc.) was the therapist? Did the therapist appear to allow silence to continue (or use minimal encouragement such as "uh huh," "mmhmm," "okay") as a means of encouraging the client to talk? Was the therapist supportive of the client by acknowledging the client's gains during therapy OR by reassuring the client that gains will be forthcoming? Did the therapist actively attempt to engage the client in working together to explore therapeutic issues? Summarizing: Did the therapist summarize OR encourage the client to summarize session content of a previous or the current session? Did the therapist convey that she understood the client's problems and is able to help the client? How much did the therapist direct or guide the session in an explicit way?