Sustainability of cognitive behavioural interventions for chronic back pain: A long‐term follow‐up

There is a significant research gap with respect to the long‐term sustainability of psychological treatment effects in chronic pain patients. This study aimed to investigate long‐term treatment effects of two psychological treatments: cognitive behavioural therapy (CBT) as a broad‐spectrum approach and exposure as a specific intervention for fear‐avoidant pain patients.


| INTRODUCTION
Chronic pain is inherently persistent and has low recovery rates.A general population study with a 4-year follow-up investigated the prevalence and distribution of chronic pain in more than 1600 patients.The results showed that 79% of those with chronic pain at baseline still had it at the follow-up, and the overall prevalence of the pain further increased (Elliott et al., 2002).
Among possible treatments, cognitive behavioural therapy (CBT) provides the most established evidence base for reducing pain-related disability, emotional distress and pain intensity (Hofmann et al., 2012;Williams et al., 2020).CBT has a problem-solving orientation and helps patients learn adaptive strategies to better cope with their pain.Long-term maintenance, which means that newly learned behaviours can be transferred to novel situations and will be sustained over time, are assumed to depend on self-management and self-efficacy (Turk, 2003).
Fear-avoidant patients with pain who were exposed in vivo to a tailored CBT-based intervention showed short-(post-treatment) and medium-term (6-month follow-up) improvements in functioning, pain intensity, fear avoidance and catastrophising (Glombiewski et al., 2018;Leeuw et al., 2008).
There is initial evidence that specific treatment effects are stable across different situations (Riecke et al., 2020).A question of critical importance that remains is the stability of treatment gains over longer time periods (Borsook et al., 2019;Turner, 2013).
The absence of long-term follow-up data in CBT interventions is evident in Williams et al.'s (2020) review, with a follow-up survey conducted in fewer than half of the studies and with most studies being of only moderate to low quality.Furthermore, time periods showed a maximum length of 6-12 months after treatment.Effect sizes for CBT (disability and pain intensity) at follow-up were only small or very small and, compared with an active control group, no evidence of benefit was found.Studies with longer follow-up periods have mainly investigated interdisciplinary rehabilitation programmes based on psychological models in inpatient settings (Bergström et al., 2010;Boonstra et al., 2021;Groot et al., 2019;Morley et al., 2008;Volker et al., 2017).Furthermore, Vowles et al. (2011) performed a 3-year follow-up of an interdisciplinary acceptance and commitment therapy.Monodisciplinary studies have mainly focused on different kinds of psychologically informed physiotherapy (Emilson et al., 2017;Rasmussen-Barr et al., 2009;Vibe Fersum et al., 2019).Two recent randomized controlled trials (RCT)s examined the long-term effects of internet-delivered CBT (Murray et al., 2020) and acceptance and commitment therapy (Rickardsson et al., 2021) interventions, demonstrating that improvements in disability could be maintained up to 1 year after treatment.To our knowledge, no RCT has yet investigated the long-term effectiveness of monodisciplinary, purely psychological CBT approaches.
To address this research gap, in the current study we investigated the long-term treatment gains of CBT-based interventions in persons with chronic low back pain (CLBP) in an outpatient psychological setting.
We extend the results of our previous RCT (Glombiewski et al., 2018) by providing long-term data for primary and secondary outcomes up to 8 years after treatment.We also examined outcomes of treatment dropouts, investigated reliable and clinically significant changes and performed responder analyses.

| Participants and settings
The present study is a long-term follow-up of Glombiewski et al.'s (2018) RCT on the effectiveness of in vivo exposure in individuals suffering from CLBP with elevated levels of fear and avoidance.In the original study, the authors compared in vivo exposure as a specific intervention with CBT representing a broad-spectrum approach.Furthermore, we were interested in dose effects and compared a short (10 sessions) and a long version (15 sessions) of exposure treatment.Participants were assessed across three time points: pre-treatment, post-treatment and at a 6-month follow-up (6MFU).Further information about the design and conduct of the main study are provided in the study protocol (Riecke et al., 2013) and are consistent with CONSORT guidelines (see Figure 1).
Overall, we found medium to large effect sizes in reducing pain-related disability, pain intensity and emotional distress from pre-to post-treatment and from pre-treatment to 6MFU both for exposure and CBT.These results provided promising support for the shortand medium-term benefits of two CBT-based psychological treatments in fear-avoidant back pain sufferers.
In the present study, we wanted to extend these results with long-term data up to 8 years after treatment.We were primarily interested in general treatment effects rather than specific treatment or dose effects.Our former results showed short time benefits for the short version of exposure treatment, which declined with time.Accordingly, we assumed that dose effects continue to lose their influence after even longer follow-up periods, and we merged the short and long versions.We further conducted t tests for all primary outcomes at a long-term follow-up, in which revealed no differences between the short and long exposures.
We contacted all 67 participants who had completed treatment as well as all 21 participants who dropped out of treatment.The inclusion criteria for the RCT were disabling CLBP and elevated levels of fear avoidance (for more details, see Riecke et al., 2013).
The exclusion criteria were acute back pain, back surgeries during the past 6 months or planned surgeries (red flags).The included participants mostly met the criteria for the new diagnosis of chronic primary pain according to the International Classification of Diseases (World Health Organization; see also Nicholas et al., 2019).Our participants were moderately disabled, with a mean pain duration of 15 years.Fear avoidance scores were also moderate (mean Tampa Scale of Kinesiophobia score = 40).More than half of the sample either received a disability pension or were on sick leave.Most of the participants also presented depressive symptoms, which were not the primary diagnosis.Because in Germany psychotherapy for chronic pain patients is still not very common, we mainly informed primary care units about our programme, and they referred their patients to us, or patients learned about the study via local newspapers.In Germany, chronic pain patients are mainly treated in inpatient settings, either rehabilitation or interdisciplinary pain management programmes.Accordingly, more than 60% of our study sample was already receiving at least one form of rehabilitation.The study was approved by the local ethics committee of the psychology department of the University of Marburg (2019-12k).The Clinical Trials Identifier of the original RCT is NCT01484418.The long-term follow-up was conducted between April and June 2019.

| Treatments
The two interventions, in vivo exposure (EXP) and CBT, are fully described in our study protocol and so are described only briefly here.The EXP condition consisted of two versions: a short version (10 sessions) and a long version (15 sessions).The four exposure elements were (a) psychoeducation information about chronic pain, (b) an individualized fear avoidance model, (c) a fear hierarchy and (d) in vivo exposure (short: 5 sessions, long: 10 sessions).The CBT group consisted of 15 weekly sessions.
The intervention comprised educational information about chronic pain and cognitive, relaxation (progressive muscle relaxation) and behavioural (graded activity) techniques.
The PDI is a sensitive and responsive questionnaire, the standard error of the mean was 1.2 (Beemster et al., 2018;Tait et al., 1990).To provide a more behaviour-specific measure of disability, we also administered the QBPDS, which assesses 20 basic daily activities rated on a 6-point Likert scale indicating the level of difficulty of each activity (0 = not difficult at all, 5 = unable to do).The German version has strong psychometric properties (Riecke et al., 2016).The scale is able to detect relatively small changes over time (Kopec et al., 1996).
As recommended by the Initiative on Methods, Measurement and Pain Assessment in Clinical Trials (IMMPACT), we assessed pain intensity using an 11-point numeric rating scale that covers average pain intensity during the previous 4 weeks (Dworkin et al., 2005).The standard error of measure was 1.02, and 2 points is recommended as the minimum detectable and clinical important change score (Childs et al., 2005;Ostelo et al., 2008).

| Secondary outcome measures
We assessed pain catastrophizing with the Pain Catastrophizing Scale (PCS; Sullivan et al., 1995), a 5-point scale that measures three dimensions of catastrophizing (helplessness, magnification and rumination) on 13 items.Answers are scored on a five-point rating scale (0 = not at all, 4 = all the time).The German version has demonstrated good psychometric properties (Meyer et al., 2008).
For the assessment of pain-related anxiety, we used the Pain Anxiety Symptom Scale (McCracken et al., 1992;Quint, 2007).The scale consists of 20 items intended to measure four components of pain-related anxiety: (a) fearful appraisal of pain, (b) cognitive anxiety, (c) physiological anxiety and (d) escape and avoidance behaviour.
Respondents are asked to rate each item on a 6-point Likert scale that ranges from 0 (never) to 5 (always).The scale has very good validity and reliability (Roelofs et al., 2004).
Psychological flexibility and avoidance were measured using the Psychological Inflexibility in Pain Scale (Barke et al., 2015;Wicksell et al., 2010).This scale consists of 12 items rated on a 7-point scale that ranges from 1 (never true) to 7 (always true).The German version has good psychometric properties (Barke et al., 2015).
We also examined coping strategies with the wellestablished German self-rating questionnaire FESV-BW (Fragebogen zur Erfassung von Schmerzverarbeitung; Geissner, 1999) because teaching active coping strategies is a core element of CBT treatment.We used the behavioural pain coping scale of the questionnaire.This consists of three subscales: (a) Mental Distraction (e.g.'When I feel pain, I distract myself by listening to pleasant music'), (b) Counter-Activities (e.g.'When I feel pain, I conceal them by just continuing with my work') and (c) Relaxation Techniques (e.g.'When I feel pain, I apply a relaxation technique, e.g.autogenic training, progressive muscle relaxation').Answers are specified on a 6-point Likert scale that ranges from 1 (not at all) to 6 (absolutely).Homogeneity of the scales (Cronbach's alpha ranged from 0.71 to 0.81) and test-retest reliability were good.
We assessed depression and anxiety with the Hospital Anxiety and Depression Scale (Hermann et al., 1995;Zigmond & Snaith, 1983).This 14-item instrument measures depressive and anxiety symptoms in the past week on a 4-point scale with item-specific response categories.

| Statistical analyses
All statistical analyses were performed using IBM SPSS (Version 27).We used descriptive statistics to characterize patients in both treatment groups (CBT and EXP) at long-term follow-up.We first performed a series of t and chi-squared tests for categorical variables to examine potential baseline differences between participants who provided long-term follow-up data and those who did not.To investigate possible differences between dropouts (participants who dropped out of treatment) and completers, we calculated a series of t tests for all primary outcomes at all four time points.

| Effect sizes
We calculated effect sizes within groups for all primary and secondary outcomes by dividing the mean difference between two measurements by the standard deviation of change.We used bias-corrected Hedges's g values instead of uncorrected Cohen's d values with degrees of freedom for paired samples.

| Clinical meaningful change
Clinical important change estimates, which are complementary to statistical significance, evaluate individual change and treatment responses providing clinical meaningful changes (Jacobson & Truax, 1991).The Jacobson and Truax method (JT) represents a change score, which is based on intra-individual pre-and post-treatment comparisons (McGlinchey et al., 2002).Clinically significant or meaningful change is defined as returning to normal functioning and that improvements as a result of treatment are statistically reliable (Jacobson et al., 1999).Jacobson and colleagues proposed three mathematical criteria for demonstrating that participants had moved from the dysfunctional to the functional range during the course of treatment.The proposed methods were (a) the participant's score changed 2 standard deviations away from the mean of the dysfunctional population; (b) the participant's score fell 2 SDs below the mean for the normal population; and cut-off point (c), when the participant's post-test score moved closer to the mean of the functional population compared with the dysfunctional population mean (Jacobson et al., 1999).Criterion (c) was a weighted midpoint between the means of a functional and dysfunctional distribution (McGlinchey et al., 2002).
Second, we calculated the reliable change index to guarantee that changes are not due to measurement error (Christensen & Mendoza, 1986).On the basis of these two criteria, participants were classified into improved (passed both cut-offs), deteriorated (passed cut-offs in a worsening direction) or unchanged.
As no normative data were available for the QBPDS we used a change criterion of ≥14.20 for clinical and statistical meaningful change in movement-related disability (QBPDS) based on the population mean (M = 45.6,SD = 15.66) of the original validation study by Kopec et al. (1996).The clinical change criterion (a) for functional disability (PDI) was 10.51 (based on the dysfunctional population mean, M = 33.69,SD = 11.59) and (c) 20.13 (based on both the clinical distribution and normative data; M = 6.8,SD = 11.4).For the numerical pain intensity scale we followed the recommended change score of more than 2 points (or 20%) on the numerical rating scale (Childs et al., 2005;Ostelo et al., 2008).

| Multilevel analyses
For each outcome, we fitted a longitudinal multilevel model with measurement occasion (Level 1) nested within patients (Level 2).Therapy version (EXP vs. CBT; Level 2) and time of measurement (pre-treatment, post-treatment, 6MFU and long-term follow-up; Level 1) were included as dummy-coded predictors.Regarding the therapy version, we coded exposure therapy (0) as the reference therapy that was contrasted with CBT (1).With respect to time, we specified three coding variables, with the long-term follow-up as the reference category that was contrasted with pre-treatment (1 in dummy time 1), post-treatment (1 in dummy time 2) and 6MFU (1 in dummy time 3).All models also included all cross-level interactions to model differences between exposure therapy and CBT changes across time.

| RESULTS
In total, 64 participants completed the questionnaires, yielding an overall response rate of 73%.We included 48 out of 67 completers (71%) and 16 out of 21 dropouts (76%) who filled out only the primary outcome measures (see Figure 1).

| characteristics
The demographic characteristics of both groups are listed in Table 1.Time from treatment end to long-term followup ranged from 5 to 8 years (Mdn = 6).After treatment termination, 8% of the participants indicated that they used pain medication, and 29% said they were receiving physiotherapy.Chi-squared tests for categorical variables (e.g.sex, occupational status, further treatments) and t tests for continuous measures (e.g.age, pain duration, time from pre-treatment to long-term follow-up) revealed no significant differences.
3.2 | Baseline differences (pre-treatment group differences) We calculated chi-squared and t tests for all relevant baseline measures to ascertain whether there are differences between the participants who completed the long-term follow-up and those who did not.The groups did not differ in regard to any pre-treatment clinical characteristic.An overview of means, standard deviations and effect sizes of all outcome variables over time is provided in Table 2.In regard to demographic variables, we found no differences except for age, with older participants in the group of long-term follow-up completers.
3.3 | Outcomes from pre-treatment to long-term follow-up Findings for the fitted multilevel regression models are presented in Table 3 for primary outcomes and secondary outcomes.Most important, the effects for exposure showed that all primary and secondary outcomes had significantly better values at long-term follow-up than before treatment.Slopes for dummy time 1 showed that values significantly increased from the reference time (long-term follow-up) to pre-treatment, with the exception of the inverse-coded secondary outcome, FESV-BW.Furthermore, the comparison of post-treatment and longterm follow-up revealed no significant changes meaning that treatment effects stayed stable.Only the PCS scores showed a significant increase because participants showed the lowest scores in catastrophizing directly after treatment (post-treatment).In addition, the results comparing outcomes at 6MFU and long-term follow-up further support the stability of treatment effects given that they revealed no significant changes.Only for disability (measured with the QBPDS) was a significant increase of disability over time found.Rates of catastrophizing decreased from 6MFU to long-term follow-up.
In addition, there were significant cross-level interactions.For movement-related disability (QBPDS), differences between time points are in favour of the CBT condition, which had lower disability levels compared with the EXP group.Figure 2 shows the course over time for all primary outcomes.Furthermore, regarding PCS, the increase from the reference time (long-term follow-up) to pre-treatment is significantly stronger for CBT than for EXP.Finally, for the FESV-BW, differences between the long-term follow-up and post-test, as well as between the long-term follow-up and the 6MFU, were significantly worse for the CBT group than for the EXP group.Nevertheless, values for participants in the CBT group were still better compared with the EXP group.

| Clinically meaningful and reliable change
An overview of the proportions of reliable and clinically important improvement for both treatment conditions is given in Table 4.The overall response rate was 77%, including clinical and reliable change in at least one out of three outcomes.Both the EXP and the CBT groups showed meaningful reliable and clinically meaningful change rates at the long-term follow-up for all primary outcomes.The mean proportions of cases meeting reliable and clinically improved criteria (in all primary outcomes) were 44% (range: 30%-59%) in the EXP group and 59% (range: 38%-in the group.A chi-squared analysis revealed that these differences between groups were not significant.Clinical and reliable deterioration rates were 7% for the EXP group (range: 4%-11%) and 5% for the CBT group.

| Dropout evaluation
A series of t tests revealed that dropouts (n = 16) showed no differences in primary outcomes from participants who completed treatment (n = 48) at baseline, post-treatment and the 6MFU, except for at the long-term follow-up.At this measure point, significant differences were found for pain intensity, t(62) = 3.84, p < 0.001; movementrelated disability (QBPDS), t(62) = 3.76, p < 0.001; and functional disability (PDI), t(62) = 2.63, p = 0.011, with higher mean scores for the dropout group, suggesting that dropouts' pain worsened over time.These results were complemented by chi-squared analyses that compared improvement and deterioration rates between dropouts and completers.Significant results emerged for improvement rates from pre-treatment to the long-term follow-up in QBPDS (χ 2 = 7.31, df = 2, p = 0.026) and pain intensity (χ 2 = 7.5, df = 2, p = 0.024).Significant differences in deterioration rates were noted in regard to pain intensity and QBPDS both for the difference between post-treatment and long-term follow-up and between the 6MFU and the long-term follow-up.Overall, the results revealed that dropouts showed a more clinical and reliable decline over time compared with participants who completed treatment.

| Identification of potential predictors of treatment responses
None of the outcomes measured at baseline (pre-treatment) predicted the long-term outcomes.Furthermore, we regressed the long-term follow-up values of the primary outcomes onto their difference at pre-test minus post-test separated for exposure and CBT.Regarding pain intensity, the slope of the regression was neither significant for EXP, b = 0.25, SE = 0.22, t( 25

| DISCUSSION
The primary purpose of present analyses was to determine long-term outcomes of two CBT-based treatments (CBT and in vivo exposure) for patients suffering from CLBP.In an outpatient setting, we investigated whether treatment gains remained stable after a long-term followup of up to 8 years after treatment.First and foremost, the results indicated that participants were still better than before treatment in all primary and secondary outcome parameters.Furthermore, short-term treatment gains (post-treatment to long-term follow-up) of all primary outcomes (disability and pain intensity) seemed to remain stable over time.Results for long-term treatment effects (6MFU to long-term follow-up) also indicated stability for pain intensity and functional disability (measured with the PDI).Group differences in movement-related disability (measured with QBPDS) emerged for the period between the 6MFU and long-term follow-up, with an increase in the EXP group.For secondary outcomes (emotional distress, psychological flexibility, catastrophizing and pain-related fear), both approaches suggested stability over time.
Although our results indicate sustained treatment effects up to 8 years after treatment, improvements in movementrelated disability were not maintained in participants who underwent exposure treatment.One explanation for this finding might be that the exposure treatment comprised both a short and a long version.In our main study we found that short exposure (10 sessions) was superior to long exposure and CBT directly after treatment, but at the 6MFU these superior effects had disappeared (Glombiewski et al., 2018).It might be possible that 10 weeks of treatment are too short to create a sustainable treatment effect over time.Accordingly, it would be interesting to investigate whether booster sessions could maintain these short-term treatment gains.Furthermore, exposure seems to be effective over time in reducing pain-related fear given that fear levels remained stable over time.Because chronic pain conditions are inherently complex, there are more issues other than fear avoidance, such as coping with pain (Turk, 2003), stress reduction skills and emotion regulation (Boersma et al., 2019), that may be important for sustained treatment effects.Our CBT intervention, representing a broadspectrum approach, showed maintained treatment effects in all primary outcomes.These results are in line with a single case study that found delayed treatment effects for a CBT intervention (Schemer et al., 2018).Moreover, the results for catastrophizing suggest a superiority for CBT, as improvements from pre-treatment to the long-term followup were greater in the CBT group.A further CBT-specific effect was found for coping behaviours.Whereas in the exposure group baseline values stayed almost the same, there was a significant improvement in the CBT group.There was a decrease at the long-term follow-up, but coping levels were still higher compared with baseline.This can be clearly explained by the distinct focus of our CBT intervention on the development of different coping mechanisms (e.g.cognitive restructuring, attention shifting, relaxation techniques, graded activity) in the face of pain.Thus, our CBT intervention was successful in improving pain selfmanagement given that patients were able to use different coping strategies after treatment, which might explain the promising long-term effects of reduced pain-related disability.
On the one hand, these results confirm specific treatment effects that Glombiewski et al. (2018), and on the other hand, they indicate that specific treatment effects can be maintained over long time periods.Overall, the response rate of 77% with clinical and reliable change in at least one out of three outcomes was quite high.Between 40% and 68% of the participants experienced an improvement, whereas 4%-13% showed deterioration.For this long time period, change rates are still high, as well as compared with previous studies that found improvement rates between 30% and 40% with shorter time intervals (Nicholas et al., 2017;Vowles et al., 2011).Dropout analyses might further support the stability of treatment effects showing significant higher deterioration rates over time and significant differences to both treatment groups at the long-term follow-up.The results this study should be interpreted with a consideration of its limitations.As most longitudinal evaluations, this study was subject to a loss of participants over the follow-up.Our sample size was small but still sufficient for accurate estimations in longitudinal multilevel regression analysis.Studies that have used simulation showed that a sample size of 10 is enough if fixed effects are of interest, as was the case in our study (Maas & Hox, 2004).A main limitation of our study is the lack of a control group, which would allow sound conclusions with regard to functioning over time in the absence of treatment.Indeed, our sample of dropouts did not replace a regular control group, but it provided some interesting data showing that, without appropriate treatment, pain intensity and pain-related disability might worsen over time.Data from longitudinal population-based studies have shown that chronic pain persists over time and that most individuals are stable in their pain level (Elliott et al., 2002;Glette et al., 2020).Landmark et al. (2018) found that the probability of recovery from chronic pain was about 8% during the first year and further declined over time; thus, it might be possible that our dropouts differed in their symptomatology.One can speculate that they had more comorbidities, which might have influenced them in dropping out and worsening over time.An alternative explanation might be that the intervention type did not match with the person's preferences and characteristics, thus leading to earlier termination and insufficient results.One characteristic might be less social support given that lower levels of social support seem to characterize a specific subgroup of patients with pain (Bergström et al., 2001).Furthermore, the generalisability of our study results can be limited by a possible selection bias due to dropouts.A selective sample might have included only those persons who were satisfied and motivated.Dropout rates for exposure treatment were already higher in our original RCT.Reasons were a lower treatment alliance at pre-treatment and avoidance.Still, we were able to also include 76% of the former dropouts and the overall response rate was acceptable and satisfied the cut-off of 70% proposed by Peat et al. (2001).It can be discussed whether our interventions, which comprised of 15 sessions over 4 months, were powerful enough to result in long-lasting effects.During CBT-based treatment, patients learn to change profound cognitive and behavioural patterns, which should positively influence their everyday quality of life over the long term.Changing the interpretation of pain so that it is not a sign of harm, and adapting behaviour accordingly, is a really big change, which can influence a person's life.Pain is not expected to be eliminated, but the treatment is designed to help patients to better cope with their pain and to live a goal-oriented, active life.Further goals might be reduced use of pain medication and a return to work or household activities (Turk, 2003).Thus, one shortcoming of the present study is the narrow assessment of long-term changes, with only self-rating questionnaires being used.Future studies might include qualitative methods and assess in more detail return to work, hospitalization rates and medication use, to better display these long-term changes and processes of adaptation.Prospective studies should also consider to evaluate minimal important change (MIC) with a more patient focused approach like anchor based methods.
The main strengths of this study include that it is one of a very few RCTs to consider a long-term follow-up over several years and also focuses on the participants who dropped out of treatment.Overall, this study provides preliminary evidence that the effects of psychological treatments are stable over several years.Thus, these treatments might offer a promising and sustainable long-term perspective for patients with persistent back pain.Treatment gains for exposure as stand-alone treatment seem to be of shorter duration than a general CBT intervention.Booster sessions or health apps for aftercare might be an interesting option for future studies.Furthermore, a combination of exposure and additional CBT elements, such as goal setting, emotion regulation or stress management, might be beneficial and particularly sustainable.
T A B L E 3 Clinical meaningful change (from pre-treatment to long-term follow-up).
T A B L E 4 Abbreviations: PDI, Pain Disability Index; QBPDS, Quebec Back Pain Disability Scale.a Dysfunctional population mean.b Dysfunctional mean + functional mean.