The selfBACK artificial intelligence‐based smartphone app can improve low back pain outcome even in patients with high levels of depression or stress

selfBACK provides individually tailored self‐management support for low back pain (LBP) via an artificial intelligence‐based smartphone app. We explore whether those with depressive/stress symptoms can benefit from this technology.


| INTRODUCTION
Low back pain (LBP) is the greatest cause of global disability (Hoy et al., 2014), and is the condition most commonly associated with other chronic conditions (Schäfer et al., 2014). Psychological factors are frequently associated with recurrent episodes of LBP, but the mechanisms are complex (Hartvigsen et al., 2018;Pinheiro et al., 2016;Stubbs et al., 2016). Neurobiological underpinnings, including neurocognitive processing, have been shown to play a role in the connection between chronic pain and poor mood and distress (Rusu et al., 2019), contributing to the complexity of chronic pain and the unique experience lived by each individual (Clauw et al., 2019).
A systematic review, including 13 longitudinal studies, found that depression was associated with increased disability in LBP (Pinheiro et al., 2016). Depression also appears to increase the incidence of future chronic LBP (Currie & Wang, 2005).
Perceived stress has also been associated with LBP development and duration (Hartvigsen et al., 2018;Power et al., 2001). A review of 16 observational studies reported that catastrophizing was associated with delayed recovery from LBP (Wertli et al., 2014). The aligned, but distinct concept of feeling overwhelmed by long-term nonspecific stressors has also been associated with LBP intensity and pain-related disability (Puschmann et al., 2020).
Self-management strategies are recommended in LBP guidelines, where the patient is encouraged to learn about and manage their condition (Bernstein et al., 2017;Foster et al., 2018). Digital interventions (websites, mobile applications (apps), wearable technology) are seen as engaging and scalable ways to deliver this information (Dennison et al., 2013;Zhao et al., 2016). However, self-management requires motivation and confidence -attributes that may be difficult for those with depressive symptoms or high levels of perceived stress.
The selfBACK digital intervention is an artificial intelligence (AI)-based mobile app designed to generate evidence-based, individually tailored self-management support of non-specific LBP (Mork & Bach, 2018). The selfBACK randomized controlled trial (RCT) found that as an adjunct to usual care, selfBACK resulted in lower LBP-related disability at 3 months compared to usual care alone, and its benefits were sustained throughout the 9month follow-up period (Sandal et al., 2021).
In this secondary analysis, we aim to investigate whether depressive symptoms and perceived stress influence outcomes for people with LBP in the selfBACK RCT. The specific research questions were as follows:

Q1
Is there a difference at baseline in LBP-related disability and self-efficacy between those with high versus low levels of depressive or perceived stress symptoms? Q2 What is the trajectory of LBP-related disability by baseline depressive symptom score or level of perceived stress over a 9-month period? Q3 Is the effect of the selfBACK app on LBP-related disability, global perceived effect and self-efficacy at 3 and 9 months modified by the level of baseline depressive symptoms or level of perceived stress? Q4 Are people with high baseline depressive symptoms or perceived stress less likely to be satisfied with the selfBACK app than those without? Q5 Are people with high baseline depressive symptoms or perceived stress less likely to engage with the selfBACK app than those without? 2 | METHODS

| Study design, setting and participants
This was a secondary analysis of the randomized, assessorblinded international multicentre trial of the selfBACK app for patients with non-specific LBP (Trial Registration: NCT03798288. Date of registration: 9 January 2019; https://clini caltr ials.gov/ct2/show/NCT03 798288).
We explored the influence of depressive symptoms and perceived stress at baseline on selected outcomes of the main RCT. The full protocol and results of the RCT are published elsewhere (Mork & Bach, 2018;Sandal et al., 2019Sandal et al., , 2021. In summary, patients who had attended their primary care provider or an outpatient spine clinic with LBP Significance: We have demonstrated that an app supporting the self-management of LBP is helpful, even in those with higher levels of baseline depression and stress symptoms. selfBACK offers an opportunity to support people with LBP and provides clinicians with an additional tool for their patients, even those with depression or high levels of stress. This highlights the potential for digital health interventions for chronic pain. within the previous 8 weeks were invited to complete a web-based questionnaire. A total of 461 patients from primary care (Denmark and Norway) and an outpatient spine clinic (Denmark) were randomized to usual care (n = 229) or usual care plus selfBACK app (n = 232). LBPrelated disability, assessed by the Roland-Morris Disability Questionnaire (RMDQ) (Roland & Fairbank, 2000), was the primary outcome; a score of 6 or above was required to be eligible for the trial. Potential participants were 18 years or older, required email, computer and smartphone access, could speak, read or understand Danish or Norwegian and had no cognitive impairment, learning disability or conditions limiting participation, contraindication to exercise or physical activity, fibromyalgia, pregnancy, previous back surgery or ongoing participation in other LBP management trials.
Those randomized to the selfBACK intervention received weekly, individually tailored, recommendations for physical activity, strength and flexibility exercises and daily educational messages via the app. User data from symptom progression, step count, exercise completion and questionnaire information fed back to the case-based management system to tailor recommendations based on what had successfully worked in cases with similar characteristics and symptoms (Bach et al., 2016). Usual care meant managing LBP as per the advice or treatment from their care provider.

| Depressive and stress symptom measures
Sociodemographic information, depressive symptoms and perceived stress were collected at baseline. Baseline depressive symptoms were measured by the Patient Health Questionnaire (PHQ-8). The PHQ-8 is a widely accepted and validated instrument for assessing depressive symptoms in epidemiological studies (Arias-de la Torre et al., 2021;Wu et al., 2020). It asks about eight items related to the DSM-IV diagnostic criteria for depression over the previous 2-week period and responses are on a scale from 0 (not at all) to 3 (nearly every day) which are summed to give a total PHQ-8 score of 0-24; higher scores indicate greater depressive symptoms .
Perceived stress was measured with the 10-item Perceived Stress Scale (PSS), in which participants rated how upsetting, uncontrollable, unpredictable and overwhelming their lives had been during the previous month (Cohen et al., 1983;Cohen & Williamson, 1988;Lee, 2012). The scale was developed for use in communitybased samples and not as a diagnostic instrument (Cohen et al., 1983), it has been shown to have good reliability and to correlate with other measures of stress (Cohen & Williamson, 1988). Responses were rated on a 0 (never) to 4 (very often) point scale; with four positively stated items reverse scored, the 10 responses were then summed to give a total PSS score of 0-40, with higher scores indicating greater perceived stress (Cohen & Williamson, 1988).

| Outcomes
RMDQ was measured at baseline, 6 weeks, 3, 6 and 9 months using a web-based questionnaire. Higher scores (0-24) indicated higher LBP-related disability. The mean difference in RMDQ between the intervention and control group at 3 months was the primary outcome for the RCT. Although there is some debate in the literature, a clinically meaningful improvement in RMDQ score was regarded as a decrease of ≥4 points Ostelo & de Vet, 2005).
Secondary outcomes were (1) the 10-item Pain Self-Efficacy Questionnaire (PSEQ: range 0-60), which has shown strong psychometric properties in LBP populations, participants are asked to rate their confidence in taking part in activities despite their pain (Nicholas, 2007).
Higher scores indicate greater confidence in the ability to cope despite pain; and (2) the Global Perceived Effect (GPE: range −5 to 5) scale estimate of overall improvement, where positive scores indicate LBP improvement, and negative scores a deterioration (Kamper et al., 2009). PSEQ was recorded at baseline; both PSEQ and GPE were recorded at 6 weeks, 3, 6 and 9 months.
Satisfaction with the app was measured at a single point at 4 months on a 5-point Likert scale (1 lowest and 5 highest satisfaction) by asking the participants: 'how do you rate the selfBACK app?' For analysis, 'satisfied' participants were those scoring 4 or 5.
App engagement was measured by the number of weekly self-management plans participants created on the app. 'Engaged' participants were defined as those creating 6 or more self-management plans during the first 3 months post-randomization.
PSS respondents were categorized for all analyses into two groups based on the score's clinical categories (Cohen et al., 1983) and the number of trial participants: nil and low stress (0-13 score, n = 191) and moderate and high stress (14-40 score, n = 270).
To address Q1, we described the cohort's baseline RMDQ and PSEQ scores stratified by baseline PHQ-8 or PSS level. The crude and adjusted differences from the no depression or nil/low stress reference categories are presented.
For Q2, we presented the trajectory of RMDQ at baseline, 3 and 9 months and calculated the crude and adjusted mean change in RMDQ from baseline stratified by PHQ-8 or PSS category using a linear mixed model. We also repeated this analysis to consider PHQ-8 and PSS scores as a continuous variable. Further, we calculated the relative risk for an improvement of ≥4 points in RMDQ at 3 and 9 months, compared to the no depressive symptoms or nil/ low perceived stress groups using a Poisson generalized estimated equation (GEE) model.
For Q3, we estimated the effect of the intervention using a constrained longitudinal data analyses to approach as described in the primary outcome paper (Sandal et al., 2021), and presented RMDQ, GPE and PSEQ stratified by PHQ-8 or PSS category. This implies using a common baseline category for the intervention and control group, and model group differences at each follow-up time point in a linear mixed model. Measures of effect modification by depression or stress were then obtained from a post-estimation command that calculated the difference between the strata-specific effects at 3 and 9 months with associated p-values. Throughout, statistical significance was defined as p-value <0.05.
Q4 analysed data from the intervention group using logistic regression to estimate the adjusted odds ratio with 95% CI of being 'satisfied' with the app (defined above) in participants with depression or stress, compared to reference groups with no depressive symptoms or nil/low perceived stress.
Q5 was analysed as per Q4 but calculated the adjusted odds ratio with 95% CI of being 'engaged' with the app (defined above).
The main trial was powered at 90% to detect a difference in RMDQ of ≥2 points at 3 months' follow-up with a planned sample size of 350 participants (Sandal et al., 2021). As in the main trial, analyses were adjusted for country of recruitment, recruiting clinician, education (<10, 10-12, >12 years), duration of current pain episode (<1, 1-4, 5-12, >12 weeks), average pain intensity in the preceding week (continuous, range 0-10 scale), sex (male, female) and age (years). When analysing the trajectory of the whole cohort (research Q2), we also adjusted for baseline body mass index, work-ability (11-point self-rated scale from 0: unable to work to 10: fully able to work [Ahlstrom et al., 2010]) and physical activity level (self-reported time per week performing leisure activities with a revised version of the Saltin-Grimby score: level 1: sedentary to level 4 regular vigorous activity [Grimby et al., 2015]).
All analyses were performed using Stata version 16.1.

| Research Q1
Analysing the intervention and control groups as a whole cohort, when stratified by baseline PHQ-8 or PSS categories, participants with higher depressive symptoms or higher perceived stress scores reported higher adjusted baseline RMDQ scores (Table 2). Groups with higher levels of depressive or stress symptoms also had less confidence in their ability to cope with pain: adjusted difference in PSEQ between moderate to severe and no depressive symptoms: -8.1 (95% CI: −10.7 to −5.4), and between moderate to high stress and nil to low stress symptoms: −4.6 (95% CI: −6.5 to −2.7). Extending this analysis to consider PHQ-8 and PSS as continuous variables (Table 2) showed that for each unit increase in PHQ-8 and PSS scores, there was an increase of 0.3 (95% CI: 0.2-0.4) and 0.1 (95% CI: 0.1-0.2) of the RMDQ score respectively; and a reduction in the PSEQ score (i.e., poorer pain self-efficacy) of −0.8 (95% CI: −1.0 to −0.5) and −0.4 (95% CI: −0.6 to −0.3) respectively.

| Research question 2
Across the whole cohort, there was an improvement in RMDQ score for those with all levels of baseline depressive symptoms and perceived stress scores at 9 months' follow-up, although all confidence intervals were wide and crossed zero (Tables 3 and 4; Tables S1 and S2).
Level of baseline depression or perceived stress was inversely associated with the probability of a clinically significant improvement in LBP-related disability (RMDQ change ≥4 points) at 3 months. Compared to people with no depressive symptoms, those who reported mild, or moderate-to-severe symptoms had a RR of 0.8 (95% CI: 0.6-0.9) and 0.7 (95% CI: 0.5-1.0) respectively. Compared to those with nil or low stress, those reporting moderate or high stress had a RR of 0.8 (95% CI: 0.7-1.0). These trends were maintained for 9 months.

| Research question 3
Regardless of the level of depressive symptoms or perceived stress, those receiving the selfBACK intervention had better outcomes in RMDQ, GPE and PSEQ than those receiving usual care alone at 3 and 9 months; however, this was not statistically significant (Table 5). The exception was the RMDQ score at 9 months in those with moderatesevere depression randomized to selfBACK, which was similar to the RMDQ score of those with moderate-severe depression randomized to usual care at this timepoint (8.8 vs. 8.7, respectively). The overall trend suggested that those with more severe depressive or perceived stress symptoms in the selfBACK groups had marginally greater improved RMDQ, GPE and PSEQ scores at 3 months compared to those in the control groups or those with less severe symptoms. At 9 months, this trend was maintained for GPE in the depressive symptom groups, but reversed in all other outcomes, such that those with less severe symptoms fared better.

| Research question 4
Analysing the intervention group only, participants with mild-or moderate-to-severe depressive symptoms were less likely to be satisfied with the app compared to those with no depressive symptoms (adjusted OR 0.5, 95% CI: 0.4-0.7 and OR 0.7, 95% CI: 0.4-1.1 respectively). The same was found for perceived stress, where those with moderate and high stress were less likely to be satisfied with the app compared to those with nil and low perceived stress (adjusted OR 0.5, 95% CI: 0.4-0.7).

| Research question 5
Analysing the intervention group only, those with mild-or moderate-to-severe depressive symptoms were less likely to engage with the app compared to a reference group with no depressive symptoms (both groups adjusted OR 0.6, 95% CI: 0.5-0.8). Similarly, those with moderate and high perceived stress were less likely to engage with the app compared to those with nil and low perceived stress (adjusted OR 0.8, 95% CI: 0.6-0.9).

| Summary of findings
People with higher levels of depressive symptoms or perceived stress reported higher levels of LBP-related disability (RMDQ) and lower pain self-efficacy (PSEQ) at baseline. Over 9 months' follow-up, during which the cohort either received usual care or usual care plus self-BACK, LBP-related disability tended to improve. However, the probability of improvement decreased with increasing levels of baseline depressive symptoms or perceived stress.
Those in the intervention group with any level of elevated baseline depressive or stress symptoms tended towards better outcomes in RMDQ, GPE and PSEQ than those receiving usual care alone. There was no evidence that different baseline levels of depressive or perceived stress symptoms were associated with different RMDQ, T A B L E 2 LBP-related disability (RMDQ) and self-efficacy (PSEQ) at baseline for all participants stratified by baseline depression symptoms (PHQ) and perceived stress level (PSS). a Adjusted for country, recruiting clinician, education, pain duration at baseline, pain intensity at baseline, sex, age, body mass index, work ability score at baseline and physical activity at baseline. GPE or PSEQ outcomes. Those with moderate to severe depressive symptoms or high perceived stress at baseline were less likely to be satisfied or engage with the app than those with no depressive symptoms or nil/low perceived stress.

| Comparison with existing literature
To our knowledge, this is the first study to explore how depressive symptoms or perceived stress may influence the acceptability and effectiveness of a digital intervention to promote self-management of LBP. In line with existing literature, participants with higher levels of depressive symptoms and perceived stress report higher levels of LBP-related disability and lower self-efficacy (Bair et al., 2009;Jackson et al., 2014;Puschmann et al., 2020). Different trajectories of non-specific LBP have been noted and attempts have been made to categorize patients into subgroups based on clinical characteristics (Axén & Leboeuf-Yde, 2013). For example, there is a growing literature suggesting an association of depression with greater chronicity (Bair et al., 2009;Currie & Wang, 2005) and sustained higher levels of pain-related disability and functional disability (Andersen et al., 2022), although cause and effect may be debated. Previous research has suggested that psychologically augmented physiotherapy is effective for those with high levels of pain catastrophizing or fear avoidance (Hill et al., 2011). Unfortunately, post-COVID-19 pandemic access to rapid and appropriate physiotherapy is suboptimal in many countries (Equipsme, 2022). Digital health interventions have the potential to promote more rapid access to appropriate self-management support for LBP. Importantly for primary care, where depression, stress and non-specific LBP are highly co-prevalent (Pincus & McCracken, 2013), we have demonstrated that those with severe depressive or stress symptoms can benefit from digital self-management support for LBP.
Our work has also shown that non-specific LBP tends to improve over the medium term (9 months), resonating with work by Axén et al. (2011) that looked at the 6-month course of LBP but differs somewhat from a previous study by Dunn et al. (2006), which suggested that those with poor psychological status were more likely to experience severe, chronic pain and disability over the course of 12 months. However, these results should be interpreted with caution since psychological symptoms in people with pain may appear qualitatively different from symptoms in other groups (Rusu & Hallner, 2018;Rusu & Pincus, 2017).
As hypothesized, engagement and satisfaction were lower in those with greater levels of depression or stress. Maintaining engagement with app-delivered interventions is particularly challenging in those with underlying mental a Adjusted for country, recruiting clinician, education, pain duration at baseline, pain intensity at baseline, sex, age, body mass index, work ability score at baseline and physical activity at baseline.
health problems (Linardon & Fuller-Tyszkiewicz, 2020) but selfBACK's ability to personalize content to each user meets best-practice recommendations (Schubart et al., 2011), suggesting that this may be a useful additional tool for primary care practitioners managing those with LBP.

| Strengths and limitations
The main trial was robustly conducted, with effective randomization, a person-centred intervention and multicentred, international recruitment from a variety of clinical settings (Sandal et al., 2021). Despite the stated risks of loss to follow-up, especially in the groups with higher levels of depressive symptoms or perceived stress, there was little attrition (Sandal et al., 2021). This may reflect the usability and perceived utility of the intervention. However, this was a secondary analysis of a trial not originally powered to detect differences by depressive symptom score or levels of perceived stress. PHQ-8 and PSS were both measured on approximately continuous scales. The variables were categorized to allow for possible non-linear associations and to address effect modification in stratified analyses. The number of categories and their cut-offs were selected as a necessary compromise between clinical relevance and meaningful statistical power. The small numbers limit the precision of the estimates as reflected by the wide confidence intervals throughout, suggesting that results must be interpreted with caution. To increase statistical power, we also analysed PHQ-8 and PSS scores as continuous variables and observed weak positive associations with RMDQ and weak inverse associations with PSEQ score. Although this suggests a dose-response relationship of PHQ-8 and PSS with both RMDQ and PSEQ, these are exploratory analyses and do not necessarily infer causality. The causal mechanisms underpinning the complex relationships between pain and mood require further elucidation. It may be that people with high levels of depressive symptoms or perceived stress need further support to help with their mental health before tackling self-management of LBP using selfBACK.
There is some debate about the best measure for LBPrelated disability (Chiarotto et al., 2016;Kent et al., 2015). The RMDQ has been shown to have adequate measurement properties in a range of domains as well as good testretest reliability (Jenks et al., 2022;Lauridsen et al., 2006); it is particularly suited to studies in primary care populations (Lauridsen et al., 2006) and is one of the recommended core outcome measures for use in non-specific LBP RCTs (Chiarotto et al., 2018). Measurement error is at random (Jenks et al., 2022) and therefore any bias is likely to result in an underestimate of pain-related disability, so we are confident that the relationships reported in this study are not exaggerated. a Adjusted for country, recruiting clinician, education, pain duration at baseline, pain intensity at baseline, sex and age.
Although there were participants with severe depressive symptom scores and high perceived stress scores, it is possible that the very fact they participated in the trial may mean that they represent a more motivated cohort than the general LBP population with depression and stress (Axén & Leboeuf-Yde, 2013). The requirement for access to the internet and a smartphone also potentially limits the findings to those of higher socioeconomic status.

| Implications for practice, policy, education and future research
Given the high prevalence of depressive and stress symptoms, future digital interventions for LBP should record these factors at baseline and follow up and explore the effects, if any, of antidepressant/anxiolytic use. Self-efficacy should also be monitored, to better establish its interplay with LBP and depression/stress (Puschmann et al., 2020). For primary care clinicians, this work suggests that even patients with depressive symptoms or high levels of stress should be considered for digital interventions for LBP as they are still likely to benefit. The study highlights the potential for AI-based apps to tailor self-management support for chronic illness, which may be particularly useful for people with combined physical and mental health problems.

| CONCLUSION
App-based interventions may improve outcomes even in those with high levels of depressive symptoms and perceived stress and could be recommended for this population of patients with LBP.

AUTHOR CONTRIBUTIONS
BIN, FSM, KW, JH, PJM and TILN were involved in the conception of this study and contributed to the study design. Statistical analyses were carried out by TILN. GR drafted the manuscript. All authors interpreted the results and commented on the manuscript before approving the final version.