Do psychological characteristics predict response to exercise and advice for subacute low back pain?


  • Australian Clinical Trials Registry Number 12605000039684.



To determine whether psychological characteristics predict outcome and/or response to physiotherapist-directed exercise- or advice-based treatment of subacute low back pain.


We conducted a secondary analysis of a factorial, placebo-controlled trial (n = 259). The psychological characteristics were catastrophizing, coping, pain self-efficacy, fear of injury/movement, depression, anxiety, and stress. We used mixed models to predict pain and function outcomes (both scored on a 0–10 scale). The models include a term for treatment group, a term for the psychological characteristic (which tested prediction of outcome), and an interaction term between the treatment group and psychological characteristic (which tested treatment effect modification). To aid the interpretation of the magnitude of the effect modification, we calculated the change in outcome for a 1 SD increase of the baseline score of the putative effect modifier. A ≥1.5-point change of the outcome of interest per 1 SD change of putative effect modifier was regarded as clinically important.


All of the psychological characteristics except coping predicted outcome, but none appeared to be important treatment effect modifiers. Only 5 of the 56 tests of treatment modification were statistically significant, and none of the 95% confidence intervals (95% CIs) for the interactions included clinically important effects. For example, a 1 SD higher baseline level of anxiety was associated with a 0.62 (95% CI 0.10, 1.15) additional effect of exercise on function at 52 weeks.


Most of the psychological characteristics we tested predicted outcome, but none predicted response to physiotherapist-guided exercise and/or advice.


Despite the general opinion that acute low back pain is mostly self-limiting, recent studies demonstrate that the prognosis is not as good as is often thought; recovery from pain and disability is slow, and even at 1 year many patients still have ongoing pain and disability (1, 2). Therefore, effective treatments in the acute and subacute stages are needed. However, trials evaluating low back pain treatments typically demonstrate small treatment effects. For example, a recent systematic review of 76 placebo-controlled trials found that only 15% of the trials had effects >20 points for pain reduction where pain was measured on a 0–100 scale (3). In 47% of the trials, effects were <10 points. One potential solution to this problem is to direct treatment to the subgroups of patients who gain the most from that specific intervention rather than applying treatment in a generic fashion (4). A prerequisite to this approach is the identification of clinical features that predict response to treatment, also called effect modifiers.

There are 2 related issues that are sometimes confused in the analysis of clinical trials: predictors of outcome and treatment effect modifiers. Baseline characteristics that predict outcome in both the treated and control groups are said to be predictors of outcome (5, 6). In contrast, treatment effect modifiers are baseline characteristics that predict treatment effects (5). Predictors of outcome and effect modifiers are both useful in clinical practice, but they have different purposes. Effect modifiers can help clinicians select the best treatment for an individual patient, whereas predictors of outcome can be used by a clinician to provide patient-specific information on prognosis. It is important to note that predictors of outcome may not necessarily be effect modifiers.

A third issue is mediation, providing knowledge regarding the working mechanisms of treatments. This type of analysis requires the collection of putative mediators and outcome measures at least before and after the treatment (6, 7). However, mediation was not within the scope of this study.

The available literature suggests that psychological variables are important predictors of clinical outcome. Numerous studies of patients with a wide variety of chronic pain problems have shown that patients' beliefs about their pain and the strategies they use to cope with their pain are associated with pain intensity and psychosocial functioning (8). In a systematic review of the role of psychosocial factors that included 9 prospective cohorts, Pincus et al showed that distress (stress, anxiety), depressive mood, and somatization at baseline all negatively influenced outcome (9). There was weak support for an association between catastrophizing and outcome. The authors argued there was a need for further clarification of the role of fear-avoidance beliefs, although more robust evidence for the role of fear-avoidance beliefs in chronicity (persisting pain and disability) has been reported recently (10–13). Other studies have identified anxiety (10) and pain self-efficacy as prognostic factors (14, 15). During a recent workshop on priorities in primary care research on low back pain, the aforementioned psychological factors were nominated as potential effect modifiers (16). The importance of assessing these potential effect modifiers was stressed (16).

Until now, only one study, the UK Back Pain Exercise and Manipulation (UK-BEAM) trial, has evaluated effect modification in patients with subacute low back pain (17). The outcome of interest was disability. The authors concluded that a participant's baseline pain, disability, quality of life, and pain beliefs were not associated with the effects of manipulation, exercise, a combination of these treatments, or usual care (17). One potential reason for the negative result is that the authors merged several psychological variables into a new composite variable. If some but not all of the psychological variables predicted response to treatment, this approach might have disguised effect modification.

Two other interesting studies have investigated effect modification. In these studies, the patients were on sick leave for 4–12 weeks and most already had chronic low back pain. In both studies, a brief program using cognitive–behavioral principles was compared with usual care. Both studies used a multivariate regression model and included as many as 13 to 30 putative effect modifiers. Karjalainen et al (18) showed that the brief program was most effective in reducing pain and disability in those who perceived that they had the greatest risk of not recovering. Hagen et al (19) reported that this brief program produced a greater rate of return to work for those who perceived that they had a reduced ability to work, that their work was straining their back, and that their work caused their low back pain. Unfortunately, these studies did not examine whether any of the aforementioned key psychological risk factors (catastrophizing, coping, anxiety, stress, depression, pain self-efficacy, and fear-avoidance beliefs) were effect modifiers.

Apart from these 3 studies, the only information about effect modification comes from studies using subgroup-specific analyses. However, subgroup-specific analyses are statistically inferior to testing effect modification with interaction terms (20). Because only one subgroup is included in a subgroup-specific analysis, the power is far less. Also, because a test is performed within each subgroup, the risk of Type I error is inflated. Consequently, the results of subgroup-specific analyses should be regarded with caution (21). Nevertheless, these studies are often used to in order to search for subgroups based on psychological factors. For example, Jellema et al (22) used a subgroup-specific analysis and found that people with acute and subacute low back pain who had less somatizing symptoms, perceived lower risk for chronicity, and reported higher fear-avoidance beliefs showed a favorable outcome when provided a brief cognitive–behavioral treatment. Acute low back pain patients with high levels of fear-avoidance beliefs appeared to benefit more from an educational booklet (23), a cognitive–behavioral-based exercise program (24), or fear-avoidance based physiotherapy compared with those with low scores (25). However, among patients on sick leave due to low back pain and treated with graded activity, those with high levels of fear-avoidance beliefs returned to work more slowly (26).

Taken together, these studies that mainly consist of subgroup-specific analyses suggest that there is limited evidence available that the key psychological factors modify any type of treatment in patients with subacute low back pain.

In the current study we evaluated whether psychological variables predict outcome and/or response to treatment by conducting a secondary analysis of a trial that evaluated exercise and/or advice for subacute low back pain (27). The putative psychological predictors were catastrophizing, coping, pain self-efficacy, depression, anxiety, stress, and fear of injury/movement. These variables were selected a priori on the basis of clinical experience and current literature regarding the more general predictors of outcome. The aim of this study was to determine whether these psychological variables separately predicted outcome and/or response to exercise- and/or advice-based treatment in patients with subacute low back pain.

Because both advice and exercise used elements of cognitive–behavioral techniques to encourage patients to resume normal activity levels and change any unhelpful beliefs and avoidance behavior, both treatments were expected to be more effective in patients with less healthy psychological features. We did not expect a strong interaction but, if present, thought it might be found for exercise because those patients were given more specific and structured input on how to upgrade their activity.


This study was a secondary analysis of data of a factorial, randomized, placebo-controlled trial in patients with subacute low back pain conducted in 7 physiotherapy clinics in Australia and New Zealand. The results of the primary analyses have been reported in detail elsewhere (27), so only a brief summary of the patients and methods will be presented here. The psychological predictors of outcome were chosen a priori. The study protocol received ethical approval from the Institutional Review Boards of the University of Sydney and the relevant area health services of each clinic.


We recruited 259 people between the ages of 18 and 80 years with nonspecific low back pain lasting for ≥6 weeks but ≤12 weeks by direct referral, invitations to participants on hospital waiting lists for physiotherapy treatment of low back pain, and newspaper advertisements. Participants were asked not to use other treatments for low back pain during the 6-week treatment phase.

Randomization and interventions.

After completing the baseline assessment, participants were randomly allocated to 1 of 4 intervention groups: exercise and advice, exercise and sham advice, sham exercise and advice, or sham exercise and sham advice. Participants received 12 physiotherapist-directed exercise or sham exercise sessions and 3 physiotherapist-directed advice or sham advice sessions over 6 weeks. The exercise program included individualized, progressive, submaximal training designed to improve the abilities of participants to complete functional activities that they specified as being difficult to perform due to low back pain. The program consisted of aerobic exercise, stretches, functional activities, activities to build speed, endurance, and coordination, and trunk- and limb-strengthening while using cognitive–behavioral therapy principles (setting goals of progressively increasing difficulty, encouraging self-monitoring of progress and continuation of home program after intervention, and promoting self-reinforcement). Sham exercise consisted of sham pulsed ultrasound and sham pulsed short-wave diathermy. Advice aimed to encourage a graded return to normal activities. The physiotherapist explained the benign nature of low back pain, addressed any unhelpful beliefs about low back pain, and emphasized that being overly careful and avoiding light activity would delay recovery. During sham advice, participants were given the opportunity to talk about their low back pain and any other problem, while the therapist responded in an empathic manner without giving advice about the low back pain.

Data collection.

At baseline, demographic, historic, and social characteristics were assessed, as well as the putative psychological predictors of response to treatment and outcome measures. Data were collected by a blinded assessor.

Putative psychological predictors.

To measure catastrophizing and coping we used the Pain-Related Self-Statements Scale, which assesses situation-specific cognitions that either promote or hinder the individual's attempts to cope with pain. Individuals were asked to rate on a 6-point scale how often they think in such a way when they experience severe pain (0 = almost never, 5 = almost always). There are two 9-item subscales, catastrophizing and coping. A higher total score indicates more frequent catastrophizing or use of adaptive coping statements, respectively (28). The scales have been shown to have good psychometric properties (Cronbach's α = 0.86 and 0.76, respectively) (29).

To measure pain self-efficacy, we used the Pain Self-Efficacy Questionnaire, which is a 10-item inventory that measures both the strength and generality of individuals' beliefs about their ability to accomplish a range of activities despite their pain. Scores range from 0–60, with higher scores indicating stronger self-efficacy beliefs. The psychometric properties are sound (Cronbach's α = 0.92) (15, 29).

To measure fear of injury and movement, we used the Tampa Scale for Kinesiophobia (TSK), which consists of 17 items. Individuals rate the extent to which they agree with statements such as “pain always means that I injured my body” on a 4-point rating scale, where 1 = strongly disagree and 4 = strongly agree. Four items are reverse-scored. The total score ranges from 17–68, with a higher score indicating more fear. The scores are considered reliable and valid in chronic low back pain (Cronbach's α = 0.83) (29, 30).

The 21-item Depression Anxiety and Stress Scales (DASS-21) consist of 3 subscales, depression, anxiety, and stress. Each subscale includes 7 statements about negative emotional symptoms. Subjects rated the extent to which they experienced each symptom over the past week on a 4-point severity/frequency scale. Scores of each subscale were multiplied by 2, yielding a maximum score of 42 per subscale (31). The depression scale measures dysphoric mood, including inertia and hopelessness, and has no somatic items. Its validity with chronic pain patients has been reported to be high (32). The DASS-21 has strong psychometric properties (for depression Cronbach's α = 0.96; for anxiety α = 0.89; and for stress α = 0.95) (29, 31).

Outcome measures.

In the primary analysis of the parent clinical trial, 3 primary outcome measures were assessed at 6 and 52 weeks after randomization: pain intensity, function, and global perceived effect. For this analysis we used the 2 most commonly used and most clinically meaningful outcome measures in low back pain research, pain and function (33).

The average pain experienced over the past week was rated by each individual on an 11-point scale, where 0 = no pain at all and 10 = the worst pain possible (34).

To measure function, the Patient-Specific Functional Scale was used. At baseline, each individual selected 3 activities that he/she was unable to perform or had difficulty with as a result of his/her low back pain. The ability to perform these activities was rated on a 0–10 scale (where 0 = cannot perform the activity and 10 = can perform the activity at a preinjury level) and the overall mean was used (35).

Statistical analysis.

In our data set we included, a priori, 7 putative predictors. The dependent variables used in the analyses were first tested for normality. Descriptive statistics were calculated for demographic and self-report measures using the appropriate statistical procedures.

In the primary analysis, a linear mixed model in which time was defined as a repeated factor was used to estimate effects of exercise, advice, and exercise plus advice compared with placebo treatment for the 2 primary outcomes. Analyses were conducted by using all measurements (including baseline measurements) as outcome variables. Random intercepts accounted for correlation over time within participants and clinics. We constructed models with dummy variables representing the exercise and advice interventions, each time point, the intervention-by-time interaction, and potential confounders (current pain medication use, being a current smoker, current exercise activity, low back pain treatment in the previous 6 weeks, and previous surgery for low back pain). Next, each of the 7 putative predictors was entered separately into the model to determine whether the variable was a statistically significant predictor of outcome (that is, whether the predictor had a main effect regardless of the treatment provided). Subsequently, we evaluated the interaction between treatment allocation and each potential predictor (i.e., effect modification) separately at 6 and 52 weeks, using the same linear mixed model. We separately assessed the interactions with exercise and advice but not with both treatments combined.

To aid the interpretation of the magnitude of effect modification, we calculated the change in effect of intervention associated with a 1 SD increase of the baseline score of each putative effect modifier. Based on the American Pain Society/American College of Physicians clinical practice guideline, which defines a mean difference between treatment groups of 1–2 points as moderate, effect modification associated with a 1 SD change of ≥1.5 points was regarded as clinically relevant (36). As all putative effect modifiers were chosen a priori, no correction for multiple comparisons was performed.

Where more than 1 variable was shown to be a significant effect modifier, all significant effect modifiers were entered in a multivariate analysis to evaluate the independent effect modification.

Linear mixed models analyses were conducted using the xtmixed command in Stata, version 9.0 (StataCorp, College Station, TX). Estimates were obtained using maximum likelihood or, when the maximum likelihood would not converge, with restricted maximum likelihood.


Participant recruitment, followup, and baseline characteristics.

Of the 259 participants, 231 (89%) attended the 6-week followup, 236 (91%) attended the 12-week followup, and 231 (89%) attended the 52-week followup. The baseline characteristics of all participants and of the participants per treatment are reported in Table 1. The groups were similar at baseline.

Table 1. Baseline characteristics of the study population*
 Exercise + advice (n = 63)Sham exercise + advice (n = 63)Exercise + sham advice (n = 65)Sham exercise + sham advice (n = 68)
  • *

    Values are the mean ± SD unless otherwise indicated. LBP = low back pain; TSK = Tampa Scale for Kinesiophobia (range 17 [low fear of movement] to 68 [high fear of movement]); PRSS = Pain-Related Self-Statements Scale; DASS-21 = 21-item Depression Anxiety and Stress Scales.

  • Range 0 (no disability) to 24 (high disability).

  • Range (low self-efficacy) to 60 (high self-efficacy).

  • §

    Coping range 0 (poor coping strategies) to 45 (strong coping strategies); catastrophizing range 0 (low catastrophizing) to 45 (high catastrophizing).

  • Depression range 0 (no depression) to 42 (high depression); anxiety range 0 (no anxiety) to 42 (high anxiety); stress range 0 (no stress) to 42 (high stress).

  • #

    Range 0 (no pain) to 10 (worst pain possible).

  • **

    Range 0 (unable to perform activity) to 10 (able to perform activity at preinjury level).

Age, years50.1 ± 15.451.2 ± 16.148.0 ± 16.150.0 ± 15.6
Female sex, %46444654
Working prior to LBP, %61545853
Working now, %56495247
Smoker, %17141221
Currently undertakes regular exercise, %52596246
Roland Morris disability questionnaire9.0 ± 4.78.2 ± 4.48.3 ± 5.08.1 ± 5.6
History of LBP    
 Previous episodes of LBP, %71696065
 Previous sick leave for LBP, %20181621
 Previous surgery for LBP, %0630
Duration of current episode of LBP, %    
 6–8 weeks48514547
 9–11 weeks34413837
 >11 weeks1881716
 Pain referred to the leg29383129
 Other pain areas than back or leg28302619
Taking painkillers, %37333538
Predictors/effect modifiers at baseline    
 TSK39.0 ± 7.938.9 ± 7.839.5 ± 8.638.1 ± 8.2
 Pain self-efficacy44.4 ± 12.846.3 ± 11.044.3 ± 11.343.7 ± 13.4
 PRSS coping§30.4 ± 6.830.1 ± 8.430.2 ± 7.330.5 ± 6.3
 PRSS catastrophizing§17.3 ± 9.118.0 ± 10.517.9 ± 8.618.0 ± 7.9
 DASS-21 depression7.0 ± 8.87.5 ± 7.77.1 ± 7.97.1 ± 7.6
 DASS-21 anxiety4.7 ± 6.75.2 ± 7.46.2 ± 7.65.4 ± 6.9
 DASS-21 stress10.1 ± 9.011.6 ± 8.512.7 ± 9.111.7 ± 10.0
Outcome measures at baseline    
 Pain#5.4 ± 2.25.6 ± 2.05.4 ± 1.95.3 ± 1.8
 Patient-Specific Functional Scale**3.8 ± 1.93.9 ± 1.93.8 ± 2.14.0 ± 1.8

Prediction of outcome.

In univariate analyses, all psychological variables except coping were significantly associated with outcome (Table 2). However, the mean effects were not large. For example, a 1 SD higher score in depression was associated with 0.36 points (95% confidence interval [95% CI] 0.17, 0.55) less reduction in pain and 0.32 points (95% CI 0.13, 0.51) less improvement in function.

Table 2. Prediction of pain and function outcomes associated with a 1 SD increase of a putative predictor*
Predictor of outcome (range)SD baselineChange in pain (95% CI)PChange in PSFS (95% CI)P
  • *

    95% CI = 95% confidence interval; PSFS = Patient-Specific Functional Scale; PSEQ = Pain Self-Efficacy Questionnaire; TSK = Tampa Scale for Kinesiophobia; PRSS = Pain-Related Self-Statements Scale; DASS-21 = 21-item Depression Anxiety and Stress Scales.

  • Positive change in score indicates an increase in pain, and negative change in score indicates less pain.

  • Positive change indicates better functioning, and negative change indicates deterioration.

  • §

    Higher score indicates that these patients had a higher level of self-efficacy and experienced more reduction of pain and better function than those with lower scores.

PSEQ (0–60)§12.2−0.49 (−0.69, −0.28)< 0.0010.42 (0.22, 0.61)< 0.001
TSK (17–68)8.10.27 (0.08, 0.47)0.005−0.36 (−0.54, −0.18)< 0.001
PRSS catastrophizing (0–45)9.00.54 (0.35, 0.73)< 0.001−0.31 (−0.50, −0.12)0.001
PRSS coping (0–45)7.20.08 (−0.11, 0.27)0.409−0.003 (−0.19, 0.19)0.967
DASS-21 depression (0–42)8.00.36 (0.17, 0.55)< 0.001−0.32 (−0.51, −0.13)0.001
DASS-21 anxiety (0–42)7.10.37 (0.18, 0.56)< 0.001−0.30 (−0.48, −0.12)0.001
DASS-21 stress (0–42)9.20.39 (0.20, 0.58)< 0.001−0.32 (−0.50, −0.13)0.001

Prediction of response to treatment (effect modification).

The results for effect modification of exercise treatment are presented in Table 3. Of the 28 interactions, only 2 were statistically significant. A 1 SD higher score in depression (range 0–42, SD 8) was associated with 0.59 points (95% CI 0.03, 1.16; P = 0.02) greater effect of exercise on pain at 52 weeks. A 1 SD higher score of anxiety (range 0–42, SD 7.1) was associated with 0.62 points (95% CI 0.10, 1.15; P = 0.041) greater effect of exercise on function at 52 weeks.

Table 3. Response to exercise for pain and function (Patient-Specific Functional Scale) of a 1 SD increase of a putative predictor*
Predictor of response to exercise (range)Change in pain (95% CI)Change in function (95% CI)
6 weeks52 weeks6 weeks52 weeks
  • *

    See Table 2 for definitions.

  • Positive change in score indicates an increase in pain, and negative change in score indicates less pain.

  • Positive change indicates better functioning, and negative change indicates deterioration.

  • §

    Restricted maximum likelihood was used.

PSEQ (0–60)§−0.09 (−0.65, 0.47)0.21 (−0.34, 0.76)−0.22 (−0.77, 0.32)−0.08 (−0.61, 0.46)
TSK (17–68)0.17 (−0.39, 0.73)−0.28 (−0.84, 0.28)0.17 (−0.36, 0.71)−0.06 (−0.60, 0.48)
PRSS catastrophizing (0–45)−0.26 (−0.80, 0.28)−0.07 (−0.62, 0.48)0.19 (−0.34, 0.72)−0.26 (−0.79, 0.27)
PRSS coping (0–45)−0.45 (−1.00, 0.10)−0.45 (−1.01, 0.10)0.07 (−0.47, 0.61)0.25 (−0.29, 0.79)
DASS-21 depression (0–42)−0.04 (−0.59, 0.51)−0.59 (−1.16, −0.03)0.52 (−0.24, 1.05)0.34 (−0.20, 0.89)
DASS-21 anxiety (0–42)−0.04 (−0.58, 0.49)−0.41 (−0.95, 0.14)0.26 (−0.26, 0.78)0.62 (0.10, 1.15)
DASS-21 stress (0–42)0.02 (−0.52, 0.57)−0.13 (−0.70, 0.43)0.26 (−0.33, 0.74)0.07 (−0.47, 0.62)

The results for modification of the effect of advice treatment are provided in Table 4. Only 3 of the 28 comparisons were statistically significant. A 1 SD higher score at baseline on the TSK (range 17–68, SD 8.1) was associated with 0.66 points (95% CI 0.10, 1.21; P = 0.021) less reduction of pain and 0.65 points less improvement in function (95% CI 0.12, 1.19; P = 0.016) at 6 weeks. At 52 weeks, only coping was a significant predictor of the response to advice: a 1 SD higher score for coping at baseline (range 0–45, SD 7.2) was associated with 0.68 points greater reduction in pain (95% CI 0.12, 1.24; P = 0.017).

Table 4. Response to advice for pain and function (Patient-Specific Functional Scale) of 1 SD increase of putative predictor*
Predictor of response to advice (range)Change in pain (95% CI)Change in function (95% CI)
6 weeks52 weeks6 weeks52 weeks
  • *

    See Table 2 for definitions.

  • Positive change in score indicates an increase in pain, and negative change in score indicates less pain.

  • Positive change indicates better functioning, and negative change indicates deterioration.

  • §

    Restricted maximum likelihood was used.

PSEQ (0–60)§−0.39 (−0.95, 0.17)−0.01 (−0.57, 0.55)0.20 (−0.35, 0.75)0.03 (−0.52, 0.57)
TSK (17–68)0.66 (0.10, 1.21)0.02 (−0.54, 0.58)−0.65 (−1.19, −0.12)−0.02 (−0.56, 0.51)
PRSS catastrophizing (0–45)0.11 (−0.43, 0.66)−0.24 (−0.80, 0.31)−0.28 (−0.81, 0.26)0.14 (−0.40, 0.68)
PRSS coping (0–45)0.16 (−0.40, 0.71)0.68 (0.12, 1.24)−0.36 (−0.90, 0.18)−0.16 (−0.70, 0.38)
DASS-21 depression (0–42)0.08 (−0.47, 0.63)−0.05 (−0.62, 0.51)−0.46 (−0.99, 0.08)0.25 (−0.30, 0.79)
DASS-21 anxiety (0–42)0.01 (−0.53, 0.54)0.22 (−0.32, 0.77)−0.15 (−0.67, 0.37)0.19 (−0.34, 0.72)
DASS-21 stress (0–42)0.26 (−0.30, 0.82)0.11 (−0.46, 0.68)−0.34 (−0.89, 0.20)−0.00 (−0.56, 0.55)

There were no significant interactions between the other putative effect modifiers and treatment allocation at any time point for either pain or function. None of the point estimates or even the upper 95% CIs of the interactions between treatment allocation and the other nonsignificant predictors included clinically important effects. Because in all of the analyses no more than 1 effect modifier at a time was found to be significant, we did not perform any multivariate analyses.


In this reanalysis of a previously published trial (27), we found that although all psychological factors except coping were overall predictors of outcome, none were clinically important predictors of the effects of exercise or advice for people with subacute low back pain. Although we tested several putative psychological effect modifiers for physiotherapy-guided exercise or advice, only 5 of the 56 interactions between these psychological variables and the treatment allocation were statistically significant. Importantly, none of the 95% CIs for the interactions included clinically important effects (≥1.5 points improvement compared with baseline).

Our study had a number of strengths. The parent trial was registered and of high methodologic quality, and the psychological predictors and analysis plan were prespecified. We also evaluated effect modification using the preferred approach of including an interaction term of treatment and each putative effect modifier in the statistical model. One potential limitation of our study was that the parent trial was not specifically designed to be sufficiently powered to test for effect modification (although it was powered to detect an interaction between the 2 intervention factors). Consequently the negative result may be a Type II error (20). We think this is not of concern because none of the upper estimates of the 95% CIs included our prespecified clinically important effect.

Although we used the same mixed-model longitudinal analysis as was used in the primary analysis and added the interaction terms of interest, this is still a secondary analysis and our findings should be interpreted with caution. Due to the multiple testing (56 comparisons in addition to the primary analyses) and the expectation that as many as 5% of tests will have false-positive findings (20), our few statistically significant findings are most likely attributable to chance. If we had performed a Bonferroni correction and set the critical P value at 0.01, none of the 56 interactions would have been statistically significant.

The cutoff score for clinically important effect modification is somewhat arbitrary and might look high. However, the magnitude presented was the mean interaction effect multiplied by the SD of the putative effect modifier at baseline and as such was rather small. Even dividing the cutoff score in half would mean that none of the 95% CIs included a clinically important interaction effect. As a result, we conclude that we cannot use these psychological variables to select patients with subacute low back pain for physiotherapy-guided exercise or advice.

Why did we find so few significant interactions between treatment allocation and the putative psychological variables? It might be that there is no effect modification at all. In this respect our results concur with the results of the UK-BEAM study. Although the UK-BEAM study used a composite factor consisting of fear-avoidance beliefs, back beliefs, distress, and depression, making it unclear which psychological construct they were actually measuring, they found no effect modification in patients with subacute low back pain treated with manipulation, exercise, or both compared with usual care (17). Unfortunately, the trial report did not provide data on the magnitude and 95% CIs of the nonsignificant interactions between this composite psychological factor and each active treatment, making a comparison with our results impossible.

The current study has possible mechanisms and implications for clinicians or policymakers. If a predictor of good outcome is highly accurate it could be argued that people scoring high on this predictor just need reassurance and no further treatment. However, as the mean effects were not large, we cannot advise using these predictors as such.

Despite the fact that many researchers are attempting to identify subgroups of patients with subacute low back pain who are likely to gain the most from different interventions, there is little substantive evidence that such subgroups exist at all. Most studies that claim to have identified subgroups have reported on subgroup-specific findings that should be considered exploratory in nature (21). The best way to identify subgroups is by examining interaction effects in randomized controlled trials that prespecify a small number of putative effect modifiers in trial registers. However, as far as we know, such studies are not available.

Our study identified only a few predefined psychological characteristics of patients that predict greater benefits from advice or therapist-guided exercise for subacute low back pain. However, the effects were small, not clinically relevant, and could plausibly have been due to chance. Overall it can be concluded that these psychological characteristics do not predict response to physiotherapist-guided exercise or advice for patients with subacute low back pain who report low distress and low pain-related disability.

At this stage it would be premature to dismiss the potential for treatment effect modification. It may be that psychological constructs different from the ones we measured predict response to exercise- or advice-based treatment, or that the constructs we measured predict response to other treatments for subacute low back pain. Equally, as the study population appeared to be rather psychologically healthy compared with a chronic pain population (29), it may be that a higher threshold level of these constructs is necessary for their effects to be apparent. We recommend that further exploration of effect modification be conducted. Given the required power to test for effect modification, there is a pressing need to analyze a limited number of effect modifiers, based on clear a priori hypotheses.


All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be submitted for publication. Dr. Smeets had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study conception and design. Smeets, Maher, Nicholas, Refshauge, Herbert.

Acquisition of data. Maher, Nicholas, Refshauge, Herbert.

Analysis and interpretation of data. Smeets, Maher, Nicholas, Herbert.