Appendix 1. Previous plain language summary
Dance therapy for schizophrenia
Schizophrenia can be a long-term disabling illness. The most common treatments for people with this condition are medication (antipsychotics) and talking therapies, especially cognitive behavioural therapy and family therapy. These treatments work well for people with 'positive' symptoms (hearing voices and other alterations of the senses) and delusions (distortions in the way the world is seen). However people experiencing 'negative' symptoms (such as flattening of mood, poverty of speech, lack of drive, loss of feeling, social withdrawal and decreased spontaneous movement) do not respond as well. Dance therapy (also called dance movement therapy) uses dance and movement to explore a person’s emotions in a non-verbal way. The therapist will help the individual to interpret their movement as a link to personal feelings. This review aims to assess how successful this therapy is as a treatment for schizophrenia, when compared to standard care or other interventions. Six studies were identified but five were excluded because there were no reliable data, because they were for a therapy other than dance or because they were not properly randomised. The included study compared 10 weeks of group dance therapy plus standard care, to group supportive counselling plus standard care for the same length of time. It was a community-based project involving 45 people and both groups were followed up after four months. Of the outcomes measured (mental state, satisfaction with care, leaving the study early, quality of life and adverse effects) the majority showed no difference between the two groups. However, when negative symptoms were specifically measured after 10 weeks of treatment, there was a significant improvement in the mental state of the dance therapy group. At the four month follow-up more than 40% of the participants had been lost from both groups, making it impossible to draw any valid conclusions from the outcomes measured. Overall, because of the relatively small number of people, the data from this trial were inconclusive. However a larger randomised trial measuring outcomes such as relapse, admission to hospital, quality of life, leaving the study early, cost of care and satisfaction with treatment would help clarify whether dance therapy is an effective treatment for schizophrenia; especially for negative symptoms that don't respond so well to medication and talking therapies.(Plain language summary prepared for this review by Janey Antoniou of RETHINK, UK www.rethink.org).
Appendix 2. Previous search strategy
1. Cochrane Schizophrenia Group Trials Register (July 2007)
We searched the Cochrane Schizophrenia Group Trials Register (July 2007) using the phrase:
[(* danc* in title, abstract, index terms of REFERENCE) or (danc* in interventions of STUDY)]
This register is compiled by systematic searches of major databases, hand searches and conference proceedings (see Group Module).
Appendix 3. Previous data collection and analyses
JX and TG independently extracted data from included studies. JX carried out a separate re-extraction of data to ensure reliability. Again, when disputes arose, we attempted to resolve these by discussion and where further clarification was needed we contacted the authors of trials to provide us with the missing data. While waiting for further information, trials were added to the list of those awaiting assessment.
Where possible, we entered data in such a way that the area to the left of the line of no effect indicated a favourable outcome for dance therapy.
Data were extracted onto standard, simple forms.
3. Scale-derived data
We included continuous data from rating scales only if the measuring instrument had been described in a peer-reviewed journal (Marshall 2000) and the instrument is either a self-report or completed by an independent rater or relative (not the therapist).
Assessment of risk of bias in included studies
Again working independently, JX and TG assessed risk of bias using the tool described in The Cochrane Collaboration Handbook (Higgins 2005). This tool encourages consideration of how the sequence was generated, how allocation was concealed, the integrity of blinding at outcome, the completeness of outcome data, selective reporting and other biases. We would not have included studies where sequence generation was at high risk of bias or where allocation was clearly not concealed.
If disputes arose as to which category a trial has to be allocated, again resolution was made by discussion between the authors.
Measures of treatment effect
1. Binary data
For binary outcomes we calculated a standard estimation of the fixed-effect risk ratio (RR) and its 95% confidence interval (CI). For statistically significant results we calculated the number needed to treat/harm statistic (NNT/H), and its 95% confidence interval (CI) using Visual Rx taking account of the event rate in the control group.
2. Continuous data
2.1 Summary statistic
For continuous outcomes we estimated a fixed-effect weighted mean difference (WMD) between groups. We did not calculate effect size measures.
2.2 Endpoint versus change data
We preferred to use scale endpoint data, which typically cannot have negative values and is easier to interpret from a clinical point of view. Change data are often not ordinal and are very problematic to interpret. If endpoint data were unavailable, we used change data.
2.3 Skewed data
Continuous data on clinical and social outcomes are often not normally distributed. To avoid the pitfall of applying parametric tests to non-parametric data, we aimed to apply the following standards to all data before inclusion: (a) standard deviations and means are reported in the paper or obtainable from the authors; (b) when a scale starts from the finite number zero, the standard deviation, when multiplied by two, is less than the mean (as otherwise the mean is unlikely to be an appropriate measure of the centre of the distribution, (Altman 1996); (c) if a scale starts from a positive value (such as PANSS which can have values from 30 to 210), the calculation described above will be modified to take the scale starting point into account. In these cases skew is present if 2SD>(S-S min), where S is the mean score and S min is the minimum score. Endpoint scores on scales often have a finite start and end point and these rules can be applied. When continuous data are presented on a scale which includes a possibility of negative values (such as change data), it is difficult to tell whether data are skewed or not. Skewed data from studies of less than 200 participants were entered as 'other data' rather than into an analysis. Skewed data pose less of a problem when looking at means if the sample size is large and were entered into syntheses.
Unit of analysis issues
1. Cluster trials
Studies increasingly employ 'cluster randomisation' (such as randomisation by clinician or practice) but analysis and pooling of clustered data poses problems. Firstly, authors often fail to account for intraclass correlation in clustered studies, leading to a 'unit of analysis' error (Divine 1992) whereby P values are spuriously low, confidence intervals unduly narrow and statistical significance overestimated. This causes type I errors (Bland 1997; Gulliford 1999).
Where clustering was not accounted for in primary studies, we planned to present data in a table, with a (*) symbol to indicate the presence of a probable unit of analysis error. In subsequent versions of this review we will seek to contact first authors of studies to obtain the intraclass correlation coefficient (ICC) of their clustered data and to adjust for this by using accepted methods (Gulliford 1999). If clustering had been incorporated into the analysis of primary studies, we planned to present these data as if from a non-cluster randomised study, but adjusted for the clustering effect.
We have sought statistical advice and have been advised that the binary data as presented in a report should be divided by a 'design effect'. This is calculated using the mean number of participants per cluster (m) and the ICC [Design effect = 1+(m-1)*ICC] (Donner 2002). If the ICC had not been reported, we would have assumed it to be 0.1 (Ukoumunne 1999).
If we had included cluster studies and they had been appropriately analysed taking into account the ICC and relevant data documented in the report, synthesis with other studies would have been possible using the generic inverse variance technique.
2. Cross-over trials
A major concern of cross-over trials is the carry-over effect. It occurs if an effect (e.g. pharmacological, physiological or psychological) of the treatment in the first phase is carried over to the second phase. As a consequence on entry to the second phase the participants can differ systematically from their initial state despite a wash-out phase. For the same reason cross-over trials are not appropriate if the condition of interest is unstable (Elbourne 2002). As both effects are very likely in schizophrenia, if we had included cross-over trials, we planned to useonly data of the first phase of cross-over studies.
3. Studies with multiple treatment groups
Where a study involved more than two treatment arms, if relevant, the additional treatment arms were presented in comparisons. Where the additional treatment arms were not relevant, these data would nott have been reproduced.
Dealing with missing data
1. Overall loss of credibility
At some degree of loss of follow-up, data must lose credibility. We are forced to make a judgment where this is for the very short-term trials likely to be included in this review. Should more than 30% of data be unaccounted for by 24 hours we did not reproduce these data or use them within analyses.
In the case where attrition for a binary outcome is between 0% and 30% and outcomes of these people are described, we included these data as reported. Where these data were not clearly described, we assumed the worst primary outcome, and rates of adverse effects similar to those who did continue to have their data recorded.
In the case where attrition for a continuous outcome is between 0% and 30% and completer-only data were reported, we have reproduced these.
Assessment of heterogeneity
As only one study was included we could not examine heterogeneity.
1. Clinical heterogeneity
We planned to consider all included studies without any comparison to judge clinical heterogeneity.
2.1 Visual inspection
We planned to visually inspect graphs to investigate the possibility of statistical heterogeneity.
2.2 Employing the I-squared statistic
This provides an estimate of the percentage of inconsistency thought to be due to chance. An I-squared estimate greater than or equal to 40% would have been interpreted as evidence of high levels of heterogeneity (Higgins 2003).
Assessment of reporting biases
Reporting biases arise when the dissemination of research findings is influenced by the nature and direction of results (Egger 1997). These are described in section 10.1 of the Cochrane Handbook (Higgins 2005). We are aware that funnel plots may be useful in investigating reporting biases but are of limited power to detect small-study effects. As we only included one study, we did not use funnel plots to investigate the likelihood of overt publication bias.
Where possible we employed a fixed-effect model for analyses. We understand that there is no closed argument for preference for use of fixed- or random-effects models. The random-effects method incorporates an assumption that the different studies are estimating different, yet related, intervention effects. This does seem true to us, however, random-effects does put added weight onto the smaller of the studies - those trials that are most vulnerable to bias. For this reason we favour using fixed-effect models employing random-effects only when investigating heterogeneity.
Subgroup analysis and investigation of heterogeneity
There were no subgroup analyses planned.
Again, had there been more trials we would have analysed the effect of excluding studies with high attrition rates in a sensitivity analysis. We would have compared primary outcomes for trials where randomisation was implied, rather than described, with those where allocation was clearly at random.