Summary of findings
Description of the condition
Schizophrenia is a severe form of mental illness characterised by positive symptoms such as delusions and hallucinations; negative symptoms such as social withdrawal, loss of motivation and an inability to experience pleasure; and abnormalities in cognition. These symptoms lead to significant social and occupational dysfunction and the majority of sufferers also experience a lack of insight, believing that they do not have a disorder that warrants treatment (Tandon 2009). People with schizophrenia also have a significantly reduced life expectancy compared with the general population (Tiihonen 2009), due to unnatural deaths (i.e. suicide and accidents) and physical disorders (i.e. heart disease, endocrine diseases, respiratory disease and infectious diseases) (Saha 2007). The onset of schizophrenia tends to be in adolescence or early adulthood. One meta-analysis reported the median (10% to 90% quantile) incidence of schizophrenia to be 15.2/100,000 population with a greater distribution in males compared with females; the male/female rate ratio median (10% to 90% quantile) was 1.40 (95% CI 0.9 to 2.4) (McGrath 2004). The lifetime prevalence of schizophrenia is relatively high - at 7.2/1000 population - because it usually starts in early adult life and is a chronic disease (Saha 2005). In the Global Burden of Disease Study 2010, mental illness and behavioural disorders accounted for 7.4% of all DALYs (Disability Adjusted Life Years), and were attributable to more than 15 million DALYs each; schizophrenia ranked as fifth under this category (accounting for 0.6%) after depression, anxiety, drug-use and alcohol-use disorders (Murray 2012). Medication is the mainstay of treatment for schizophrenia; however, 5% to 15% of people continue to experience symptoms in spite of medication and may also develop undesirable adverse effects (Johnstone 1998) - improved quality of life is then particularly important for these individuals. On a related topic, Faulkner 2005 concludes that exercise may alleviate secondary symptoms of schizophrenia such as depression, low self-esteem and social withdrawal.
Description of the intervention
The terms horticultural therapy (HT) and therapeutic horticulture (TH) are often used interchangeably. However, some clinicians and researchers distinguish between them (Gonzalez 2009). Horticultural therapy is defined by the American Horticultural Therapy Association as the engagement of a person in gardening-related activities, facilitated by a trained therapist, to achieve specific treatment goals (AHTA 2011). Therapeutic horticulture (TH) is defined as an open programme "a process that uses plant-related activities through which participants strive to improve their well-being through active and passive involvement", which can be easy implemented and performed by a variety of healthcare providers (Gonzalez 2009). Flournoy broadly describes horticultural therapy as the process of utilising fruits, vegetables, flowers and plants (Flournoy 1975). It is Flournoy's definition, the active participation, which we are using to define horticultural therapy, either with a trained therapist or healthcare provider. Whether described as horticultural therapy or therapeutic horticulture, they both can be used as vehicles for therapy or rehabilitation programs for cognitive, physical, social, emotional, and recreational benefits (Wichrowski 2005), thus improving the person's body, mind and spirit.
How the intervention might work
Horticultural therapy may work in a number of ways, but certain elements will be present. The setting would be a hospital healing garden or therapeutic garden; a nature-orientated space of sanctuary in which the basic urge for contact with nature can be met. The garden design should facilitate security and physical comfort and should be calm and quiet to create therapeutic and rehabilitative potential (Soderback 2004). Many participators in these projects have said that the setting is important, and they talked about the enjoyment of "being outside" or "being in the fresh air" (Sempik 2011a). This type of garden may help reduce stress in patients suffering from severe mental disorders. Since stress has been demonstrated to play a role in both the development and outcome of mental disorders, a reduction of stress associated with working in a therapeutic garden may improve outcomes in patients with schizophrenia. Many people with schizophrenia suffer from social isolation and a lack of social support. Working in a therapeutic garden may provide opportunities for social interaction and network building, improving an individual's overall social well-being. The activities will be led by a trained therapist or healthcare provider who will teach horticultural tasks such as soil preparation, sowing seeds, setting out plants, pruning etc. They may help participants record their tasks by writing or drawing, develop confidence and self-esteem through their work, and assist individuals to improve their social and practical skills. Throughout the programme they will be observing individuals to monitor their progress, and assessing whether they have met predetermined targets/outcomes. Finally, people with schizophrenia may be more sedentary and less physically active than the general population (Gorczynski 2010). Low levels of physical activity are associated with poor physical and mental health and premature mortality. Therpeutic gardening is a source of physical activity and may therefore, benefit people with schizophrenia (Flournoy 1975).
Why it is important to do this review
Horticultural therapy has been offered to a number of groups recovering from various conditions i.e. depression (Gonzalez 2009), aphasia (Sarno 1997) and cardiac rehabilitation (Wichrowski 2005). It has long clinical traditions, but other than O'Reilly and Handorth (O'Reilly 1955), there are few published studies of its use as a therapy for psychiatric patients (Sempik 2011b) and little research evidence to support its claims of effectiveness (Sempik 2011a).
To evaluate the effects of horticultural therapy for people with schizophrenia or schizophrenia-like illnesses compared with standard care or other additional psychosocial interventions.
Criteria for considering studies for this review
Types of studies
All relevant randomised controlled trials. If a trial had been described as double blind but it was implied that the study was randomised, these trials would have been included in a sensitivity analysis. We intended that if there was no substantive difference within primary outcomes (see Types of outcome measures) when implied randomised trials were added, then we would have included these trials in the final analysis. If there was a substantial difference, we would have only analysed clearly randomised trials and described the results of the sensitivity analysis in the text. We excluded quasi-randomised studies, such as those where allocation is undertaken by alternate days of the week. Where people were given additional treatments within horticultural therapy, we only included the data if the adjunct treatment was evenly distributed between groups and it was only the horticultural therapy that is randomised.
Types of participants
Adults, however defined, with schizophrenia or related disorders, including schizophreniform disorder, schizoaffective disorder and delusional disorder, again, by any means of diagnosis.
Types of interventions
1. Horticultural therapy
The engagement of a person in horticultural activities, which is facilitated by a trained therapist or healthcare provider.
2. Standard Care
Standard care or care as usual is defined as standard psychiatric care for participants in the trial. It was assumed that both intervention and control participants would be receiving standard care, which would normally include: medication, medication management, case management, and supportive psychotherapy.
Types of outcome measures
We planned to divide all outcomes into short term (less than six months), medium term (seven to 12 months) and long term (over one year).
1. Well-being and quality of life measures
1.1 No clinically important change in quality of life
1.2 Average endpoint quality of life score
1.3 Average change in quality of life scores
1.4 No clinically important change in specific aspects of quality of life
1.5 Average endpoint specific aspects of quality of life
1.6 Average change in specific aspects of quality of life
1. Clinical global response
1.1 Global state - not improved
1.3 Average change or endpoint score in global state
1.4 Leaving the study early
1.5 Compliance with medication
2. General functioning
2.1 No clinically important change in general functioning
2.2 Average endpoint general functioning score
2.3 Average change in general functioning scores
2.4 No clinically important change in specific aspects of functioning, such as social or life skills
2.5 Average endpoint specific aspects of functioning, such as social or life skills
2.6 Average change in specific aspects of functioning, such as social or life skills
3. Physical fitness
3.1 No clinically important change in physical fitness
3.2 Average endpoint physical fitness score
3.3 Average change in physical fitness scores
3.4 No clinically important change in specific aspects of physical fitness
3.5 Average endpoint specific aspects of physical fitness
3.6 Average change in specific aspects of physical fitness
4. Satisfaction with treatment
4.1 Leaving the studies early
4.2 Recipient of care not satisfied with treatment
4.3 Recipient of care average satisfaction score
4.4 Recipient of care average change in satisfaction scores
4.5 Carer not satisfied with treatment
4.6 Carer average satisfaction score
4.7 Carer average change in satisfaction scores
5. Mental state and behaviour
5.1 Positive symptoms (delusions, hallucinations, disordered thinking)
5.2 Negative symptoms (avolition, poor self-care, blunted affect)
5.3 No clinically important change in specific symptoms
5.4 Average change or endpoint score
6. Adverse effects
6.1 No clinically important general adverse effects
6.2 Not any general adverse effects
6.3 Average change or endpoint general adverse effect scores
6.4 No clinically important change in specific adverse effect
6.5 Not any change in specific adverse effects
6.6 Average change or endpoint specific adverse effects
6.7 Death- suicide and all causes of mortality
7. 'Summary of findings' table
We used the GRADE approach to interpret findings (Schünemann 2008) and used GRADE Profiler to import data from RevMan 5 to create 'Summary of findings' tables. These tables provide outcome-specific information concerning the overall quality of evidence from the included study in the comparison, the magnitude of effect of the interventions examined, and the sum of available data on all outcomes we rated as important to patient-care and decision making. We selected the following main outcomes for inclusion in the 'Summary of findings' table.
- Well-being and quality of life measures - no clinically important change in quality of life' average endpoint or change in quality of life scores
- Clinical global response - global state - not improved
- General functioning - no clinically important change in general functioning
- Physical fitness - no clinically important change in physical fitness
- Satisfaction with treatment - leaving the studies early
- Adverse effect - no clinically important general adverse effects
- Mental state and behaviour - average change or endpoint score
Search methods for identification of studies
Cochrane Schizophrenia Group Trials Register (9th January 2013)
We searched the Cochrane Schizophrenia Group Trial's Register using the phrase:
[*garden* OR *horticult* OR *cultivat* OR * plant* OR *floriculture* OR *topiary* in title, abstract and index terms of REFERENCE]
The Cochrane Schizophrenia Group’s Registry of Trials is compiled by systematic searches of major resources (including AMED, BIOSIS, CINAHL, EMBASE, MEDLINE, PsycINFO, PubMed, and registries of Clinical Trials) and their monthly updates, handsearches, grey literature, and conference proceedings. The searches do not have language limitation (for details of the search term for schizophrenia and databases searched please also see their Group Module).
Searching other resources
1. Reference searching
We inspected references of all identified studies for further relevant studies.
2. Personal contact
We contacted the first author of the one included study for information regarding unpublished trials.
Data collection and analysis
Selection of studies
Review authors YL and LB independently inspected citations identified in the search. Where disputes arose, we acquired the full report for a more detailed scrutiny. YL and LB obtained and inspected the full reports of the abstracts meeting the review criteria. Where it was impossible to resolve disagreement by discussion, we attempted to contact the authors of the study for clarification.
Data extraction and management
Review authors YL and LB extracted data from the included study. We extracted data presented only in graphs and figures where possible, but only included if both results were similar. When further information was necessary, we contacted the authors of the study in order to obtain missing data or for clarification.
We extracted data onto standard, simple forms.
2.2 Scale-derived data
We included continuous data from rating scales only if:
a. the psychometric properties of the measuring instrument have been described in a peer-reviewed journal (Marshall 2000); and
b. the measuring instrument has not been written or modified by one of the trialists for that particular trial.
Ideally, the measuring instrument should either be i. a self-report or ii. completed by an independent rater or relative (not the therapist). We realise that this is not often reported clearly, we have noted whether or not this is the case in Description of studies.
2.3 Endpoint versus change data
There are advantages of both endpoint and change data. Change data can remove a component of between-person variability from the analysis. On the other hand, calculation of change needs two assessments (baseline and endpoint) which can be difficult in unstable and difficult to measure conditions such as schizophrenia. We decided primarily to use endpoint data, and only use change data if the former were not available. Only one trial was included, therefore, we were unable to combine data. If in future updates of this review, we identify more trials, we will combine endpoint and change data in the analysis using mean differences (MD) rather than standardised mean differences (SMD) throughout (Higgins 2011).
2.4 Skewed data
Continuous data on clinical and social outcomes are often not normally distributed. To avoid the pitfall of applying parametric tests to non-parametric data, we aimed to apply the following standards to all data before inclusion:
a) standard deviations (SDs) and means are reported in the paper or obtainable from the authors;
b) when a scale starts from the finite number zero, the SD, when multiplied by two, is less than the mean (as otherwise the mean is unlikely to be an appropriate measure of the centre of the distribution), (Altman 1996);
c) if a scale started from a positive value (such as the Positive and Negative Syndrome Scale (PANSS, Kay 1986), which can have values from 30 to 210), we modified the calculation described above to take the scale starting point into account. In these cases skew is present if 2 SD > (S-S min), where S is the mean score and S min is the minimum score.
Endpoint scores on scales often have a finite start and end point and these rules can be applied. We would have entered skewed endpoint data from studies of fewer than 200 participants as 'other data' within the data and analyse rather than into a statistical analysis. Skewed data pose less of a problem when looking at the mean if the sample size is large; we would have entered such endpoint data into syntheses. In the one included study, continuous data were presented on a scale that included negative values (change data). It is difficult to tell whether data are skewed or not, we entered skewed change data into analyses regardless of size of study.
2.5 Common measure
To facilitate comparison between trials, we intended to convert variables that can be reported in different metrics, such as days in hospital (mean days per year, per week or per month) to a common metric (e.g. mean days per month).
2.6 Conversion of continuous to binary
Where possible, we would have made efforts to convert outcome measures to dichotomous data. This can be done by identifying cut-off points on rating scales and dividing participants accordingly into 'clinically improved' or 'not clinically improved'. It is generally assumed that if there is a 50% reduction in a scale-derived score such as the Brief Psychiatric Rating Scale (BPRS, Overall 1962) or the PANSS (Kay 1986), this could be considered as a clinically significant response (Leucht 2005; Leucht 2005a). If data based on these thresholds were not available, we would have used the primary cut-off presented by the original authors. However, in the one included study, these data were not available.
2.7 Direction of graphs
Where possible, we entered data in such a way that the area to the left of the line of no effect indicated a favourable outcome for horticultural therapy. If we had to report data where the left of the line indicated an unfavourable outcome, this would have been noted in the relevant graphs.
Assessment of risk of bias in included studies
Again, YL and LB worked independently using criteria described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011) to assess trial quality. This new set of criteria is based on evidence of associations between overestimate of effect and high risk of bias of the article such as sequence generation, allocation concealment, blinding, incomplete outcome data and selective reporting.
If the raters disagreed, we made the final rating by consensus, with the involvement of another member of the review group. If inadequate details of randomisation and other characteristics of the included trial had been provided, we would have contacted the authors of the study in order to obtain additional information. We would have reported non-concurrence in quality assessment, but if disputes arose as to which category a trial was to be allocated, again, we would have resolved these by discussion.
We noted the level of risk of bias in both the text of the review and in the Summary of findings for the main comparison.
Measures of treatment effect
1. Binary data
For binary outcomes, we calculated a standard estimation of the risk ratio (RR) and its 95% confidence interval (CI). It has been shown that RR is more intuitive (Boissel 1999) than odds ratios and that odds ratios tend to be interpreted as RR by clinicians (Deeks 2000). For statistically significant results, we had planned to calculate the number needed to treat to provide benefit /to induce harm statistic (NNTB/H), and its 95% CI using Visual Rx (http://www.nntonline.net/) taking account of the event rate in the control group. However, we found no statistically significant results.
2. Continuous data
For continuous outcomes, we estimated MD between groups. We preferred not to calculate effect size measures (SMD). However, if in future versions of this review, we identify scales of very considerable similarity are used, we will presume there is a small difference in measurement, and calculate the effect size and transform the effect back to the units of one or more of the specific instruments.
Unit of analysis issues
1. Cluster trials
Studies increasingly employ 'cluster randomisation' (such as randomisation by clinician or practice), but analysis and pooling of clustered data poses problems. Firstly, authors often fail to account for intra-class correlation in clustered studies, leading to a 'unit of analysis' error (Divine 1992) whereby P values are spuriously low, confidence intervals (CIs) unduly narrow and statistical significance overestimated. This causes type I errors (Bland 1997; Gulliford 1999).
If clustering had not been accounted for in primary studies, we planned to present data in a table, with a (*) symbol to indicate the presence of a probable unit of analysis error. We identified no cluster trials in this review; however, in subsequent versions of this review where cluster trials are identified, we will seek to contact first authors of such studies to obtain intra-class correlation coefficients (ICCs) for their clustered data and to adjust for this by using accepted methods (Gulliford 1999). Where clustering has been incorporated into the analysis of primary studies, we will present these data as if from a non-cluster randomised study, but adjust for the clustering effect.
We have sought statistical advice and have been advised that the binary data as presented in a report should be divided by a 'design effect'. This is calculated using the mean number of participants per cluster (m) and the ICC [Design effect = 1+(m-1)*ICC] (Donner 2002). If the ICC is not reported it will be assumed to be 0.1 (Ukoumunne 1999).
If cluster studies have been appropriately analysed taking into account ICCs and relevant data documented in the report,synthesis with other studies would be possible using the generic inverse variance technique.
2. Cross-over trials
A major concern of cross-over trials is the carry-over effect. It occurs if an effect (e.g. pharmacological, physiological or psychological) of the treatment in the first phase is carried over to the second phase. As a consequence, on entry to the second phase the participants can differ systematically from their initial state despite a wash-out phase. For the same reason cross-over trials are not appropriate if the condition of interest is unstable (Elbourne 2002). As both effects are very likely in severe mental illness, if we had included cross-over trials, we would only have used the data of the first phase of cross-over studies.
3. Studies with multiple treatment groups
No studies with multiple treatment groups were included. In future updates of this review, if an included study involves more than two treatment arms, if relevant, we will present the additional treatment arms in comparisons. If data are binary, we will simply add and combine them within the two-by-two table. If data are continuous, we will combine data following the formula in section 18.104.22.168 (Combining groups) of the Cochrane Handbook for Systemic reviews of Interventions (Higgins 2011). Where the additional treatment arms are not relevant, we will not reproduce these data.
Dealing with missing data
1. Overall loss of credibility
At some degree of loss of follow-up, data must lose credibility (Xia 2009). We chose that, for any particular outcome, should more than 50% of data be unaccounted for, we would not reproduce these data or use them within analyses, except for the outcome of 'leaving the study early'. In future updates of this review, If, however, more than 50% of those in one arm of a study are lost, but the total loss is less than 50%, we will address this within the 'Summary of findings' table by down-rating the quality.
In future updates of this review, where attrition for a binary outcome is between 0% and 50% and where these data are not clearly described, we will present data on a 'once-randomised-always-analyse' basis (an intention-to-treat (ITT) analysis). Those leaving the study early are all assumed to have the same rates of negative outcome as those who complete, with the exception of the outcome of death and adverse effects. For these outcomes the rate of those who stay in the study - in that particular arm of the trial - are used for those who do not. We will undertake a sensitivity analysis to test how prone the primary outcomes are to change when 'completer' data only are compared with the ITT analysis using the above assumptions.
In future updates of this review, where attrition for a continuous outcome is between 0% and 50% and completer-only data is reported, we will reproduce these.
3.2 Standard deviations
If standard deviations (SDs) were not reported, we first tried to obtain the missing values from the authors. If not available, where there are missing measures of variance for continuous data, but an exact standard error (SE) and CIs available for group means, and either P value or 't' value available for differences in mean, we can calculate them according to the rules described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011): When only the SE is reported, SDs are calculated by the formula SD = SE x square root (n). Chapters 7.7.3 and 16.1.3 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011) present detailed formulae for estimating SDs from P values, t or F values, CIs, ranges or other statistics. If these formulae do not apply, we can calculate the SDs according to a validated imputation method which is based on the SDs of the other included studies (Furukawa 2006). Although some of these imputation strategies can introduce error, the alternative will be to exclude a given study's outcome and thus to lose information. We nevertheless will examine the validity of the imputations in a sensitivity analysis excluding imputed values.
3.3 Last observation carried forward
We anticipated that in some studies the method of last observation carried forward (LOCF) would be employed within the study report. As with all methods of imputation to deal with missing data, LOCF introduces uncertainty about the reliability of the results (Leucht 2007). Therefore, in future versions of this review, where LOCF data have been used in the trial, if less than 50% of the data have been assumed, we plan to reproduce these data and indicate that they are the product of LOCF assumptions.
Assessment of heterogeneity
We only included one study in this version of the review. In future updates of this review, if we include more studies, we will use the following methods to assess heterogeneity.
1. Clinical heterogeneity
We will consider all included studies initially, without seeing comparison data, to judge clinical heterogeneity. We will simply inspected all studies for clearly outlying people or situations which we had not predicted would arise. When such situations or participant groups arise, we will discuss these fully.
2. Methodological heterogeneity
We will consider all included studies initially, without seeing comparison data, to judge methodological heterogeneity. We will simply inspect all studies for clearly outlying methods which we had not predicted would arise. When such methodological outliers arise, we will discuss these fully.
3. Statistical heterogeneity
3.1 Visual inspection
We will visually inspect graphs to investigate the possibility of statistical heterogeneity.
3.2 Employing the I
We will investigate heterogeneity between studies by considering the I
Assessment of reporting biases
Reporting biases arise when the dissemination of research findings is influenced by the nature and direction of results (Egger 1997). These are described in Section 10 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). We are aware that funnel plots may be useful in investigating reporting biases but are of limited power to detect small-study effects. We were unable to use funnel plots for outcomes as there was only one included study. In future updates of this review, we will not use funnel plots for outcomes where there are 10 or fewer studies, or where all studies are of similar sizes. In other cases, where funnel plots are possible, we will seek statistical advice in their interpretation.
We understand that there is no closed argument for preference for use of fixed-effect or random-effects models. The random-effects method incorporates an assumption that the different studies are estimating different, yet related, intervention effects. This often seems to be true to us and the random-effects model takes into account differences between studies even if there is no statistically significant heterogeneity. There is, however, a disadvantage to the random-effects model. It puts added weight onto small studies which often are the most biased ones. Depending on the direction of effect, these studies can either inflate or deflate the effect size. In this version of the review we were unable to combine data. In future updates, if there are more trials, we will choose random-effects model for all analyses. The reader will, however, be able to choose to inspect the data using the fixed-effect model.
Subgroup analysis and investigation of heterogeneity
1. Subgroup analyses - only primary outcomes
We do not anticipate that we will have the data for subgroup analyses.
2. Investigation of heterogeneity
There is only one included study. In future updates - if we include more studies - if inconsistency is high, we will report this. First, we will investigate whether data have been entered correctly. Second, if data are correct, we will visually inspect the graph and successively remove outlying studies to see if homogeneity is restored. For this review, we decided that should this occur with data contributing to the summary finding of no more than around 10% of the total weighting, we will present data. If not, then we will not pool data and we will discuss these issues. We know of no supporting research for this 10% cut-off but we use prediction intervals as an alternative to this unsatisfactory state.
When unanticipated clinical or methodological heterogeneity is obvious, we will simply state hypotheses regarding these for future reviews or versions of this review. We do not anticipate undertaking analyses relating to these.
We would have applied all sensitivity analyses to the primary outcomes of this review.
1. Implication of randomisation
We planned to include trials in a sensitivity analysis if they were described in some way so as to imply randomisation. For the primary outcomes, we would have included these studies and if there was no substantive difference when the implied randomised studies were added to those with better description of randomisation, then we would have entered all data from these studies.
2. Assumptions for lost binary data
In future updates, where assumptions have to be made regarding people lost to follow-up (see Dealing with missing data), we will compare the findings of the primary outcomes when we use our assumption compared with completer data only. If there is a substantial difference, we will report results and discuss them but continue to employ our assumption.
If assumptions have to be made regarding missing SDs data (see Dealing with missing data), we will compare the findings of primary outcomes when we use our assumption compared with complete data only. A sensitivity analysis will be undertaken to test how prone results are to change when 'completer' data only are compared with the imputed data using the above assumption. If there is a substantial difference, we will report results and discuss them but continue to employ our assumption.
3. Risk of bias
We planned to analyse the effects of excluding trials that were judged to be at high risk of bias across one or more of the domains of randomisation (implied as randomised with no further details available), allocation concealment, blinding and outcome reporting for the meta-analysis of the primary outcome. If the exclusion of trials at high risk of bias did not substantially alter the direction of effect or the precision of the effect estimates, then we would have included data from these trials in the analysis.
4. Imputed values
In future updates, we will also undertake a sensitivity analysis to assess the effects of including data from trials where we use imputed values for ICC in calculating the design effect in cluster randomised trials. If we note substantial differences in the direction or precision of effect estimates in any of the sensitivity analyses listed above, we will not pool data from the excluded trials with the other trials contributing to the outcome, but present them separately.
5. Fixed-effect and random-effects
We used a random-effects model to calculate data from the one included study.
Description of studies
For more detailed description of each studies, please see the Characteristics of included studies, Characteristics of excluded studies, Characteristics of studies awaiting classification.
Results of the search
The original search identified 19 potentially eligible studies. After checking the titles and abstracts, five full text papers were considered as potentially relevant and they were retrieved for further assessment; one was excluded (Galderisi 2010) and three were assigned to awaiting classification (Ban 2001a; Ban 2001b; Ban 2001c), leaving only one to include in this review (Kam 2010). The reference list of the one study was checked, through which we identified another three potentially relevant studies (Perrins-Margalis 2000; Son 2004a; Son 2004b). Again, full text papers were obtained and after closer inspection they were excluded because they were not randomised. Please see Figure 1.
|Figure 1. Study flow diagram to show trial selection|
We included only one study that compared horticultural therapy with conventional workshop training (Kam 2010). The characteristics of this study are described below (See: Characteristics of included studies).
1. Length of trial
The treatment period of the included study (Kam 2010) lasted within two weeks, consisting of 10 consecutive days, with no longer term follow-up. The single study examined the short-term (less than six months) effects of horticultural therapy versus conventional workshop training before and after the intervention.
Twenty-four participants were included in the study, n = 17 men and n = 7 women with a mean age of 44.3 (SD = 11.6). Twenty-two participants had primary diagnoses of schizophrenia and two experienced other psychiatric illnesses (schizoaffective disorder, schizophreniform disorder, psychosis not otherwise specified, bipolar disorder, or major depression). More than half (58.3 %) of the participants had received secondary education or higher.
This trial took place in the New Life Farm (NLF) in Hong Kong—a sheltered workshop that uses horticulture activities as rehabilitation media.
4. Study size
Study size was small with only 24 people randomised (see: Characteristics of included studies).
5.1 intervention group
All participants in the intervention group maintained their standard care and in addition received horticultural therapy. The therapy was a one-hour horticultural activity session conducted in 10 consecutive days within two weeks (see Description of the intervention and Flournoy 1975). Each session had a specific theme and objectives:
- orientation-introduction to the programme, garden tour;
- organic tips-give an introduction to organic farming, review life story and successes in coping with life events;
- cultivation and growth-teach and practice watering and fertilising plants, improve understanding about importance of protective factors in coping with stress;
- small steps toward great success-teach and practice weeds removal and loosening soil, sharing of experience related to coping strategies;
- the great day-teach harvesting skills and how to examine and taste vegetables, share about their past interests and successful events;
- herbs for relaxation-introduction to herbs and make drawing of and identify different herbs, share experiences related to their personal interests;
- be tough as a scarecrow-make a scarecrow, share experience related to handicraft project and coping with stress;
- taste the herbs-make herb tea bags, share strategies related to self-management of diet;
- bringing it to life-teach the procedures of potting plants, share their hopes, wishes, and future;
- grow with support-visit and introduction to greenhouse, sharing on the activity group experience.
A registered occupational therapist was responsible for implementing the horticultural programme.
5.2 Control group
The control group received standard care only. However, this standard care also had an element of the horticultural therapy as well. The latter was not as long or focused but, nevertheless, was present. Stanard care was one hour per day of horticultural activity, whereas the intervention group did get much more intensive training as well as the activity.
A number of outcomes were reported. Personal Wellbeing Index (PWI-C) and Depression Anxiety Stress Scale (DASS21) were measured before and at the end of treatment; only the change scores from the baseline to post intervention are listed in the paper. General functioning was reported using Work Behavior Assessment (WBA). We were, however, unable to use this outcome because the scale have not been peer-reviewed, and the author of the study appeared to have been involved in the development of this instrument. Satisfaction with treatment was also reported using Qualitative Evaluation but there were no usable data. Leaving the study early was reported. However, the physical fitness and adverse effects were not reported in the included study.
6.1 Well-being and quality of life measures
Personal Wellbeing Index (PWI-C) (Lau 2005)
The PWI consisted of seven domains, measured on an 11-point end defined Likert scale, with numerical ratings ranging from zero (extremely dissatisfied) to 10 (extremely satisfied).The PWI-C is a subjective quality of life measure which has been translated and validated for wide academic use in Hong Kong (Kam 2010). This scale is a generic measure of subjective well-being and asks how satisfied people are with seven life domains: standard of living, personal health, achievement in life, personal relationships, personal safety, community-connectedness and future security (Lau 2005). A higher score indicates a better quality of life.
6.2 Mental state and behaviour
Depression Anxiety Stress Scale (DASS21) (Lovibond 1995)
This 21-item scale, with a scoring system of zero to three for each item, measures current ("over the past week") symptoms of depression, anxiety and stress. It has been translated into Chinese and has been reported to be sensitive to the cultural and linguistic issues, and could significantly discriminate between the negative emotional syndromes of depression, anxiety and stress in Chinese populations (Lovibond 1995). A low score indicates lesser severity.
For a more detailed description of each study, please refer to the Characteristics of excluded studies). We excluded four studies from this review. Three of the excluded studies were not randomised ( Son 2004a; Son 2004b; Perrins-Margalis 2000). Galderisi 2010 was excluded as gardening was included as part of the structured leisure activities at one of the sites only, and not as an intervention alone. Three studies are awaiting classification (Ban 2001a; Ban 2001b; Ban 2001c); all three studies were written by the same person. The participants and interventions were similar, but there were irregularities in the outcome data. We are awaiting clarification from the author.
1. Ongoing Studies
No studies were classified as ongoing.
2. Awaiting classification
Three studies are awaiting assessment for some irregularities with the data.(See Characteristics of studies awaiting classification).
Risk of bias in included studies
|Figure 2. 'Risk of bias' graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.|
|Figure 3. 'Risk of bias' summary: review authors' judgements about each risk of bias item for each included study.|
Kam 2010 described that the participants were "randomly allocated to the experimental or control groups by an independent research assistant who was blinded to the hypothesis and the intervention programme". The review authors therefore elected to consider this a low risk of bias.
A blinded independent assessor conducted all the outcome measures. Therefore, the included study was single blind (assessor blind).
Incomplete outcome data
Two people who left early appeared in the CONSORT diagram of the paper with unknown reasons, but they were omitted in the comparison of change scores. The study does not reassure us on this point and we considered this to be a high risk of bias..
There is no selective reporting. All measurements as stated in the 'methods' section of the paper are reported.
Other potential sources of bias
No other potential biases were found.
Effects of interventions
The review contains one included study (Kam 2010). The included study allows for one comparison: horticultural therapy standard care (the 'intervention treatment' referred to below) versus standard care, where 'standard care' refers to conventional workshop training. Outcomes are presented below in the order in which they are listed in the Types of outcome measures section. Outcomes were measured at short term (less than six months), within two weeks consisting of 10 consecutive days, with no longer follow-up.
Comparison 1: HORTICULTURAL THERAPY + STANDARD CARE versus STANDARD CARE
The one included study contained one comparison: horticultural therapy standard care versus standard care.
1. Well-being and quality of life measures
Short-term well-being and quality of life change scores from the PWI-C were presented.The average change scores were equivocal (1 RCT, n = 22, mean difference (MD) -0.90 95% confidence interval (CI) -10.35 to 8.55, Analysis 1.1).
2. Clinical global response
The only included study did not report this outcome.
3. General functioning
The WBA scale was used to present data for general functioning. We were, however, unable to use this outcome because the scale have not been peer-reviewed, and the author of the study appeared to have been involved in the development of this instrument (Data extraction and management).
4. Physical fitness
The only included study did not report this outcome.
5. Satisfaction with treatment
Qualitative Evaluation (Kam 2010) was used for measuring satisfaction with treatment. But the scale is not referenced and there were no usable data.
As for the leaving the study early data, there was no clear difference between groups at the end of treatment (1 RCT, n = 24, risk ratio (RR) 5.00, 95% CI 0.27 to 94.34, Analysis 1.2).
6. Mental state and behaviour
The DASS21 scale was used to report change scores for mental state and behaviour.
6.1 Total change scores
Here a difference, favouring horticultural therapy, was found. Those in the horticultural therapy group had larger change scores (1 RCT, n = 22, MD -23.70, CI -35.37 to -12.03, Analysis 1.3).
6.2 Depression change scores
A difference, favouring horticultural therapy was found again (1 RCT, n = 22, MD -8.03, CI -15.40 to -0.66).
6.3 Anxiety change scores
There was a difference between the two groups,suggesting horticultural therapy was more effective (1 RCT,n = 22, MD -9.67, CI -15.87 to -3.47).
6.4 Stress change scores
Again, a difference between groups was apparent, favouring horticultural therapy (1 RCT, n = 22, MD -5.50, CI -10.57 to -0.43).
7. Adverse effects
The only included study did not report this outcome.
Summary of main results
HORTICULTURAL THERAPY STANDARD CARE versus STANDARD CARE
1. Well-being and quality of life measures
Limited data from this one included study found no clear evidence of a difference between groups for well-being and quality of life measures after treatment.
2. Clinical global response
It is unfortunate that no data are available for service utilisation.
3. Satisfaction with treatment-leaving the study early
We did not find clear evidence of a difference between groups in the number of people leaving the study early at the end of treatment. It shows a relatively good acceptance of the experimental treatment, which may be considered as a positive effect of horticultural therapy itself and facilitate its use in practice.
4. Mental state and behaviour
We found a difference between treatment groups on the DASS21 change score (total, depression, anxiety and stress respectively).This means horticultural therapy plus standard care may, in the short term, be more effective than standard care in relieving depression, anxiety and stress symptoms. However, the results came from one small study (n = 24) with a short duration.
Overall completeness and applicability of evidence
Evidence from the included study is certainly relevant to the review question, but it is not sufficient to address all of the objectives of this review. We do not really have any good data on service utilisation or other crude but useful outcomes such as global response. The very short duration and follow-up in this included study might influence the directness of evidence, considering the chronic nature of schizophrenia.
Quality of the evidence
Overall, the evidence in this review is poor, as we were only able to include one small trial and the quality of this trial is unclear (Figure 3). Randomisation was undertaken by an independent research assistant who was blinded to the hypothesis and the intervention programme. A blinded independent assessor conducted all the outcome measures. Although randomisation was adequate, single blinding and the incomplete outcome data downgraded the included trial's quality.
Potential biases in the review process
Although the search for studies was thorough and the review protocol was strictly followed in the following process, potential biases may still exist. We found only one randomised controlled trial that met our inclusion criteria, which suggested publication bias may exist. Some trials may not get published in those journals which could be identified through our search. Besides, there is a lag from the search date to the publication date, and during this time a number of eligible studies might be completed whose impact on the results of the review is uncertain.
Agreements and disagreements with other studies or reviews
We have not identified any reviews on horticultural therapy for schizophrenia.
Implications for practice
1. For people with schizophrenia
Some of the data favoured horticultural therapy when combined with standard care, but the results came from one small study (n = 24) with short duration. There is currently insufficient evidence for people with schizophrenia to determine whether horticultural therapy is beneficial or not.
2. For clinicians
Based on the limited evidence available currently, clinicians could not draw any conclusions on benefits or harms of horticultural therapy for people with schizophrenia. More and larger randomised trials are needed before clinicians can confidently determine the efficacy of horticultural therapy in the treatment of schizophrenia.
3. For policy makers
There is not sufficient evidence in this review to support a change in policy.
Implications for research
Following the guidance of CONSORT statement more closely would make the future studies more informative for clinicians and people with schizophrenia. Clear description of randomisation, allocation concealment and blinding would have helped to assure that bias had been minimised.
We had to exclude Galderisi 2010 because it was not directly relevant. However, it could be included in a review evaluating leisure activity for schizophrenia.
This small study could not supply sufficient evidence to determine the effectiveness of horticultural therapy for schizophrenia.More and larger randomised trials with simple, straight forward design are needed for future research (see Table 1 for suggested design of a future study).
The Cochrane Schizophrenia Group Editorial Base in Nottingham produces and maintains standard text for use in the Methods section of their reviews. We have used this text as the basis of what appears here and adapted it as required.
The authors would like to thank Farhad Shokraneh for running the trials search and acknowledge Claire Irving for her constant help and support. Finally, we would like to acknowledge Andrew J Bradley for his help in developing the protocol and review.
An external peer review for this review was completed by Mohammad Nejad Sigaroudi, we would like to acknowledge and thank him for his helpful comments.
Data and analyses
- Top of page
- Summary of findings [Explanations]
- Authors' conclusions
- Data and analyses
- What's new
- Contributions of authors
- Declarations of interest
- Sources of support
- Differences between protocol and review
- Index terms
Last assessed as up-to-date: 3 September 2013.
Contributions of authors
Yan Liu - study selection, data extraction, input and analysis, writing up the review.
Stephanie Sampson - study selection, data extraction, writing up the review.
Li Bo - study selection, data extraction, input and analysis, writing up the review.
Guoyou Zhang - study selection, data extraction, input and analysis, writing up the review.
Samantha Roberts - completion of protocol, initial screening of search results.
Declarations of interest
Yan Liu - no known conflict of interest.
Stephanie Sampson - no known conflict of interest.
Li Bo - no known conflict of interest.
Guoyou Zhang - no known conflict of interest.
Samantha Roberts - no known conflict of interest (author of protocol).
Andrew J Bradley - no known conflict of interest (author of the protocol).
Sources of support
- University of Nottingham, UK.
- Eli Lilly, Basingstoke, UK.
- NSFC （No：81303151）, China.“Project The exploratory study of evidence-based medical record about doctor-patient building through joint efforts of integration of traditional and western for digestive system diseases（No：81303151） supported by National Natural Science Foundation of China”
- No sources of support supplied
Differences between protocol and review
Due to the search retrieving low numbers of citations, it was more appropriate for both review authors to inspect all the citations retrieved, and to data extract all included studies rather than a percentage of each.
'Summary of findings' table: we added the seventh outcome 'mental state and behaviour'.
Medical Subject Headings (MeSH)
MeSH check words