Illness management and recovery programme for people with severe mental illness

  • Protocol
  • Intervention

Authors

  • Lisa Korsbek,

    Corresponding author
    1. Mental Health Services Centre Ballerup, The Capital Region of Denmark, Competence Center for Rehabilitation and Recovery, Ballerup, Denmark
    • Lisa Korsbek, Competence Center for Rehabilitation and Recovery, Mental Health Services Centre Ballerup, The Capital Region of Denmark, Maglevænget 2, Building 24, Ballerup, 2750, Denmark. lisa.korsbek@regionh.dk.

    Search for more papers by this author
  • Helle S Dalum,

    1. The Mental Health Services, Psychiatric Centre Ballerup, Competence Center for Rehabilitation and Recovery, Ballerup, Denmark
    Search for more papers by this author
  • Jane Lindschou,

    1. Department 7812, Rigshospitalet, Copenhagen University Hospital, Copenhagen Trial Unit, Centre for Clinical Intervention Research, Copenhagen, Denmark
    Search for more papers by this author
  • Lene Falgaard Eplov

    1. Mental Health Centre Copenhagen, Research Unit, Copenhagen NV, Denmark
    Search for more papers by this author

Abstract

This is the protocol for a review and there is no abstract. The objectives are as follows:

To investigate the benefits and harms of the curriculum-based intervention IMR for people with schizophrenia or schizophrenia-like psychoses (schizophreniform and schizoaffective disorders).

Background

Description of the condition

The term severe mental illness (SMI) is generally used to describe a condition or group of conditions that causes people to have a severe and often long-term impairment in psychological and social functioning. Although there is no internationally agreed definition of SMI, there is wide consensus regarding that of the National Institute of Mental Health (NIMH) (Schinnar 1990; Dieterich 2010). This definition is based on three criteria: i. diagnosis of non-organic psychosis or personality disorder; ii. duration, characterised as involving ’prolonged illness’ and ’long term treatment’, and operationalised as a two-year or longer history of mental illness or treatment; and iii. disability, moderate impairment in work and non-work activities, and mild impairment in basic needs (National Institute of Mental Health 1987; Ruggeri 2000). A survey conducted in Europe in 1999 put the total population-based annual prevalence of SMI at approximately two per thousand (Ruggeri 2000).

Schizophrenia is an illness heavily contributing to the numbers of people considered severely mentally ill. The median lifetime morbid risk for schizophrenia is 7.2 per 1000 persons and, generally, the prevalence of schizophrenia ranges from four to seven per 1000 persons, depending on the type of prevalence estimate used (Warner 1995; Saha 2005; McGrath 2008). Schizophrenia is responsible for 1.1% of the disability-adjusted life-years worldwide (Picchioni 2007). Although the clinical presentation of schizophrenia varies widely among affected individuals and even within the same individual at different phases of the illness, schizophrenia as per the definition of the World Health Organization (WHO) is characterised by disruptions in thinking, affecting language, perception and the sense of self. It often includes psychotic experiences and can impair functioning (WHO 2013).

Description of the intervention

Pharmacological interventions, which are used to manage symptoms, comprise the main treatment modality for people with schizophrenia and other SMI. In recent years, however, it has become clear that medication alone is not sufficient as it tends to produce only limited improvement in social functioning and quality of life (Drake 2009; Kern 2009; Dixon 2010). Psychosocial treatments that enable people with schizophrenia and other SMI to cope with the disabling aspects of their illnesses are therefore a necessary complementary intervention.

Over the past decade several systematic efforts have been made to identify evidence-based psychosocial interventions for people with schizophrenia (Drake 2009; Dixon 2010). Examples of these interventions include systematic approaches to medication management, assertive community treatment, relapse prevention programmes and supported employment (Drake 2009).

The aims of psychosocial interventions are numerous. Often they may be intended to improve one or more of the following outcomes: to decrease the person’s mental vulnerability; reduce the impact of stressful events and situations; decrease distress and disability; minimise symptoms; improve quality of life; reduce relapses; improve coping skills; and involve the person's relatives (NICE 2009).

As treatment for people with schizophrenia and SMI in general requires the integration of different levels of care and different types of intervention to support independence, quality of life, personal well-being and social participation (Drake 2001; Freese 2001; Davidson 2006), most psychosocial interventions can be classified as multimodal. A multimodal intervention is defined as one that comprises two or more specific psychological or psychosocial interventions in a systematic and programmed way, with the aim of producing a benefit over and above that which might be achieved by a single intervention (NICE 2009). They are also often subsumed under the general term 'rehabilitation' (Almerie 2011).

Illness management and recovery (IMR) is a multimodal curriculum-based standardised programme that evolved out of this growing knowledge that psychosocial treatments which enable persons with schizophrenia and other SMI to cope with the disabling aspects of their illness and achieve personal goals are a necessary complement to pharmacological interventions. The impetus for developing the IMR programme initially arose at a Robert Wood Johnson Foundation Consensus Conference of US National Institutes of Mental Health staff, services researchers, advocates and the schizophrenia Patient Outcomes Research Team in Baltimore, USA, in 1997, where it was suggested that the various psychosocial interventions available for helping people manage their symptoms and prevent relapses needed to be consolidated into a single standardised programme for study and dissemination (Mueser 2006). To meet this need, the IMR programme was developed as part of the National Implementing Evidence-Based Practices Project, between 2000 and 2002, as an evidence-based practice based on the principles of recovery to help people with SMI. By collecting the evidence of different empirically supported practices, including psychoeducation, relapse prevention, behaviour training to improve medication adherence, coping skills training and social training, IMR was developed as a full-ranged rehabilitation programme and consolidated into a single standardised programme for study and dissemination (Mueser 2006; Dalum 2011). By relating to the concept of recovery, the care tradition of IMR is an orientation to psychiatric illness which holds that individuals are more than the sum of their symptoms and that a mental illness is only one aspect of a multidimensional sense of self, capable of identifying, choosing and pursuing personally meaningful goals and aspirations (Davidson 2006). Hence, IMR was developed in order to help people with schizophrenia and other SMI learn how to manage their illness more effectively in the context of pursuing their personal goals, and to acquire the knowledge and skills to manage their disorder independently (Mueser 2006).

There are currently three versions of the manual for IMR, which reflects a steady development of the intervention. A draft from 2003 was first developed into the August 2006 version (Gingerich 2006), and in 2010 an additional optional module on healthy lifestyles was developed into the second revised edition of the manual (Gingerich 2010), The latest version is from November 2011, and includes several key improvements, while existing modules have been expanded (Gingerich 2011)

The target groups for IMR are: people with schizophrenia, schizoaffective disorder, bipolar disorder and severe depression. The intervention is organised into 11 curriculum topic areas: recovery strategies, practical facts about mental illness, the stress-vulnerability model, building social support, using medication effectively, drug and alcohol use, reducing relapses, healthy lifestyle, coping with stress, coping with problems and symptoms, and getting your needs met in the mental health system.

The curriculum topic areas are taught using a combination of educational, motivational and cognitive-behavioural teaching strategies. In order to help people apply the information and skills that they learn in the sessions to their day-to-day lives, the participant and the practitioner collaborate to develop homework assignments at the end of each session. Every session follows the same routine, and sessions generally last between 45 and 60 minutes; however, it is possible to conduct more frequent, brief sessions, such as meeting for 20 to 30 minutes two or three times a week. The whole programme follows a structured pattern, with educational handouts and practitioners’ guidelines for each topic area. In each session of the programme, practitioners should follow up on the progress of participants towards their personal recovery goals.

IMR can be provided in an individual or group format, and generally lasts between 8 and 11 months, involving a series of weekly sessions where mental health practitioners help the participants to develop personalised strategies for managing their mental illness and moving forward in their lives. With participants’ consent, significant others (e.g., family, friends) are encouraged to be involved in supporting participants while they learn self-management strategies and pursue their personal goals. The participants can share their educational handouts with their significant others, practice specific learned skills and invite their significant others to participate in some of the sessions.

How the intervention might work

The IMR programme integrates specific, empirically supported strategies for teaching illness self-management into a cohesive treatment package based on two theoretical models: the transtheoretical model and the stress-vulnerability model (Mueser 2006; Dalum 2011). The transtheoretical model assumes that human change develops over a series of stages that can be motivated (Derisley 2002; Norcross 2011). By motivating people through the structured course of the IMR programme, the assumption is that people can succeed more easily in achieving their personal recovery goals. The stress-vulnerability model builds on the assumption that the course of SMI is determined by an interaction between biological vulnerability, stress and coping. The aim, therefore, of IMR is to empower people and teach them skills to interrupt the circle of stress and vulnerability that can lead to poor functioning and relapse (Das 2001; Goh 2010).

Why it is important to do this review

Research increasingly suggests that the IMR programme has beneficial effects on outcomes such as illness self-management, global functioning, knowledge of illness, distress related to symptoms and goal orientation (Mueser 2006; Hassan-Ohayon 2007; Lewitt 2009; Fujita 2010), and that it can be implemented in a routine metal health treatment setting with a high fidelity to the programme curriculum (Salyers 2009a; Salyers 2009b). A growing number of countries have ongoing studies on IMR, and the implementation process is well developed in some parts of the USA. Sweden and Denmark have also started the implementation process in parts of their community mental health services(Dalum 2011; Färdig 2011). Presently, however, there are no systematic reviews evaluating the benefits and harms of IMR for people with SMI.

Objectives

To investigate the benefits and harms of the curriculum-based intervention IMR for people with schizophrenia or schizophrenia-like psychoses (schizophreniform and schizoaffective disorders).

Methods

Criteria for considering studies for this review

Types of studies

We will include all relevant randomised clinical trials, irrespective of language, publication types (e.g., peer-reviewed publications, non-published studies, conference abstracts, posters), publication year or publication status. We will exclude quasirandomised studies, such as those allocating participants by alternate days of the week. If it is unclear whether and how a trial is randomised, we will contact the authors for further information.

Types of participants

We will include adults (aged 16 or older) diagnosed with schizophrenia or schizophrenia-like psychoses (schizophreniform and schizoaffective disorders). Participants must have been diagnosed according to standardised criteria, such as the International Classification of Diseases (ICD)-10 or the Diagnostic and Statistical Manual of Mental Disorders (DSM)-III, or DSM-IV; if these operational criteria have not been used, then participants should have been diagnosed by a psychiatrist.

If participants with other diagnoses are included in the trial, we will contact the authors for separate data on the diagnoses included in this review. If it is not possible to have separate data on these diagnoses, we include the trial only if the majority of participants (more than 50%) in the trial meet our target diagnoses. We will not exclude comorbidity with other mental or somatic diagnoses or substance abuses.

Types of interventions

We will include trials where participants are randomised between IMR and other intervention group(s), irrespective of whether IMR is categorised as the experimental intervention group or the control intervention group by trialists. In all included trials, we will consider the experimental intervention group the group that receives IMR. Where people are given additional treatments or interventions within an IMR trial, we will include data only if the adjunctive treatment or intervention is evenly distributed between groups, and it is only the IMR programme that is randomised.

1. Experimental intervention

The experimental intervention is defined as IMR, the structured psychosocial intervention, whether group or individual, that follows the curriculum-based standardised programme of the IMR intervention, lasting between 8 and 11 months and organized into 9 to 11 topic areas, with a series of weekly sessions following a structured pattern based on the principles of personal recovery in the rehabilitation of people with SMI (Gingerich 2006; Mueser 2006). The revisions of the IMR manual have not altered the basic structure or the theoretical underpinning of the curriculum-based IMR programme, but the version of the manual used in the trial will be defined by reference to manual 2006, 2010 or 2011 (see 'Description of the intervention').

2. Control interventions

The control intervention group will be considered that which receives:

  • no intervention (i.e., the participants in the experimental intervention group receive IMR in the setting of the trial; the participants in the control intervention group receive nothing comparable with this). This does not mean that the participants in the control (and experimental) intervention group cannot receive other types of treatments and interventions – these just have to be equal in both groups whether medically or through other therapeutic procedures;

  • placebo (i.e., simulated or otherwise ineffectual treatment whether medically or through measures that are known to have no detectable effect);

  • standard care (i.e., the normal level of psychiatric care provided in the area where the trial is being carried out);

  • other rehabilitation strategies (i.e., social skills training, cognitive behaviour therapy and cognitive remediation, and other complex rehabilitation programmes such as Collaborative Recovery and Comprehensive Approach of Rehabilitation).

3. Main comparisons

We will conduct two main comparisons:

  • IMR versus 'no intervention', placebo or standard care;

  • IMR versus other rehabilitation strategies.

We will further explore if there are any differences between the different types of control interventions in subgroup analyses (see 'Subgroup analysis and investigation of heterogeneity').

Types of outcome measures

We will estimate all outcomes at two time points:

  • 'end of intervention' (often after 9 to 12 months); the trial’s choice of 'end of intervention' will be used. This is the most important outcome measure time point in this review;

  • at short follow up (up to, and including, two years after the ‘end of intervention’);

  • at long follow up (more than two years after the ‘end of intervention’).

The primary outcomes have to be measured with relevant and validated scales, as defined by each study. To assess secondary outcomes, we will use the scales and subscales as defined by trialists, but only when the scales used have been validated.

Primary outcomes
1. Global functioning

1.1 Clinically important change in global functioning (as defined by the trialist)
1.2 Mean endpoint score in global functioning (as defined by the trialist)

2. Global personal recovery

2.1 Clinically important change in global personal recovery (as defined by the trialist)
2.2 Mean endpoint score in global personal recovery (as defined by the trialist)

3. Suicides

3.1 Proportion of participants who commit suicide
3.2 Proportion of participants who attempt suicide

Secondary outcomes

1. Mental state

1.1 Clinically important change in global mental state (as defined by the trialists)
1.2 Mean endpoint score in global mental state (as defined by the trialists)
1.3 Clinically important change in specific aspects of mental state (as defined by the trialists)
1.4 Mean endpoint score in specific aspects of mental state (as defined by the trialists)

2. Specific aspects of personal recovery

2.1 Clinically important change in specific aspects of personal recovery (as defined by the trialists)
2.2 Mean endpoint score in specific aspects of personal recovery (as defined by the trialists)

3. Specific aspects of functioning

3.1 Clinically important change in specific aspects of functioning (as defined by the trialists)
3.2 Mean endpoint score in specific aspects of functioning (as defined by the trialists)

4. Adverse event and suicide ideation

4.1 Proportion of participants experiencing serious adverse events, defined as any event that results in death or significant disability/incapacity, is life-threatening or requires hospitalisation or prolongation of existing hospitalisation
4.2 Clinically important change in suicide ideation (as defined by the trialists)
4.3 Mean endpoint score in suicide ideation (as defined by the trialists)

5. Health-related quality of life

5.1 Clinically important change in health-related quality of life (as defined by the trialists)
5.2 Mean endpoint score in health-related quality of life (as defined by the trialists)

6. Substance abuse

6.1 Proportion of participants with drug or alcohol abuse (as defined by the trialists)
6.2 Mean endpoint score in drug or alcohol abuse (as defined by the trialists)

7. Costs of interventions

7.1 Mean costs of the interventions (as defined by the trialists)

'Summary of findings' table

We will use the GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach to interpret findings (Schünemann 2008) and use the GRADE profiler (GRADE PRO) to import data from Review Manager 5.1 (RevMan 2008) to create two 'summary of findings' tables. We will create one 'summary of findings' table for each main comparison (see 'Main comparisons'). These tables will provide outcome-specific information concerning the overall quality of evidence from each included trial in the comparison, the magnitude of effect of the interventions examined and the sum of available data on all outcomes we rated as important to participant care and decision making.

For inclusion in the 'summary of findings' tables, we therefore aim to select the following outcomes, all assessed at ‘end of intervention’:

  1. clinically important change in global functioning (as defined by the trialists);

  2. clinically important change in global personal recovery (as defined by the trialists);

  3. clinically important change in global mental state (as defined by the trialists);

  4. clinically important change in health-related quality of life (as defined by the trialists);

  5. proportion of participants with drug or alcohol abuse (as defined by the trialists);

  6. suicides: proportion of participants who commit suicide (as defined by the trialists);

  7. suicides: proportion of participants who attempt suicide (as defined by the trialists).

Search methods for identification of studies

We will search the Cochrane Schizophrenia Group's register using the phrase: [(*imr* OR *illness management*) in title, abstract and index terms of REFERENCE OR (imr* OR *illness management*) in interventions of STUDY]

This register is compiled by systematic searches of major databases, handsearches and conference proceedings (for details of databases searched by the Cochrane Schizophrenia Group see http://onlinelibrary.wiley.com/o/cochrane/clabout/articles/SCHIZ/frame.html).

Searching other resources

1. Reference searching

We will inspect the references of all identified studies for further relevant studies.

2. Personal contact

We will contact the first author of each included study for information regarding unpublished trials.

Data collection and analysis

Selection of studies

Two authors (LK and HSD) will inspect all the abstracts of studies identified as above and the identified potentially relevant reports. In addition, to ensure reliability, JL or LFE will inspect a random sample of these abstracts, comprising 10% of the total. Where disagreement occurs this will be resolved by discussion or, where there is still doubt, the full article will be acquired for further inspection. The full articles of relevant reports will be acquired for reassessment and will be carefully inspected for a final decision on inclusion. Once the full articles are obtained, LK and HSD will inspect all full reports and independently decide whether they meet the inclusion criteria. LK and HSD will not be blinded to the names of the authors, institutions or journal of publication. Disagreesment means the publication in question will be reinspected by a third author (JL or LFE), in order to ensure reliable selection. If it is impossible to decide, we will contact the authors of the study for clarification. We will list studies excluded from the second round and state the reason for each exclusion.

Data extraction and management

1. Extraction

Two authors (LK and HSD) will independently extract data from all included studies. Any disagreement will be discussed, decisions documented, together with remaining problems, a third author (JL or LFE) will help clarify issues and these final decisions will be documented. Data presented only in graphs and figures will be extracted whenever possible, but will be included only if two reviewers independently come to the same result. Where possible, we will extract data relevant to each component centre of multicentre studies separately. If necessary, authors of studies will be contacted for clarification. Contact of authors will be made via an open-ended request in order to obtain missing information or for clarification whenever necessary.

2. Management
2.1 Forms

Data will be extracted into standard, simple forms.

2.2 Scale-derived data

We will include continuous data from rating scales only if:
a. the psychometric properties of the assessment instrument have been described in a peer-reviewed journal (Marshall 2000);
b. the assessment instrument has not been written or modified by one of the trialists from that particular trial.

2.3 Endpoint versus change data

We have decided to primarily use endpoint data. If endpoint data are not available, we will contact the authors and use change data only if the former are not available. Trials for which change data only are available will be combined with trials for which endpoint data are available in the analysis, as we will use weighted mean differences (MDs) rather than standardised mean differences (SMDs) throughout (Higgins 2011).

2.4 Skewed data

Continuous data on clinical and social outcomes are often not normally distributed. To avoid the pitfall of applying parametric tests to non-parametric data, we aim to apply the following standards to all data before inclusion: a) use of standard deviations (SDs) and means reported in the paper or obtained from the authors; b) when a scale starts from the finite number zero, the SD, when multiplied by two, should be less than the mean (as otherwise the mean is unlikely to be an appropriate measure of the centre of the distribution (Altman 1996)); c) if a scale starts from a positive value (such as the Positive and Negative Syndrome Scale (PANSS) which can have values from 30 to 210). the calculation described above will be modified to take the scale starting point into account. In these cases, skew will be considered to be present if 2 SD > (S - S min), where S is the mean score and S min is the minimum score. Follow-up scores on scales often have a finite start and endpoint, and these rules can be applied. When continuous data are presented on a scale that includes a possibility of negative values (such as change data), it is difficult to tell whether data are skewed or not. Skewed data from studies of fewer than 200 participants will be entered in additional tables rather than into an analysis. Skewed data pose less of a problem when looking at means if the sample size is large and such data will be entered into syntheses.

2.5 Common measure

To facilitate comparison between trials, we intend to convert variables that can be reported in different metrics, such as days in hospital (mean days per year, per week or per month), to a common metric (e.g. mean days per month).

2.7 Direction of graphs

Where possible, we will enter data in such a way that the area to the left of the line of no effect indicates a favourable outcome for IMR. Where keeping to this convention makes it impossible to avoid outcome titles with clumsy double-negatives (e.g. 'not improved'), we will report data where the left of the line indicates an unfavourable outcome. This will be noted in the relevant graphs.

Assessment of risk of bias in included studies

Two authors (LK and HSD) will work independently to assess the risk of bias in included studies using criteria described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). According to empirical evidence (Schulz 1995; Moher 1998; Kjaergard 2001; Wood 2008), we will assess the risk of bias according to sequence generation, allocation concealment, blinded outcome assessment, incomplete outcome data, selective outcome reporting and vested interest bias.

Sequence generation

  • Low risk of bias: if the sequence generation was achieved using computer random number generation or a random number table. Drawing lots, tossing a coin, shuffling cards and throwing dice are adequate if performed by an independent adjudicator.

  • Unclear bias: the trial is described as randomised, but its method of randomisation is not specified.

  • High risk of bias: the method of sequence generation is not, or may not, be random. Quasirandomised studies, and those using dates, names or admittance numbers in order to allocate particpants, are considered inadequate.

Allocation concealment

  • Low risk of bias: if the allocation is controlled by a central and independent randomisation unit, opaque and sealed envelopes or similar, so that intervention allocations could not have been foreseen in advance of, or during, enrolment.

  • Unclear bias: the trial was described as randomised but the method used to conceal the allocation was not described, so that intervention allocations may have been foreseen in advance of, or during, enrolment.

  • High risk of bias: the allocation sequence is known to the investigators who allocated participants or if the trial is quasirandomised.

Blinding of participants and personnel

Due to the nature of the IMR intervention, neither participants nor staff can be blinded to allocation. We will therefore assess the blinding of outcome assessment by raters and researchers focusing only on whether the blinding of raters and researchers is described, see below.

Blinded outcome assessment

  • Low risk of bias: if the outcome assessment was blinded for all relevant outcomes, and the method of blinding described in details, so that knowledge of allocation was adequately prevented.

  • Unclear: the outcome assessment is described as blinded, but the method not further described, so that knowledge of allocation was possible.

  • High risk of bias: outcome assessment is not blinded, so that the allocation was known for outcome assessors.

Incomplete outcome data

  • Low risk of bias: if it was specified that there were no dropouts or withdrawals, or if the numbers and reason for dropouts and withdrawals in both intervention and control groups are well described and judged to be equal, and sufficient methods, such as multiple imputation, have been used to handle missing data.

  • Unclear bias: the trial gives the impression that there are no dropouts or withdrawals without specifically stating it, or the number or reasons for dropouts and withdrawals are not described.

  • High risk of bias: if the pattern and reasons for dropout can be described as being different in the two intervention groups.

Selective outcome reporting

  • Low risk of bias: all predefined or clinically relevant and reasonably expected outcomes are reported.

  • Unclear bias: not all predefined or clinically relevant and reasonably expected outcomes are reported or are not reported fully, and it is unclear whether data on these outcomes were recorded or not.

  • High risk of bias: one or more clinically and reasonably expected outcomes is not reported, and data on these outcomes were likely to have been recorded.

Vested interest bias

Vested interest bias can be academic or financial. We will assess the risk of both.

  • Low risk of bias: the trial's sources of funding did not come from any parties that might have conflicting interests, be they of academic, professional, financial or other benefit to the persons responsible for the trial, and the persons responsible are independent of the direction or statistical significance of the trial results.

  • Unclear bias: the sources of funding are not clear, or it is unclear if the persons responsible for the trial benefit according to the direction or statistical significance of the trial results.

  • High risk of bias: the trial's sources of funding have a conflict of interest, or the academic, professional, financial or other benefits to the persons responsible for the trial are dependent of the direction or statistical significance of the trial.

A trial will be classified as at ‘overall low risk of bias’ only if all of the bias components described in the above paragraphs are classified as ‘low risk of bias’. If one or more of the bias components are classified as ‘unclear’ or ‘high risk' of bias, the trial will be classified as at ‘overall high risk of bias’. The overall risk of bias will be noted in both the text of the review and in the 'Summary of findings' table.

If LK and HSD disagree in any assessment of risk of each of these biases, the final rating will be made with the involvement of JL and LFE and thereby by discussion between the four raters. Where inadequate details of randomisation and other characteristics of trials are provided, we will contact the authors of the trials in order to obtain further information. We will report non-concurrence in quality assessment, but if disputes arise as to which category a trial is to be allocated, again, resolution will be made by discussion.

Measures of treatment effect

1. Binary data

For binary outcomes we will calculate a standard estimation of the risk ratio (RR) and its 95% confidence interval (CI). It has been shown that RRs are more intuitive (Boissel 1999) than odds ratios, and that odds ratios tend to be interpreted as RRs by clinicians (Deeks 2000). The number needed to treat for an additional beneficial outcome/harmful outcome (NNTB/H) statistic with its CIs is intuitively attractive to clinicians, but is problematic both in its accurate calculation in meta-analyses and in its interpretation (Hutton 2009). For binary data presented in the 'Summary of findings' table/s, where possible, we will calculate illustrative comparative risks.

2. Continuous data

For continuous outcomes we will estimate MDs between groups. We prefer not to calculate effect size measures (SMDs). However, if scales of very considerable similarity are used, we will presume there is a small difference in measurement, and we will calculate effect size and transform the effect back to the units of one or more of the specific instruments.

Unit of analysis issues

1. Cluster trials

Studies increasingly employ 'cluster randomisation' (such as randomisation by clinician or practice), but the analysis and pooling of clustered data poses problems. First, authors often fail to account for intraclass correlation (ICC) in clustered studies, leading to a 'unit of analysis' error (Divine 1992), whereby P values are spuriously low, CIs unduly narrow and statistical significance overestimated. This causes type I errors (Bland 1997; Gulliford 1999).

For cluster-randomised trials, we will calculate the ‘design effect’. This is calculated using the mean number of participants per cluster (M) and the ICC [design effect = 1 + (M - 1) * ICC] (Donner 2002). Where clustering is not accounted for in primary studies, we will seek to contact the first authors of studies to obtain ICCs for their clustered data and to adjust for this by using accepted methods (Gulliford 1999). If the ICC is not reported it will be assumed to be 0.1 (Ukoumunne 1999).

We will then calculate the 'effective sample size'. For dichotomous outcomes, the number or participants and the number experiencing the event should be divided by the design effect. For continuous outcomes, only the sample size needs to be divided by the design effect (Higgins 2011). After this, the numbers can be entered into RevMan.-

2. Cross-over trials

A major concern of cross-over trials is the carry-over effect. This occurs if an effect (e.g., pharmacological, physiological or psychological) of the treatment in the first phase is carried over to the second phase. As a consequence, on entry to the second phase the participants can differ systematically from their initial state despite a wash-out phase. For the same reason cross-over trials are not appropriate if the condition of interest is unstable (Elbourne 2002). As both effects are very likely in SMI, we will use data only from the first phase of cross-over studies.

3. Studies with multiple intervention groups

Where a study involves more than two intervention arms, if relevant, the additional intervention arms will be presented in comparisons. If data are binary, these will be simply added and combined within the two-by-two table. If data are continuous, we will combine data following the formula in Section 7.7.3.8  (Combining groups) of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). Where the additional treatment arms are not relevant, these data will not be reproduced.

Dealing with missing data

1. Overall loss of credibility

If a large proportion of outcome data is missing, data loses credibility (Xia 2009). For any particular outcome, should more than 50% of data be unaccounted for, we will not reproduce these data or use them within analyses.

2. Dichotomous data

For dichotomous outcomes, we will analyse data according to intention-to-treat principles, whereby all participants randomised in each trial are included in the analyses. Participants with missing data will initially be considered as having the same proportion of the outcome as the participants with complete data. We will conduct sensitivity analyses to test the robustness of this assumption (see 'Sensitivity analysis').

3. Continuous data
3.1. Attrition

For continuous outcomes, we will import means and SDs from the maximum number of participants with data available. All trials will be included in the analyses, regardless of the proportion of missing data; however, trials with incomplete outcome data will be assessed as at ‘high risk of bias’ (see 'Assessment of risk of bias in included studies'). 

3.2 SDs

If SDs are not reported, we will first try to obtain the missing values from the authors. If not available, where there are missing measures of variance for continuous data, but an exact standard error (SE) and CIs available for group means, and either a P value or t value available for differences in means, we will calculate them according to the rules described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). When only the SE is reported, SDs can be calculated using the formula SD = SE * square root (n). Chapters 7.7.3 and 16.1.3 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011) present detailed formulae for estimating SDs from P values, t or F values, CIs, ranges or other statistics. If these formulae do not apply, we will calculate the SDs according to a validated imputation method that is based on the SDs of the other included studies (Furukawa 2006). Although some of these imputation strategies can introduce error, the alternative would be to exclude a given study’s outcome and thus to lose information. We nevertheless will examine the validity of the imputations in a sensitivity analysis excluding imputed values.

Assessment of heterogeneity

1. Clinical heterogeneity

We will consider all included studies initially, without seeing comparison data, to judge clinical heterogeneity. We will simply inspect all studies for clearly outlying people or situations that we had not predicted would arise. When such situations or participant groups arise, these will be fully discussed.

2. Methodological heterogeneity

We will consider all included studies initially, without seeing comparison data, to judge methodological heterogeneity. We will simply inspect all studies for clearly outlying methods that we had not predicted would arise. When such methodological outliers arise, these will be fully discussed.

3. Statistical heterogeneity
3.1 Visual inspection

We will visually inspect graphs to investigate the possibility of statistical heterogeneity.

3.2 Employing the I2 statistic

Heterogeneity between studies will be investigated by considering the I2 statistic alongside the Chi2 P value. The I2 statistic provides an estimate of the percentage of inconsistency thought to be due to chance (Higgins 2011). The importance of the observed value of the I2 statistic depends on: i. magnitude and direction of effects; and ii. strength of evidence for heterogeneity (e.g., the P value from the Chi2  test or a CI for the I2 statistic). An I2 estimate greater than or equal to around 50%, accompanied by a statistically significant Chi2 statistic, will be interpreted as evidence of substantial levels of heterogeneity (see Section 9.5.2 in Higgins 2011). When substantial levels of heterogeneity are found in the primary outcome, we will explore reasons for this heterogeneity.

Assessment of reporting biases

Different types of reporting biases (e.g., publication bias, time lag bias, outcome reporting bias) will be handled following the recommendations of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). For all types of outcomes, we will test for funnel plot asymmetry when at least 10 trials are included in the meta-analysis (Higgins 2011). For continuous outcomes with intervention effects measured as MDs, the test proposed by Egger 1997 will be used to test for funnel plot asymmetry. However, asymmetric funnel plots are not necessarily caused by publication bias, and publication bias does not necessarily cause asymmetry in a funnel plot (Egger 1997). In other cases, where funnel plots are possible, we will seek statistical advice in their interpretation.

Data synthesis

1. Statistical methods

We will use the statistical software RevMan 5 (RevMan 2008) to analyse data. We will calculate RRs with 95% CIs for dichotomous variables and MDs with 95% CIs for continuous variables, using both the random-effects model (DerSimonian 1986) and the fixed-effect model (Demets 1987). Statistical significance will be evaluated at the level of P value ≤ 0.05. In case of discrepancies between the results of the two models, we will report the results of both models; otherwise we will report the results from only the random-effects model. We will explore the presence of statistical heterogeneity using the Chi2 test with significance at a P value ≤ 0.1, and measure the quantity of heterogeneity using the I2 statistic (Higgins 2011). Where data are available from only one trial, we will use Fisher's exact test (Fisher 1922) for dichotomous data and Student's t-test (Student 1908) for continuous data.

2. Trial sequential analysis

Trial sequential analysis (TSA) is a tool for quantifying the statistical reliability of data in cumulative meta-analysis, adjusting P values for repetitive testing on accumulating data (Brok 2008; Thorlund 2008; Wetterslev 2008). TSA is a methodology that combines an information size calculation (cumulated sample sizes of included trials) with the threshold of statistical significance. A more detailed description of TSA can be found at http://www.ctu.dk/tsa/.

We will perform TSA on the data from the trials with low risk of bias (Brok 2008; Wetterslev 2008). The outcome measures analysed using TSA will be the primary outcomes. For binary outcomes we will estimate the required information size based on the proportion of participants with an outcome in the control group, an RR of 20% or, as suggested by the trials with low risk of bias, a type I error (α) of 5%, a type II error of (β) of 20% and diversity of 30% and 60%. For continuous outcomes, we will estimate the required information size based on the SD observed in the control group of trials with low risk of bias and a minimal relevant difference of 50% of this SD, a type I error (α) of 5%, a type II error of (β) of 20% and diversity of 30% and 60%.

Subgroup analysis and investigation of heterogeneity

We will perform the following two subgroup analyses on data for the primary outcome measures:

  • on bias risk: trials with an overall low risk bias compared with trials with an overall high risk of bias;

  • on control interventions: 1) trials with a control group that received no intervention compared with 2) trials with a control group that received placebo compared with 3) trials with a control group that received standard care.

We will also perform subgroup analysis on those outcome measures that show significant heterogeneity (i.e., P value ≤ 0.1) in the main meta-analysis.

We will use the test for interaction (Altman 2003) to assess whether the intervention effects in the subgroup analyses differ statistically significantly from each other. We will consider P values ≤ 0.05 to indicate a statistically significant interaction, or difference, of the intervention effects between the subgroups.

Sensitivity analysis

1. Assumptions for lost binary data 

Where assumptions have to be made regarding people with missing data for outcomes (see 'Dealing with missing data'), we will conduct two sensitivity analyses.

  • ’Best-worst-case’ scenario: it will be assumed that all participants lost to follow up in the experimental group have a positive outcome (e.g., not committed suicide); and all those with missing outcomes in the control group have a negative outcome (e.g., committed suicide).

  • ’Worst-best-case’ scenario: it will be assumed that all participants lost to follow up in the experimental group have a negative outcome (e.g., committed suicide); and all those with missing outcomes in the control group have a positive outcome (e.g., not committed suicide).

2. Assumptions for lost continuous data

Where assumptions have to be made regarding missing SDs (see 'Dealing with missing data'), sensitivity analysis will be undertaken testing how prone the results are to change when 'completer' data only are compared with the imputed data using the above assumption. If there is a substantial difference, we will report results and discuss them but continue to employ our assumption.

3. Imputed values

We will also undertake a sensitivity analysis to assess the effects of including data where we used imputed values for ICCs in calculating the design effect in cluster randomised trials.

If substantial differences are noted in the direction or precision of effect estimates in any of the sensitivity analyses listed above, we will not pool data from the excluded trials with those of the other trials contributing to the outcome, but will present them separately.

4. Fixed and random effects

All data will be synthesised using a random-effects model; however, we will also synthesise data for the primary outcome using a fixed-effects model to evaluate whether the greater weights assigned to larger trials with greater event rates alter the significance of the results compared to the more evenly distributed weights in the random-effects model.

Acknowledgements

The Cochrane Schizophrenia Group Editorial Base in Nottingham produces and maintains standard text for use in the Methods section of their reviews. We have used this text as the basis of this protocol but adapted the standard text in accordance with the objective of the review and the intervention investigated.

We would like to thank Michael Dixon for peer reviewing this protocol.

Contributions of authors

All authors participated in the conception and design of the protocol. All authors have read, revised critically and approved the final protocol.

Declarations of interest

The authors are currently conducting an investigator-initiated randomised trial of IMR in Denmark (Dalum 2011).

Sources of support

Internal sources

  • The Mental Health Services in the Capital Region (Psychiatric Centre Ballerup and The Mental Health Centre Copenhagen), Denmark.

  • Copenhagen Trial Unit, Copenhagen University Hospital, Denmark.

External sources

  • None, Other.

Ancillary