Physical exercise for oppositional defiant disorder and conduct disorder in children and adolescents

  • Protocol
  • Intervention



This is the protocol for a review and there is no abstract. The objectives are as follows:

To determine the effects of physical exercise interventions on behavioural and cognitive symptoms, and on family functioning, in children and adolescents diagnosed with ODD or CD.


Description of the condition

Oppositional defiant disorder (ODD) and conduct disorder (CD) are the two most common juvenile disorders seen in mental health and community clinics. Both involve conduct problem behaviours that are of great concern because of the high degree of distress they cause for communities, families, and the children and youths themselves (Kazdin 1995; Frick 1998; Meltzer 2000; Essau 2011).

In both the International Classification of Diseases (ICD-10, F.91.0, F.91.3) (WHO 2010), and the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition-Text Revision (DSM-IV-TR) (APA 2000), ODD and CD are defined as two separate conditions. DSM-IV-TR replaced previous distinctions between socialised and non-socialised aggression with subtypes based on whether the onset of symptoms occurred before or after 10 years of age. In ICD-10, ODD is also classified as a "type of conduct disorder, usually occurring in younger children..." (WHO 2010). The essential feature of ODD is a recurrent pattern of negativistic, defiant, disobedient and hostile behaviour towards authority figures. This behaviour must persist for at least six months. CD is defined as a repetitive and persistent pattern of behaviour in which the basic rights of others and major age-appropriate societal norms or rules are violated (APA 2000). CD might be seen as a further development from ODD, located along the same continuum (Loeber 2000), and thus is a more severe condition (Hazell 2010). ODD is also a strong predictor for the onset and course of CD over time (Essau 2011). The line that is conceptualised as separating ODD from CD is, primarily, one of 'safety'. While ODD is characterised by "markedly defiant, disobedient, provocative behaviour and by the absence of more severe dissocial or aggressive acts that violate the law or the rights of others" (WHO 2010), a child or adolescent with CD provokes considerably more concerns for the safety of the child or adolescent, those who surround him or her, and their possessions (behaviours such as fire-setting and vandalism are common in CD).

Prevalence estimates for ODD and CD among children and adolescents in community settings have been reported to range from 1% to 16% and from 1% to 9%, respectively (Hamilton 2008; Canino 2010). The variability in the prevalence seems mostly to be related to methodological differences across studies (Canino 2010). ODD and CD are reported to be more common in boys than girls; however, data are inconsistent (Hamilton 2008). The problem behaviour tends to persist as stability rates of CD are reported to range from approximately 50% to nearly 90% (Essau 2011). In a cohort of New Zealand children, only 14% showed remission of behavioural problems within a two-year period (Fergusson 1995). In addition, children who displayed early-onset CD are at greater risk for persistent difficulties. Children having an onset of CD before 10 years of age are about nine times more likely to show aggressive symptoms than youths who are diagnosed with CD at a later age (Lahey 1998).

Both children and adolescents with ODD and those with CD are of great concern because of their social and educational impairments and poor prognosis. These children and adolescents have great difficulty following rules and behaving in a socially acceptable way. Typically, the child's negative behaviour causes a negative reaction from others, which makes the problem worse. The association with school failure and negative family and social experiences is strong (Biederman 1991; APA 2000). Affected children are more likely to have troubled peer relationships and academic difficulties (Lahey 1998). In addition, childhood conduct problems contribute substantially to the development of criminality in adulthood (Robins 1993), with concomitant social costs (Scott 2001). 

Both ODD and CD tend to co-occur with a number of emotional, neurodevelopmental and other disorders of childhood, particularly attention deficit hyperactive disorder (ADHD) (Boylan 2007). In one Norwegian study, 45% of children with ODD also had a diagnosis of ADHD (Mørk 2004). Other comorbid conditions included depression (about 30%), anxiety disorders (30%) and learning disabilities (30% to 40%) (Angold 1999).

Risk factors that may predict behavioural problem disorders include impulsiveness, low IQ and low school achievement, poor parental supervision, punitive or erratic parental discipline, cold parental attitude, parental conflict, disrupted families, antisocial parents, large family size, low family income, antisocial peers, high delinquency rate schools and high crime neighbourhoods (Murray 2010). It is widely accepted that the best predictors for future recurrent delinquency are early conduct problems (Sheldrick 1995; AACAP 1997). The cost of CD is high, both in terms of the quality of life of those who have them (and the people they affect) and in terms of the resources necessary to counteract them. Early interventions for CD are, therefore, critical and it is important that treatments are both effective and cost-effective.

Treatment programmes that have been shown to be effective for children and adolescents with ODD and CD involve various behavioural management strategies, parenting interventions, programmes using cognitive-behavioural approaches to build social and problem-solving skills, multisystemic therapy and medication (Hamilton 2008; Kimonis 2010). Physical activity has positive mental health outcomes in young people (Biddle 2011), and may have beneficial effects in the treatment of behavioural disorders (Tantillo 2002; Larun 2009).

Description of the intervention

There is strong evidence that regular physical activity has a positive effect on both physical health (such as reducing the risk for coronary heart disease, diabetes, certain cancers, obesity, hypertension and all-cause mortality) and mental health outcomes (such as self esteem and cognitive function) (Blair 1992; Bouchard 1994; Pate 1995; Erikssen 1998; Ekeland 2004; Biddle 2011).

Physical activity comprises bodily movements produced by the skeletal musculature resulting in an increase in energy expenditure above the resting level. Physical exercise is physical activity and bodily movements that are planned, structured and repeated regularly to improve or maintain physical fitness and overall health. Exercise interventions are normally characterised in terms of: the type of activity (exercises such as walking, running, aerobics or muscle strengthening), the intensity of the activity (low, moderate or high), the frequency (how often the activity is performed) and the duration (the time spend performing the activity) (Bouchard 1994).

How the intervention might work

There has been an increasing body of research on psychological health outcomes of physical activity, and the results indicate positive effects for adults (Bouchard 1994), and a similar positive effect of physical activity on depression, anxiety and behavioural problems in children and adolescents (Biddle 1993; Calfas 1994; Mutrie 1998; Gapin 2011). People who are physically active seem less likely to suffer from mental health problems and may have enhanced cognitive functioning (Biddle 2011).

As the essential feature of ODD and CD is negativistic behaviour and violation of norms or rules, putative mechanisms that might explain the effect of exercise are related to prevention and mitigation of these symptoms. In children and adolescents with conduct problems, exercise interventions have demonstrated increased social competency, better academic achievement and cognitive function, improved self esteem, and reduced depression and anxiety (Ekeland 2004; Larun 2009; Biddle 2011; Kang 2011). In addition, regular physical exercise has shown effects on neuroendocrine factors, suggesting a relationship between gonadal androgens and ODD, CD and physical aggression (see Essau 2011). Regular exercise programmes also lead to increased levels of brain-derived neurotrophic factor, an essential element in normal brain development that promotes health-associated behaviours (Archer 2012).

Why it is important to do this review

There are several benefits associated with the use of exercise. Physical activity is believed to be inherently good for young people, and it is inexpensive and has few negative side effects.

Given the substantial costs of behavioural disorders diagnosed in childhood and adolescence to both the individual and society, and the need for a broad treatment strategy, a review of the effectiveness of physical exercise interventions in prevention and treatment of disorders involving conduct problems is needed.


To determine the effects of physical exercise interventions on behavioural and cognitive symptoms, and on family functioning, in children and adolescents diagnosed with ODD or CD.


Criteria for considering studies for this review

Types of studies

Eligible study designs are relevant randomised controlled trials (RCTs) and 'quasi-randomised' trials (qRCT) (where allocation is, for example, by date of birth, alternate numbers, case number, day of the week or month of the year.

Types of participants

Children and adolescents aged five to 20 years with ODD or CD, however diagnosed or defined, with or without common comorbid conditions such as ADHD, depression, anxiety disorders or learning disabilities, in any setting.

Types of interventions

Any organised sport, exercise or play activities that aim i) to strengthen the muscles or cardiovascular system, or both; ii) to improve balance and mobility; or iii) strengthen the muscles/cardiovascular system and improve balance and mobility; and that last a minimum of four weeks.

The control group may be children and adolescents receiving no intervention or children and adolescents on a waiting list. 'No intervention' arms could include children and adolescents engaged in ordinary physical activity (which may be statutorily required, for example, routine physical education classes), but who are not otherwise engaged in organised physical exercise sessions.

Types of outcome measures

The outcomes may be self reported or measured by parent, teacher, health professionals or other evaluators.

Timing: assessment at pre-treatment, post-treatment (four to 12 weeks), follow-ups (six to 12 months and more than 12 months) and changes from baseline.

Primary outcomes
  • Behavioural symptoms* (incidence or severity of symptoms of inattention, hyperactivity and offending behaviour).

  • Cognitive function* (as skills of concentration and attention, intelligence; ability to reason quickly and abstractly, and academic achievement; overall school grades and performance).

  • Family functioning* and parent satisfaction.

  • Adverse event* (aggression).

Secondary outcomes
  • Quality of life*.

  • Adverse events (fatigue, pain and injuries/over-use injuries).

Outcomes marked with an asterisk are those that will be used to populate a 'Summary of findings' table, if data are available.

Search methods for identification of studies

We will search the following databases. No date or language limits will be applied and we will seek translation of relevant studies where necessary.

  1. Cochrane Central Register of Controlled Trials (CENTRAL)

  2. Ovid MEDLINE



  5. PsycINFO

  6. ERIC

  7. Cochrane Database of Systematic Reviews (CDSR)

  8. Database of Abstracts of Reviews of Effects (DARE)

  9. Science Citation Index (SCI)

  10. Social Sciences Citation Index

  11. Conference Proceedings Citation Index -Science (CPCI-S)

  12. Conference Proceedings Citation Index - Social Science & Humanities (CPCI-SSH)

  13. WorldCat (limited to dissertations and theses)

  14. SPORTDiscus

  15. (

  16. World Health Organization's (WHO) ICTRP (

  17. metaRegister (

We will search for additional grey literature in OpenGrey and will correspond with authors of relevant studies and other experts to identify unpublished data and ongoing trials. We will handsearch the reference lists of relevant studies and systematic reviews, and will search for additional studies using Google Scholar.

Searches will be based on the following search strategy for Ovid MEDLINE, which combines search terms for the condition (ODD and CD) with search terms for the intervention (physical exercise) and uses the Cochrane Highly Sensitive Search Strategy for identifying randomised trials (Lefebvre 2011). It will be adapted for other databases using appropriate controlled vocabulary and syntax.

1 exp "Attention Deficit and Disruptive Behavior Disorders"/

2 Child Behavior Disorders/

3 Adjustment Disorders/

4 adjustment$ disorder$.tw.

5 exp Impulse Control Disorders/

6 impuls$ control$.tw.

7 conduct disorder$.tw.

8 ((disrupt$ or defiant$) adj3 (behav$ or conduct$ or disorder$)).tw.

9 (impulsiv$ adj3 (behav$ or conduct$ or disorder$)).tw.

10 ((attention$ or behav$) adj3 (defic$ or dysfunc$ or disorder$)).tw.

11 ((aggressive or hostile or negativistic) adj3 (behav$ or conduct)).tw.

12 (ADHD or ADDH or ADHD or "AD/HD" or HKD).tw.

13 or/1-12         

14 exp exercise/

15 Physical Fitness/

16 Physical Exertion/

17 Motor Activity/

18 exp physical endurance/

19 exp Sports/

20 (sport$ or exercis$).tw.

21 exp Leisure Activities/

22 exercise movement techniques/ or dance therapy/ or tai ji/ or yoga/ or exp exercise therapy/ or hydrotherapy/

23 (aqua$ activit$ or aqua$ exercis$ or athletic$ or badminton or bandy or baseball or basketball or Bicycling or Boxing or danc$ or football).tw.

24 (golf$ or gymnastic$ or handball or hockey or jogging or martial art$ or mountaineer$ or rowing).tw.

25 (squash or strength training or swimming or skiing or skating or tai or tennis or volleyball or walking or weight lifting or workout or yoga).tw.

26 ((aerobic$ or cardio$ or isometric$ or physical) adj (activit$ or fitness$ or train$)).tw.

27 or/14-26

28 randomized controlled

29 controlled clinical

30 randomi#ed.ab.

31 placebo$.ab.

32 drug therapy.fs.

33 randomly.ab.

34 trial.ab.

35 groups.ab.

36 or/28-35

37 exp animals/ not

38 36 not 37

39 13 and 27 and 38

Data collection and analysis

Selection of studies

Two review authors (VS and MSF) will independently screen titles and abstracts from the literature searches and will reject any articles that are obviously not relevant to the topic, such as those of adults, non-randomised controlled trials and non-exercise interventions. For studies to be included, we will retrieve full-text papers, which will be judged independently by VS and MSF against the inclusion criteria. In case of uncertainty or disagreement, we will resolve concurrence through discussion or by consulting a third assessor (EE or GJ).

Data extraction and management

For eligible studies, two review authors (VS and MSF) will extract the data using data extraction forms. We will extract data concerning population, age, baseline characteristics, intervention characteristics, compliance and outcome measures. In studies of children with disruptive behaviour, factors such as socioeconomic disadvantage, single-parent family status, younger maternal age, higher parenting stress and depression, and greater severity of conduct problems have been found to be associated with attrition (Werba 2006). Therefore, attrition rates will be registered. Furthermore, attrition may lead to bias referring to systematic differences between groups in withdrawals from a study. Withdrawals from the study lead to incomplete outcome data.

We will resolve discrepancies through discussion or, if required, we will consult a third person (EE or GJ). We will enter data into Review Manager software (RevMan 2011) and check for accuracy. When information regarding any of the above is unclear, we will attempt to contact the corresponding authors of the original reports to provide further details. Trials published only as abstracts will be included, and assessed for eligibility to be included in the review based on the information available. We will present the data in the 'Characteristics of included studies' table.

Assessment of risk of bias in included studies

Two review authors (VS and MSF) will independently assess risk of bias for each included study using the criteria outlined in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). We will resolve any disagreement by discussion or by involving a third assessor (GJ or EE).

(1) Sequence generation (checking for possible selection bias)

For each included trial, we will describe the methods used to generate the allocation sequence in sufficient detail to allow an assessment of whether it should produce comparable groups. To the question "Was the allocation sequence adequately generated?", we will rate the risk of bias as:

  • low (e.g. random number table; computer random number generator);

  • high (odd or even date of birth; hospital or clinic record number);

  • unclear (when there is insufficient information to permit judgement of 'low risk' or 'high risk').

(2) Allocation concealment (checking for possible selection bias)

For each included trial, we will describe the method used to conceal the allocation sequence in sufficient detail and determine whether intervention allocation could have been foreseen in advance of, or during recruitment, or changed after assignment. To the question "Was allocation adequately concealed?", we will rate the risk of bias as:

  • low (e.g. telephone or central randomisation; consecutively numbered sealed opaque envelopes);

  • high (open random allocation; unsealed or non-opaque envelopes; alternation; date of birth);

  • unclear (when there is insufficient information to permit judgement of 'low risk' or 'high risk').

(3) Blinding

Blinding may reduce the risk that knowledge of which intervention was received, rather than the intervention itself, affects outcomes and assessments of outcomes. We will describe for each included trial the methods used, if any, to blind trial participants, personnel and outcome assessors from knowledge of which intervention a participant received.

(i) Blinding of participants and personnel (checking for possible performance bias)

We will describe for each included trial the methods used, if any, to blind trial participants and personnel from knowledge of which intervention a participant received. We will assess blinding separately for different outcomes or classes of outcomes. To the question "Was knowledge of allocated intervention adequately prevented during the trial?", we will rate the risk of bias as:

  • low, high or unclear for participants;

  • low, high or unclear for personnel.

(ii) Blinding of outcome assessment (checking for possible detection bias)

We will describe for each included trial the methods used, if any, to blind outcome assessors from knowledge of which intervention a participant received. We will assess blinding separately for different outcomes or classes of outcomes.

To the question "Was knowledge of allocated intervention adequately prevented during the trial?", we will rate the risk of bias as:

  • low, high or unclear for outcome assessors.

Where 'low' indicates when there was blinding or where we assessed that the outcome was not likely to have been influenced by lack of blinding.

 (4) Incomplete outcome data (checking for possible attrition bias through withdrawals, dropouts, protocol deviations)

We will describe for each included trial the completeness of outcome data for each main outcome, including attrition and exclusions from the analysis. We will state whether attrition and exclusions are reported, the numbers included in the analysis at each stage (compared with the total number of randomised participants), reasons for attrition or exclusion where reported, and any re-inclusions in analyses that we undertake. To the question "Were incomplete outcome data adequately addressed?", we will rate the risk of bias as:

  • low (e.g. where there are no missing data or where reasons for missing data are balanced across groups);

  • high (e.g. where missing data are likely to be related to outcomes or are not balanced across groups, or where high levels of missing data are likely to introduce serious bias or make the interpretation of results difficult);

  • unclear (when there is insufficient information to permit judgement of 'low risk' or 'high risk').

(5) Selective reporting bias

Systematic differences between reported and unreported findings (e.g. incomplete reporting of study findings, differential reporting of outcomes, potential for bias in reporting through source of funding). We aim to examine for selective reporting by comparing study report and protocol (if available) or outcomes prespecified in methods, including the possibility of selective reporting if authors use only one of the two (the one with exaggerated results) when both final values and change scores are used. We will rate the risk of bias as:

  • low (where it is clear that all of the trial's prespecified outcomes and all expected outcomes of interest to the review have been reported);

  • high (where not all the trial's prespecified outcomes have been reported; one or more reported primary outcomes were not prespecified; outcomes of interest are reported incompletely and so cannot be used; trial failed to include results of a key outcome that would have been expected to have been reported);

  • unclear (when there is insufficient information to permit judgement of 'low risk' or 'high risk').

(6) Other sources of bias

We will describe for each included trial any important concerns we have about other possible sources of bias. For example, was there a potential source of bias related to the specific study design? Was the trial stopped early due to some data-dependent process? Was there extreme baseline imbalance? Has the trial been claimed to be fraudulent? To the question "Was the trial apparently free of other problems that could put it at a high risk of bias?", we will rate the risk of bias as:

  • low;

  • high;

  • unclear.

Trials will be grouped as those with a low risk of bias (five or six criteria met), those with a moderate risk of bias (three to five criteria met), and those with a high risk of bias (three or fewer criteria met). Other methodological issues will be documented in 'Characteristics of included studies' table.

Measures of treatment effect

We will aim to use the final values, that is, the values at post-treatment and follow-ups, in calculating treatment effects. However, as changes from baseline in some circumstances will be more efficient and powerful than comparison of final values, change scores will be used if these have a less skewed distribution than the final values, or when this is the method of analysis used in the study reports. However, change-from-baseline and final value scores will be combined in meta-analyses when using the (unstandardised) mean difference (MD). Since the mean values and standard deviations for the two types of outcome may differ substantially, they will be placed in separate subgroups, to avoid confusion for the reader, but the results of the subgroups will be pooled together. Final values and change scores will not be combined together in the case of using standardised mean differences, since the difference in standard deviation then reflects the differences in the reliability of the measurements and not the differences in measurement scale.

Continuous (including scale) data

We will analyse continuous outcomes using the MD with 95% confidence intervals (CI). MDs and 95% CI will be calculated by comparing and pooling the mean score differences from the end of treatment to baseline for each group.

Dichotomous (binary) data

We will analyse dichotomous outcomes calculating the risk ratio (RR) for each trial with the uncertainty in each result being expressed by their 95% CI.

Multiple measure strategies

Investigators may choose different instruments to measure outcomes, either because they use different definitions of a particular outcome or because they choose different instruments to measure the same outcome. For example, an investigator may choose to use a generic instrument to measure functional status or a different disease-specific instrument to measure functional status. The definition of the outcome may or may not differ. We will aim to use the standardised mean difference to combine trials that assess the same outcome with different measures or instruments. Since both final values and changes from the baseline are probable, we will use final values as outcome variable and account for baseline measurements of the outcome variable as a covariate in a regression model or analysis of covariance (ANCOVA). If there is evidence of skewed data, this will be reported. In case of missing data about the standard deviation of the change, we will impute this measure using the standard deviation at the end of treatment for each group.

Unit of analysis issues

1. Cluster trials

Trials often employ 'cluster randomisation' (such as randomisation by clinician or practice), but analysis and pooling of clustered data may pose problems.

We will include cluster-randomised trials in the analyses along with individually randomised trials. We will adjust their sample sizes using the methods described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011), using an estimate of the intracluster correlation coefficient (ICC) derived from the trial (if possible), from a similar trial or from a trial of a similar population. If we use ICCs from other sources, we will report this and conduct sensitivity analyses to investigate the effect of variation in the ICC. A statistician will be involved in this part of the analysis.

2. Studies with multiple treatment groups

In the case of several interventions or control groups to be compared, we will, wherever possible, attempt to combine all relevant control intervention groups into a single group and all relevant experimental intervention groups into a single group. This might be when there are two groups undertaking similar types of exercise, for example when aiming to strengthen the muscles, strengthen the cardiovascular system, or improve balance and mobility, or when there are exercise groups of different intensities, and also when data are presented separately for boys and girls.

3. Repeated observations on participants

If there are any trials that have multiple observations for outcome variables, we will classify the time frames of included trials as short-term (from four weeks to three months) and long-term (over three months) follow-up, and we will conduct meta-analyses for each group of trials.

Dealing with missing data

For included trials, we will note levels of attrition and the proportions of participants for whom no outcome data were obtained and we will report it in a 'Risk of bias' table. For all outcomes, we will carry out analyses, as far as possible, on an intention-to-treat basis, that is, we will attempt to include all participants randomised to each group in the analyses, and all participants will be analysed in the group to which they were allocated, regardless of whether or not they received the allocated intervention. The denominator for each outcome in each trial will be the number randomised minus any participants whose outcomes are known to be missing.

Where data are not reported for some outcomes or groups, we will attempt to contact the trial authors for further information. If the original investigators cannot provide the missing data, we will assume these data are not missing at random and that missing cases had poor outcomes. The options for dealing with missing data are dependent on if it is dichotomous or continuous.

For dichotomous data, we will impute the missing data with replacement values, and treat these as if they were observed (e.g. imputing an assumed outcome such as assuming all were poor outcomes, imputing the mean, or imputing based on predicted values from a regression analysis).

For continuous data, we will impute outcomes for the missing participants as 'last observation carried forward', or, for change from baseline outcomes, assume that no change took place.

Where it is not possible to calculate the standard deviation from the P value or the CIs, we will impute the standard deviation as the highest standard deviation in the other trials included under that outcome, fully recognising that this form of imputation will decrease the weight of the trial for calculation of MDs and bias the effect estimate to no effect in case of standardised mean difference (Higgins 2011). We will explore the impact of including trials with high levels of missing data using a sensitivity analysis.

Assessment of heterogeneity

Clinical heterogeneity is differences between trials in key characteristics of the participants, interventions or outcome measures. Many different clinical factors can influence the effect of intervention, including demographics (age, sex, and race/ethnicity), severity of disease, comorbidities, and co-interventions in the population. Identifying such factors is important because it may help us understand which populations will benefit the most, the least, and who is at greatest risk of experiencing adverse outcomes. We consider the likelihood of encountering clinical heterogeneity as high for our population since CD and ODD are conditions associated with quite high and fluctuating comorbidities. Moreover, the incidence or severity of symptoms may vary greatly (both behavioural and cognitive symptoms). Age and gender are also important characteristics that may influence the results. With a broad intervention approach, differences in the type or length/period of the intervention may also be sources of clinical heterogeneity.

Another source of heterogeneity is methodological heterogeneity and relates to differences in study design and methods. Non-randomised trials may be expected to be more heterogeneous than randomised trials, given the extra sources of methodological diversity and bias.

Where there are large differences in clinical or methodological nature between trials, the simplest question to ask is whether there is any good reason for pooling data from these trials in a meta-analysis, where heterogeneity is known to exist. We will use visual exploration of heterogeneity with forest plots and evaluate CI overlap and, if clinical heterogeneity is apparent, we will decide whether it is meaningful or not to follow up with a subgroup analysis.

Statistical heterogeneity is the variability in the observed treatment effects beyond what would be expected by random error. We will assess statistical heterogeneity using the Chi2 test of heterogeneity along with visual inspection of the graph. We will also calculate the I2 statistic (Higgins 2011). This measures the extent of inconsistency among the results of the trials. We will regard heterogeneity as substantial if I2 is greater than 50% to 60% (I2 > 40% moderate heterogeneity, I2 > 80% considerable heterogeneity) or there is a low P value (< 0.10) in the Chi2 test for heterogeneity. We will explore the heterogeneity by subgroup analysis. We will use a random-effects meta-analysis as an overall summary if this is considered appropriate.

Assessment of reporting biases

We will aim to minimise publication bias by employing a sufficiently robust search strategy involving electronic databases and handsearching, grey literature including conference abstracts, registers of clinical trials, and field experts. Our review of each trial report will include an evaluation of non-reported or insufficiently reported outcomes; where within-trial reporting bias is suspected, we will obtain the protocols where possible and compare the prespecified outcomes with those reported in the published trial results, or compare study reports with prespecified outcomes in the methods.

We will be alert to possible duplication bias by cross-checking details of authors, locations, numbers of participants and dates. Funnel plots will be drawn to investigate any relationship between effect size and trial precision (closely related to sample size) if we identify a sufficient number of trials (i.e. at least 10 trials). Such a relationship could be due to publication or related biases or due to systematic differences between small and large trials. If a relationship is identified, we will aim to find possible explanations.

Data synthesis

We will perform analyses according to the recommendations in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). We will use aggregated data for analysis. For statistical analysis, data will be entered into Review Manager (RevMan 2011). One review author (VS) will enter data into the software and a second review author (MS or EE) will check this for accuracy.

For both the dichotomous and the continuous outcomes, we will use a fixed-effect meta-analysis for combining data where it is reasonable to assume that trials are estimating the same underlying treatment effect, that is, where trials are examining the same intervention, and the trials' populations and methods are judged sufficiently similar. If there is clinical heterogeneity sufficient to expect that the underlying treatment effects differ between trials, or if substantial statistical heterogeneity is detected (I2 > 60%), we will use a random-effects meta-analysis to produce an overall summary if a mean treatment effect across trials is considered clinically meaningful. The random-effects summary will be treated as the mean range of possible treatment effects and we will discuss the clinical implications of treatment effects differing between trials. If the mean treatment effect is not clinically meaningful, we will not combine trials. If we use random-effects analyses, the results will be presented as the mean treatment effect with its 95% CI, and the estimates of T2 and I2. If data are too heterogeneous, we will explore possible explanations by means of sensitivity analyses. Otherwise, we will present the data narratively.

When one outcome is reported as a dichotomous measure by some trials and others use a continuous measure, we will analyse them separately.

Subgroup analysis and investigation of heterogeneity

Subgroup analyses involve splitting all the participant data into subgroups, often to make comparisons between them. Subgroup analyses may be done for subsets of participants, or for subsets of studies (Higgins 2011). Subgroup analyses may be done as a means of investigating heterogeneous results, or to answer specific questions about particular patient groups, types of interventions or types of studies.

If possible, we will perform subgroup analyses and investigations of heterogeneity for each outcome meta-analysed by considering:

  • diagnosis - CD versus ODD;

  • type of exercise intervention.

Sensitivity analysis

We will conduct sensitivity analyses to determine whether findings are sensitive to restricting the analyses to trials judged to be at low risk of bias. In these analyses, we will restrict the analysis to only trials with low risk of selection bias (associated with sequence generation or allocation concealment). In addition, we will assess the sensitivity of findings to any imputed data. If we identify a sufficient number of trials, we also plan to perform sensitivity analyses excluding unpublished trials.


We would like to acknowledge members of the Cochrane Developmental, Psychosocial and Learning Problems Group for their assistance and comments, especially Laura MacDonald for assistance with the preparation of the protocol and Professor Geraldine Macdonald, Director at the Institute of Child Care Research Queen's University in Belfast, for her constructive feedback on the first draft of the protocol. We would also like to acknowledge Frode Heian, Gro Jamtvedt and Kåre Birger Hagen for their contributions to the original protocol (Issue 1, 2006).

Contributions of authors

VS, MSF and EE have written the protocol. VS and MSF will select studies for the review. VS and MSF will assess the quality of studies, extract the data and write the review. EE and GJ will adjudicate disagreements between VS and MSF. GJ and EE will contribute with statistical competence and give comments and feedback on the manuscript draft.

Declarations of interest

Vegard Strøm - researcher at Sunnaas Rehabilitation Hospital, Nesoddtangen, Norway.
Marita Sporstøl Fønhus - none known.
Eilin Ekeland - President of the Norwegian Physiotherapist Association, Norway.
Gro Jamtvedt - none known.