Editorial: “Better by design” – why randomized controlled trials are the building blocks of evidence-based practice


This issue of JCPP features three papers that explore the question of ‘what works’ in interventions for children's difficulties. They are important in two ways: first, because they add to the evidence base on effective – and in one case depressingly ineffective – interventions in specific areas; and second, because they offer more general lessons on the importance of good research design in treatment evaluations, and the key features to be alert to in evaluating evaluation research.

The core design requirements for assessing treatment effectiveness are well established (see, for example, Evans, Thornton, Chalmers & Glasziou, 2011). The most robust approach involves random allocation of children to intervention or control conditions (or other stringent approaches to group matching if random allocation is not feasible); adequate sample sizes to ensure statistical power; a clearly defined intervention, with checks on fidelity of implementation; and outcome measures that are theoretically and clinically relevant, as well as ecologically valid, and undertaken blind to treatment group. Though these criteria may seem demanding, they have repeatedly proved crucial in sorting the wheat from the chaff in treatment evaluations. All too often, poorly designed studies have been used to support ineffective treatments (as detailed, for example, in the Strong, Torgersen, Torgerson and Hulme (2011) meta-analysis of the Fast ForWord language programme, or the Sonuga-Barke et al. (2013) meta-analyses of non-pharma interventions for ADHD). Adequately powered, well-conducted randomized controlled trials represent the best test of the efficacy of interventions.

Fricke and colleagues provide an elegant example of the application of these design principles in their evaluation of a language intervention for young children, designed to bolster early language skills and provide a foundation for learning to read. To date, most well-supported interventions for literacy problems have focused on improving reading accuracy, via enhancing word-decoding skills. As Fricke et al. argue, however, broader language skills are also important for literacy development, and perhaps especially so for reading comprehension; as a result, improving early language skills may also be of value for children who struggle learning to read. Their study is a model of a rigorous, real-world test of that proposition. Large samples of young children with weak oral language skills were randomly assigned to intervention or waiting control groups at the end of Nursery school; a well-documented 30-week language programme was delivered by teaching assistants in schools; and the investigators took repeated measures of skills directly taught in the programme, as well as broader standardized tests of language, narrative, phoneme awareness and literacy skills.

Adequately powered, well-conducted randomized controlled trials represent the best test of the efficacy of interventions

The findings were clear-cut in showing improvements in oral language and spoken narrative skills, and also in later reading comprehension; importantly, these latter effects were independent of any concurrent effects on reading accuracy, suggesting that reading comprehension did indeed benefit from improvements in earlier language skills. Though some questions remained regarding the optimum timing, duration and intensity of elements of the programme, this study clearly illustrates how a well-designed evaluation can provide evidence on programme effectiveness that is of direct relevance to policy and practice.

The second intervention-oriented paper also underscores the importance of research design, but makes for less encouraging reading. For her 2012 Emanuel Miller Memorial Lecture, Dorothy Bishop set out to examine the contribution that recent advances in brain imaging have made to our understanding of interventions for language impairment. In the Research Review based on that lecture Bishop details the disappointing outcome that ensued: as she discovered, few if any clear conclusions can be drawn at this stage because most extant studies show design flaws. As a result, her review became, in effect, both an exposition of key design principles in clinical trials and a cautionary tale on the current state of evidence in this particular field. On the way, she draws attention to the somewhat alarming abandonment of critical faculties that many of us seem to show when faced with neuroscience findings: include images of the brain in a study report, and we are more likely to think that study has good scientific bona fides….

In practice, as Bishop's review documents, this is far from necessarily the case. Having measures of brain function as outcomes in no way obviates the need for strong research design – and indeed may raise new design issues of its own. In addition, as with any outcome measure, a good scientific rationale is needed for the inclusion of neuroimaging findings in intervention research. At this stage, Bishop argues that we may be running ahead of ourselves here, at least in relation to treatment studies in the language impairment field. She sets out a range of areas where such measures are likely to be valuable in the future; at this stage, however, she argues that we would do better to concentrate on undertaking the well-designed trials needed to identify effective interventions before exploring the brain-related changes that may contribute to their effects.

The third intervention-related study in this issue focuses on a quite different outcome – youth offending – and assesses a very different type of treatment. Rather than testing a new approach to work with young offenders, Petitclerc and colleagues (2013) examine the long-term impact of the standard disposals handed down every day in the juvenile courts. In the juvenile justice field random assignment is of course often difficult to implement (though some studies have used it). Petitclerc et al. take a different approach, capitalizing creatively on data collected in the course of a long-term longitudinal study to identify and match their ‘intervention’ and ‘control’ groups. Across the teens, many of the young men in their study were arrested, but not all were brought to court. Using a technique known as propensity score matching, the investigators mined their extensive data-base to match each court-adjudicated offender with one or more others who also reported being arrested, but who were not dealt with through the courts. Most past studies of the effectiveness of juvenile court disposals have tracked offenders over relatively short follow-up periods, of around one year. Petitclerc et al. had access to offence data up to age 25, well into early adulthood, and well beyond the peak age of offending. The findings from this extended follow-up were, however, strikingly consistent with those of earlier research: young offenders processed through the court system had higher risks of re-offending than their similarly delinquent but non-adjudicated peers. As the authors acknowledge, their study cannot tell us why young people dealt with by the courts had such poor outcomes; what it underscores, however, is the key need for evidence on what does work in reducing risks of re-offending in these troubled and troublesome groups.

In their different ways, each of these articles points to the need for rigorous treatment evaluations as the core building-blocks of evidence-based practice. Undertaking such studies makes heavy demands on investigators – and also requires vigilance among those who read their results. To help in that, Duff and Clarke (2011, Table 2) recently devised a checklist for evaluating intervention research, which we reference again here. Armed with tools of this kind, and the examples provided by this issue's studies, we should all be in a stronger position to sort out what really works in intervention research.

  1. 1

    Articles in this issue (with italicized names on first mention) can be found at: http://onlinelibrary.wiley.com/doi/