Commentary: A contribution to evidence-informed education policy – reflections on


  • Conflict of interest statement: No conflicts declared.

On the basis of their meta-analysis, Strong, Torgerson, Torgerson, and Hulme (2011) conclude that ‘There is no evidence from this review that Fast ForWord is effective as a treatment for children’s reading or expressive or receptive vocabulary weaknesses.’ This commentary will consider whether this is an accurate and fair conclusion and, if so, the implications for education policy.

The meta-analysis is based on a systematic review of the literature that follows the current guidelines on best practice for such reviews. The search and selection of studies for inclusion and data extraction were based on a pre-defined protocol. This is a transparent process and it is open to replication. There can be no doubt that accurate values were obtained for effect size of Fast ForWord against untreated controls or against treated controls. These effect sizes were small and not significantly different from zero for any of the four outcome measures selected. Using the methods adopted, the conclusions from the paper are accurate, but are they fair?

There seem to be two potential ways that a claim for partiality could be made. The first is that there was an element of selection at the second stage of screening that reduced the possible 13 eligible papers to just 6 for the meta-analysis. Did these papers contain positive results that could have changed the outcome? The authors themselves discuss the negative results from two of these excluded studies (Bishop, Adams, & Rosen, 2006; 1 Bishop, Adams, Lehtonen, & Rosen, 2005). The other two studies not using the commercially available Fast ForWord program also failed to show significant effects (Wren & Roulstone, 2008; Ukrainetz, Ross, & Harm, 2009). There were two studies excluded because they failed to meet the design requirement of baseline equivalence of groups. Of these, one (Troia & Whitney, 2003) showed significant gains for the Fast ForWord group on just one out of the four measures used and the other did not show any significant gains against controls over the 2 years after treatment (Hook, Macaruso, & Jones, 2001). So of the six substantive papers excluded at this second stage of screening, just one showed any evidence for significantly sustained gains for the Fast ForWord (or equivalent) intervention group. It has to be concluded that an element of selection at the second stage of screening has not introduced unfairness in the review process.

The second potential source of unfairness is publication bias. This is a pervasive problem in evaluation of treatments based on published studies. It is clearly essential to concentrate a review on the studies appearing in peer reviewed journals. The peer review process provides a degree of assurance of a minimum quality of the published research but this is not infallible – poor quality studies do get published. Publication bias arises when studies getting through this scrutiny present findings that are not representative of the findings from the full range of studies undertaken. Many of these excluded studies fail to reach the minimum quality standard in terms of design and analysis. However, some do reach this standard but fail to get published because the findings are not seen to add to the field and this is particularly the case for those that produced non-significant results. Non-significant results might arise if a study is underpowered and therefore unlikely ever to produce positive results. However, others may have been adequately powered but were rejected for publication, or not even submitted for publication, because the negative result was thought to be of little interest. By this process, publication bias may result in the published papers over-estimating the potential impact of a treatment by selecting out studies with small or insignificant effects.

How might publication bias have produced unfairness in the Strong et al. (2011) meta-analysis? Here there is a conundrum. The meta-analysis failed to show significant effect sizes and so publication bias, if present and as it is usually understood, would produce a ‘true’ effects size smaller than that obtained in the review. Unfairness from publication bias would only arise in this case if the unpublished studies showed substantially greater effect sizes. It is here that what Strong et al. term ‘findings from their privately conducted and non-peer-reviewed studies [Scientific Learning Corporation, 1999, 2003]’ might be drawn upon. It is known that systematic bias favours products that are made by the company funding the research (Lexchin, Bero, Djulbegovic, & Clark, 2003). With this in mind, until these studies have been subjected to the peer review process, it is proper to exclude them from consideration.

These considerations suggest that the findings and conclusions of the meta-analyses were both accurate and fair. This view is supported by the outcome of the previous reviews of the effectiveness of the Fast ForWord program summarised in the paper.

The conclusions from this review matter, since failing to learn to read has long-term adverse outcomes for the individual and creates a burden for society as whole. These burdens include the continuing cost to the education system but also those arising from the increased risk of behavioural difficulties and reduced employment capabilities that are, for some, a consequence of educational failure. There is a great need for effective intervention to help those failing in learning to read. This means not only that we should employ methods of proven value but equally important should not divert resources to adopt remedial methods that have failed to demonstrate their worth. In the case of supporting children failing to learn to read, there is excellent evidence for the efficacy of training in phoneme awareness (Ehri et al., 2001).

How can the results of such meta-analyses be incorporated into educational policy making? The emphasis being placed on evidence-based practice in health care is seen as problematic by some in other fields such as education (Oakley, 2002). Some of these concerns relate to the suspected political motives behind the evidence-based focus. Others relate to the inappropriateness of importing what is seen as a ‘positivist’ orientation to research in an area (education) where alternative social science approaches are thought to be more appropriate. Nevertheless, Oakley (2002) identifies ways in which the ‘unscientific, non-cumulative, uncollaborative and inaccessible nature of much educational research’ can be brought within a framework to produce evidence-informed policy. In addition, there is an argument that a broader range of evidence than that from randomised controlled trials (RCTs) is needed to properly appraise the influence of the full range of factors that could be beneficial, for example, to reading attainment (Pressley, Duke, & Boling, 2004). They rightly argue that factors, such as teacher effectiveness, are not readily amenable to RCT methods and need to be studied using non-experimental approaches.

However, where RCTs can be applied they remain the most robust test of causal relationships between the type of intervention and an outcome. For Fast ForWord there can be no doubt the RCT is the method of choice to evaluate its efficacy. It is a circumscribed package of instruction that can be readily applied under controlled conditions. When subjected to this test, it is my view that RCTs of Fast ForWord have failed to provide evidence for its efficacy and that it is time that children with difficulties in language and/or reading acquisition were given access only to teaching methods of known effectiveness.


Correspondence to

Jim Stevenson, School of Psychology, University of Southampton, Highfield, Southampton SO17 1BJ, UK; Email: