The quality of randomised trials of tocolysis


Prof JG Thornton, Department of Obstetrics and Gynaecology, University of Nottingham, Nottingham City Hospital NHS Trust, Hucknall Road, Nottingham NG5 1PB, UK. Email


Tocolytic treatment of suspected preterm labour has been evaluated in at least 75 randomised controlled trials. These have been included in six Cochrane reviews. If the trials are poorly designed, such reviews may mislead or, at best, provide weaker evidence than those based on well-designed ones. The objective of this study was to compare the quality of the trials included in the Cochrane reviews of tocolytic therapy. Trial group sizes; the methods used by each trial to avoid selection, performance, attrition, and detection bias; and evidence that the statistical analysis plan was prespecified were abstracted from each Cochrane review. Except where noted, the judgement of the Cochrane reviewers was used. The number of trials graded A (sealed envelopes or third-party randomisation) for allocation concealment was as follows: beta-agonists 5/16, magnesium sulphate 9/23, oxytocin receptor antagonists 6/6, cox inhibitors 12/13, calcium channel blockers 9/12, and nitric oxide donors 5/5. The number blinding the intervention was as follows: beta-agonists 9/16, magnesium sulphate 2/23, oxytocin receptor antagonists 5/6, cox inhibitors 7/13, calcium channel blockers 0/12, and nitric oxide donors 1/5. The number reporting a sample size calculation was as follows: beta-agonists 2/16, magnesium sulphate 3/23, oxytocin receptor antagonists 6/6, cox inhibitors 4/13, calcium channel blockers 4/12, and nitric oxide donors 1/5. The mean sample size of each treatment group was as follows: beta-agonists 53, magnesium sulphate 41, oxytocin receptor antagonists 126, cox inhibitors 31, calcium channel blockers 43, and nitric oxide donors 46. Data on avoiding attrition bias (follow-up rates) are difficult to summarise because there is no agreed standard for ‘complete follow up’. Data on avoiding detection bias (blinding of outcome assessments) appeared unreliable because reviewers reported this in different ways. In conclusion, the trials of oxytocin antagonists and beta-agonists were of the highest quality. There remains considerable scope for bias in many of the trials included in the current Cochrane systematic reviews of tocolytics.


Evaluating treatments for spontaneous preterm labour is difficult because our understanding of the aetiology is poor. Genetic differences, cervical damage, sexual activity, and infection, as well as prostaglandins, oxytocin, and progesterone receptors, all play a role. At a practical level, preterm birth is difficult both to predict and to diagnose. We need to know not only whether a tocolytic is effective in preventing preterm birth but also whether delaying preterm birth will improve or worsen outcome. Reported studies suggest that any benefits or harms from current tocolytics are likely to be marginal. In this situation, randomised controlled trials (RCTs) provide the only reliable method to avoid bias. However, a study does not automatically provide reliable unbiased data when it uses a randomised design. Bias can still creep in if other aspects of the design are poor.

There are many ways to judge the quality of RCTs,1,2 but almost all authorities agree that there are four major sources of potential bias, which means that trial quality can be judged by the methods used to avoid them, namely (1) selection bias prevented by blinding of randomisation; (2) performance bias prevented by blinding of the intervention; (3) attrition bias prevented by complete follow up; and (4) detection bias prevented by blinding of the outcome assessment.

The Cochrane collaboration reviews all use at least a variation on this system.3 In addition to this, trials can be judged on their methods to avoid bias due to the play of chance.


No new primary research was carried out. Rather, the latest versions of all Cochrane reviews of tocolysis for the treatment or prevention of spontaneous preterm labour were retrieved.4–9 Reviews of maintenance therapy were excluded. The quality grading given by the Cochrane reviewers for methods used by each trial to avoid each type of bias was abstracted and tabulated.

The results are presented exactly as the Cochrane reviewers classified them with one exception, the correction of an apparent error in classifying some oxytocin antagonist trials. In some of these trials, the Cochrane reviewer had recorded that they were uncertain as to whether the trial had avoided detection bias despite the trials having been recorded as being double-blind.

Cochrane reviewers varied in how they classified methods for avoiding attrition bias. Sometimes semiquantitative words were used, describing the follow-up rate as ‘good’ or ‘bad’, or ‘high’ or ‘low’. At other times, they reported follow-up rates for the primary trial endpoint. Rates more than 90% were classed as good and those below this as poor. If no rate was reported, the word used by the reviewers was used to classify this aspect of the trial.

With regard to methods used to avoid bias due to the play of chance, trial sizes and whether the authors reported a predetermined sample size calculation were also recorded. Achieved sample sizes are presented as the mean number per trial group to avoid favouring trials with more than two arms.


Six systematic reviews were found. The review of beta-agonists included 16 trials; oxytocin receptor antagonists, six trials; cox inhibitors, 13 trials; calcium channel blockers, 12 trials; magnesium sulphate, 23 trials; and nitric oxide donors, five trials. The number of trials versus placebo or versus another tocolytic for each review is shown in Table 1.

Table 1.  The number of trials versus placebo or versus another tocolytic for each Cochrane reviews included in the analysis
TocolyticVersus placeboVersus other
Oxytocin receptor antagonists24
Cox inhibitors310
Calcium channel blockers012
Magnesium sulphate320
Nitric oxide donors14

The quality of randomisation is shown in Table 2, and the number with adequate blinding of the intervention, adequacy of follow up, and blinding of the outcome assessment are shown in Table 3. Of note, in four of the five oxytocin receptor trials the Cochrane reviewer recorded that they were uncertain as to whether the trial had avoided detection bias despite the trial had been recorded as being double-blind. Since it is unlikely that the blinding would have been broken before outcome assessment, this may have been an accidental misclassification by the Cochrane reviewers.

Table 2.  The quality of randomisation for trials included in Cochrane reviews
TocolyticA: low risk of biasB: unclearC: high risk of bias
Beta-agonists5 (all pharmacy)11 
Oxytocin receptor antagonists6 (all pharmacy)
Cox inhibitors12 (mixed)1
Calcium channel blockers9 (all envelopes)3
Magnesium sulphate9 (mixed)122
Nitric oxide donors5 (all envelopes)
Table 3.  Methods used to avoid bias
TocolyticPerformance bias (blinding of treatment group)Attrition bias (follow-up rate)Detection bias (blinding of outcome assessments)
YesNoUnclearGood or >90%Bad or <90%UnclearYesNoUnclear
  • *

    In four of these trials, the intervention had been recorded as double-blind. Since it is unlikely that the blinding would have been broken before outcome assessment, this may have been an accidental misclassification by the Cochrane reviewers.

Oxytocin receptor antagonists660015*
Cox inhibitors7611285
Calcium channel blockers01266012
Magnesium sulphate2216413221
Nitric oxide donors1511314

Table 4 shows the mean sample size in each treatment group for each trial and whether or not a sample size calculation was reported.

Table 4.  Mean sample size in each group and the number of trials that reported a sample size calculation
TocolyticMean group sample sizeSample size calculation
Oxytocin receptor antagonists1265/6
Cox inhibitors314/13
Calcium channel blockers434/12
Magnesium sulphate413/23
Nitric oxide donors461/5


These data suggest that the trials of beta-agonists and oxytocin antagonists are generally of the best quality. Many of the trials of other tocolytics are susceptible to a number of other sources of bias. The only area in which the oxytocin receptor trials are not of the highest quality is in the avoidance of detection bias. If we assume that four of these were misclassified, as seems likely, then the oxytocin antagonist trials are the best on all quality measures.

The strength of this review lies largely in the sources used. The Cochrane collaboration has a high reputation for the thoroughness of its search strategy and the quality of its reviews. The weakness is that there was still some scope for bias in the way the reviewers classified trial quality. None of the reviews used blinded reviewers to judge the quality, and none recorded how many reviewers they used and whether any of the reviewers had a conflict of interest. In two of the reviews, one or more of the review authors were also authors of one or more of the included trials. Nevertheless, these conclusions are probably robust because most of the quality score items are fairly objective.


The main implication of these findings is that more high-quality primary research is required into all types of tocolysis. Even the largest trials reported so far are too small to test the effect of tocolysis on important neonatal outcomes. If it is considered that the effect on neonatal outcomes is likely to vary considerably by gestational age and fetal condition, the reviews are even more severely underpowered.