Dr LD Huusom, Department of Gynecology and Obstetrics, Kettegaard Allé 30, 2650 Hvidovre, Denmark. Email firstname.lastname@example.org
Please cite this paper as: Huusom L, Secher N, Pryds O, Whitfield K, Gluud C, Brok J. Antenatal magnesium sulphate may prevent cerebral palsy in preterm infants—but are we convinced? Evaluation of an apparently conclusive meta-analysis with trial sequential analysis. BJOG 2011;118:1–5.
Cerebral palsy is a condition with chronic, nonprogressive disability of the central nervous system, which has major personal and socioeconomic burdens. For preterm infants, the risk of cerebral palsy increases inversely with gestational age at birth.1 Several studies have indicated that magnesium sulphate given to women at risk of preterm birth may be neuroprotective for the fetus.2 Meta-analyses of the data from randomised clinical trials (RCTs), comparing magnesium sulphate against placebo for women at risk of preterm birth, have revealed a significant preventive effect on cerebral palsy in the child without observed serious adverse drug effects.2–4 The meta-analyses found no overall difference in mortality (fetal and later deaths) between the magnesium sulphate and placebo groups.2–4
Recently, an updated Cochrane review concluded, ‘The neuroprotective role for antenatal magnesium sulphate therapy given to women at risk of preterm birth for the preterm fetus is now established’.2 The review reported a 32% reduced risk of cerebral palsy when antenatal magnesium sulphate therapy was given [relative risk (RR), 0.68; 95% confidence interval (CI), 0.54–0.87].2
To our knowledge, however, most obstetric departments have not adopted magnesium sulphate as standard treatment for women at risk of preterm birth.5,6 This discrepancy with the evidence could have several reasons: ignorance/unawareness, too small effect size to be considered clinically relevant, lack of resources, or clinicians do not believe the results because of, for example, risk of bias or risk of random error. The previous bias risk assessments have not reported major bias limitations. Thus, this surprising discrepancy between evidence and clinical practice spurred us to ask whether the results of the meta-analyses are statistically robust.
Conclusions based on too little data are at risk of random error (‘play of chance’), which may have caused a false ‘positive’ meta-analytic result. Surprisingly, the risk of random error is rarely assessed in meta-analyses, and was not performed in the meta-analyses of the effect of magnesium sulphate on cerebral palsy.2–4 Here, we briefly outline the rationale for assessing random error risk in meta-analyses and apply a statistical method that adjusts for such risk: trial sequential analysis (TSA).7 TSA methodology determines the level of existing evidence and estimates whether additional evidence is needed before acceptance or rejection of magnesium sulphate as standard treatment for women at risk of preterm birth to prevent neonatal cerebral palsy. TSA may serve as a tool for the quantification of the reliability of cumulative data in meta-analyses, but it has not been presented previously to obstetricians.
Trial sequential analysis
Meta-analyses are commonly updated when new trials are published. At least four meta-analyses have been published on magnesium sulphate.2–4,8 Such repeated significance testing, if all performed with the conventional P value criterion (P ≤ 0.05), exacerbates the risk of random errors. The situation is comparable with interim analyses of a single trial. TSA combines an a priori calculated required information size for a meta-analysis with the adaptation of monitoring boundaries to evaluate the accumulating data (i.e. meta-analytic updates).7 The required information size calculation is similar to the sample size calculation in a single trial. It requires: (1) a realistic event proportion in the control group, (2) a minimal relevant intervention effect, which is judged to be clinically worthwhile and biologically plausible, and (3) a desired maximum risk of statistical errors [type 1 error (α) and type 2 error (β)]. Once the required information size is calculated, trial sequential monitoring boundaries can be adapted as new trials are published and meta-analyses are updated over time.7 Such monitoring boundaries for meta-analyses guide for more conservative P values and adjacent 95% CI in cases of few trials (and patients or events) to claim a statistically significant result.
We selected the data from the 5357 high-risk infants of less than 34 weeks of gestational age who were included in the recent Cochrane review.2 All five included RCTs compared magnesium sulphate against placebo for women at risk of preterm birth. Overall, a significant effect of magnesium sulphate compared with placebo on cerebral palsy was seen (RR, 0.70; 95% CI, 0.54–0.90) using the ‘traditional’ random-effects model meta-analysis (Figure 1).
We calculated the required information size and constructed the trial sequential monitoring boundaries for the outcome cerebral palsy. All TSAs were calculated using a control event proportion of 5% and an a priori effect size of 25% RR reduction. These variables were based on a conservative interpretation of the data obtained in the previous meta-analyses.2–4 Type 1 errors of 5 and 1% were applied and the type 2 error was set to 10%. We explored the extent to which the conclusion based on the meta-analysis remained reliable when accounting for exacerbated risk of random error caused by sparse data and repetitive testing. Further, we calculated whether additional participants and events (information size) were needed to obtain firm evidence. The exact method for the calculation of the required information size and the construction of monitoring boundaries can be found elsewhere.7,9
Two TSAs are provided (Figures 2 and 3). TSA with a type 1 error of 5% demonstrated that the required information size is 11 776 participants. The cumulative Z-curve after inclusion of 5357 infants with a gestational age of <34 weeks shows a positive intervention trend, but this effect does not reach statistical significance as it does not cross the monitoring boundary (adjusted 95% CI, 0.47–1.04) (Figure 2). The estimated number of additional participants required in a RCT to cross the monitoring boundary and obtain firm evidence approximates to 400 patients, i.e. 200 patients in each intervention arm, with an anticipated 8 events in the magnesium sulphate arm and 10 events in the placebo arm.
TSA with a type 1 error of 1% found that the required information size is 16 675 participants. The cumulative Z-curve after inclusion of 5357 infants with a gestational age of <34 weeks did not cross the monitoring boundary (adjusted 95% CI, 0.36–1.33) (Figure 3). The estimated number of additional participants required in a RCT to cross the monitoring boundary and obtain firm evidence is about 4000 patients, i.e. 2000 patients in each intervention arm, with an anticipated 75 events in the magnesium sulphate arm and 100 events in the placebo arm.
The use of TSA to assess the data for the effect of antenatal magnesium sulphate on cerebral palsy revealed that the apparently conclusive beneficial effect resulting from the meta-analyses may, in fact, be a false positive result because of a risk of random error. Such a random error may occur because of too sparse data or repetitive testing of data (multiplicity). Thus, more data may be needed before implementing this intervention as a standard therapy. Depending on the individual criteria (control events proportion, RR estimated in previous RCTs, and the desired type 1 and 2 errors), 400–4000 patients should be enrolled in RCTs before firm statistical evidence may be achieved. However, the decision to proceed with an additional RCT is ethically a difficult topic. Such a RCT might establish the benefits of magnesium sulphate beyond the risk of random error from a statistical point of view. In contrast, it would postpone a potentially beneficial intervention, as claimed by several meta-analyses, from being introduced in clinical practice. Thus, women giving birth to preterm infants with subsequent cerebral palsy may question why magnesium sulphate was not given to them before birth.
Most obstetric departments have still not implemented magnesium sulphate as standard treatment for women at risk of preterm birth, despite the encouraging result from the Cochrane review. The reasons may be that, despite a statistically significant result (P = 0.006), the intervention might not be considered clinically relevant. However, even the worst-case scenario according to the 95% CI still provides a considerable risk reduction of 13% (number needed to treat, 167) on an important clinical outcome (cerebral palsy). Clinicians may also sense the need for more data or be unaware of the results. Other reasons for not implementing a new treatment regime could be that it is difficult to administer, is costly or the risk of an adverse event is considerable. However, magnesium sulphate is relatively cheap and easy to administer. It has been used for decades in obstetrics with purposes other than fetal neuroprotection, e.g. prevention and treatment of eclampsia and for tocolysis. Hence, it is a well-known medication and only minor maternal adverse events have been reported in the literature. However, it is known that adverse events are usually insufficiently reported in RCTs,10 and the risk of unintentionally overdosing and long-term adverse events to the infant may not be captured in these RCTs, and thus may be underestimated. The optimal treatment regimen of magnesium sulphate remains unknown. The RCTs used different regimens, but the Cochrane review found no significant difference between different loading and maintenance subgroups.2 It is not known whether tocolytic agents interact with magnesium sulphate, and none of the RCTs have looked into this issue. A survey of the opinion of clinicians could address the reasons for not implementing the Cochrane recommendations on magnesium sulphate, and would provide interesting information.
In the calculation of the required information size, we did not consider different type 2 error risks or heterogeneity of the meta-analysis—either inconsistency or diversity.11 This was in order to keep our messages as simple as possible. The choice of the a priori effect size could be considered to be data-driven as it was based on previous meta-analyses. A more conservative effect size would have provided a more conservative result, and vice versa (Table 1). A type 2 error risk of 20% instead of 10% would, for example, have provided a borderline statistically significant result (P = 0.05) (Table 1). In contrast, heterogeneity adjustments would probably increase the need for additional participants. Therefore, before anyone plans to conduct further trials on magnesium sulphate, we recommend that such issues should be considered in estimating the required sample size of the trial.12 Crossing of the sequential boundaries has been used to stop clinical trials before reaching the planned sample size. The standards for testing statistical significance in meta-analyses should be, at least, equal to those of a RCT. TSA mimics such standards. The trialist and the meta-analyst share an obligation to avoid the dissemination of misleading results. The trialist has an additional obligation to terminate the trial if a substantial treatment effect that clearly outweighs harm becomes evident. The meta-analyst cannot terminate the accumulation of evidence directly and can only provide all available evidence and analyse it with the best available methods. We have provided such evidence for antenatal magnesium sulphate, but it would be interesting to apply a similar method on other obstetric meta-analyses.
Table 1. Trial sequential analysis on antenatal magnesium sulphate with random error risk adjusted to 95% CI according to different type 1 and 2 error risks and estimated effect size
CI, confidence interval; RRR, relative risk reduction (the a priori estimated effect size); α, risk of type 1 error; β, risk of type 2 error.
*Statistically significant result when adjusted for random error risk.
α = 5%, β = 10%
α = 1%, β = 10%
α = 5%, β = 20%
α = 1%, β = 20%
The potential fetal neuroprotective effect of magnesium sulphate obtained so far is very promising, but we believe that it is fair to obtain solid evidence before implementing a new intervention. We risk that the current meta-analyses may be misleading, just as the early meta-analyses of magnesium for myocardial infarction proved to be after the ISIS-4 and MAGIC RCTs.13 The use of TSA revealed a possible false positive effect of magnesium sulphate caused by random error, and a need for further RCTs to obtain firm evidence to recommend or refute magnesium sulphate.
Magnesium sulphate seems to be a promising intervention to reduce the risk of cerebral palsy in preterm infants. However, we may still risk the premature dissemination of an intervention because of false positive results in the conducted meta-analyses. Further, the risks of adverse events may be underestimated. Future consensus debates should focus on the level of additional evidence that is needed, if any, and subsequently initiate the required RCT. We need strong evidence of robust, consistent effects in RCTs that have enrolled adequate numbers of participants before mandating a new therapy for the management of all relevant patients. However, we must consider the ethical issue of spending several years on new RCTs, when magnesium sulphate has been used for decades in obstetrics. It is a difficult balance of benefit, harms and resources when deciding whether to treat and who to treat with magnesium sulphate.
Disclosure of interest
None of the authors have competing interests.
Contribution to authorship
All authors conceived and designed the study. JB and LH extracted and analysed the data, and drafted the article. All authors had constructive input and revised the text. LH is guarantor.
Details of ethics approval
No funding was required.
We have no acknowledgements.
Commentary on ‘Antenatal magnesium sulphate may prevent cerebral palsy in preterm infants—but are we convinced? Evaluation of an apparently conclusive meta-analysis with trial sequential analysis’
Numerous authors have shown that repeated updating of meta-analysis, such as that commonly conducted in the Cochrane Database of Systematic Reviews (2010), can lead to inflated type I error rates. Typically, in a single trial, the tolerable type I error rate (or false positive rate) is set to be 5%. This is the false positive rate for only one test. When more than one test is conducted, the overall type I error rate will exceed 5%, corresponding to an increased probability of obtaining a false positive result. The degree to which the false positive rate is inflated depends on a number of factors; for example, if five tests of independent effects are conducted, the probability of obtaining one or more false positives is inflated to 23%. Similarly, continued updating of meta-analysis can lead to an unacceptable (>5%) risk of false positive results.
Pogue and Yusuf (Lancet 1998;351:47–52; Controlled Clin Trials 1997;18:580–93, with discussion) remarked that the strategy of updating a meta-analysis when new studies were published was similar to taking multiple repeated looks at the data from a single clinical trial. They suggested that sequential analysis approaches developed for a single trial could be adapted to the cumulative meta-analysis setting. Wetterslev et al. (J Clin Epidemiol 2008;61:64–75) extended this approach to account for heterogeneity between trials. In this issue, Huusom et al. (BJOG 2011;118:1–5) revisit the recent Cochrane review of magnesium sulphate use in women at risk of preterm birth for the prevention of cerebral palsy in offspring (Doyle et al., Cochrane Database Syst Rev 2009; DOI: 10.1002/14651858.CD004661.pub3), applying a simple case of the Wetterslev method to assess random error risk in the meta-analysis. They find that this risk is non-negligible, suggest that the meta-analysis is in fact inconclusive, and recommend that 400–4000 additional subjects (assuming 50:50 randomisation between two groups) may be needed before a conclusive result is reached. For the sake of simplicity, they did not account for heterogeneity in this analysis, and therefore their estimated range of additional subjects is probably somewhat low.
Although methods for sequential meta-analysis (Wetterslev et al., J Clin Epidemiol 2008;61:64–75; van der Tweel and Bollen, Clin Trials 2010;7:136–46) have not yet had their statistical properties comprehensively studied, they will help to control the type I error rate in this cumulative meta-analysis setting (Nüesch and Jüni, Int J Epidemiol 2009;38:298–303). Because a cumulative meta-analysis is constantly updating prior information with new data, a Bayesian approach is a natural and very attractive alternative (Spiegelhalter et al., Br Med J 1999;319:508–12).
Although these methods show considerable promise, they do not yet deal with errors arising from the inclusion of flawed studies or publication bias. Traditional meta-analytic methods, including the use of both funnel plots and stratified analysis (including tests of trial characteristic by effect estimate interactions), should accompany any meta-analysis.
What next? Given that patients may not wish to be randomised in the light of the 2009 Cochrane review, future investigators of randomised trials may wish to consider adaptive designs, such as Zelen’s famous ‘play the winner rule’ (J Am Stat Assoc 1969;64:131–46), and its many adaptations, to allow the allocation ratio between magnesium sulphate and placebo to depend on the observed success rate. (The basic idea of this strategy is to allocate more patients to the superior treatment. In a very simple case, one allocates the first patient to either treatment by randomisation. If this treatment is a success, the next patient receives it; if the treatment fails, the next patient receives the other treatment.) Regardless, as Huusom et al. show, the magnesium sulphate story may not yet be complete.