Regulatory and methodologic challenges to tocolytic development


  • TM Goodwin

    Corresponding author
    1. University of Southern California, Women’s and Children’s Hospital, Los Angeles, CA, USA
      Prof TM Goodwin, University of Southern California, Women’s and Children’s Hospital, 1240 North Mission Road, Room 5K-40, Los Angeles, CA 90033, USA. Email
    Search for more papers by this author

Prof TM Goodwin, University of Southern California, Women’s and Children’s Hospital, 1240 North Mission Road, Room 5K-40, Los Angeles, CA 90033, USA. Email


The development of tocolytic medications faces challenges common to all drug development programmes, principally related to evolving understanding of the pathophysiology. There are unique impediments to drug development for pregnancy-related conditions in general and for tocolysis in particular. The purpose of this brief overview is to familiarise the obstetrician with the current challenges to drug development, focusing in particular on the problems of tocolytic development. A strategy for encouraging drug development for preterm labour and for pregnancy-related problems in general is presented.


Most clinicians in the USA and Europe are not familiar with the process by which medications are approved for use by regulatory agencies or what factors influence the likelihood that an agent pass get through the lengthy process of development and approval. The purpose of this brief overview is to familiarise the obstetrician with the current challenges to drug development, focusing in particular on the problems of tocolytic development.

Among the few drugs approved for use in pregnancy and still manufactured in the USA are the dinoprostone PGE2 insert, oxytocin and three agents used in anaesthesia—nalbuphine, oxymorphone and ropivacaine. The vast majority of medications used during pregnancy are used off-label, even in circumstances where the indications, such as tocolysis in spontaneous preterm labour, are specific for pregnancy. Pregnant women can be considered as ‘therapeutic orphans’. Why is there so little drug development for pregnancy-related problems? This is in part due to issues unique to obstetrics. There is another set of problems common to many areas of medicine, and first, I would like to outline these.

Trends in drug development

From 1993 to 2003, spending on research, as reflected by the total budget of the National Institutes of Health, increased more than 2.5-fold. In this same time period, US Pharmaceutical Research and Development spending doubled. Nevertheless, submissions to the US Food and Drug Administration (FDA) of new molecular entities and biological licence application submissions declined steadily over the same time frame. Much of the reason for this is related to the costs of bringing a new medication to the market. Estimated total costs of drug development from discovery through drug launch from 1995 to 2000 were approximately $1.1 billion. A similar analysis for the time period 2000–2002 showed that costs had risen to $1.7 billion. The largest increase in this time period was in phase II and phase III development, i.e. clinical trials in humans.1

These changes in costs create a major barrier to investment, especially with regard to innovative or higher risk drugs or therapies for uncommon diseases. Almost all conditions in pregnancy would fall into these groups. Because the cost of failure is high, there is a tendency to concentrate efforts on products, which have a very high potential market return.

If one examines the path of medical product development, from basic research through the discovery phase, preclinical development, clinical development and finally FDA filing and approval, there has been asymmetric progress. The great advances in basic science research in recent years have failed to translate into benefits to participants in part because of the inefficiency of the process that would bring these basic science developments to the participant. Basic science research and discovery research may be conducted at the so-called ‘cutting edge’. But the process of clinical development (the so-called ‘critical path’) relies on tools that have not changed in a generation. There is a widening gap between the basic biomedical knowledge and the clinical application.2 The problem is illustrated by the fact that of all drugs entering phase I of clinical development in 1985, 14% were likely to achieve approval and eventually be marketed. Fifteen years later, only 8% of the drugs that entered phase I of clinical development made it to the participant. Advances in basic science research have not been complemented by advances in understanding of how to conduct clinical studies of the new entities born out of the basic research. This has been seen in obstetrics and in tocolytic development in particular. There is an urgent need to develop modelling, biomarkers and clinical trial endpoints that will make the development process more efficient and more likely to benefit participants.

Tocolytic drug development

With respect to tocolytic development in particular, there have been three programmes in the past 40 years: ritodrine, which culminated in FDA approval in 1979, hexaprenoline, approved by the FDA in 1990 (although never marketed), and atosiban, not approved by the FDA after the Advisory Committee hearing in April 1998. Even though spontaneous preterm labour is widely acknowledged to be one of the major public health problems in Western societies, there has been very little development in pharmacological approaches to address this problem. This can be attributed to at least three inter-related factors.

First is the relatively small market for tocolytic drugs. Because of the rising costs of drug development and the effect it has on discouraging companies from developing innovative drugs, or drugs for relatively small markets, the chance of developing a drug for tocolysis is markedly impeded. Compared with actual market sales for the so-called ‘blockbuster drugs’ in the USA for 2004, which averaged almost $5 billion, annual sales for a tocolytic agent have been estimated at well under $500 million. The exact size of the tocolytic market is unclear because the natural history of spontaneous preterm labour is not well understood. This will be discussed in more detail later. It has been estimated that there are fewer than 80 000 subjects who could be treated for spontaneous preterm labour annually in the USA. When atosiban was first proposed for development in the USA, there was an attempt to have it developed as an orphan drug because it was believed that the number of potential subjects for treatment was small enough to qualify under the orphan drug statute. Although this approach was not ultimately adopted, it does highlight the small size of the market.

The second issue is liability. Institutional review boards, as well as participants and physicians, have heightened concern in dealing with any drug that has a potential effect on the fetus and on the mother. Informed consent during labour has been questioned; can a women in the changing environment of active labour give adequate informed consent? There does not seem to be a reasonable alternative, but it does make the process of study more difficult. The problem of liability in general adds to the more than two-fold higher cost of performing phase II and III trials in the USA compared with Europe.

A third area of difficulty is the unique complexity of obstetric studies. In contrast to other areas of medicine, there are two patients (mother and baby) that need to be considered during all aspects of the study design with regard to both efficacy and safety. In addition, there are no templates of successful study design in tocolytic studies since the only drugs previously approved by the FDA were approved under standards that are now deemed inadequate. At the present time, no tocolytic drug that has been presented to the FDA has consistently shown neonatal benefit in controlled trials and limited understanding of the fundamental bases of the spontaneous preterm labour syndrome still impedes progress.

The conclusions of the FDA Atosiban Hearing of April 1998 still largely control the debate about tocolytic development. It is therefore important to examine these in detail.

Conclusions and recommendations of the atosiban advisory committee of 1998

Placebo trial

One of the principal conclusions was that a placebo-controlled trial was necessary to show efficacy and to receive approval from the FDA. The problem of placebo for a study of spontaneous preterm labour is significant. An informal poll of centres conducted before a follow-up FDA meeting on atosiban in 2000 found that only 32 of 175 US centres (18%) would consider a placebo trial and 26 of 175 of those would do so only if rescue therapy was allowed. Only 6 of 175 centres believed that they could conduct a true placebo-controlled trial with no rescue therapy. The actions of individual investigators during the atosiban trial were consistent with significant reservations about conducting a placebo trial. Although investigators were only required to adhere to a 1-hour placebo versus active drug time period before allowing rescue therapy, 20% of subjects overall had results that were censored because of early administration of rescue tocolysis. Much anecdotal evidence suggests that women themselves may be reluctant to participate in a placebo-controlled trial, and this may vary widely between centres and from country to country. Some institutional review boards in the USA have expressed resistance to the idea of a placebo-controlled trial, when the standard of care in most communities is to administer a tocolytic drug with the goal of allowing corticosteroid administration, an intervention, which is associated with proven improved neonatal outcomes.

A resolution to the problem of conducting a pure placebo trial was proposed in the form of rescue therapy. This option was pursued in the atosiban-096 trial.3 This turned out to be a problematic alternative. Allowing rescue therapy with a specific agent or with the community standard diminishes the chance of showing benefit of the active drug and, moreover, complicates interpretation of the results. Although the atosiban-096 trial was approved by the FDA and was designed in conjunction with the FDA, the use of rescue therapy ultimately created irremediable problems with interpreting the trial results.

A second alternative to the placebo trial is that of an active control trial. Unfortunately, this is not possible in the USA at the present time, as there is no approved drug for tocolysis still marketed. An active control trial for the FDA can only be conducted against an agent approved for that indication.


A second major area in which the Atosiban Advisory Committee of 1998 set the standard for future study was the insistence that future approval of tocolytic drugs must be based on showing improved outcomes in the offspring. In other words, the prior standard of delay in delivery that had been successful at the time of the Hexaprenoline Hearing in 1990 was no longer acceptable. The various endpoints that have been proposed include perinatal mortality and a number of measures of neonatal morbidity. Perinatal mortality has fallen to such an extent that it can no longer be studied alone as an endpoint. Any single endpoint of neonatal morbidity is also difficult to study. The only outcome that occurs with sufficient frequency is respiratory distress syndrome. However, a sample size estimate using the known prevalence of respiratory distress syndrome in populations of women in spontaneous preterm labour diagnosed by uterine contractions and cervical change (12–15%) shows that such a study would have to be twice as large as the largest tocolytic trial ever performed to show a significant difference. In any case, regulatory agencies are now reluctant to accept respiratory distress syndrome as a meaningful endpoint for tocolytic efficacy. Most investigators are now considering various composite fetal and neonatal outcomes as a means of demonstrating efficacy.

Almost all of the proposed composite morbidity schemes include the following outcomes: perinatal mortality, respiratory distress syndrome, intraventricular haemorrhage, necrotising enterocolitis and sepsis. Other outcomes, which are included in some composite neonatal morbidity scales but not in others, include chronic lung disease (bronchopulmonary dysplasia), periventricular leucomalacia, patent ductus arteriosus, pulmonary hypertension, retinopathy of prematurity, bowel perforation, seizures and intubation. Minor outcomes include intensive care unit admission, jaundice, transient tachypnoea of the newborn and metabolic disturbances. To date, there has been little attempt to balance these composite endpoints. Respiratory distress, for example, is given the same weight as necrotising enterocolitis, whereas these two complications are of completely different significance in terms of short- and long-term morbidity. As shown in Figure 1, the major neonatal morbidities that would be proposed to be included in such scales occur with appreciable frequency from 24 weeks up to 30 weeks of gestation.4 After 30 weeks, and certainly by 32 weeks, the frequency of all these except respiratory distress is below 10%. A number of studies have used such composite endpoints. A recently published study of outcomes related to preterm prelabour rupture of membranes shows a composite major morbidity rate at 32 weeks of gestation of just over 20%.5 Nevertheless, almost all the major morbidity after 30 weeks of gestation was due to respiratory distress.

Figure 1.

Major neonatal morbidity by gestational age. Regional data from the University of Southern California combined with IVH data from the Epipage Study.4 RDS, respiratory distress syndrome; PDA, patent ductus arteriosus; IVH, intraventricular haemorrhage; CLD, chronic lung disease; NEC, necrotising enterocolitis.

The challenge of identifying appropriate outcomes for drug development in many other areas of medicine has been addressed by reliance on surrogate endpoints. Such endpoints are generally acceptable to regulatory agencies. One example is in the area of AIDS drug development. A predictable response of CD4 counts is considered to be an acceptable endpoint since it is predictive of subsequent survival and diminished morbidity, and its use has significantly shortened the period of drug development. Similarly, the study of antihypertensives has been expedited by use of frequent automated blood pressure measurements. Unfortunately, in obstetrics, there are few surrogate endpoints that are reliably linked to improved outcomes at the present time. This is something that is urgently needed.

Closely related to the problem of selecting the appropriate endpoint for study in tocolytic drug development is the problem of the study size. The atosiban-096 randomised, placebo-controlled trial, conducted in the mid-1990s in the USA, required more than 3 years and involved 35 centres in North and South America to enrol just over 500 women, despite using relatively broad diagnostic criteria for spontaneous preterm labour. Part of the problem in understanding exactly how many patients are needed in a study of spontaneous preterm labour is that its natural history is poorly understood. While many studies have looked at the natural history and epidemiology of preterm birth, there is very little information on the natural history of spontaneous preterm labour itself. The relationship between spontaneous preterm labour and preterm birth from the atosiban-096 study is shown in Figure 2. Eighty-five percent of the 501 women who were enrolled with spontaneous preterm labour before 34 weeks of gestation were enrolled after 28 weeks of gestation. Fifty percent of these women delivered after 37 weeks of gestation.

Figure 2.

Gestational age at admission and delivery: percent of total subjects in the atosiban-096 trial (n = 501).4 PTB, preterm birth; PTL, preterm labour.

Between 24 and 32 weeks of gestation (Figure 2), the gestational age at which composite neonatal morbidity may be common enough to demonstrate efficacy of a given intervention, only 13% (65/501) of all subjects delivered at or before 32 weeks. It has been proposed that the diagnosis of spontaneous preterm labour should be refined by using either fetal fibronectin (FFN) or transvaginal scanning of the cervix to identify a group of women who are more likely to deliver preterm. In a recent study, among women who presented with spontaneous preterm labour, 50% of those with a cervical length less than 1.5 cm delivered within 1 week.6 Similarly, of women who presented with spontaneous preterm labour, 43% were FFN-positive and 20% of these women delivered within the next 7 days. Nevertheless, 82% of women who presented with spontaneous preterm labour did not have a cervical length of less than 1.5 cm. Within such a population of women with spontaneous preterm labour with a high likelihood of delivering preterm and who could possibly show a benefit from tocolytic drug therapy, the number of women who can be enrolled is reduced dramatically.

Another proposed endpoint for the study of tocolytics, and perhaps the most important, is that of long-term follow up into childhood. So far, only four trials have collected long-term follow-up data after tocolysis. The first was the Canadian Ritodrine Trial, published in 1992.7 Although this study reported on Bailey scores and other measures of developmental outcome at 2 years of age, this was a convenience sample, and no attempt was made to follow up the entire population.

Two-year follow up of infants exposed to atosiban and placebo in the atosiban-096 and -098 trials in the USA in the early and mid-1990s were published in abstract form.8 Follow up at two years was approximately 60%. There were no observed differences in neurodevelopmental outcome between those exposed to atosiban and placebo.

The Australian nitroglycerine trial (randomised nitric oxide tocolysis trial) reported in 2004 also had long-term follow up included in its protocol, but this information has not been published. Results of long-term follow up of infants exposed to nifedipine and ritodrine in two trials conducted in Holland in 1997 and 2000 were recently published;9 there was no difference in long-term follow up at school age. Sixty-one percent of children were able to be followed up.

Study size

A reminder of the challenges that are faced by investigators wishing to study a new tocolytic was recently presented at the Society for Gynecologic Investigation.10 Women presenting with regular uterine contractions and a Bishop score of >6 at 24 to 32 weeks of gestation, who had not received tocolysis previously, were randomised at 14 tertiary care hospitals in Canada to receive a nitroglycerine patch or placebo. The primary outcome was a composite of neonatal morbidity consisting of chronic lung disease, necrotising enterocolitis, grades II–IV intraventricular haemorrhage, periventricular leucomalacia or perinatal mortality. The 14 centres had catchment areas that were significantly larger than the average 4000 births per year in these centres.

A sample size of 600 was estimated to be needed to show a 38% reduction in composite neonatal morbidity. From June 2001 to June 2004, investigators at 15 centres were able to enrol 158 women. Many of the participating centres in this trial were part of the largest single tocolytic trial ever published, The Canadian Preterm Labor Investigators Ritodrine Trial, published in 1992.7 Factors for the inability to enrol women are still being analysed, but it is a sobering reminder of the difficulties faced by investigators wishing to study a new tocolytic in a randomised, placebo-controlled trial under the current circumstances in medical practice.

The following questions remain critical for future drug development for tocolysis:

  • 1Can a true placebo trial be performed?
  • 2What are the proper endpoints?
  • 3Can the study be conducted in a timely fashion?
  • 4How can the FDA and other parties collaborate in such a study?

The following recommendations are offered for advancing the study of tocolytics:

  • 1A comprehensive study of the natural history of spontaneous preterm labour and the association of spontaneous preterm labour and preterm birth with various surrogate markers for important outcomes is needed. Such a study of the natural history should be accompanied by an assessment of the feasibility of patient participation in a randomised, placebo-controlled trial.
  • 2Development of a meaningful composite neonatal morbidity endpoint. Current composites weigh all components of the score equally, which is counterintuitive, and makes the score less robust.
  • 3A carefully conducted poll of centres in the USA asking whether they would participate in a randomised, placebo-controlled trial should be undertaken and published. This would form the basis for discussions with regulatory agencies as to whether a placebo-controlled trial can actually be conducted.
  • 4An increased cooperative effort with regulatory agencies is needed. Several of the problems facing tocolytic development cannot be solved without a significant input from the FDA. It is fair to ask if FDA bears responsibility in part for the fact that there is no approved agent for tocolysis in the USA? Two of the three principal goals for the FDA Modernization Act of 1997 were intended to move the agency from an adversarial culture vis a vis industry and investigators to a cooperative one.11 The FDA should convene a summit meeting on methods for tocolytic phase III development. Such a meeting has been agreed to in principle and is urgently needed.
  • 5Enactment of a ‘Best Pharmaceuticals for Pregnant Mothers Act’, similar to the Best Pharmaceuticals For Children Act. The status of pregnant women, as ‘pharmaceutical orphans’ can only be partially addressed by clinician scientists. There must be recognition and a consensus that safe and efficient development of drugs for use during pregnancy and for tocolysis in particular is an important national goal.