Meta-analysis: phytotherapy of functional dyspepsia with the herbal drug preparation STW 5 (Iberogast)


Dr J. Melzer, Department of Internal Medicine, Complementary Medicine, University Hospital Zurich, Rämistrasse 100, CH-8091, Zurich, Switzerland.


Background : Despite a long-standing use of herbal drugs with dyspeptic symptoms, little attention has been paid to their clinical evaluation.

Aim : To assess efficacy and safety of the herbal drug preparation STW 5 (containing, e.g. Iberis, peppermint, chamomile) in the treatment of functional dyspepsia.

Methods : Research in electronic databases, consultation of experts and of the producer identified STW 5 (Iberogast) as descriptor in six randomized-controlled trials. The raw data of three placebo-controlled studies which met the selection criteria, were reanalysed and pooled for meta-analysis; one reference-controlled study supported the safety analysis (STW 5: n = 199, control: n = 198).

Results : Pooled data showed verum (n = 138) to be more effective than placebo (n = 135) with regard to the severity of the most bothersome gastrointestinal symptom (P-value: 0.001, odds ratio: 0.22, 95% CI: 0.11–0.47). A fourth randomized-controlled trial showed no significant difference between STW 5 and cisapride. As to safety, adverse events were similar with verum and placebo; no serious adverse events occurred.

Discussions : From the point of view of efficacy and safety, the herbal medicinal product STW 5 appears to be a valid therapeutic option for patients seeking phytotherapy for their symptoms of functional dyspepsia.


Functional dyspepsia is a clinical syndrome characterized by chronic or recurrent symptoms experienced in or referred to the upper digestive tract.1 Nature and severity of the symptomatology are of little value in differentiating between organic dyspepsia and the more common functional dyspepsia (also described as ‘non-ulcer dyspepsia’ (NUD) or ‘irritable stomach’).2 Possibly, up to a quarter of the adult population suffers from dyspeptic symptoms, as a study focusing on 3-month prevalence showed, and these symptoms are responsible for about 5% of all visits to general practitioners3, 4 or use of other medical services.5 The diagnosis of functional dyspepsia is made by excluding other possible causes of the patient's symptoms. The nature of the disease is probably pleomorphic, and different potential underlying pathogenic mechanisms have been described, e.g. delayed gastric emptying, gastric hypersecretion, visceral hypersensitivity to distension or disturbed intragastric distribution of the meal. Moreover, epidemiology shows some relation between the occurrence of dyspepsia and recent events in a patient's life as well as psychosocial factors.6

Various diagnostic and therapeutic strategies have been proposed, each having its advantages and disadvantages.7, 8 Treatment of functional dyspepsia is still controversial, and probably there is no ‘fit all’ therapy. A Cochrane review9 concluded that prokinetics, H2-receptor antagonists and proton-pump inhibitors (PPI) meant a significant relative risk reduction compared with placebo, namely by 48%; 22% and 14%, respectively. Prokinetics, however, in spite of having some beneficial effects, have somewhat fallen into disgrace due to central nervous (e.g. metoclopramide) or cardiac adverse effects (e.g. cisapride).10 Bismuth salts, antacids and sucralfate were of limited or no interest, and Helicobacter pylori eradication therapy has a small but statistically significant effect in H. pylori-positive NUD.11 Herbal drugs have a long history of use in the treatment of dyspeptic complaints, either alone or in combination with other herbal drugs, as they contain, e.g. essential oils, which are known for their spasmolytic, carminative and local anaesthetic action. Their mechanisms of action are not completely understood. Findings suggest, however, that they modulate the activity of the smooth musculature of the digestive tract.12 In the past, little attention has been paid to an evaluation of herbal remedies in the therapy of patients with dyspeptic symptoms.13 The herbal medicinal product STW 5 (Iberogast Steigerwald, Darmstadt, Germany) is a fixed combination of nine different herbal extracts (Table 1), each (except for Iberis amara totalis14) contained in a very low concentration compared with dosages used in single drug treatment, as they are described, e.g. in monographs for Chelidonii herba, Liquiritae radix, Matricariae flos, Melissae flos, Menthae piperitae folium,15 or for Cardui mariae fructus16 and Angelicae radix.17 Its clinical efficacy seems to be promising.18 Therefore, the present study will analyse respective clinical evidence in the treatment of functional dyspepsia.

Table 1.  Composition of STW 5
Drugs extracted (ethanolic 30%, DER 1:3)Amount of extract (in 100 mL)
  1. DER, drug extract ratio.

  2. * Ethanolic extract (50%) of fresh plant, DER 1:2.

Angelicae radix (Garden angelica root)10 mL
Cardui mariae fructus (Milk thistle fruits)10 mL
Carvi fructus (Caraway fruits)10 mL
Chelidonii herba (Greater celandine)10 mL
Iberis amara* (Bitter candy tuft)15 mL
Liquiritiae radix (Liquorice root)10 mL
Matricariae flos (Chamomile flowers)20 mL
Melissae folium (Balm leaves)10 mL
Menthae piperitae folium (Peppermint leaves) 5 mL

Materials and methods

Search strategy

For meta-analysis, the following databases were searched, each from the date of its start to December 2003: TOXLINE, MEDLINE, HealthSTAR, AIDSLINE and CANCERLIT, Embase, AMED, Cochrane Collaboration. Search terms were STW 5, Iberogast, herbal, dyspepsia, dyspeptic and gastrointestinal disorders, phytotherapy. Additionally, reference lists from pertinent articles, reviews and books were scrutinized, experts in this field were contacted, and so was the producer of the herbal preparation. In case of double publications, the more recent one was included in the study or the one that had appeared in a peer-reviewed journal. Because of paucity of the published data, unpublished data were included as well – for instance reports submitted for registration of the product to German health authorities, e.g. to the Federal Institute for Drugs and Medical Devices (BfArM). This should also help to minimize publication bias.19 Then, to ensure that a meta-analysis of the published and unpublished data was based on reliable and comparable data, all raw data from the randomized-controlled trials (RCT) were reanalysed.

Selection criteria

Initially, all articles were considered for review that mentioned STW 5. Then, these were classified according to topic or indication, screened and weighted according to their methodological quality (methods, participants, interventions, outcome measures and results). Articles about the drug combination published before 1992 had to be discarded because they did not comply with current standards, neither with those of good clinical practice (GCP) nor with modern diagnostic criteria such as Rome II. To be included in the present study, trials had to be: double-blind, randomized, placebo-controlled, studying patients with functional dyspepsia, and complying with GCP standards or adequate statistical reporting. Studies that did not meet these criteria were excluded. In its analysis of the clinical data, the present study followed the guidelines provided by the Cochrane Collaboration Handbook for Reviews.20


When trials had been selected and respective raw data were available, the latter were reanalysed on the basis of intention-to-treat (ITT) and last observation carried forward (LOCF), applying the same criteria to all of them. Analysis of the selected studies with regard to diagnostic criteria and patients’ characteristics was performed according to the Rome II consensus criteria.21 These criteria classify functional gastroduodenal disorders in functional dyspepsia, aerophagia and functional vomiting. Functional dyspepsia is subclassified in ‘ulcer-like’, ‘dysmotility-like’ and ‘unspecified’ symptom groups. Furthermore, the analysis was performed according to the most bothersome symptom, i.e. the symptom attributed the highest score by the patient, to avoid both the generation of sum scores and the use of scales not sufficiently validated for clinical parameters. A validation trial showed that about one-third of the patients with functional dyspepsia additionally present symptoms of gastro-oesophageal reflux or of irritable bowel syndrome.22 All trails included in the present study investigated functional dyspepsia, i.e. complaints that proved negative in oesophagogastroduodenoscopy and upper abdominal sonography and did not show any relevant abnormalities in routine laboratory values either.

Respective data were tabulated, and an appropriate software23 was employed for validation of results. To make them comparable, the various trials were converted into a ‘standard trial’: run-in 7 days, randomization at admission, treatment with consultations after 3 and 5 weeks. The data were summarized in tables and statistically analysed: in case of dichotomous data, odds ratio and risk difference according to Peto Mantel-Haenszel were employed. With continuous data, values were pooled as mean difference weighted by inverse variance. If results were significant, sensitivity analyses were performed. Significance values were calculated using two-sided tests, the threshold of significance being P ≤ 0.05 and of non-significance being P > 0.1; values between P > 0.05 and P ≤ 0.1 were reported as trends. A reduction of the score of the most bothersome symptom among the 10-items of the gastrointestinal symptom scale (GIS) from very severe or severe to absent or mild, on a 5-point Likert scale (very severe, severe, moderate, mild, absent), was defined as response.


In the electronic databases, STW 5 and Iberogast were identified as descriptors in five papers. Two additional clinical studies were found via other channels and were duly scrutinized. According to the primary objective of the present meta-analysis – to assess the efficacy of STW 5 from a clinical point of view by focusing on clinically relevant end-points. Of these seven studies, four did not meet the inclusion criteria (different research formula and double publication,24, 25 observational study,26 single-blind27) but three28–30 of them were included in the meta-analysis, and a fourth31 one was used for a safety analysis. General characteristics of the patients participating in the trials did not differ significantly (Table 2).

Table 2.  General characteristics of patients participating in the trials analysed for efficacy and safety of STW 5, and diagnostic findings
  Buchert28Schnitker and Schulte-Körne29Madisch et al.30 Rösch et al.31
  1. BMI, body mass index; ND, not determined; ITT, intention-to-treat.

  2. * Ex-smokers were counted as non-smokers.

  3. † Values approximated; original classification: >12 (n = 9), 6–12 (n = 16), 3–6 (n = 18) and <3 months (n = 16).

  4. STW 5-II: Iberis amara totalis 15 mL, Carvi fructus 20 mL, Liquiritiae radix 10 mL, Matricariae flos 30 mL, Melissae folium 15 mL, Menthae piperitae folium 10 mL.

  5. 2 STW 6: containing only Iberis amara totalis 15 mL.

  6. 3 STW 5-S: similar to STW 5 yet without Iberis amara totalis extract.

Recruited (n = 618)24711860193
STW 5 (n = 199)STW 5 (83)STW 5 (35)STW 5 (20)STW 5 (61)
Control (n = 198)Placebo (80)Placebo (35)Placebo (20)Cisapride (63)
Other (n = 200)STW 5-II1 (80)STW 62 (38)STW 5-S3 (20)STW 5-II (62)
ITT (reanalysed)24310860186
Percentage of females (56%)51.067.863.359.1
Percentage of smokers* (27%)33.326.326.720.2
Age (year, mean ± s.d.)45.72 ± 11.3743.81 ± 14.4246.83 ± 11.4345.51 ± 14.44
BMI (mean ± s.d.)24.7 ± 2.324.9 ± 4.225.1 ± 3.224.3 ± 3.6
Helicobacter pylori-positive (%)ND28.4ND33.2
Duration of symptoms (month, mean ± s.d.)ND111.5 ± 222.76.2 ± 3.9†55.4 ± 80.8

STW 5 placebo-controlled: individual trials

All three randomized trials with STW 5 were multicentric, double-blind, placebo-controlled. In each of them, patients were administered the medicament at a dose of 20 drops (20 drops = 1 mL), three times a day, over a period of 4 weeks, with four examinations (Table 3). Anamnestic statuses of the patients were comparable, but durations of symptoms were quite heterogeneous, which may probably reflect different ways of inquiring for symptom duration. In each trial, the improvement of symptoms was analysed on the basis of ITT. Furthermore, apart from STW 5, each of the trials investigated one additional preparation, similar to STW 5 but not containing all of its nine herbal constituents. The latter were labelled STW 5-S, STW 6, STW 5-II (Table 2).

Table 3.  Dates of patients’ visits, and assumptions made to obtain comparable number of visits (V)
V 1: day −7V 2: day 0V 3: days 14–21V 4: c.∼day 35
  1. +, visited.

  2. † V 4: missing data replaced by data from previous V, according to last observation carried forward (LOCF).

  3. ‡ No ‘run-in’ in this trial; missing data replaced by copying the data of day 0.

Schnitker and Schulte-Körne29++++
Madisch et al.30+++
Rösch et al.31++++
  • 1The study published by Madisch et al.30 is a small RCT in which 60 patients were randomized and treated with either STW 5 (n = 20), STW 5-S (n = 20) or placebo (n = 20). At the final visit, 15 of 20 patients treated with STW 5 reported the most bothersome symptom as ‘mild’ or ‘absent’, compared with 0 of 20 placebo-treated patients. The researchers, too, in their global assessment of efficacy judged STW 5 as superior to placebo (P < 0.05).
  • 2The second RCT, conducted by Schnitker and Schulte-Körne,29 has not been published. It is based on data of 118 patients: STW 5 (n = 35), STW 6 (n = 38), placebo (n = 35); 10 patients were excluded because no information was available about whether they received any medication or not. A total of 91 patients completed all four control visits. At the final visit, 16 of 35 patients treated by STW 5 reported the most bothersome symptom as ‘mild’ or ‘absent’, whereas 7 of 35 assessed it as ‘severe’ or ‘very severe’. Both patients’ judgements about their individual symptoms and the researchers global assessment failed to show a significant difference between STW 5 and placebo (visit 4, inline image: N.S.).
  • 3The third RCT was the subject of an abstract published by Buchert.28 A total of 247 patients were recruited and 243 were evaluated. No data were available about the four patients excluded. Patients were assigned to three treatment groups: STW 5 (n = 83), STW 5-II (n = 80), placebo (n = 80). At the final visit, 52 of 83 patients treated with STW 5 reported the most bothersome symptom as ‘mild’ or ‘absent’, compared with 14 of 80 placebo-treated patients (inline image: P < 0.01). The analysis of the individual symptoms at the final visit revealed significant differences between STW 5 and placebo in favour of verum, for all symptoms except abdominal cramps.

STW 5 placebo-controlled: pooled trials

For a meta-analysis, data of the patients participating in the three RCTs were pooled (which was possible as baseline data were comparable with respect to general characteristics, anamnestic data and intervention): STW 5 (138 patients), placebo (135 patients). The majority of the patients described predominance of acid regurgitation (n = 124), while the rest reported epigastric pain as the predominant symptom (n = 101), or predominance of dysmotility-like symptoms (n = 30), or functional vomiting (n = 18). At the final examination, 83 of 138 patients in the verum group reported the most bothersome symptom as ‘mild’ or ‘absent’, compared with 33 of 135 in the placebo group (inline image: P < 0.01; Figure 1).

Figure 1.

Percentage of patients with rating of the most bothersome symptom (assessment ranging from absent to very severe), at each visit: STW 5 vs. placebo (pooled data from Madisch et al.30, Schnitker and Schulte-Körne29, Buchert28; at last visit: inline image: P < 0.01).

Stepwise regression analysis with the data at admission as independent variables [trial, treatment, age, sex, height, smoking, body mass index (BMI), most bothersome symptom: maximum score at admission] and outcome (most bothersome symptom: maximum score at last examination) as dependent variable showed that only treatment (P < 0.001) and score of the most bothersome symptom at admission (P = 0.013) were correlated to the outcome. At the end of the intervention, the most bothersome symptom remained ‘severe’ and ‘very severe’ in 26% of the patients in the placebo group but only in 7% of the STW 5 group. The significant difference between placebo and STW 5 amounts to 19% (P < 0.001, odds ratio 0.22, 95% CI: 0.11–0.47, Figure 2).

Figure 2.

Rate difference between assessments of the most bothersome symptom with STW 5 and with placebo treatment and 95% CI (19% more patients with reduction from very severe/severe to mild/absent in the STW 5 group).

Regarding the individual symptom scores at the final visits, the differences in favour of verum (STW 5) were more pronounced for epigastric pain, acid regurgitation and retrosternal troubles (not rated in Schnitker and Schulte-Körne29). Stepwise regression analysis with data at admission as independent variables and an individual symptom score at the final visit as dependent variable showed that treatment was significantly related to the outcome (P < 0.001) for six of 10 symptoms (abdominal cramps, epigastric pain, nausea, acid regurgitation, retrosternal troubles, vomiting) and no significant correlation for the remaining four: inappetence, fullness, retching and early satiety.

STW 5 reference-controlled

A multicentric, reference-controlled trial compared STW 5 to the prokinetic cisapride.31 A total of 193 patients with ‘functional dyspepsia of the dysmotility type’ were recruited, 186 were randomized, 183 were analysed on the basis of ITT, and 137 constituted the per-protocol-population employed to prove non-inferiority. In the present analysis, all randomized patients were reanalysed: STW 5 (n = 61), STW 5-II (n = 62), cisapride (n = 63) (total: n = 186). Patients were comparable at admission with respect to general characteristics, duration of troubles, endoscopic findings, and frequency of moderate-to-severe symptoms except retching, for which there was a trend to be predominant in the cisapride group (moderate-to-severe, P < 0.1).

As shown in Figure 3, the most bothersome symptom score shifted significantly with both STW 5 and cisapride compared with placebo, from predominantly severe/very severe to mostly absent/mild. Yet comparison of the two verum groups with respect to individual symptom ratings at the end of treatment showed significantly less inappetence (P < 0.01) and a trend for less early satiety in the STW 5 group (P < 0.1). Besides, overall assessment of efficacy and tolerability of the studied medications showed no significant differences, although in the patients’ assessments there was a trend towards better tolerability of STW 5 compared with cisapride (P < 0.1).

Figure 3.

Percentage of patients with rating of the most bothersome symptom (assessment ranging from absent to very severe), at each visit: STW 5 vs. cisapride (P = 0.216).


The STW 5 seems to be generally well-tolerated; the incidence of adverse events varies largely among the trials, which might be due to a more active vs. a rather passive gathering of information on adverse event. The studies investigated report the following percentages of adverse events for STW 5 or for placebo, respectively: Buchert28 3.6% vs. 1.3%, Schnitker and Schulte-Körne29 22.9% vs. 25.7%, Madisch et al.30 5% vs. 10% and Rösch et al.31 24.6% vs. cisapride 34.9%. Classification of adverse events reported in the trials according to body system (Table 4) shows a similar pattern of incidences for STW 5, placebo or cisapride. No actually serious adverse events were reported in these trials for either medication, and no relevant deviations from routine biochemical values were observed. In one observational study (postmarketing surveillance), conducted by the producer of STW 5 and including 2267 patients, the only serious adverse event reported could not be related to the medication (surgery for colonic cancer 4 days after completing trial).18, 26 Among the spontaneously reported adverse events over a period of 14 years are seven cases of exanthematous skin reactions (one with Quinckes’ oedema and one with disseminated neurodermatitis), six cases of reported digestive intolerance, and one case of allergic asthma. Adverse central nervous or cardiac events occasionally reported for prokinetics have not been found with STW 5.

Table 4.  Percentage of adverse events reported, classified according to body system (multiple mentions possible)
WHO system organ classSTW 5 (n = 199) (%)Placebo (n = 135) (%)Cisapride (n = 63) (%)
  1. BSR, blood sedimentation rate.

Body as a whole – general disorders (dizziness, influenza, chest pressure, diaphoresis, BSR increased)
Central and peripheral nervous system (migraine)
Gastrointestinal system (vomiting, nausea, enteritis, cramps, diarrhoea, dyspepsia, flatulence, etc.)
Liver and biliary system (hepatic enzymes increased)
Metabolic and nutritional disorders (uricaemia, hypoglycaemia)
Musculo-skeletal system (trauma, myalgia)
Psychiatric (alcoholism, anxiety)
Reproductive, female (menstrual disorder)
Respiratory system (tracheitis/bronchitis, sinusitis, pertussis, tonsillitis)
Skin and appendages (mycotic infection, eczema)
Special senses (dysgeusia)
Urinary system (urinary infection, cystitis)
Vascular, extracardiac (thrombophlebitis)


The number of 273 patients with functional dyspepsia treated with STW 5 (Iberogast) and placebo in the pooled RCTs is relatively limited, but still large enough to provide a fair idea of the clinical short-term efficacy of the medicament. The analysis presented herein avoids problems frequently encountered in meta-analyses and pooling of data:32, 33 (i) all studies that met current regulations were included, (ii) study designs and schemes for a rating of symptoms were fairly similar across the studies and (iii) all these trials were carried out in the same socio-cultural milieu. In contrast to a previously conducted analysis of the same data,29 we avoided the use of sum scores and ensured reanalysis of all raw data on the basis of ITT and LOCF. We did so because basing their assessment on the symptom with the highest score – hence the most bothersome one – enabled them to perform an analysis with non-parametric tests and avoid inclusion of several clinical outcome parameters, as it would otherwise have been required by heterogeneity of symptoms and symptom fluctuation. This approach seems to mean some loss of sensitivity, but its results in fact confirmed the results published or reported in the individual trials using score differences of the GIS as response criterion. Although the GIS profiles used in the research of STW 5 do not differ greatly, they all share the problem of many other clinical symptom scores: they are not sufficiently validated for the evaluation of functional dyspepsia. The method of determining the score of the most bothersome symptom, in contrast, has already been used for classification purposes4 and for determining the efficacy of another drug group used to treat functional dyspepsia (e.g. PPI – omeprazol).34 Reanalysing the raw data of the published and unpublished studies was a further step to ensure comparability of the patients’ demographic and health status and to minimize publication bias. Employing the ITT approach on all data increased the robustness of the data presented here. Although inclusion of unpublished data in a meta-analysis is still a matter of debate, there are principles that suggest to do so, e.g. in case of paucity of published data, if the unpublished data can be subjected to the same scrutiny as published data.19

General observation has shown that STW 5 is significantly more effective than placebo in providing symptomatic relief to patients with functional dyspepsia. This seems even more evident with associated symptoms of gastro-oesophageal reflux or predominance of epigastric pain. But these findings will have to be confirmed in larger trials.

The safety profile of STW 5 appears to be satisfactory, both in view of the data derived from clinical and observational studies and considering the fact that the drug has been sold on the German market for about 40 years. Its good tolerability might be due to the low concentrations of its individual constituents. Probably there is a synergy of therapeutic effects, e.g. spasmolytic, tonicizing, carminative, anti-inflammatory and local anaesthetic – without additive toxic effects. Only Chelidonium has been linked to rare cases of cholestatic hepatitis in other formulations and in higher concentrations.35, 36 This might be a reason why no such events have been reported with STW 5.18 For other fixed combinations like peppermint and caraway oil37, 38 or peppermint oil and ginger extract,39 studies over 4 weeks have shown improvements in functional or NUD compared with placebo or cisapride and seem to be comparable with STW 5. These results concerning safety and efficacy are encouraging, yet further research regarding long-term safety and possible interactions of herbal preparations remains a necessity.13

In view of the efficacy and the safety profile of STW 5, the preparation has shown to be a valid, promising approach for first-line management, at least of patients with functional dyspepsia asking for a complementary medicine (CAM) treatment. This result, however, will have to be confirmed in larger studies.


The study was partly supported by a research grant from Steigerwald GmbH, Darmstadt, Germany.