Nonsurgical treatment of stress urinary incontinence (SUI): grading of evidence in systematic reviews


Background  The guidance on SUI has not been rigorously assessed using GRADE system.

Objective  To determine if the quality and results of existing systematic reviews on conservative treatment of stress urinary incontinence (SUI) can underpin evidence-based recommendations for practice.

Study design  Review of systematic reviews.

Data sources  Electronic search in PubMed, Medline (OVID 1966-version), CINAHL, Biomed, Psychinfo, the Cochrane library, National Library for Health, the National Research Register and hand search of reference lists.

Methods  Two reviewers independently selected systematic review articles in which a publicly available database was searched for randomised trials on conservative treatment of SUI and assessed them for quality of methods and results (OR and 95% CIs). The extracted information was used to classify strength of evidence as per the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) system.

Results  There were 13 reviews of variable quality. Quality assessment of studies included in the reviews and their findings were adequately tabulated in all but four reviews. Meta-analysis of data was carried out in six reviews. Pelvic floor muscle training (PFMT) and other physical treatments, estrogens and duloxetine were better than no treatment in SUI. Based on the assessment as per GRADE system, only 2/13 (15.4%) reviews were deemed to be of high quality, 8/13 (61.5%) of moderate quality and 3/13 (23.1%) of low quality. The case for recommendation of PFMT and duloxetine was strong.

Conclusion  Systematic reviews of conservative treatments of SUI are not always suitable to generate robust recommendations for practice as they are weak in methodological quality or lack power to produce reliable results.


The International Continence Society defines urinary incontinence (UI) as the complaint of any involuntary leakage of urine, with urge and stress incontinence being the two most common types. They can exist separately or combined in the form of mixed incontinence. Stress urinary incontinence (SUI) is the involuntary leakage of urine upon effort or exertion, or on sneezing or coughing in absence of rise in detrusor pressure.1 A large survey conducted on 27 936 women, aged 20 years and above in Norway, showed an overall UI prevalence of 25%, of whom 50 and 36% had SUI and mixed UI, respectively.2 It is estimated that the prevalence of UI is approximately 16% in those older than 40 yearsin the UK.3 It can cause ‘bothersomeness’ and negatively impact the quality of life.4 A wide range of treatments have been used to manage SUI by conservative and pharmacological means. These have been summarised in systematic reviews, but the extent to which the underlying evidence about effectiveness is trustworthy has not been rigorously assessed.

A systematic review allows one to collect evidence in order to formulate recommendations for clinical practice. Even if there are systematic reviews available on interventions, the conclusions drawn from them can be flawed due to inherent weaknesses in the methodology of search, study selection, data abstraction, data synthesis and interpretation. Many systems for grading evidence exist,5 often leading to confusion about what is best practice. There is a need for an objective and rigorous assessment of the evidence before generating recommendations. The mission of the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) group is to help resolve the confusion among the different systems of rating evidence and recommendations.6 The GRADE system classifies quality of evidence into one of four levels (high, moderate, low and very low) and recommendation as strong or weak depending on several factors.6 This system has been developed with input from a wide representation including World Health Organization and Agency for HealthCare Research and Quality.

Where guidelines exist for SUI,7 the evidence has not been rigorously assessed according to the GRADE system. Our aim was to assess the methodological quality and results, as well as the strength of recommendation of systematic reviews on conservative treatments in SUI per GRADE system.


The review was carried out with a prospective protocol using widely recommended methodology and previous similar reviews.8,9 Our electronic searches targeted all reviews, systematic reviews and meta-analysis of studies on UI irrespective of whether they were therapeutic, diagnostic or prognostic reviews. We searched general databases: PubMed, Medline (OVID 1966-version), CINAHL, Biomed, Cochrane and Psychinfo, the National Library for Health, the National Research Register and Guidelines (NLH guidelines finder) upto January 2007.10 Our search term combination included the terms like stress incontinence (txt) or urinary incontinence (MESH) AND therapy. We also checked reference lists of known reviews to identify cited articles not captured by electronic searches.

Our predefined selection criteria sought reviews which as a minimum looked for studies in a publicly available database. Reviews were selected in a two-stage process. Two of us (P.M.L. and R.F.) independently scrutinised the electronic searches and obtained full manuscripts of all citations that were likely to meet the predefined selection criteria. Final decisions on inclusion or exclusion were then made after we examined these manuscripts. In cases of multiple review publications on the same topic by the same group, we selected the most complete and recent version of the review. We had no language restrictions but excluded reviews which looked at nonrandomised studies and where the search was not conducted systematically.

All manuscripts meeting the selection criteria were assessed for their methodological quality. We graded the evidence based on existing checklists for grading the quality of evidence and the strength of recommendations.6,11,12 Based on existing checklists, quality assessment involved extracting information from each selected review article on framing of question, literature search and review methods, scrutinising methods of literature search, data synthesis and results.13,14 The results were summarised as odds ratio and 95% CI wherever meta-analysis was performed.

The GRADE system11 was applied to the reviews identified. The quality of evidence was graded as follows:

  • • High: Further research is very unlikely to change our confidence in the estimate of effect.
  • • Moderate: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
  • • Low: Further research is likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
  • • Very low: Any estimate of effect is very uncertain.

The factors in deciding on a strong or weak recommendation included methodological quality of the evidence supporting estimates of likely benefit and costs, magnitude and precision of treatment effect, risks associated with therapy and varying values placed by the young versus old people on outcomes.


Figure 1 summarises the process of literature identification and selection. Of 367 citations, there were 13 (3.5%) review articles that met the selection criteria.15–27Tables 1 and 2 show the details of each review and summary of findings. There were five Cochrane reviews, and they all included only randomised controlled trials (RCT).

Study selection process for review of systematic reviews of nonsurgical treatments of SUI.

Table 1.  Details of systematic reviews included in the review of systematic reviews of nonsurgical treatment of SUI
StudyYearLanguageTopicLiterature search terms describedNumber of included studiesPopulationInterventionComparisonOutcomeNumber of womenSpecific question or hypothesisQuality assessmentFindings tabulatedAssessment for heterogeneity
  1. QOL, quality of life.

Oelke et al.192006EnglishSafety and tolerability of duloxetine in women with SUIYesNot mentionedWomen with SUI40 mg bd duloxetineNoneAdverse effects of duloxetine1913To review the safety and tolerabilityNoYesNo
Neumann et al.242006EnglishPelvic floor muscle training (PFMT) and adjunctive therapies for SUINo24Women with SUIPFMT/other physical therapiesNonePerception of cure, quantification of symptoms, clinician observations, QOL1180What evidence is there for PFMT (alone or in combination with adjunctive therapies for treatment of SUI)YesYesNo
Mariappan et al.212005EnglishSerotonin and noradrenaline reuptake inhibitors (SNRI) for SUI in adultsYes9Women with SUIDuloxetinePlaceboSubjective cure of SUI and results of pad test3327Are SNRI better than placebo in treating women with SUIYesYesYes, chi-square test, visual inspection of data and I-squared test
Herbison et al.252002EnglishWeighted vaginal cones for UIYes16Women with SUIWeighted conesNone or PFMTQOL measures, patients symptoms and physical measures (e.g. weight of cone retained)1246Vaginal cones are more effective than no treatment, as effective as conservative treatment, vaginal cones + conservative, more effective than conservative only and vaginal cones as effective as surgeryYesYesNo
Shaikh et al.222006EnglishMechanical devices for UI in womenYes14Women with SUIMechanical deviceNoneCure, improvement286To determine the effects of mechanical devices in management of female UIYesYesYes, visual inspection of data, chi-square test and I-squared test
Hay-Smith and Dumoulin202006EnglishPelvic floor training versus no treatmentNo6Women with SUIPFMTNoneReduction in incontinence episodes714PFMT is better than no treatment, placebo, sham or any other form of inactive control treatmentYesYesChi-square test, I-squared test and visual inspection of data plots
Holroyd-Leduc and Straus262004EnglishManagement of Urinary Incontinence (UI)No66Women with SUIPFMT, electrostimulation, vaginal cones, bladder trainingNoneNone626To review the most recent evidence on aetiology and management of SUIYesNoNo
Moehrer et al.232003EnglishEstrogens for female UIYes28Women with SUIEstrogensPlaceboSubjective cure of incontinence2926Estrogen therapy is better than placebo or no treatment, better than other forms of treatment, estrogens combined with other therapy better than placebo One method of administration better than another, high doses better than anotherYesNoYes, visual inspection of data
Virseda et al.162002SpanishPelvic floor muscle training and biofeedback for SUIYes2Women with stress and urge incontinencePFMTPhysical exercise without supervision or beta-adrenergicsImprovement, cure—subjective and/or pad test, urodynamics to demonstrate no SUINot mentionedTo determine the meta-analysis if the perineal rehabilitation is effective for treatment of SUI in womenNoNoYes
Bo172000NorwegianPelvic floor training versus no treatmentNo5Women with stress and urge incontinencePFMT with or without biofeedbackNoneImprovement, cure—subjective and/or pad testNot mentionedReview the evidence of PFMT in treatment of incontinenceNoNoNo
Weatherall271999EnglishBiofeedback or pelvic floor muscle exercises for female SUIYes3Women with SUIBiofeedback with PFMTPFMT alonePad test or complete remission of symptoms157Is biofeedback no more effective than PFMT alone for treatment of SUINoYesYes, chi-square test
Berghmans et al.151998EnglishConservative treatment of SUIYes11Women with SUIPFM exercises, electrostimulation, biofeedback, surgeryNoneCure, improvementPrevention 1883/therapy 1122Efficacy of physical therapy for first-line treatment and prevention of SUIYesYesYes
Teunissen et al.182004EnglishTreating urinary incontinence in the elderly conservativelyYes16Women with UIExercise and drug therapyNoneNumber of urinary accidents, women’s perception, cystometric measurement, perineometry679To assess the effects of drug therapy and exercise on UIYesYesNo
Table 2.  Results and GRADE of systematic reviews of nonsurgical treatments of SUI
StudyMeta-analysis of dataFindings in shortQuality of evidence—high, intermediate, low or very lowStrength of recommendation
  1. QOL, quality of life; RR, relative risk; CI, confidence interval.

Oelke et al.19NoDuloxetine has an acceptable safety profileHighStrong
Neumann et al.24NoStrong evidence for the efficacy of physical therapy for the treatment for SUIIntermediateStrong
Mariappan et al.21YesDuloxetine was significantly better than placebo in terms of improving women’s QOL (weighted mean difference 5.26, 95% CI 3.84–6.68, P< 0.00001) and perception of improvement. Individual studies demonstrated a significant reduction in the Incontinence Episode Frequency by approximately 50% during treatment with duloxetine. With regard to objective cure, however, meta-analysis of stress pad test and 24-hour pad weight change failed to demonstrate a benefit for duloxetine over placebo, although data were relatively few. Subjective cure favoured duloxetine, albeit with a small effect size (3%)HighStrong
Herbison et al.25YesCones were better than no active treatment (RR for failure to cure incontinence 0.74, 95% CI 0.59–0.93). There was little evidence of difference between cones and PFMT (RR 1.09, 95% CI 0.86–1.38) or electrostimulation (RR 1, 95% CI 0.89–1.13), but the confidence intervals were wide. There was not enough evidence to show that the cones plus PFMT was different to either cones alone or PFMT alone. Only three studies used a QOL measure and no study looked at economic outcomesLowWeak
Shaikh et al.22NoInsufficient evidence to clarify use of mechanical devices for SUILowWeak
Hay-Smith and Dumoulin20YesWomen perceived cure was more likely after PFMT than controlIntermediateStrong
Holroyd-Leduc and Straus26NoUI in women is an important public health problem and effective treatment options existsIntermediateStrong
Moehrer et al.23YesEstrogen treatment can improve or cure incontinence, and the evidence suggests that this is more likely with urge incontinenceIntermediateWeak
Virseda et al.16YesPFMT is 7.03 times more effective than placebo or no treatmentIntermediateStrong
Bo17NoSUI can be effectively treated with PFMT and is suggested as a first-line treatment; need for RCTs on role of PFMT in prevention of incontinenceLowStrong
Weatherall27YesBiofeedback was better than PFM exercises but CI > 5%IntermediateStrong
Berghmans et al.15NoThere is strong evidence to suggest that pelvic floor exercises are effective in reducing stress incontinence symptomsIntermediateStrong
Teunissen et al.18NoBehavioural therapy reduced episodes of urinary accidents, the effect of drug therapy—unclear, behavioural therapy can be recommended as first-choice treatmentIntermediateWeak

Overall quality of the existing systematic reviews was variable. As shown in stacked bar chart (Figure 2), all 13 had specific question and a testable hypothesis, but only 10/13 (76.9%) had narrow focus of question. Nine reviews described the literature search terms adequately and searched more than two databases. However, 7/13 reviews used four databases, 2 searched two databases, 1 searched only Medline and another did not mention the database searched. One review restricted search to English language articles26 and one restricted to English, Dutch and German languages.15 Reference list to identify studies missed in electronic searches was used in only 9/13 (69.2%) review articles. Assessment for risk of missing literature and contact with authors was explicitly mentioned in only one review.25 Quality assessment of included studies in the systematic reviews and findings were adequately tabulated in all but four reviews. Assessment for heterogeneity was not clear or inadequate in 7/13 (53.8%) reviews. Meta-analysis of data was carried out in six reviews.16,20,21,23,27 The five Cochrane reviews had included between 1320 to 2823 trials. The Cochrane reviews satisfied most of the criteria for a good systematic review.

Methodological quality of systematic reviews of nonsurgical treatments of SUI.

As shown in Figure 3, pelvic floor exercises and other physical treatments were better than no treatment in SUI (OR 16.8; 95% CI 2.37–119). There was inadequate evidence to support mechanical devices. Estrogens helped urge more than SUI symptoms (OR 1.2; 95% CI 0.6–2.3). Duloxetine was significantly better than placebo in terms of improving women’s quality of life (weighted mean difference 5.26, 95% CI 3.84–6.68) and perception of improvement. It reduced incontinence episode frequency (OR 1.24; 95% CI 1.14–1.36). About one in three women allocated duloxetine reported adverse effects (most commonly nausea) related to treatment, and about one in eight stopped treatment as a consequence.

Range of effectiveness of some nonsurgical treatments for SUI for improvement/cure in women’s perception. PFMT, pelvic floor muscle training.

As shown in Table 2, based on the assessment per GRADE system, only 2/13 (15.4%) reviews were deemed to be of high quality, 8/13 (61.5%) of moderate quality and 3/13 (23.1%) of low quality. Out of the 13 systematic reviews, we deemed that the case for recommending pelvic floor muscle therapy and duloxetine was strong when compared with no treatment. There was insufficient evidence and hence weak case for recommending habit retraining including timed voids, biofeedback and electrical stimulation with pelvic floor muscle training (PFMT), mechanical devices and estrogen therapy.


Existing systematic reviews of conservative treatments for SUI tended to be variable in reporting of methodological features. There were no systematic reviews comparing conservative with surgical treatments of SUI. There is evidence that pelvic floor exercises and other physical treatments such as cones as well as duloxetine are better than no treatment in SUI. But the effectiveness has to be weighed against costs, varying values of women and the magnitude as well as the precision of the treatment effect. In conclusion, some of the systematic reviews of conservative treatments of SUI that exist in the literature are unsuitable to generate robust inferences for practice as they are weak in methodological quality and often lack power to produce reliable results.

This is a first comprehensive review of all available systematic reviews on nonsurgical treatments of SUI that employed GRADE system for assessment of evidence. It highlights the summary of findings of the existent reviews and their quality. The quality of a study depends on the degree to which it employs measure to minimise bias and error in its design, conduct and analysis.13 Although half of the reviews included meta-analysis, the actual need for use of this statistical technique is uncertain. When we examine the data of the systematic review, if there is evidence of publication bias, the reviews are likely to be flawed. Yet the steps to avoid this, such as search for missing studies, contact with experts and avoidance of language restrictions were overlooked. From the grading, it becomes obvious that most systematic reviews indicate that further research on this topic is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.6

Most of the National Institute for Clinical Excellence recommendations concur with GRADE; for example they recommend that a trial of supervised PFMT of at least 3 months of duration should be offered as first-line treatment to women with stress or mixed incontinence, perineometry or pelvic floor electromyography as biofeedback should not be used as a routine part of PFMT and electrical stimulation should not routinely be used in combination with PFMT. But they also recommend some good practice points which are not evidence based, for example if PFMT is beneficial, an exercise programme should be maintained and electrical stimulation and/or biofeedback should be considered in women who cannot actively contract pelvic floor muscles to aid motivation and adherence to therapy.

Meta-analysis and systematic reviews are frequently used in evidence-based medicine to provide helpful summaries of evidence extracted from a number of individual studies.

We feel that since the RCT data could be scarce for some interventions and comparisons, it will be useful to synthesise the evidence from observational studies28 and to undertake indirect comparisons29 paying due attention to study quality and other methodological issues. Information on safety is scarcely available from RCTs, so observational data would have to be reviewed for this issue. Underlying disease prevalence (baseline risk) and its variation in key clinical subgroups are an essential input into cost-effectiveness models, and for this control, event rate from RCTs is not an adequate proxy, raising the need for reviewing prevalence data. Most of the existing reviews have included women with stress, urge and mixed incontinence, and these reviews will have to be re-examined to tease out the data on SUI only. American, European and Asian Nephrology and Urology organisations plan to adopt GRADE for their recommendations.6,30 Guidelines will be framed using the simple, clinically applicable GRADE system of strong and weak recommendations, thereby facilitating practice of evidence-based medicine. Judgements about evidence and recommendations are complex. Some subjectivity, especially regarding recommendations, is unavoidable. The GRADE system appropriately balances the need for simplicity with the need for full and transparent consideration of all important issues and hence should be used for generating recommendations for clinical practice in urogynaecology.


We are grateful to Derick Yates, Clinical Librarian at Education Resource Centre of Birmingham Women’s Hospital, for conducting literature searches for this study.