Measurements of amniotic fluid volume are used for pregnancy surveillance despite a lack of evidence for their predictive ability.
Measurements of amniotic fluid volume are used for pregnancy surveillance despite a lack of evidence for their predictive ability.
To evaluate the association and predictive value of ultrasound measurements of amniotic fluid volume for adverse pregnancy outcome.
Electronic databases (inception to October 2011), reference lists, hand searching of journals, contact with experts.
Studies comparing measurements of amniotic fluid volume with adverse outcome, excluding pre-labour ruptured membranes or congenital/structural anomalies.
Data on study characteristics, design, quality. Random effects meta-analysis to estimate summary odds ratios (prognostic association) and summary sensitivity, specificity and likelihood ratios (predictive ability).
Forty-three studies (244 493 fetuses) were included demonstrating a strong association between oligohydramnios (varying definitions) and birthweight <10th centile (summary odds ratio [OR] 6.31, 95% confidence interval [95% CI] 4.15–9.58; high-risk population [author definition] n = 6 studies, 28 510 fetuses), and mortality (neonatal death any population summary OR 8.72, 95% CI 2.43–31.26; n = 6 studies, 55 735 fetuses; and perinatal mortality high-risk population summary OR 11.54, 95% CI 4.05–32.9; n = 2 studies, 27 891 fetuses). There was a strong association between polyhydramnios (maximum pool depth >8 cm or amniotic fluid index ≥25 cm) and birthweight >90th centile (OR 11.41, 95% CI 7.09–18.36; n = 1 study, 3960 fetuses). Despite strong associations, predictive accuracy for perinatal outcome was poor.
Current evidence suggests that oligohydramnios is strongly associated with being small for gestational age and mortality, and polyhydramnios with birthweight >90th centile. Despite strong associations with poor outcome, they do not accurately predict outcome risk for individuals.
The amniotic fluid is fundamental for proper fetal development and growth, and amniotic fluid volume measurements using prenatal ultrasound have become standard in fetal surveillance, especially in the evaluation of high-risk pregnancies. Alterations in amniotic fluid volume, especially decreased amniotic fluid volume (oligohydramnios), have classically been considered an indicator of adverse perinatal outcome and, therefore, have led to an almost uniform recommendation for delivery following the diagnosis of oligohydramnios, at least for patients at term. However, the number of ultrasonographic modalities applied to assess amniotic fluid volume and the various threshold points reflect the inaccuracies inherent in each of these modalities.[2-7] Moreover, the association between abnormal amniotic fluid volume and adverse perinatal outcomes came from heterogeneous studies that frequently included patients with preterm ruptured membranes or different underlying medical conditions, and/or fetuses with structural anomalies, clinical situations that may affect the amniotic fluid volume.
A previous review of randomised controlled trials (RCTs) has concluded that single deepest vertical pocket measurement is the method of choice for the assessment of amniotic fluid volume on the basis that neither method was superior but that amniotic fluid index (AFI) led to more diagnoses of oligohydramnios, more inductions of labour and more caesarean deliveries for fetal distress without improving perinatal outcome. The observed effect in these RCTs is determined by both test accuracy and the effect of the intervention that follows testing; a conclusion of this review was that a systematic review of the accuracy of AFI versus single deepest pocket was needed.
Therefore we present here a systematic review and meta-analysis of the literature to assess the prognostic association and predictive accuracy of measurements of amniotic fluid for adverse pregnancy outcome and we compare the performance of different techniques of measurement of amniotic fluid.
The following sources were searched from inception to October 2011: MEDLINE; EMBASE; Cumulative Index To Nursing And Allied Health Literature (CINAHL); The Cochrane Central Register of Systematic Reviews; The Cochrane Central Register of Controlled Trials; DARE; MEDION; SIGLE; Index of Scientific and Technical Proceedings, Web of Science and ClinicalTrials.gov database. The search consisted of keywords and MeSH terms relating to the tests under investigation combined with MeSH terms of ‘Prenatal Diagnosis’, ‘Ultrasonography’, ‘Amniotic Fluid’ and ‘Pregnancy Outcome’. The full search strategy is shown in the Appendix S2. The reference lists of all included primary and review articles were examined to identify cited articles not captured by electronic searches. Reference Manager 12.0 was used to construct a comprehensive database of literature. No language restrictions were applied.
The database was scrutinised by two reviewers (RKM, CHM) and full articles likely to meet the selection criteria were obtained. Translations were obtained for non-English articles. Three reviewers made the final inclusion/exclusion decisions according to adherence to the following criteria.
All articles were assessed independently by a minimum of two reviewers (RKM, CHM, JT, GLM) and the data were abstracted. The following were recorded; study characteristics (authors, journal, year of publication, country, study design, objectives, type of medical centre, and period or duration of the study); characteristics of the participants (study population, method of selection, inclusion and exclusion criteria, whether consecutive cases, number of participants, number of excluded participants and reasons for exclusion, personal and medical characteristics of enrolled women, inpatients compared with outpatients, level of activity, gestation at time of test and at delivery as well as test to delivery interval were recorded); information on how the diagnostic tests were carried out and the results; and methods for assessing the diagnostic accuracy of the tests and the results (number of true and false positives, number of true and false negatives, sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, method of agreement, receiver operating characteristic curve, area under the curve, and the threshold level(s) used). Disagreements were resolved by consensus or through arbitration by a third reviewer (RKM or KSK). For multiple and/or duplicate publication of the same data set only the most recent or most complete study was included. All studies had to state that they excluded rupture of membranes and congenital/structural anomalies due to the association of renal/urinary tract anomalies and karyotypic anomalies with abnormalities of liquor volume.
All included manuscripts were assessed by at least one reviewer for study and reporting quality using validated tools for test accuracy studies.[15-19] Methodological quality was defined as the confidence that the study design, conduct and analysis have minimised biases in addressing the research question, thereby focusing on the internal validity (i.e. the degree to which the results of an observation are correct for the patients being studied). Items considered important for a good quality paper were prospective design with consecutive/random recruitment, full verification of the test result with an outcome measure (>90%), adequate description of the population and index test and whether the clinicians managing the patients were blinded to the results of the index test. Quality of the papers was assessed by QUADAS (see Appendix S3). Quality scores were not assigned because these have been shown to give flawed results in a diagnostic accuracy setting. Studies were rated as high quality for subgroup analysis if they satisfied at least four of the following items: adequate description of population; adequate description of the test (measurement of amniotic fluid and threshold) and outcome measure; consecutive recruitment; prospective recruitment; >90% completions of follow up; appropriate outcome measurement; blinding of the investigators performing the outcome measure and a statement regarding the use of intervention between the index test and outcome. As this review was nearing completion QUADAS 2 was published and so all included papers were re-assessed in-line with the recommendations from QUADAS 2. Elements of study design that were likely to have a direct relationship to bias in a test accuracy study were assessed using the STARD checklist.
From the 2 × 2 tables in each study, odds ratios (OR) were computed with their 95% confidence intervals (CI) for each measure of amniotic fluid (at all reported thresholds) and its outcome pair. Results were pooled using a random effects meta-analysis model[21, 22] where the definition of the measure of amniotic fluid volume, the threshold used and the outcome measure were the same. Odds ratios were selected as the summary statistic, to assess prognostic ability, because they represent the effect of the test on the odds in an unbiased fashion and enable the results of case–control and cohort studies to be included. They are often used to demonstrate an epidemiological association.
A random effects meta-analysis model was chosen for each test due to the expected presence of clinical and statistical heterogeneity between studies. This approach synthesises the log OR estimates and weights each study by the inverse of the study's variance plus between-study variance to produce a summary estimate of the average prognostic effect of a test. As a test's prognostic ability may vary from this average from setting to setting, after each random-effects meta-analysis if I2 > 0% we also estimated a prediction interval to reveal the potential prognostic effect if the test is applied in a single setting similar to one of the studies from our analysis. This was calculated where three or more studies were included in the meta-analysis.
We plotted summary odds ratio data in forest plots and assessed the between-study heterogeneity in prognostic effect of each test by estimating I2 (the amount of variability in prognostic effects due to between-study heterogeneity) and tau-squared. Where possible we performed meta-regression or subgroup analysis as appropriate to examine the effect of potential confounding factors: singleton or multiple birth status, timing of test in relation to delivery, gestation of pregnancy at time of testing, high-risk or low-risk population as assessed by the study authors, and study quality were considered to be important factors that may influence the strength of the association between amniotic fluid measurement and adverse outcome.
In studies where there were cells in the 2 × 2 table with a value of 0, 0.5 was added to all cells to allow the calculation of log odds ratios and their variances for meta-analysis. Meta-analyses were performed where two or more studies reported the same index test and outcome measure.
The primary outcomes were considered to be birthweight <10th centile and birthweight <2500 g for measurements of small for gestational age, and perinatal mortality, abnormal cord pH (<7.20) and adverse perinatal outcome for wellbeing. A composite outcome measure (adverse perinatal outcome) was employed by some included studies to maximise the number of events that could be included in the analysis and avoid the need to select a single morbidity/mortality as a primary outcome measure. However, a hazard of composite outcome measures is the assumption that the significance of the result applies to all components. To address this issue, we analysed the component outcomes as subgroups when these were reported (see Appendix S4). When the composite outcome measure was used, care was taken to ensure that each individual was only counted once in each analysis, particularly where studies reported multiple outcomes for a single population. Where multiple outcomes and test thresholds were reported, attempts were made to select the most consistent threshold and outcome across the analysis. It should be noted that for the outcomes of neonatal death and perinatal mortality, these were the outcomes as used in the included studies and there is no overlap between the studies included in each outcome.
To explore for the presence of publication bias, the Peters test (a weighted linear regression with ln odds ratio as the dependent variable and the inverse of the total sample size as the independent variable) was performed to assess funnel plot asymmetry for each meta-analysis containing ten or more studies, with a significance level of 10% used.
Odds ratios significantly >1 indicate a prognostic association between amniotic fluid measurement and poor outcome; on average in the population this measure was associated with a worse outcome. We considered odds ratios between 1 and 2 to be mild associations, 2 and 5 to be moderate associations, and >5 as strong associations. In particular, although all odds ratios >1 indicate a prognostic association at the population level, we felt that only an odds ratio >5 would indicate a sufficient discrepancy between amniotic fluid volumes that may have predictive ability at the individual level. Therefore we only considered test accuracy at the individual level (in terms of sensitivity and specificity of a test) when its odds ratio was >5, the 95% CI did not cross 1 and there was statistical significance. We assessed the predictive ability of the test by calculating summary sensitivity, specificity and likelihood ratios, again using data from the 2 × 2 tables and synthesising using a bivariate random-effects meta-analysis model. Likelihood ratios indicate by how much a given test result raises or lowers the odds of having the disease and have been recommended by Evidence-based Medicine Groups[32, 33] as they show how the test result informs clinical decision making.
All analyses were performed in Stata version 11.0 (StataCorp, College Station, TX, USA) using the metan, metandi and metabias commands.[34-36] Summary results were displayed in forest plots generated using StatsDirect.
Figure 1 summarises the process of literature identification and selection. Of the 6259 potential citations, 43 primary articles were included in the critical appraisal and systematic review. Appendix S4 details the individual study characteristics of the included studies and their references. There were 43 studies included overall, reporting on 244 493 fetuses. The commonest index tests reported were amniotic fluid index ≤5 cm (number of studies, n = 23) and maximum pool depth (MPD) ≤2 cm (n = 6; Figure 1). The outcome measures reported most often were birthweight <2500 g (n = 6), birthweight <10th centile (n = 13), Apgar score at 1 minute <7 (n = 12), Apgar score at 5 minute <7 (n = 17), umbilical cord pH <7.20 (n = 5), admission to neonatal intensive care unit (n = 16), perinatal mortality (n = 9), neonatal death (n = 5) and adverse perinatal outcome (n = 9). There were only four papers that reported results for ponderal index and no papers that used a measure of fetal growth restriction as an outcome, e.g. fetal weight <10th centile and abnormal Dopplers.
Figure 2 shows a summary of the quality assessment of included studies. There was good compliance with appropriate population spectrum, selection criteria adequately described and appropriate reference standard. There was poor compliance with adequate description of index and reference standard (Appendix S3 for adequate criteria). Blinding of the assessors of the outcome measure to the results of the amniotic fluid measurement was also poorly reported (6/43 studies). Only seven studies reported on the use of any treatment in between the amniotic fluid measurement and delivery, or whether the results of the tests were used in determining patient management. When assessing the included papers with the QUADAS-2 recommendations the results were: patient selection, 86% low risk of bias and 9% high concerns re applicability; index test, 65% low risk of bias and low concerns re applicability (mainly due to inadequate description of index test); reference standard, only 9% had low risk of bias (due to non-blinding and poor reporting of method of reference standard) but applicability was high with only 3% of studies having concerns re applicability; for flow and timing 16% of studies had low concerns regarding possibility of bias and this was mainly due to poor reporting of any intervention between index and reference standard.
Five of the 12 meta-analyses performed for oligohydramnios (according to threshold used and outcome definition) provided a summary odds ratio and 95% CI that demonstrated a significant association between oligohydramnios and small for gestational age as measured by birthweight <2500 g or <10th centile (Figure 3). The summary odds ratio estimate was generally above 2, suggesting a reasonably large prognostic association on average. However, there was large heterogeneity for most meta-analyses, even after subgrouping studies (Figure 3), with I2 often over 70%. The heterogeneity is reflected in the wide prediction intervals, which generally include an odds ratio of 1, and reveal that in individual settings the association might vary considerably from the average, and may not even be important in some situations.
For birthweight <10th centile subgroup analysis found only a significant effect from a high-risk population. On average across all measures of oligohydramnios in high-risk populations, the summary odds ratio was 6.31 (95% CI 4.15–9.58) and this was a rare situation where the prediction interval was also entirely above 1 (2.23–17.81; Figure 3). This is compared with a summary odds ratio of 2.34 (95% CI 1.76–3.09) and prediction interval (0.38–14.43) in a low-risk/unselected population. Given this, we also evaluated the summary accuracy of oligohydramnios for correctly predicting a birthweight <10th centile in a high-risk population. Across all measures of oligohydramnios, the summary sensitivity was 0.4 (0.12–0.76), summary specificity was 0.91 (0.66–0.98), summary positive likelihood ratio was 4.23 (2.38–7.52), and summary negative likelihood ratio was 0.66 (0.41–1.06).
There were insufficient papers (n = 1) for meta-analysis of birthweight <3rd and 5th centiles as outcome measures but individual studies showed a stronger association with these more severe measures of SGA (Figure 3). Corresponding results for predictive accuracy were for birthweight <5th centile (MPD < 2 cm) sensitivity 0.43 (95% CI 0.18–0.71), specificity 0.92 (95% CI 0.86–0.96), positive likelihood ratio 5.46 (95% CI 2.38–12.5) and negative likelihood ratio 0.62 (95% CI 0.39–0.98). For birthweight <3rd centile (AFI ≤5 cm), sensitivity 0.04 (95% CI 0.03–0.05), specificity 0.99 (95% CI 0.99–0.99), positive likelihood ratio 12.9 (95% CI 8.88–18.7) and negative likelihood ratio 0.97 (95% CI 0.96–0.98). For birthweight <2500 g (AFI ≤5 cm AFI ≤5 cm, n = 2 studies, 28 554 fetuses) the accuracy results were summary sensitivity 0.09 (95% CI 0.08–0.11), summary specificity 0.98 (95% CI 0.98–0.99), summary positive likelihood ratio 5.04 (95% CI 0.67–38.11) and summary negative likelihood ratio 0.84 (95% CI 0.63–1.11), there was significant heterogeneity.
All analyses demonstrated an association between oligohydramnios and neonatal death, and the majority also indicated an association with perinatal mortality (Figure 4). Heterogeneity was again large and prediction intervals were wide. Across all measures of oligohydramnios there was a strong association with neonatal death (summary OR 8.72, 95% CI 2.43–31.26, estimated prediction interval 0.19–401.44), this was not significantly changed when deaths possibly due to prematurity were excluded. The summary predictive accuracy was a sensitivity of 0.58 (0.19–0.89), specificity of 0.88 (0.55–0.98), positive likelihood ratio of 5.00 (1.69–14.76), negative likelihood ratio of 0.48 (0.20–1.14). There was no difference in any of the subgroup analyses.
For perinatal mortality there was a strong association when restricting to a high-risk population (summary OR 11.54, 95% CI 4.05–32.90) compared with any population (summary OR 3.44, 95% CI 0.61–19.43). Predictive accuracy for a high-risk population was sensitivity 0.29 (0.15–0.46), specificity 0.99 (0.99–0.99), positive likelihood ratio 4.52 (0.95–21.45), negative likelihood ratio 0.49 (0.06–4.06; Figure 4).
For abnormal cord pH there was no strong association with any of the measures of oligohydramnios, and subgroup analysis showed no significant effect in particular subgroups of interest. There was also no difference in the association when looking at more acidotic cord pHs. In all of these analyses for cord pH the estimated prediction intervals crossed the line of no effect. (Figure 4).
For adverse perinatal outcome there was no strong association and no difference with subgroup analysis.
Most meta-analyses gave a summary odds ratio that suggested a moderate association between oligohydramnios and fetal/neonatal morbidity (Table 1) with the odds ratio typically between 2 and 4. As above, heterogeneity was typically large, even when subgroup analyses were considered, and this led to wide prediction intervals.
|Outcome measure||No. of included studies||No. of fetuses||Odds ratio (95% CI)||Tau||I 2||EPI||Sensitivity (95% CI)||Specificity (95% CI)||LR +ve (95% CI)||LR −ve (95% CI)|
|Resuscitation (AFI ≤5 cm, high-risk population)||1||565||12.02 (3.82–37.89)||0.31 (0.11–0.59)||0.96 (0.94–0.98)||8.58 (3.69–19.96)||0.71 (0.51–0.99)|
|Admission to NICU||15||43 222||2.05 (1.21–3.45)||0.8||86.1||0.29, 14.54|
|AFI ≤5 cm||12||38 202||1.64 (0.76–3.53)||1.4||89.1||0.1, 26.22|
|Fetal distress||12||12 794||2.69 (1.27–5.70)||1.4||85.8||0.17, 42.3|
|AFI ≤5 cm||9||39 839||1.86 (0.95–3.67)||0.7||79.5||0.21, 16.69|
|MPD <2 cm||2||24||1.15 (0.22–6.12)||0.6||29.4|
|MPD <1 cm||1||30||7.93 (3.35–18.77)||0.67 (0.47–0.83)||0.80 (0.72–0.86)||3.31 (2.19–5.0)||0.42 (0.25–0.70)|
|APGAR 1 minute <7||8||35 694||2.94 (1.1–7.91)||1.7||91.3||0.09, 93.17|
|AFI ≤5 cm||5||34 828||3.33 (0.9–12.27)||2||93.6||0.02, 471.07|
|MPD <2 cm||1||56||0.67 (0.12–3.65)|
|APGAR at 5 minute <7||19||47 431||2.61 (1.32–5.17)||1.2||81.7||0.22, 31.23|
|Test within 24 hours||2||45 090||9.77 (4.93–19.37)||0.1||10.7||0.06 (0.04–0.08)||0.99 (0.99–0.99)||6 (1.47–24.57)||0.95 (0.93–0.97)|
|AFI ≤5 cm||11||45 542||2.89 (1.12–7.49)||1.7||87.9||0.12, 70.5|
|MPD <2 cm||1||56||0.8 (0.03–20.62)|
|Morbidity||6||11 400||1.81 (1.0–3.3)||0||0||0.78, 4.23|
|AFI ≤5 cm||2||6919||1.66 (0.36–7.68)||0||0|
|Preterm delivery <37 weeks||4||34 508||1.87 (0.33–10.67)||2.9||93.4||0, 7019.56|
For assisted ventilation/intubation there was an especially strong association but this was from a single study in a term population (Table 1). For fetal distress the association was generally around an odds ratio of 2 with no significant effects demonstrated in the subgroup analysis. When looking at individual measures of oligohydramnios (amniotic fluid index ≤5 cm, or maximum pool depth <2 cm, <1 cm) only maximum pool depth <1 cm (single study) showed a significant association (Table 1). For Apgar score at 5 minute <7 there was a significant association when looking at tests performed within 24 hour of delivery.
Where a direct comparison was possible between AFI and MPD for an individual outcome within the same study then predictive accuracy results were calculated to directly compare the different measures (Table 2). There was no difference for the outcomes of Apgar scores, admission to neonatal intensive care unit, fetal distress, neonatal death or perinatal mortality. For adverse perinatal outcome and birthweight <10th centile there were improved positive likelihood ratios for MPD versus AFI with no significant change in specificity or negative likelihood ratio.
|Author||Index test||Outcome and test||Sensitivity||95% CI||Specificity||95% CI||LR+ve||95% CI||LR−ve||95% CI|
|Chauhan et al.46||AFI ≤5 cm||Birthweight <10th centile||0.14||0.04–0.33||0.97||0.92–0.99||4.2||1.20–14.68||0.89||0.76–1.04|
|MPD ≤2 cm||0.04||0.00–0.18||1||0.98–1.0||15.31||0.64–366.61||0.95||0.87–1.04|
|Youssef et al.47||AFI ≤5 cm||0.79||0.62–0.91||0.69||0.61–0.77||2.59||1.91–3.50||0.3||0.15–0.58|
|MPD ≤1 cm||0.56||0.38–0.73||0.79||0.71–0.85||2.61||1.69–4.03||0.56||0.38–0.83|
|Desari et al.48||AFI ≤5 cm||Admission to NICU||0.22||0.03–0.60||0.65||0.54–0.75||0.63||0.18–2.21||1.2||0.82–1.76|
|MPD ≤3 cm||0.44||0.14–0.79||0.4||0.30–0.50||0.74||0.35–1.56||1.4||0.74–2.66|
|Morris et al.49||AFI ≤5 cm||0.12||0.05–0.21||0.92||0.91–0.94||1.5||0.79–2.84||0.96||0.88–1.04|
|MPD ≤2 cm||0.01||0–0.07||0.99||0.98–0.99||0.92||0.13–6.75||1||0.98–1.03|
|Myles et al.50||AFI ≤5 cm||0||0–0.6||0.87||0.82–0.91||0.74||0.05–10.46||1.04||0.77–1.40|
|MPD ≤2.5 cm||0||0–0.6||0.86||0.81–0.90||0.68||0.05–9.63||1.05||0.78–1.42|
|Morris et al.49||AFI ≤5 cm||Fetal distress||0.29||0.04–0.71||0.92||0.91–0.94||3.66||1.12–11.96||0.78||0.49–1.24|
|MPD ≤2 cm||0||0–0.41||0.99||0.98–0.99||4.38||0.29–66.21||0.95||0.80–1.14|
|Myles et al.50||AFI ≤5 cm||0.2||0.10–0.32||0.89||0.84–0.93||1.72||0.90–3.29||0.91||0.79–1.04|
|MPD ≤2.5 cm||0.18||0.09–0.30||0.87||0.81–0.91||1.34||0.69–2.59||0.95||0.83–1.08|
|Youssef et al.47||AFI ≤5 cm||0.87||0.69–0.96||0.69||0.61–0.77||2.84||2.14–3.78||0.19||0.08–0.48|
|MPD ≤1 cm||0.67||0.47–0.83||0.8||0.72–0.86||3.31||2.19–5.00||0.42||0.25–0.70|
|Youssef et al.47||AFI ≤5 cm||Apgar at 5 minutes <7||0.89||0.65–0.99||0.65||0.57–0.73||2.57||1.96–3.37||0.17||0.05–0.63|
|MPD ≤1 cm||0.72||0.47–0.90||0.77||0.70–0.83||3.13||2.09–4.69||0.36||0.17–0.76|
|Fischer et al.51||AFI ≤5 cm||Adverse perinatal outcome||0.29||0.13–0.51||0.89||0.84–0.93||2.67||1.26–5.69||0.8||0.61–1.03|
|MPD ≤2 cm||0.25||0.10–0.47||0.96||0.92–0.98||6.21||2.28–16.95||0.78||0.62–0.99|
|MPD ≤1 cm||0.13||0.03–0.32||1||0.98–1.00||49||2.61–920.79||0.86||0.74–1.01|
|Desari et al.48||AFI ≤5 cm||Neonatal death||0.33||0.01–0.91||0.66||0.56–0.75||0.98||0.19–4.97||1.01||0.45–2.28|
|MPD ≤3 cm||0.33||0.01–0.91||0.4||0.30–0.51||0.56||0.11–2.79||1.67||0.72–3.83|
|Youssef et al.47||AFI ≤5 cm||Perinatal mortality||0.88||0.47–0.99||0.62||0.54–0.70||2.31||1.66–3.20||0.2||0.03–1.27|
|MPD ≤1 cm||0.75||0.35–0.97||0.74||0.67–0.81||2.9||1.80–4.66||0.34||0.10–1.12|
There were five papers, including 144 681 fetuses, that reported on polyhydramnios and adverse outcomes (Table 3). Thresholds included AFI ≥24 cm (three papers), AFI ≥25 cm (one paper) and in one paper there was no threshold reported.
|Outcome measure, any measure of polyhydramnios||No. of studies||No. of fetuses||Odds ratio||95% CI||Tau||I 2||EPI||Sensitivity||95% CI||Specificity||95% CI||LR+ve||95% CI||LR−ve||95% CI|
|Birthweight <10th centile||2||5702||0.37||0.07–1.95||1.32||90.9|
|Birthweight <2500 g||2||45 795||2.38||0.42–13.58||1.19||68.9|
|Birthweight >90th centile||1||3960||11.41||7.09–18.36||–||–||0.26||0.18–0.36||0.97||0.96–0.98||8.71||6.01–12.62||0.76||0.68–0.86|
|Apgar <7 at 5 minutes||3||141 901||3.97||1.58–9.99||0.63||95.4||0–49 5049.69|
|Apgar <7 at 1 minute||2||48 717||3.67||0.91–14.87||0.99||97|
|Admission to NICU||1||44 757||4.89||3.61–6.63||–||–|
|Need for caesarean section||1||3960||1.57||1.09–2.24||–||–|
|Neonatal death||2||48 717||5.07||0.69–37.3||1.97||95.1|
|Perinatal mortality||2||97 144||3.2||1.97–5.20||0.1||78.9|
|Intrauterine death||2||48 717||4.13||1.36–12.48||0.53||83.9|
There was no evidence of an association between polyhydramnios and birthweight <10th centile or <2500 g, Apgar score at 1 minute <7, fetal distress or neonatal death. There was a strong positive association with polyhydramnios and birthweight >90th centile and this corresponded to low sensitivity with high specificity. There was significant heterogeneity throughout.
Peters test was performed on all meta-analyses where there were ten or more studies included (admission to neonatal intensive care unit, fetal distress, 5-minute Apgar <7, abnormal pH and birthweight <10th centile. There was no significant evidence of small study effects (P values 0.21, 0.19, 0.61, 0.35, 0.62, respectively).
This is the first study to look at all measurements of amniotic fluid and compare their prognostic association and, where appropriate, their ability to predict adverse outcomes for individuals. The results demonstrate that there is an especially strong association between oligohydramnios and small for gestational age and mortality. Polyhydramnios was associated with birthweight >90th centile. For the measures that gave large summary odds ratios >5, the summary positive and negative likelihood ratios indicate that oligohydramnios and polyhydramnios substantially change the odds of an adverse outcome. However, the low summary sensitivity and negative likelihood ratios being close to 1, reveal that a negative test result is not good at discriminating accurately between those who will and those who will not have the adverse outcome. Hence, to accurately predict risk of adverse outcome, oligohydramnios and polyhydramnios should be used in conjunction with other prognostic factors as part of a prognostic model.
The inferences for clinical practice that can be made from the results of this study are limited by the biases introduced from the designs of the included studies and in particular the treatment/intervention paradox, this is discussed further below.
This review provides the most up-to-date summary and meta-analysis of the association and predictive ability of abnormal liquor volume with small for gestational age and adverse fetal and neonatal wellbeing. The strengths of our review are in the methodology used complying with existing guidelines for systematic reviews of diagnostic studies and contemporary methods for meta-analysis.[10-13, 31] Our search was extensive across many databases with no language restrictions. We have rigorously assessed study quality and reporting quality looking at risk of bias and applicability and assessed for publication bias. Heterogeneity has been explored using meta-regression and subgroup analysis. A further strength to our review is the exclusion of patients with ruptured membranes and structural or chromosomal anomalies.
The limitations to our review lie in the limitations from the quality of the primary research. Our quality assessment revealed concerns regarding possibility of bias through patient selection, performance of the index test and reference standard. We were unable to perform subgroup analysis for preterm versus term pregnancies and some studies reported insufficient data to determine whether thresholds for amniotic fluid measurement were adjusted for gestation. Where possible we used the results obtained closest to delivery and have performed subgroup analysis for those where the test was performed within 7 days of delivery. In particular, there was very poor reporting regarding the exact methods of the reference standards and whether there was any treatment used between the performance of the index and reference standard. A major concern therefore is in how many pregnancies was induction of labour performed due to the finding of oligohydramnios, which influences the results for pregnancy outcome, i.e. intervention bias. This bias can only truly be removed by performing an RCT, this would be impossible to perform as measurements of amniotic fluid volume have become the standard in fetal surveillance and management of high-risk pregnancies and so recruitment to such a trial would be very difficult. Finally, the outcome measures used in this review were those that were reported by the authors of the included studies, it is recognised that many of the outcome measures are subjective (e.g. admission to neonatal intensive care unit, need for resuscitation). The only real objective measure of poor fetal outcome is paired samples of cord pH and longer-term outcomes such as cerebral palsy, which were not reported.
This study looks at the strength of association of measures of amniotic fluid with adverse outcomes and where appropriate, their predictive accuracy, and to our knowledge this is the first systematic review and meta-analysis to do this. However, for a test to be recommended in clinical practice it must be reliable, accurately reflect the condition it is diagnosing and usefully predict adverse outcome such that when used to determine management it ultimately improves pregnancy outcome. The reliability of measures of amniotic fluid has been assessed in previous studies. These have concluded that reproducibility can be affected by fetal position, transducer pressure, maternal hydration and use of colour Doppler due to the observer variation or variations in the fluid volume.[38-41] To determine which measure (AFI versus MPD) more accurately reflects true oligohydramnios requires comparison with dye dilution techniques or comparison with volumes assessed at caesarean section. This has been performed by Magann et al. in 2000 and the authors concluded that both techniques were unreliable in determining true amniotic volume. Two previous systematic reviews and meta-analyses have looked at the effect of measurements of amniotic fluid on pregnancy outcome, the first by Magann et al. included non-randomised and randomised controlled trials, the second by Nabhan et al. included only RCTs. Both of these studies concluded that there was no evidence that either method (AFI or MPD) was superior to the other in preventing adverse pregnancy outcome and noted that AFI characterised more women as having oligohydramnios leading to an increase in obstetric interventions without any improvement in pregnancy outcome.[8, 43]
Oligohydramnios is associated with small for gestational age and mortality. Polyhydramnios is associated with birthweight >90th centile. The strong associations mean oligohydramnios and polyhydramnios modify the odds of an adverse outcome if test positive. However, to improve the accuracy of predicting future outcome risk for individuals, oligohydramnios and polyhydramnios need to be combined with other prognostic factors within a prognostic model.
Despite some strong associations demonstrated with oligohydramnios and birthweight <10th centile and mortality, the predictive ability for individuals was poor with generally good specificity and positive likelihood ratios but low sensitivity (<0.5) and negative likelihood ratios near 1. This can be interpreted as an increased risk (odds) of adverse outcome for those that test positive (compared with pretest risk) but for those that test negative there is minimal change in the risk of an adverse outcome. There was no significant difference in association or predictive accuracy comparing AFI to MPD apart from improved positive likelihood ratios (not significantly) for maximum pool depth for adverse perinatal outcome and birthweight <10th centile.
Although not accurate for individual prediction, the evidence indicates that oligohydramnios is a prognostic factor for birthweight <10th centile and mortality. As such, it has many potential uses. For example, informing randomisation strategies in clinical trials; as a confounder to adjust for in observational studies and unbalanced trials; and combined with other prognostic factors to allow more accurate predictions for individuals. However, due to the limitations discussed it would seem prudent to limit its use to high-risk pregnancies in whom intervention (such as early delivery) would be considered.
Future research needs to investigate further the test accuracy of measures of amniotic fluid volume using appropriately designed test accuracy studies, with suitable sample size calculations but also considering the value of the test within the diagnostic and management pathway and what can be done to improve the test's diagnostic and therapeutic yield. For example, the use of amniotic fluid measures within the biophysical profile assessment and in combination with umbilical artery Doppler needs to be assessed in the same rigorous manner. It is important that any future research addresses the issue of the different types of measurements, the varying thresholds used and the unexplained heterogeneity identified within this review. The present evidence demonstrates that there is no significant improvement with accuracy for MPD versus AFI and effectiveness evidence has also supported the use of MDP. As this is a much easier technique to perform, this should become recommended practice until more robust evidence becomes available.
We declare no conflicts of interest.
All authors were responsible for the design of the study. RKM, CHM, JT and GLM were responsible for the data extraction and RKM, GLM, RR, CHM, JT, MDK, SCR and KSK for the analysis. All authors checked the analysis and were involved in the drafting and critical revision of the manuscript and accept responsibility for the manuscript as published.
As this was a systematic review of published data, ethical approval was not required.
Dr R K Morris is funded by an NIHR Clinical Lectureship. Dr Richard Riley is supported by funding from the MRC Midlands Hub for Trials Methodology Research, at the University of Birmingham (Medical Research Council Grant ID G0800808).
Dr Pradeep Jayaram who helped with some of the data extraction.