Accuracy of first-trimester ultrasound in the diagnosis of early embryonic demise: a systematic review

Authors


Abstract

Objectives

To evaluate, by systematic review of the literature, the accuracy of first-trimester ultrasound in diagnosing early embryonic demise.

Methods

We searched MEDLINE (1951–2011), Embase (1980–2011) and the Cochrane Library (2010) for relevant citations. The reference lists of all known primary and review articles were examined. Language restrictions were not applied. Studies which evaluated the accuracy of first-trimester ultrasonography in pregnant women for the diagnosis of early embryonic demise were selected in a two-stage process and their data extracted by two reviewers. Accuracy measures including sensitivity, specificity and likelihood ratios (LRs) for abnormal and normal test results were calculated for each study and for each test threshold.

Results

Eight primary articles with four test categories (18 2 × 2 tables), involving 872 women, evaluated the accuracy of ultrasound in diagnosing early embryonic demise. The lower limit of the 95% CI for specificity was > 0.95 in only two tests. These were an empty gestational sac with mean diameter of ≥ 25 mm and absent yolk sac with a mean gestational sac diameter of ≥ 20 mm (specificity, 1.00; 95% CI, 0.96–1.00 for both).

Conclusions

There is a paucity of high-quality, prospective data on which to base guidelines for the accurate diagnosis of early pregnancy demise. The findings are limited by the small number of studies and patients, the age of the studies, inclusion of symptomatic and asymptomatic women and variable reference standards for diagnosis of early pregnancy demise. Before guidelines for the safe management of threatened miscarriage can be formulated, there is an urgent need for an appropriately powered, prospective study using current ultrasound technology and an agreed reference standard for pregnancy success or loss. Copyright © 2011 ISUOG. Published by John Wiley & Sons, Ltd.

Introduction

Ultrasound examination is the method of choice in the diagnosis of early embryonic demise1. One in three women miscarries at some time during reproductive life and the incidence of early embryonic demise is high compared with other early pregnancy complications2. The diagnosis of failed pregnancy has implications for further management, with associated emotional impact on the mother. Most of the current recommendations regarding the diagnosis of early embryonic demise arose as a result of the public enquiry report investigating the misdiagnosis of the death of embryos3.

Recommendations regarding the ultrasound criteria for the diagnosis of pregnancy failure in the first trimester vary. The American College of Radiologists (ACR) recommends a diagnosis of early embryonic demise when the embryo has a crown–rump length (CRL) > 5 mm without cardiac activity4. In the UK, the joint report of the Royal College of Obstetricians and Gynaecologists (RCOG) and the Royal College of Radiologists (RCR) recommends using the following criteria to diagnose pregnancy of ‘uncertain viability’: an intrauterine gestational sac of < 20 mm in mean diameter with no obvious yolk sac, or presence of a fetus or fetal echo of < 6 mm CRL with no obvious fetal heart activity5, 6. The Society of Gynaecologists of Canada (SOGC) recommends diagnosis of early embryonic demise with certainty when the mean gestational sac diameter exceeds 8 mm without a yolk sac or when the mean gestational sac diameter exceeds 16 mm without an embryo on transvaginal scan7. The evidence for all of the above recommendations came from very small studies.

We undertook a systematic review of the literature to assess the accuracy of the various ultrasound criteria in diagnosing early embryonic demise.

Methods

This review was carried out with a prospective protocol using well accepted methodology8.

Search strategy

We searched MEDLINE (1951–2011), Embase (1980–2011) and the Cochrane Library (2010) for relevant citations. We used a combination of MeSH and text words to generate two subsets of citations, one indexing ultrasound (‘ultrasound’, ‘ultrasonography’, ‘exp ultrasound’) and the other indexing outcomes (‘miscarriage’, ‘abortion’, ‘pregnancy loss’, ‘early’ AND ‘pregnancy’ AND ‘failure’, ‘fetal OR foetal OR fetus OR foetus’ AND ‘death OR demise’). These two subsets were then combined with ‘AND’ to generate a subset of citations relevant to our research question. The reference lists of all known primary and review articles were examined to identify cited articles not captured by the electronic searches. Language restrictions were not applied. A comprehensive database of relevant articles was constructed.

Study selection

Primary studies which evaluated the accuracy of first-trimester ultrasonography in pregnant women for the diagnosis of early embryonic demise were selected in a two-stage process. We included studies that assessed patients symptomatic or asymptomatic of threatened miscarriage in the first trimester. First, the electronic searches were scrutinized and full manuscripts of all citations that were likely to meet the predefined selection criteria were obtained. Second, final inclusion or exclusion decisions were made by the reviewers (Y.J. and S.T.) after examination of these manuscripts. Studies which met the predefined and explicit criteria regarding population, tests, outcomes and study design were selected for inclusion in the review. When disagreements occurred, they were resolved by consensus (Y.J. and S.T.). In cases of duplicate publication, the most recent and complete version was selected. Subjective ultrasound criteria for diagnosis of early embryonic demise were not included.

From each selected article we extracted information on study characteristics, quality and accuracy results. Accuracy data were used to construct 2 × 2 tables of ultrasound findings and pregnancy outcomes.

Methodological quality assessment

All manuscripts meeting the selection criteria were assessed for their methodological quality. Quality was defined as the confidence that the study design, conduct and analysis minimized bias in the estimation of test accuracy. Based on existing checklists, quality assessment involved scrutinizing the study design and relevant features of the population, test and outcomes of the study. A study was considered to be of good quality if it used a prospective design, consecutive enrolment, full verification of the test result with reference standard, and had adequate description of the test.

Data synthesis

Accuracy measures, including sensitivity, specificity and likelihood ratios (LRs) for abnormal and normal test results, were calculated for each study, separately for each test threshold. Heterogeneity of diagnostic odds ratio was assessed graphically using forest plot and statistically using chi-square test to aid in decisions regarding how to proceed with quantitative synthesis. Because, for some tests and outcomes, there was either graphical or statistically significant heterogeneity, we planned to use random effects model meta-analysis. When a quantitative approach was not appropriate due to significant clinical heterogeneity, we refrained from pooling and described the results narratively and reported the accuracy measures estimated in each study. All statistical analyses were performed using the Meta Disc statistical package9.

Results

From 720 citations, 23 were reviewed in detail. Eight primary articles with four different categories of tests (18 2 × 2 tables) involving 872 women were included in this systematic review10–17 (Figure 1).

Figure 1.

Flow chart of study selection in this systematic review of accuracy of ultrasound examination in diagnosing early embryonic demise.

Clinical characteristics of the included studies

Four primary articles included women symptomatic of threatened miscarriage10–13, three included both symptomatic and asymptomatic women14–16 and one included only asymptomatic women17 in the first trimester. It was not possible to separate the data of asymptomatic women from those who were symptomatic of threatened miscarriage. The sonographic criteria for diagnosis of early embryonic demise included varied CRL measurements with absent cardiac activity, size of empty gestational sac, absent yolk sac with varied gestational sac sizes and combined criteria (Table 1). The reference standard for pregnancy loss included diagnosis of miscarriage on further scan, clinically diagnosed miscarriage, histopathology, failure of embryo development, falling levels of beta-human chorionic gonadotropin (β-hCG) and evaluation of fetal status on second-trimester ultrasound.

Table 1. Clinical characteristics of studies included in this review of accuracy of early pregnancy ultrasound examination in diagnosing early embryonic demise
    Diagnostic test  
StudyStudy type & qualityPop. (n)Inclusion criteriaMachine & operatorSonographic criteriaReference standard for pregnancy success or lossOutcome measures
  1. Continued over.Continued over. CRL, crown–rump length; GS, gestational sac; MSD, mean gestational sac diameter; Pop., population; US, ultrasound examination; YS, yolk sac.

Nyberg et al.10 (1986)Cohort study, consecutive recruitment, retrospective, no blinding of test, blinding of reference standard, follow-up adequate, test insufficiently described, reference standard described168Symptomatic for threatened miscarriageMachine: commercially available real-time sector and linear array systemsNo embryo and MSD ≥ 25 mm; no YS and MSD ≥ 20 mmClinical outcome considered normal if living fetus visualized on subsequent sonogram or if clinical record confirmed normal pregnancy progression; outcome considered abnormal if spontaneous miscarriage or absent growth in subsequent sonograms; pathologic examination of uterine curettings and fall inClinical miscarriage, US showing absent growth or confirmed normal pregnancy progression
    Operator: not described β-hCG considered to 
      indicate abnormal gestation 
Nyberg et al.14 (1987)Cohort study, arbitrary recruitment, prospective, no blinding of test, no blinding of reference standard, follow-up adequate, test insufficiently described, reference standard described83In first trimester with pelvic pain/bleeding/Machine: real-time sector and linear array systems using 3.5- or 5-MHz transducerNo embryo and MSD ≥ 25 mm; no YS and MSD ≥ 20 mmClinical outcome determined by review of medical records and subsequent sonograms; clinical outcome considered abnormal if spontaneous miscarriage or follow-up sonogram demonstrated absent GS growth or absent embryonic development despite adequate (> 14 days) follow-upSpontaneous miscarriage, US showing absent GS/no embryo development after 14 days' follow-up
    confirmation ofOperator: not described   
    pregnancy    
    (symptomatic and    
    asymptomatic)    
Scott et al.11 (1987)Cohort study, consecutive recruitment, prospective, no blinding of test, no blinding of reference standard, follow-up adequate, test insufficiently described, reference standard described102Presenting as threatened miscarriageMachine: high-resolution real-time sector scanner using 3.5- or 5-MHz transducerEmpty GS diameterViability of GS determined by follow-up US and review of clinical records showing successful outcome of pregnancyClinical miscarriage, US showing absent growth or normal growth
    Operator: not described > 26 mm  
Levi et al.15 (1988)Cohort study, consecutive recruitment, prospective, no blinding of test, no blinding of reference standard, follow-up adequate, test insufficiently described, reference standard described55Symptomatic and asymptomatic with < 10-week pregnancyMachine: ESI 1000 or ESI 2000 (Elscint) with 6.5-MHz mechanical sector endovaginal probeNo YS and MSD ≥ 8 mm; no embryo and MSD ≥ 16 mm; no cardiac activity; no embryo or cardiac activity and MSD ≥ 16 mmAll patients except those who opted for termination followed up at least until middle of second trimester; pregnancy considered normal if cardiac pulsation identified on subsequent USClinical miscarriage, US showing absent growth or normal growth at least late in second trimester
    Operator: not described   
Levi et al.16 (1990)Cohort study, consecutive recruitment, retrospective, no blinding of test, no blinding of reference standard, follow-up adequate, test insufficiently described, reference standard described71Symptomatic and asymptomaticMachine: ESI 1000 or ESI 2000 (Elscint) with 6.5-MHz mechanical sector endovaginal probeNo cardiac activity and CRL < 5 mm; no cardiac activity and CRL < 4 mmAll patients followed up until termination of pregnancy or at least until late second trimesterClinical miscarriage or follow-up US showing absent growth or normal growth at least late in second trimester
    Operator: not described   
Ismail and Kishk12 (1991)Cohort study, consecutive recruitment prospective, no blinding of test, no blinding of reference standard, follow-up adequate, test sufficiently described, reference standard described86Symptomatic for threatened miscarriageMachine: SDV3000 abdominal real-time (Philips) with 3.5- and 5-MHz sector transducerEmpty GS largestViability of GS determined by follow-up US after 1–2 weeks; GS considered viable if subsequent US demonstrated live fetusMiscarriage or live fetus on subsequent US
    Operator: both gynecologist and radiologist diameter > 20 mm  
Goldstein17 (1992)Cohort study, arbitrary recruitment, prospective, no blinding of test, no blinding of reference standard, follow-up adequate, test sufficiently described, reference standard described96Positive pregnancy test at first visit (asymptomatic)Machine: Aloka 633 (Corometrics) with 5-MHz vaginal probe or Siemens SL1 with 5- or 7.5-MHz vaginal probeNo cardiac activity and CRL < 4 mm; no cardiac activity and CRL < 5 mm; no cardiac activity and CRL > 5 mm; no cardiac activity and CRL > 6 mmAll women followed up until delivery or completion of failed pregnancySubsequent miscarriage or delivery of healthy newborn
    Operator: simultaneous   
     observations by   
     physician and nurse   
Tongsong et al.13 (1994)Cohort study, consecutive, prospective, no blinding of test, no blinding of reference standard, follow-up adequate, test sufficiently described, reference standard described211Symptomatic for threatened miscarriageMachine: real-time Aloka 650 with 5-MHz transvaginal probeEmpty GS MSD ≥ 17 mm; no YS and MSD ≥ 13 mmClinical outcome considered normal if living fetus visualized on subsequent US or if clinical records confirmed normal pregnancy progression; GS considered non-viable if subsequent US demonstrated absent growth or spontaneous miscarriageMiscarriage or live fetus in subsequent US or in medical records
    Operator: single   
     sonographer   

Study quality

Six of the eight (75%) primary articles were prospective cohort studies which recruited consecutive women (Figure 2). In no study was the operator blinded to the test and in only one study was the operator blinded to ascertainment of reference standard. The test was described in sufficient detail in three of the eight (37%) primary studies. The follow-up was adequate in all studies.

Figure 2.

Quality of primary studies (n = 8) included in this review evaluating accuracy of ultrasound examination in diagnosing early embryonic demise. equation image, Yes; equation image, No.

Accuracy of ultrasound

The sensitivity of the included studies for early pregnancy demise ranged from 14% to 100% (Table 2). The highest sensitivity (1.00; 95% CI, 0.54–1.00) was observed in studies using the sonographic criteria of absence of embryo or cardiac activity in gestational sac with a mean diameter of or above 16 mm15.

Table 2. Accuracy of early pregnancy sonographic features in diagnosing early embryonic demise
Sonographic criteriaStudy (year)Sensitivity (95% CI)Specificity (95% CI)LR+ (95% CI)LR− (95% CI)FPR
  1. CRL, crown–rump length; FPR, false-positive rate; LR+, positive likelihood ratio; LR−, negative likelihood ratio; MSD, mean gestational sac diameter.

No cardiac activity 
 With CRL < 4 mmGoldstein17 (1992)0.50 (0.01–0.99)0.59 (0.41–0.75)1.21 (0.28–5.14)0.85 (0.21–3.50)0.41
 Levi et al.16 (1990)0.64 (0.41–0.83)0.82 (0.63–0.94)3.56 (1.52–8.38)0.44 (0.25–0.79)0.18
 With CRL < 5 mmGoldstein17 (1992)0.50 (0.12–0.88)0.65 (0.48–0.79)1.43 (0.58–3.53)0.77 (0.34–1.77)0.35
 Levi et al.16 (1990)0.65 (0.45–0.81)0.88 (0.73–0.96)5.16 (2.19–12.20)0.41 (0.25–0.66)0.12
 With CRL > 5 mmGoldstein17 (1992)0.50 (0.12–0.88)1.00 (0.90–1.00)36.00 (2.08–622.64)0.51 (0.24–1.07)0
 With CRL > 6 mmGoldstein17 (1992)0.50 (0.07–0.93)1.00 (0.87–1.00)27.00 (1.51–482.2)0.51 (0.21–1.23)0
Empty gestational sac (GS) 
 Diameter > 26 mmScott et al.11 (1987)0.46 (0.35–0.56)1.00 (0.69–1.00)10.05 (0.66–152.18)0.57 (0.45–0.71)0
 MSD ≥ 25 mmNyberg et al.14 (1987)0.20 (0.08–0.39)1.00 (0.93–1.00)22.65 (1.32–388.50)0.80 (0.66–0.96)0
 Nyberg et al.10 (1986)0.29 (0.19–0.40)1.00 (0.96–1.00)50.17 (3.10–811.70)0.71 (0.62–0.82)0
 Largest diameter > 20 mmIsmail and Kishk12 (1991)0.81 (0.70–0.89)0.57 (0.29–0.82)1.88 (1.02–3.48)0.34 (0.18–0.65)0.43
 MSD ≥ 17 mmTongsong et al.13 (1994)0.50 (0.43–0.58)1.00 (0.88–1.00)31.17 (1.99–489.13)0.51 (0.43–0.59)0
 MSD ≥ 16 mmLevi et al.15 (1988)0.50 (0.12–0.88)1.00 (0.88–1.00)30.00 (1.74–516.92)0.51 (0.24–1.07)0
Absent yolk sac 
 With MSD ≥ 20 mmNyberg et al.14 (1987)0.14 (0.04–0.33)1.00 (0.93–1.00)16.76 (0.93–300.55)0.85 (0.73–1.00)0
 Nyberg et al.10 (1986)0.41 (0.30–0.52)1.00 (0.96–1.00)70.64 (4.40–1133.7)0.59 (0.50–0.71)0
 With MSD ≥ 13 mmTongsong et al.13 (1994)0.96 (0.92–0.99)1.00 (0.69–1.00)21.12 (1.41–316.93)0.04 (0.02–0.10)0
 With MSD ≥ 8 mmLevi et al.15 (1988)0.67 (0.38–0.88)1.00 (0.92–1.00)59.06 (3.67–951.17)0.35 (0.18–0.69)0
Combined criteria 
 No cardiac activity in GS MSD ≥ 16 mmLevi et al.15 (1988)1.00 (0.54–1.00)1.00 (0.88–1.00)55.71 (3.54–877.02)0.07 (0.01–1.05)0
 No embryo or cardiac activity in GS MSD ≥ 16 mmLevi et al.15 (1988)1.00 (0.54–1.00)1.00 (0.88–1.00)55.71 (3.54–877.02)0.07 (0.01–1.05)0

Fourteen of the 18 evaluations had specificity > 90% in diagnosing early pregnancy demise (Figure 3, Table 2). The lower limit of the 95% CI for specificity was > 0.95 for only two sonographic criteria: an empty gestational sac with mean diameter ≥ 25 mm (specificity, 1.00; 95% CI, 0.96–1.00) and absent yolk sac with mean gestational sac diameter ≥ 20 mm (specificity, 1.00; 95% CI, 0.96–1.00).

Figure 3.

Sensitivity and specificity of first-trimester ultrasound criteria in diagnosing early embryonic demise. CRL, crown–rump length; MSD, mean sac diameter.

The positive LR for the sonographic diagnosis of early embryonic demise was > 10 in 13/18 (72.2%) of the studies and the negative LR was < 0.1 in 3/18 (16.7%). Table 2 provides the accuracy estimates sensitivity, specificity and positive and negative LRs and false-positive rates for various cut-off levels of ultrasound features.

Discussion

For the diagnosis of early embryonic demise there are various ultrasound criteria used which have relatively high specificity and poor sensitivity. The ultrasound features of an empty gestational sac with mean diameter ≥ 25 mm and absent yolk sac with mean gestational sac diameter ≥ 20 mm were the thresholds with the highest and most precise estimates of specificity for diagnosing early embryonic demise.

Specificity versus sensitivity

Most pregnancy screening tests, such as Down syndrome or gestational diabetes screening, strive for optimal sensitivity whilst tolerating a low false-positive rate. With threatened early pregnancy loss, it is imperative to have a highly specific test with a zero false-positive rate, as the diagnosis of early embryonic demise leads to evacuation of the uterus. While it would be ideal to have both a highly sensitive and highly specific test for early pregnancy loss, it is critical to realize that a false-positive diagnosis of early embryonic demise is likely to result in inadvertent termination of pregnancy. Positive results from highly specific tests would rule in a diagnosis of early pregnancy demise (Specific, Positive, In = SpPIn)18. The application of this rule and test performance in the diagnosis of early pregnancy demise will be affected by the following: spectrum bias of different stages of pregnancy demise, differential verification bias due to varied application of different reference standards for confirmation of the diagnosis and the small number of included studies18. The power to rule in a diagnosis of early pregnancy demise will depend on both specificity and sensitivity.

Ultrasound criteria for diagnosis of early pregnancy demise

This review has identified a number of studies and varied criteria for the identification of inevitable early pregnancy demise. An empty gestational sac of ≥ 25 mm diameter and a missing yolk sac with a gestational sac diameter of ≥ 20 mm appear to be the most accurate thresholds for the diagnosis of early embryonic demise, with an estimated specificity of 1.00. However, it should be noted that both thresholds had a 95% CI of 0.96–1.00, indicating that up to four in every 100 diagnoses may be a false-positive one. Although other criteria may have equally high specificities reported, all of these studies involved very few patients.

There was a fair amount of inconsistency among the reported specificities. For instance, empty gestational sacs ≥ 16 mm and ≥ 17 mm were reported to have a high specificity in two studies13, 15, but the use of a more conservative threshold (> 20 mm) gave very poor specificity for the diagnosis of early embryonic demise12. None of the studies evaluated the reproducibility and repeatability of these early pregnancy measurements. Furthermore, only half had access to an endovaginal probe. The latter two criteria are relevant when considering the very small measurements that are taken to make the diagnosis of early pregnancy demise. In clinical practice, operator error is common and it is not unusual for small embryos to be missed by inexperienced examiners. The reviewed studies did not stipulate if a policy of two independent examiners confirming the diagnosis was undertaken routinely.

Reference standards

The main problem with these studies is the use of varied reference standards. The only conclusive criterion to diagnose miscarriage is documentation of spontaneous expulsion of histologically confirmed pregnancy tissue or the finding of retained products in the uterus in a woman with previous evidence of intrauterine gestational sac on ultrasound examination. The patient follow-up in such studies needs to be of sufficient length to allow a conclusive diagnosis of early pregnancy demise to be made. This will depend on many factors, such as presumed gestational age, growth of gestational sac, presence of intragestational sac structures and presence of an embryo. It is possible that some abnormal pregnancies will develop much more slowly than do normal pregnancies and that the heartbeat will eventually appear at a much later gestation than normal. Finally, a biochemical threshold, such as a decline in serum hCG on follow-up measurements, may be a powerful predictor of miscarriage, but does not preclude laboratory error or a physiological drop in hCG late in the first trimester. Most studies did not use rigorous standards for the diagnosis of early pregnancy demise, relying rather on medical chart reviews or subsequent ultrasound thresholds to ‘confirm’ early pregnancy loss, thereby potentially biasing the results.

Conclusion

This review is the first to comprehensively collate evidence of the role of ultrasound in diagnosing early embryonic demise. The review is strengthened by its broad search strategy without language restrictions and assessment of quality of the included studies. The findings were limited by the small number and poor quality of the studies, small number of patients evaluated and heterogeneity in the tests and outcome assessment. Most studies were also conducted some two decades ago and changes in ultrasound technology and the introduction of transvaginal probes would have affected the accuracy and generalizability of the findings. An appropriately powered study using current ultrasound technology, a transvaginal approach and the appropriate reference standard for pregnancy success or loss is urgently required before setting future standards for the accurate diagnosis of early embryonic demise. In order for these studies to be successful, however, a consensus about an appropriate methodological approach should be reached before embarking on further projects.

Ancillary