- Top of page
In the last decade, ultrasound has become accepted as the method of choice for the initial assessment of women with suspected early pregnancy complications. With the liberal use and easy access to ultrasound in early pregnancy, missed miscarriage has become the most common diagnosis in women suffering early pregnancy loss1, 2. Despite its frequent occurrence, there is no general consensus on the most reliable criteria to diagnose missed miscarriage. In cases with a visible embryonic pole, the absence of cardiac activity enables a conclusive diagnosis to be reached at the initial visit3. However, when the embryo cannot be seen on the ultrasound scan, the differential diagnosis includes normal early intrauterine pregnancy less than 6 weeks' gestation and early fetal demise, sometimes referred to as anembryonic pregnancy or empty sac. Erroneous diagnoses of early fetal demise can occur in this situation, which may lead to termination of a wanted healthy pregnancy. Reports of such diagnostic errors prompted a national enquiry in the UK a few years ago, which was led by the Board of the Faculty of Clinical Radiology of the Royal College of Radiologists and the Council of the Royal College of Obstetricians and Gynaecologists4 (RCOG). The diagnostic guidelines, which were issued as a result, state that no ultrasound diagnosis of early fetal demise should be made at the initial visit. At least one additional examination after a minimum of 7 days has to be performed in order to establish the final diagnosis. However, in cases of an empty sac < 15 mm in diameter, the guidelines require that a follow-up scan is organized 2 weeks later. Although this policy minimizes the risk of diagnostic errors, the need for one or two weekly follow-up visits may cause prolonged anxiety for the woman and her partner and lead to repeated hospital visits, which increase the costs and workload in early pregnancy units.
The aim of this study was to establish whether by combining clinical information, ultrasound findings and serum biochemistry, the differentiation between early viable pregnancies and early embryonic demise could be improved to allow a conclusive diagnosis to be reached at the initial visit.
- Top of page
This was a prospective observational study of 208 consecutive women with an ultrasound finding of a gestational sac < 20 mm mean sac diameter, which did not contain an embryo. This cut-off was chosen in line with the Royal College of Radiologists and the RCOG guidelines, which state that the diagnosis of early embryonic demise can be made when an embryo is not visible in a gestational sac measuring > 20 mm in size4. All women were referred for assessment by their general practitioners or hospital consultants. The indications for referral were: suspected early pregnancy complications 163 (78%), dating scan 35 (17%) and past history of ectopic pregnancy 10 (5%). The additional inclusion criteria were: spontaneous conception, single intrauterine gestational sac and no history of exogenous progestogen use in current pregnancy. The study protocol was approved by the Research Ethics Committee for King's College Hospital, London, UK, and all women gave their informed consent.
Our dedicated Early Pregnancy Assessment Unit serves a racially mixed, inner city population with a high level of socio-economic deprivation. All women had a positive urine pregnancy test (Clearview HCG II™, Unipath, Bedford, UK). The test is a monoclonal-based antibody test which, according to the manufacturer's specifications, has a sensitivity of 99% at a urine beta-human chorionic gonadotropin (β-hCG) level greater than 25 IU/L.
A full history was documented and clinical examination carried out by the attending physician. A transvaginal ultrasound scan was then performed using a 5-MHz probe (Aloka SSD-5000, Aloka Co. Ltd, Tokyo, Japan). All women had a blood sample taken to measure serum β-hCG (β-hCG, World Health Organization, Third International Reference 75/537) and progesterone levels using an automated radioimmunoassay technique (Immuno 1™, Bayer Diagnostics, Basingstoke, UK). The results of these investigations were available within 2 to 6 h.
All women were invited for a follow-up scan in 1 to 2 weeks. A diagnosis of miscarriage was made if the gestational sac did not increase in size on follow-up scans or if the embryo failed to develop. In addition, miscarriage was also diagnosed in women with history of bleeding when the previously detected gestational sac was not visible on the subsequent scan. A diagnosis of viable intrauterine pregnancy was made only when an embryo with clearly visible cardiac activity was seen on subsequent scans.
A database was established and the data recorded included maternal age, date of last menstrual period, the presence or absence of vaginal bleeding (expressed as bleeding score 1 or 0), mean gestational sac diameter (calculated from measurements taken in three orthogonal planes) and the serum levels of progesterone and β-hCG. All statistical analyses were carried out using SPSS Version 10 (SPSS Inc., Chicago, IL, USA). The outcomes were dichotomized into viable and non-viable pregnancy categories. Comparison of means of continuous variables was performed using Mann–Whitney U-tests or Student's t-tests depending on data distribution. Proportions were compared using the Yates-corrected χ2 test. A value of P < 0.05 was considered statistically significant.
Multivariate logistic regression analysis was performed with pregnancy viability as the dependent variable. Six independent variables were used for model construction and included maternal age, gestational age, pregnancy sac diameter, serum progesterone level, serum β-hCG level, and presence or absence of bleeding (coded 1 and 0, respectively). All variables except the last one were continuous. The objective of the model-building process was to obtain a ‘good fit’ for the training data, with the least number of independent variables. The regression equation was derived by the forward stepwise selection of variables using the likelihood ratio test for determining which variables to include in the model (the thresholds for inclusion and exclusions were P < 0.05 and P > 0.10, respectively). Using these criteria, three variables were found to be independent with statistically significant coefficients. However, one of these variables (serum progesterone level) was found not to conform to a linear gradient after inspection of interactions using the Box—Tidwell transformation. Further analysis revealed that conformity to the linear gradient could be achieved by using the natural logarithm of the progesterone level in the model-building process instead of the progesterone level itself. Interactions between each of the three independent variables were sought and found to be absent. The goodness of fit for the model was tested using the Hosmer and Lemeshow test. A non-significant value for P (0.978) suggested a favorable goodness of fit.
Substitution into the regression model with actual values for each case allowed calculation of the probability of viability for each individual. Receiver—operating characteristics (ROC) curves were then constructed to describe the relationship between the sensitivity and false-positive rate for different values of these probabilities and also for the raw value of serum progesterone. The ideal cut-off for predicting viability was derived from the ROC curve.
- Top of page
Two hundred and eight women with an empty intrauterine gestational sac < 20 mm in size were identified on ultrasound scan. Data sets were incomplete in eight cases, which were excluded from further analysis. Of the remaining 200 women, 118 (59%) had a normal intrauterine pregnancy and 82 (41%) had a miscarriage on follow-up scans. The average length of follow-up was 14 (range, 1–33) days until the diagnosis was reached. In women with the final diagnosis of miscarriage at follow-up visits, 23 women (28%) had a spontaneous complete miscarriage, 23 (28%) had an incomplete miscarriage and 36 (44%) had a missed miscarriage.
There were significant differences in maternal age, gestational age, incidence of bleeding, gestational sac size and serum progesterone levels between women with viable and non-viable pregnancies at the initial visit (Table 1). The regression equation was then derived using forward stepwise selection of variables. Maternal age, gestational sac diameter and serum progesterone were found to be independent with statistically significant coefficients, and were therefore included into a logistic regression model. The natural logarithm of serum progesterone levels was used to achieve conformity to linear gradient. The probability of pregnancy being viable was then calculated using the formula:
where z = (6.091 × lnprogesterone) − (0.159 × sac diameter) − (0.164 × maternal age) − 17.435.
Table 1. Comparison of measured variables in viable and non-viable pregnancies
|Variable||Viable pregnancies (n = 118)||Non-viable pregnancies (n = 82)||P|
|Maternal age (years)*||29.3 (6.2)||32.3 (7.4)||< 0.01|
|Gestational age (days)*||42.8 (9.8)||59.8 (16.2)||< 0.01|
|Vaginal bleeding (%)†||34.7||76.8||< 0.01|
|Gestational sac diameter (mm)‡||6.8 (4.2–8.3)||10.7 (6.0–15.8)||< 0.01|
|β-hCG (IU/L)‡||3974 (1661–8638)||3556 (1000–11083)||> 0.05|
|Progesterone (nmol/L)‡||84 (62–109)||31 (19–41)||< 0.01|
With this model, at a cut-off value of 10% probability, the diagnosis of viable pregnancy was made with a sensitivity of 99.2% (95% CI, 95.8–99.97) and a specificity of 70.7% (95% CI, 61.3–78.9). A comparison of ROC curves showed that the logistic regression performs significantly better than all individual parameters except serum progesterone (Figure 1, Table 2). In order to ensure that no cases of viable pregnancy were wrongly classified as non-viable, the cut-off value of probability has to be decreased to 1%. At this level the sensitivity of 100% (95% CI, 97.5–100.0) is reached, but the specificity decreases to 43.9% (95% CI, 34.6–53.6). An almost identical result could be achieved by using serum progesterone at a cut-off level of 25 nmol/L. Viable pregnancies could be diagnosed with a sensitivity of 100% (95% CI, 96.8–100) and specificity of 40.2% (95% CI, 31.1–50.0).
Figure 1. Receiver-operating characteristics (ROC) curves demonstrating the logistic regression model (----), serum progesterone (), gestational age (······), sac diameter (–··–··–), maternal age (––––) and serum human chorionic gonadotropin (hCG) (–·–·–) in their ability to predict pregnancy viability
Download figure to PowerPoint
Table 2. Comparisons of the diagnostic accuracy of the logistic regression model and individual diagnostic variables for the prediction of early pregnancy viability
|Variable||Area under the curve||Standard error||P|
|Gestational age||0.83||0.0316||< 0.01|
|Gestational sac diameter||0.7032||0.04||< 0.01|
|Maternal age||0.6283||0.0408||< 0.01|
- Top of page
This study shows significant differences in demographic, clinical, biochemical and ultrasound parameters between early viable and non-viable pregnancies in women with an undetectable embryo on ultrasound scan. As expected, women who suffered miscarriage were older than those with normal pregnancies5. In addition, vaginal bleeding was more common and gestational age at the time of first visit was greater in non-viable pregnancies. However, almost half of all pregnancies complicated by bleeding proceeded normally, whilst the gestational age is notoriously difficult to ascertain in women with early pregnancy complications6. It is therefore not surprising that neither parameter was found to contribute significantly to the logistic regression model.
The gestational sac diameter, however, was found to be of diagnostic value. The measurement of gestational sac is reproducible and its size correlates with the length of gestation7. Although a large empty gestational sac is one of the main criteria for the diagnosis of miscarriage, in the present study, which was limited to a diameter < 20 mm, there was a considerable overlap between viable and non-viable pregnancies. It was therefore impossible to establish a particular cut-off to discriminate reliably between normal pregnancies and miscarriages. However, the size of the sac significantly contributed to the accuracy of the logistic model. In this study we did not evaluate the validity of using a sac diameter > 20 mm to diagnose miscarriage4. This cut-off level has been widely used and it is believed that it includes a high safety margin. However, ultrasound is an operator-dependent method and it is conceivable that an inexperienced operator may fail to detect an embryo in a relatively large sac due to a poor examination technique. Other previous studies have suggested that a mean diameter > 18 mm in a woman with an empty sac on the scan may be sufficient to diagnose miscarriage8. The present results, however, do not support this theory as two women with viable pregnancies presented initially with an empty sac measuring > 18 mm.
The measurement of serum hCG and progesterone has been used in the past in an attempt to discriminate between various pregnancy complications9, 10. Although there is some evidence that measuring serum hCG is helpful in the diagnosis of an ectopic pregnancy11, 12, most studies have concluded that it cannot discriminate between viable and non-viable intrauterine pregnancies13. This is not surprising bearing in mind the wide range of hCG levels recorded in normal pregnancy14 and the long half-time in blood following pregnancy demise. In addition, miscarriages may occur at any time during the first trimester, which further contributes to the variability of hCG measurements. Even in the present group of patients, which were defined by the small size of the gestational sac, hCG levels were not significantly different between viable and non-viable pregnancies.
Early studies on the use of progesterone in early pregnancy complications focused on its potential value in the diagnosis of ectopic pregnancy. However, the results were not encouraging15, 16. This may be explained by a wide spectrum of ectopic pregnancy presentations, which range from cases resembling healthy intrauterine pregnancies to spontaneous tubal miscarriages17. It has been suggested that in early pregnancy the production of progesterone by the corpus luteum is dependent on the slope of hCG increase in serum10. If this theory was correct, serum progesterone should be a measure of pregnancy viability, rather than an indication of its location. A number of studies have been published recently which showed that serum progesterone levels can be used to assess pregnancy viability18, 19. However, the studies differ significantly in their design, study populations and inclusion criteria. Most importantly, a number of studies included patients who conceived following stimulation of ovulation or those receiving luteal support, both of which may affect serum progesterone levels. There have been only a few studies that assessed the value of progesterone in very early gestations. Riss et al.20 examined 71 women with a positive urine pregnancy test who were found to have an empty sac on ultrasound scan. The lowest recorded serum progesterone among the 23 women with normal pregnancies was 12 ng/mL (38.4 nmol/L). Using a threshold of > 15 ng/L (48.0 nmol/L) they could diagnose normal pregnancy with a sensitivity of 87% and specificity of 83%. Hahlin et al.21 examined a group of women with very early pregnancies, which could not be detected on ultrasound scan. In 73 pregnancies which progressed normally the lowest progesterone level was 28.8 nmol/L. Previous studies of a similar patient population in our center found that in 57 normal early pregnancies the lowest serum progesterone was 28 nmol/L2, 22.
These results are in agreement with the findings in the present study, which showed no viable pregnancies with progesterone < 25 nmol/L. The only pregnancy which presented with progesterone < 30 nmol/L was found to be chromosomally abnormal and the patient opted for termination of pregnancy. If the present results could be confirmed in a prospective study, low serum progesterone may be used to establish the conclusive diagnosis of an abnormal pregnancy in women with an empty sac. This would identify approximately 40% of non-viable pregnancies at the initial visit and thus significantly reduce the need for follow-up. However, caution is needed before this approach is adopted in clinical practice as there are isolated reports in the literature of apparently normal pregnancies with progesterone levels < 15.9 nmol/L15. However, most of these cases come from retrospective studies, which included a wide range of gestations, and it is uncertain whether these results are applicable to the population of women with anembryonic sacs.
With the use of the logistic regression model, the diagnosis is based on the examination of multiple variables. Although serum progesterone is the single most powerful predictor of pregnancy outcome, the gestational sac diameter and maternal age both contribute to the accuracy of the logistic model. As a result the logistic model discriminates slightly better than does serum progesterone alone between normal and abnormal pregnancies. In addition, the model gives a numerical probability of a pregnancy being viable, which improves counseling of women. Medical professionals and patients alike are often under the impression that diagnosis of most medical conditions can be established with absolute accuracy, which is rarely the case. The use of logistic regression expresses the degree of diagnostic uncertainty and gives the opportunity to patients and physicians to decide at what level of probability the treatment for the presumed medical condition may be initiated.
However, when applying the model in clinical practice it is important that the population under observation is similar to that used in the present study. The second problem is that the model is more complex to use than traditional tests and necessitates the use of an electronic calculator or computer. However, computers are becoming increasingly available in outpatient clinics and they can be easily programmed to perform the necessary calculations. To allow for differences between ultrasound operators and biochemistry laboratories the initial implementation of the model should be carefully audited. This would enable the necessary adjustments to be made to define the optimal cut-offs for each individual unit.