- Top of page
- Supporting Information
The ability to accurately determine the volumes of the gestational sac, yolk sac and the embryo enables the creation of gestational age-related centiles that may be used as the basis for predicting adverse pregnancy outcome. The use of three-dimensional (3D) ultrasonography has facilitated accurate volume estimation that has been confirmed in many organ systems, either in vitro or in vivo1, 2, and it has also been found to be superior to two-dimensional (2D) ultrasound volumetry3. However, some researchers still use incorrect methodology to estimate gestational age-specific reference intervals for embryo measurements4. It is recommended that data from each pregnancy are included once only, as cross-sectional data, in the development of reference intervals for fetal size5. This has not been the case with published reference intervals for first-trimester volumetric measurements6–8. Longitudinal studies may be used to produce reference intervals for fetal size and fetal growth9, however, appropriate methodology has not always been employed8. Figueras et al.10 stated that they used appropriate methodology in constructing centile curves for yolk sac volume (YSV) and gestational sac volume (GSV), but these centiles were not presented in their paper.
We therefore aimed to produce valid reference intervals for first-trimester GSV, YSV and embryo volume (EV) using 3D transvaginal ultrasonography. In addition, centiles of gestational sac diameter (GSD) and crown–rump length (CRL) were constructed for the same reference population. The centiles of CRL were compared with previously published centiles11, and the relationships between CRL and EV, and CRL and GSV were analyzed.
- Top of page
- Supporting Information
This cross-sectional observational study was performed in the Early Pregnancy Assessment Unit at St Mary's Hospital, London, UK. The local research ethics committee approved the study. Letters were sent to local general practitioners inviting women with a positive pregnancy test to participate in this study in the first trimester of pregnancy. The women were informed that the study involved an ultrasound examination for confirmation of their pregnancy and 3D ultrasound to record the GSV, YSV and EV. Inclusion criteria were: healthy women without any medical disorders, non-smokers, a singleton pregnancy, regular menses without hormonal contraception for at least three cycles before conception and a precise date of their last menstrual period or known date of embryo transfer in assisted reproduction. The gestational age was calculated by the modified Naegele's rule. Last menstrual period-derived gestational age was compared with ultrasound-derived gestational age using CRL12 and if there was a marked discrepancy of 2 weeks or more then the woman was excluded from the study. In addition, those women with uncertain dates or early pregnancy loss were also excluded from the study. Informed consent was obtained from all participants.
The study was performed with a 7.5-MHz transvaginal probe using a Combison 530D ultrasound machine (Kretztechnik AG, Zipf, Austria). Initially, using conventional 2D transvaginal ultrasound imaging, the CRL and GSD were each recorded. The CRL was recorded as an average of three measurements, which were obtained from separate printed images to prevent the examiner seeing the results of the previous measurement and preventing the introduction of bias in subsequent measurements. The 2D measurements of the gestational sac included the maximum transverse diameter (D1) in the transverse plane and the maximum anteroposterior and longitudinal diameters (D2 and D3) in the sagittal plane. The average of the three measurements was recorded as the GSD.
The gestational sac was visualized again in the sagittal plane and the region of interest was selected using the volume box. The patient was then asked to hold her breath and, with the vaginal probe held stationary, the volume data were generated by the automatic rotation of the transducer crystal through 180° for 5–20 s. The scanned region was displayed on the screen in the three orthogonal planes (transverse, sagittal and coronal) after volume acquisition, and the examiner confirmed that the entire gestational sac was contained in the acquired volume scan. All scans were performed by a single examiner (J. S. B.). The volume scans were stored on 540 MB–1.3 GB Philips or Sony hard discs with an integrated magneto-optical drive for later measurement and analysis. Determination of GSV, YSV and EV was performed by sequentially viewing and tracing each structure in one of the three orthogonal planes at 1–2 mm intervals using the contour mode. During this procedure the gestational sac was magnified on the screen as much as possible to minimize measurement error. The computer software automatically calculated the volume from the measured circumferences and distances between them.
Intra- and interobserver reliability of measurement of GSV, YSV and EV was tested on a random selection of 15 pregnancies of between 6 and 12 weeks' gestation. Each observer (J. S. B. and V. K.) performed two measurements of GSV, YSV and EV on separate occasions using the stored volume and were unaware of each other's results until completion of the study.
All the women were referred to the St Mary's antenatal clinic and the antenatal and labor data were recorded postdelivery.
Statistical analysis was performed using STATA version 9 (StataCorp, College Station, TX, USA). The gestational age-related reference intervals were obtained using the recommendations of Royston and Wright4. Least-squares regression analysis was used to determine the mean curves as polynomial functions of exact gestational age measured in days. A number of different models were explored for each measurement. The selection of the final model depended on the appearance of the curve and its goodness of fit, particularly at the tails of the distribution. Another important criterion was the simplicity of the model. As recommended by Royston and Wright4, a cubic polynomial was initially fitted to the data. If the cubic coefficient was not significantly different from zero, a quadratic polynomial was used and the quadratic coefficient assessed. This process was repeated until all the coefficients in the model were significantly different from zero. Very small coefficients that contributed little to the model were dropped in favor of simpler models. Fitted values from the most appropriate polynomial regression curve of the desired measurement were used to predict the mean for each gestational age. Similarly, in determining the curves for the SD, a polynomial or linear model was selected depending on the most appropriate fit for the scaled absolute residuals plotted against gestational age. The appearance of the model with its mean and SD curves was checked by examining the scatter patterns of points (SD scores) relative to ± 1.645. The normality of the SD score was assessed using the Shapiro–Wilk W-test and a normal plot. Once the final model had been determined, the 5th, 50th and 95th centiles were calculated by substituting the expressions for the mean and SD into the equation:
where K = −1.645, 0, + 1.645.
Tables were then prepared for the 5th, 50th and 95th centiles and a scatter plot with reference intervals was generated. Where negative centile values were obtained for early gestational age, the model-fitting process was repeated using a cubic polynomial of a logarithmic transformation of the measurement. If none of the models met the required criteria, the same process was repeated using a modified (shifted) logarithmic transformation of the measurement (log(x) + m). The Shapiro–Wilk W-test was used to determine the most appropriate value of m. When all refinements of the model failed to produce centiles meeting the required criteria, quantile regression was used to determine more appropriate centiles13. The quantile regression model was assessed using three criteria13: no negative values; that 10% of observed values lay above the 90th centile and below the 10th centile; and that these values were scattered randomly across the gestational age range. The quantile regression equations were used to calculate the 5th, 50th and 95th centiles and a scatter plot with reference intervals was constructed.
Intraobserver variation was calculated as the difference between the first and second measurements by one observer. Interobserver variation was calculated using the mean measurement for each observer. The mean difference and SD are reported. The intra- and interobserver variation were also expressed as the intraclass correlation coefficient (ICC). A random effects model was used to estimate the ICC for consistency. Inter- and intraobserver agreement were assessed following the methods described by Bland and Altman14.
- Top of page
- Supporting Information
One hundred and seventy-five healthy pregnant women at between 6 and 12 weeks' gestation responded to our letter of invitation for a first-trimester ultrasound scan at the Early Pregnancy Assessment Unit. Nine women were excluded from the study following their ultrasound examination: two had an anembryonic pregnancy, two had early embryonic demise and five women had incorrect dates. The menstrual dates (ultrasound CRL dates in brackets) of these five women were: 7 + 1 (9 + 0), 8 + 4 (12 + 1), 11 + 1 (14 + 1), 11 + 2 (13 + 2) and 12 + 2 (14 + 5) weeks.
One hundred and sixty-six women at between 6 and 12 weeks' gestation who met the eligibility criteria were enrolled and completed the study. None of these women sustained a miscarriage or stillbirth and no infants had any congenital abnormalities. The mean age ( ± SD) of the women was 29.4 ( ± 5) years, mean gestational age at delivery was 39.3 ( ± 1.4) weeks and the mean birth weight was 3.3 ( ± 0.4) kg. Ninety-nine women (59.6%) were nulliparous and 67 women (40.4%) were parous (range, para 1 to para 4).
Details of the reference equations derived are given below. Medians and 5th and 95th centiles for each measurement by weeks of gestational age are shown in Table 1. Scatter plots of each measurement against gestational age, with the modeled centiles, are presented in Figure 1.
Figure 1. Scatter plots with 5th, 50th and 95th centiles of crown–rump length (CRL) (a), gestational sac diameter (GSD) (b), gestational sac volume (GSV) (c), yolk sac volume (YSV) (d) and embryo volume (EV) (e) against gestational age.
Download figure to PowerPoint
Table 1. Calculated 5th, 50th and 95th centiles for crown–rump length (CRL), gestational sac diameter (GSD), gestational sac volume (GSV), yolk sac volume (YSV) and embryo volume (EV) according to gestational age (GA)
| ||CRL (mm)||GSD (mm)||GSV (mm3)||YSV (mm3)||EV (mm3)|
A least squares cubic model was shown to be the best model for the reference intervals of CRL in relation to gestational age (GA):
For the SD, a linear model showed the most appropriate fit:
A normal probability plot of the Z-scores showed the scores lying close to a straight line. The Shapiro–Wilk W-test was not significant (P = 0.06), thus the assumption of normality could not be rejected. In addition, the Z-scores were randomly scattered around zero. The numbers observed above the 90th centile, 13 (7.8%), and below the 10th centile, 15 (9.0%), were close to the expected value of 10%.
Centiles by days of gestation are provided in Table S1. Figure 2 and Table S2 present a comparison of our reference curve for median CRL with that published by Robinson and Fleming11. The mean difference in CRL across the gestational age range studied was −0.14 (range, −1.24 to 3.84).
Figure 2. Comparison of the reference curve for median crown–rump length (CRL) against gestational age obtained in the present study () with that published by Robinson and Fleming11 (_____).
Download figure to PowerPoint
Gestational sac diameter
A linear model provided the best fit to the GSD data in relation to GA:
For the SD, a linear model also showed the most appropriate fit:
A normal probability plot of the Z-scores showed the scores lying close to a straight line. The Shapiro–Wilk W-test was not significant (P = 0.05), thus the assumption of normality could not be rejected. In addition, the Z-scores were randomly scattered around zero. In this case 13 (7.8%) points were below the 10th centile and 15 points (9.0%) were above it.
Gestational sac volume
The best model for GSV was a modified logarithmic transformation of the form log (GSV + 9). A linear function provided a good fit to the transformed values:
When the predicted values were back-transformed to calculate the centiles, we obtained a negative value of −0.6 for the 5th centile at 6 weeks. Quantile regression was then used to estimate more appropriate centiles. The quantile regression models that best fit the data are:
The model fitted satisfactorily as 17 observations (10.2%) fell below the 10th centile and 18 (10.8%) fell above the 90th percentile. The data points lying outside the reference interval were spread throughout the range.
Yolk sac volume
Measurements were available in 145 of the 166 pregnancies. A modified log transformation, log (YSV + 0.05), was used to model the data. A quadratic model provided the best fit:
For the SD, a linear model showed the most appropriate fit:
A normal probability plot of the Z-scores showed them lying close to a straight line. The Shapiro–Wilk W-test was not significant (P = 0.14), thus the assumption of normality could not be rejected. In addition, the Z-scores were randomly scattered around zero. The model fitted satisfactorily as 18 observations (12.4%) fell below the 10th centile and 15 (10.3%) lay above the 90th centile. The data points lying outside the reference interval were spread throughout the range.
The best model for EV was a modified logarithmic transformation of the form log (EV + 0.15). A linear function provided a good fit to the transformed values:
When the predicted values were back-transformed to calculate the centiles, we obtained a negative value of −0.049 for the 5th centile at 6 weeks. Quantile regression was then used to estimate more appropriate centiles:
The model fitted satisfactorily as 15 observations (9.0%) were below the 10th centile and 15 (9.0%) were above the 90th centile. The data points lying outside the reference interval were spread throughout the range.
Table 2 presents the intra- and interobserver variation of the 3D volume measurements. The volume measurements of GSV, YSV and EV showed high levels of intra- and interobserver agreement.
Table 2. Intra- and interobserver variation of three-dimensional volume measurements of gestational sac volume (GSV), yolk sac volume (YSV) and embryo volume (EV)
| GSV (mm3)||0.325||0.8||(−2.6 to 0.63)||0.99|
| YSV (mm3)||0.004||0.01||(−0.009 to 0.03)||0.97|
| EV (mm3)||0.075||0.23||(−0.12 to 0.80)||0.99|
| GSV (mm3)||0.285||1.05||(−3.6 to 1.17)||0.99|
| YSV (mm3)||0.002||0.01||(−0.024 to 0.015)||0.98|
| EV (mm3)||0.021||0.13||(−0.12 to 0.36)||0.99|
Table 3 shows the CRL and EV in relation to fetal gender. There was no difference in gestational age between the males and females (P = 0.4). Therefore overall measurement means and SDs are reported. There were no statistical differences between males and females in CRL (P = 0.2) or in EV (P = 0.2).
Table 3. Crown–rump length (CRL) and embryo volume (EV) according to fetal gender
| ||Male||Female|| |
There was a very strong correlation (r = 0.94) between CRL and EV (Figure 3a). For CRL measurements between 20 and 70 mm the relationship was linear. There was also a very strong correlation (r = 0.95) between CRL and GSV (Figure 3b).
- Top of page
- Supporting Information
To the best of our knowledge, this is the first study in the literature that has used 3D transvaginal ultrasound to derive reference intervals of first-trimester GSV, YSV and EV using accepted methodology; the currently available reference intervals for these parameters are based on incorrect methodology6–8. The use of appropriate methodology is crucial, as inaccurate centiles may lead to incorrect decisions regarding embryonic/fetal development, resulting in substandard clinical care15.
In the collection of data specifically for the purpose of developing centiles for size, Altman and Chitty5 recommend that each fetus be included once only in the study. This was not the case with recently published reference intervals of first-trimester volumetric measurements6–8. Babinski et al.6 created nomograms using 73 measurements obtained from 49 pregnancies between the 25th and 65th days post-ovulation. Although they recorded values from 4 + 2 weeks of gestational age, they developed their reference intervals from 5 + 4 weeks to 9 + 2 weeks6.
Gadelha et al.7 only studied a total of 25 fetuses in a longitudinal prospective study and measured each fetus on four occasions, during the 8th, 9th, 10th and 11th weeks of pregnancy. In our study, the gestational age was recorded precisely to the day, rather than rounding it off to the number of completed weeks. This impacts on the creation of the reference intervals, resulting in the mean and SD not changing smoothly with gestational age as one would expect on a biological basis4. In our study the mean and SD were modeled using the exact gestational age, resulting in smooth reference interval curves. Gadelha et al.7, using 3D ultrasound, and Weissman et al.16, using 2D ultrasound, did not take this into account and presented their data rounded to each gestational week in the development of their centile charts.
In the construction of reference intervals requiring a 90% range between the 5th and 95th centiles of the distribution, a sample size of 20 per week is recommended17. The volumetric assessment by Aviram et al.8 is limited as they studied between 10 and 14 fetuses per week from 6 to 11 weeks and only three fetuses at 12 weeks. In addition, their reference intervals were not obtained from a pregnant population with normal fetal outcome, as all 72 women recruited to their study underwent a termination of pregnancy. In the present study, we were able to recruit an adequate number of women between the 6th and 12th gestational weeks. This was, however, not possible for the fifth week of pregnancy, possibly owing to the fact that many women have a home pregnancy test when their menses are delayed by a week and only see their doctor in most instances after the 5th week, or will call the practice nurse during their 5th week of amenorrhea. It would have been only then that their general practitioner or practice nurse would have offered them the opportunity to contact one of the authors (J. S. B.) to have a pregnancy confirmation scan and additionally to determine the volume of the early pregnancy structures.
Prior to the construction of the volume reference intervals, we developed reference intervals for CRL for this group of healthy women with a normal pregnancy outcome. We averaged the CRL obtained from three different satisfactory measurements because a single measurement may estimate gestational age with an SD of ± 4.7 days, whereas an average of three different measurements may reduce the SD to ± 2.7 days11. Our values for CRL fitted a standard Gaussian distribution, confirmed by the plot of Z-scores against gestational age, and were similar to the reported normal values11. Bland and Altman have argued that averaging repeated measurements for each subject may lead to narrower centiles at any given gestational age than had they been constructed from single measurements18. Reference intervals derived from single CRL measurements may give widened centiles because they include a greater amount of measurement error. However, the impact is probably minimal in clinical practice. It has been reported previously that a sex difference in CRL was demonstrated from 8 weeks onward, with male embryos having an average measurement 2 mm greater than female embryos at the same gestational age19. However, we found no difference in CRL or in EV between male and female embryos.
Our formula for median GSV gives values that increase from 4.4 mm3 at 6 weeks to 114.9 mm3 at 12 weeks. Our estimation of the GSV differs from the first reported reference intervals in the literature20. Robinson20 performed measurements using 2D ultrasound imaging and used the mathematical formula of a sphere to calculate volumes. He also used transabdominal ultrasound with a full bladder, in contrast to our measurements, which were performed transvaginally using a 7.5-MHz probe. Robinson20 found that the mean GSV increased exponentially from 1 mm3 at 6 weeks to 31 mm3 at 10 weeks and then in a more linear manner to 100 mm3 at 13 weeks. He also found that the two-SDs limits increased considerably with gestational age and concluded that volume measurements would allow a prediction of gestational age of no better than ± 9 days, and were therefore of lesser value than CRL in measuring gestational age. In our study GSV increased in an exponential manner between 6 and 12 weeks' gestation.
In estimating GSV, we included the amniotic fluid, the extraembryonic celom and the fetus, as did Robinson20. We found that the embryo occupied 0.9% of the GSV at 6 weeks and 16.8% at 12 weeks. In 30 pregnancies in the first trimester Weissman et al.16, using 2D ultrasound measurements and the formula of an ellipsoid (V = 4/3π r1r2r3), found that the embryo occupied between 5 and 16% of the sac volume, and in their subsequent analysis of 95 pregnancies they therefore ignored the estimated EV when calculating the amniotic fluid volume.
We obtained YSV in 145 of the 166 pregnancies in our study. The yolk sac is seen in all pregnancies from 5 weeks' gestation onwards, when the gestational sac exceeds 11 mm in diameter21. Its diameter increases in size up to 11 weeks and then decreases22. The decreased vascularity of the yolk sac at the time of its maximum volume is proposed as the cause of its degeneration and disappearance23. In our study, the reference intervals for first-trimester YSV increased in a linear fashion up to 10 weeks, then maintained a plateau until 11 weeks and decreased thereafter, similar to the findings of Kupesic et al.23.
In conclusion, we have presented new reference intervals for the volumes of the gestational sac, yolk sac and embryo in the first trimester of pregnancy using 3D ultrasound. Our approach has followed a rigorous methodology as prescribed by previous authors4, 5, 15.
SUPPORTING INFORMATION ON THE INTERNET
The following supporting information may be found in the online version of this article:
Table S1 Mean and standard deviation of crown–rump length for each day of menstrual age from 6–12 weeks
Table S2 Comparison of our formula for mean crown–rump length against that published by Robinson and Fleming11