To determine whether the use of a sex-specific sonographic model improves the accuracy of fetal weight estimation.
To determine whether the use of a sex-specific sonographic model improves the accuracy of fetal weight estimation.
New regression models (sex-independent and sex-specific) were developed, based on 1708 sonographic weight estimations performed within 3 days prior to delivery. The accuracy of these models was compared to that of several published models including two of the original Hadlock models (which incorporate the biometric indices abdominal circumference (AC), biparietal diameter (BPD), femur diaphysis length (FL) and head circumference (HC) as follows: AC-FL-BPD and AC-FL-HC, designated here as Hadlock I and Hadlock II, respectively), modified versions of the Hadlock I and II models for which coefficients were adjusted to our local cohort, sex-specific versions of the Hadlock I and II models and Schild's model (a previously published sex-specific model).
The unadjusted models of Hadlock and Schild were associated with the highest systematic error (1.6–4.9%; P < 0.001) which was significantly higher for females (2.3–4.9%) compared to males (1.6–2.0%; P < 0.001). Adjustment of model coefficients to the local population decreased the systematic error (−1.4% to 1.5%) and resulted in a systematic error that was of similar magnitude (P = 0.3) but opposite in direction for male and female fetuses. The sex-specific models (adjusted or newly developed) were associated with the lowest systematic error (−0.4 to 0.5%) and were the only models for which the systematic error was similar for male and female fetuses. There were no differences in the systematic error between adjusted sex-specific versions of the Hadlock I and II models and the newly developed sex-specific models (0.0% to 0.4% vs. − 0.4% to 0.5%; P = 0.4). The random error was similar for all models and, for most of the models, was unrelated to fetal sex.
The use of sex-specific models appears to improve the accuracy of fetal weight estimation, principally because the optimal set of model coefficients differs for male and female fetuses. The improved accuracy is mainly the result of a decrease in systematic error, as the random error was not affected by the use of such sex-specific models. Copyright © 2012 ISUOG. Published by John Wiley & Sons, Ltd.
It has been suggested previously that differences exist in the accuracy of sonographic weight estimation between male and female fetuses1, and that these differences may be the result of sex-specific intrauterine growth patterns2–8, including sex-related differences in body composition and percent of body fat6, and in ratios among various biometric indices8. Indeed, we have recently reported that sonographic weight estimation was consistently more accurate for male than female fetuses, independent of the model used9.
It may be reasonable to hypothesize that the use of two distinct sex-specific models, optimized for male and female fetuses, may overcome this limitation. However, the impact of such sex-specific models on the accuracy of fetal weight estimation is as yet unclear. Surprisingly, none of the widely accepted sonographic models for fetal weight estimation10–15 includes fetal sex in the equation, and the results of only one study indicate that the use of such a sex-specific model may result in more accurate estimation than several widely used models7, 16, 17. Moreover, it is unknown whether such sex-related model optimization merely reflects a different set of sex-specific model coefficients or whether these sex-specific models also differ in the combination of biometric indices incorporated into the model.
Thus, the aim of the present study was to determine whether the use of a sex-specific sonographic model improves the accuracy or precision of fetal weight estimation, and to provide a better understanding of the reasons for sex-related differences in the accuracy of fetal weight estimation.
A retrospective cohort study design was used. Data were collected from a comprehensive database of sonographic examinations in a single tertiary center. The same database was utilized and described in several previous studies from our group that addressed other aspects of the accuracy of fetal weight estimation, including comparison of the accuracy of different sonographic models18, evaluation of the effect of the sonographic model and threshold on the diagnosis of macrosomia19, the accuracy of sonographic measurement of head circumference compared with postnatal measurements20, the effect of fetal presentation on the accuracy of weight estimation21 and the accuracy of fetal weight estimation as a function of fetal sex9. The findings of the latter study formed the basis for the current study, as mentioned in the Introduction. Routine sonographic evaluations include standard fetal biometry measurements (abdominal circumference (AC), femur diaphysis length (FL), biparietal diameter (BPD) and head circumference (HC)); the findings are saved directly to the database. Antenatal data, gestational age at delivery, fetal sex and actual birth weight (BW) were obtained from the hospital's perinatal database. The study protocol was approved by the local Institutional Review Board.
The database was searched for all sonographic fetal weight estimations performed within 3 days prior to delivery between the years 2002 and 2008. Inclusion criteria for the study were livebirth singleton pregnancy, birth weight > 500 g, gestational age > 24 weeks and absence of fetal malformations or hydrops. Pregnancies complicated by diabetes, active labor or ruptured membranes, as well as cases in which not all four biometric indices were recorded were excluded.
The study population was divided randomly into two distinct groups of similar size. The first (model-generation group) was used to generate the new sonographic models and the second (model-evaluation group) was used to evaluate the accuracy of the new models and to compare them with that of several widely accepted models.
Stepwise multivariable linear regression analysis was used to develop new sonographic models. The natural logarithm of actual birth weight was defined as the dependent variable, while the independent variables included the four routine biometric indices (AC, FL, BPD and HC), the products of these indices (AC × FL, AC × BPD, AC × HC, FL × BPD and FL × HC), as well as additional variables derived from these indices including X2, X3, X1/2, X1/3, log10X, lnX, eX and 10X (X representing each of the four biometric indices). A forward selection procedure was used to obtain the best-fit model. The cut-off value for selection or removal of covariates was set at P = 0.05. Sex-specific models were generated, based on either the male or female subgroup of the model-generation group. Since these newly developed sex-specific models may be found to be superior to other models evaluated in this study, solely due to the fact that they were derived from the same population used in the evaluation phase, we also developed new sex-independent models based on the entire model-generation group to control for this potential bias. From the several models generated, the two sex-independent models and two sex-specific models with the highest R2 values were chosen for further evaluation (Table 1). The first models (new sex-independent model 1 and new sex-specific model 1) were generated using the complete repertoire of independent variables (including simple biometric indices and the products, power, log and exponential transformations of the biometric indices, as described above). The second model in each group (new sex-independent model 2 and new sex-specific model 2) were generated by trying to give higher priority to one or more independent variables, using different blocks in the linear regression procedures. This was repeated using many variations, and models with the highest R2 values were selected as the second-best model.
|Model||Reference or description||Remark||Equation||R2*|
|1||Hadlock I (AC-FL-BPD)12||Original model||Log10 EFW = 1.335–0.0034(AC)(FL) + 0.0316(BPD) + 0.0457(AC) + 0.1623(FL)||N/A|
|2||Hadlock I; sample-specific||With local coefficients||Log10 EFW = 1.14577–0.0046(AC)(FL) + 0.03263(BPD) + 0.05278(AC) + 0.19336(FL)||0.927|
|3||Hadlock I; sample-specific, sex-specific||With local sex-specific coefficients||Males: Log10 EFW = 1.19384–0.00449(AC)(FL) + 0.02920(BPD) + 0.05165(AC) + 0.19337(FL)||0.918|
|Females: Log10 EFW = 1.10382–0.00477(AC)(FL) + 0.03043(BPD) + 0.05447(AC) + 0.19922(FL)||0.937|
|4||Hadlock II (AC-FL-HC)12||Original model||Log10 EFW = 1.326–0.00326(AC)(FL) + 0.0107(HC) + 0.0438(AC) + 0.158(FL)||N/A|
|5||Hadlock II; sample-specific||With local coefficients||Log10 EFW = 1.14266–0.00446(AC)(FL) + 0.01036(HC) + 0.05153(AC) + 0.18904(FL)||0.927|
|6||Hadlock II; sample-specific, sex-specific||With local sex-specific coefficients||Males: Log10 EFW = 1.18113–0.0044(AC)(FL) + 0.00931(HC) + 0.05094(AC) + 0.19005(FL)||0.918|
|Females: Log10 EFW = 1.10743–0.00462(AC)(FL) + 0.00961(HC) + 0.05297(AC) + 0.19532(FL)||0.937|
|7||Schild et al.7; sex-specific||Original model||Males: EFW = 43576.579 + 1913.853(log10BPD) + 0.01323(HC3) + 55.532(AC2) − 13602.664(AC1/2) − 0.721(AC3) + 2.31(FL3)||N/A|
|Females: EFW = − 4035.275 + 1.143(BPD3) + 1159.878(AC1/2) + 10.079(FL3) − 81.277(FL2)|
|8||New sex-independent model 1||Local model, overall cohort||Ln EFW = 2.40075–0.00785(AC)(FL) + 0.09008(HC) + 0.12413(AC) + 0.34804(FL) − 0.14107(BPD) − 0.00225(AC)(HC) + 0.00549(AC)(BPD)||0.929|
|9||New sex-independent model 2||Local model, overall cohort||Ln EFW = 2.27131 − 0.00888(AC)(FL) + 0.08034(HC) − 0.00101(HC2) + 0.10783(AC) + 0.32646(FL) + 0.00619(FL)(BPD)||0.928|
|10||New sex-specific model 1||Local model, sex-specific||Males: Ln EFW = 2.33375 − 0.00483(AC)(FL) + 0.06371(HC) + 0.12748(AC) + 0.25083(FL) + 0.04006(BPD) − 0.00150(AC)(HC)||0.920|
|Females: Ln EFW = 2.64735 − 0.01161(AC)(FL) + 0.01366(HC) + 0.11794(AC) + 0.47650(FL) + 0.00117(AC)(BPD)||0.938|
|11||New sex-specific model 2||Local model, sex-specific||Males: Ln EFW = 2.21686 − 0.00824(AC)(FL) + 0.09304(HC) − 0.00122(HC2) + 0.10260(AC) + 0.31613(FL) + 0.00566(FL)(BPD)||0.919|
|Females: Ln EFW = 2.65191 + 0.10405(AC) − 0.00088(AC2) + 0.53721(FL) + 0.01398(HC) − 0.03125(FL2) + 0.00220(BPD2)||0.938|
The newly developed models were compared to several previously published models (Table 1) including: 1) the original Hadlock models (AC-FL-BPD and AC-FL-HC, designated here as Hadlock I and II models, respectively)12, which were previously shown to be highly accurate as compared with other published models18; 2) sample-specific modified versions of the Hadlock I and II models with coefficients adjusted to our local cohort by linear regression analysis (these models were included to control for potential bias due to differences between our population and that used to generate these models; 3) sex-specific versions of the Hadlock I and II models with sex-specific coefficients (determined by linear regression analysis) that are based on the local cohort (these models were included to aid interpretation of results and determine whether the sex-related model optimization reflects merely a different set of sex-specific model coefficients or whether the optimal sex-specific models also differ in the combination of biometric indices incorporated into the model; and 4) a sex-specific model previously developed and evaluated by Schild et al.7 and Siemer et al.17 (this model was developed in a manner similar to our models and was included to assess the effect of differences in characteristics of the local population and the population used to generate the model).
The performance of each of the 11 models (Table 1) was assessed using the following measures of accuracy: 1) systematic error (mean of (EFW − BW) BW × 100) (reflecting the systematic deviation of a model from actual birth weight, expressed as % of actual birth weight); 2) random error (standard deviation of the systematic error × 100) (a measure of precision (rather than accuracy) reflecting the random (or non-systematic) component of the prediction error); and 3) fraction of estimates within 10% of actual birth weight.
In addition, in order to determine whether the accuracy of sex-specific models is related to the birth-weight subgroup, we calculated the systematic and random errors of the various models for three different subgroups of birth-weight ranges. The subgroups were defined as birth weight below the first quartile (< 25%), birth weight in the interquartile range (25–75%), and birth weight above the third quartile (> 75%). The quartiles were derived from the study cohort and calculated separately for male and female fetuses (first quartile, 2899 g for males and 2675 g for females; third quartile, 3788 g for males and 3607 g for females).
Gestational age at the time of examination was recorded in the database along with details of sonographic examination and was calculated according to the last menstrual period and/or first trimester ultrasound results when available. Gestational age at the time of examination was further verified by comparing the examination date with delivery date and gestational age at delivery (obtained from the perinatal database). All sonographic fetal weight estimations were performed in our ultrasound unit by senior physicians specialized in ultrasonography or by experienced ultrasound technicians. In the latter case, the examination was reviewed and confirmed by a senior physician. The BPD was measured from the proximal echo of the fetal skull to the proximal edge of the deep border (outer–inner) at the level of the cavum septi pellucidi. The HC was measured as an ellipse around the perimeter of fetal skull22. The AC was measured in the transverse plane of the fetal abdomen at the level of the umbilical vein in the anterior third and the stomach bubble in the same plane; measurements were taken around the perimeter23. The FL was measured in view of the full femoral diaphysis and was taken from one end of the diaphysis to the other, not including the distal femoral epiphysis24. Routine sonographic examination does not include determination of fetal sex; thus, sonographers were blinded to fetal sex during the process of sonographic weight estimation.
Data analysis was performed with SPSS v15.0 software (SPSS Inc., Chicago, IL, USA). The independent-samples Student's t-test was used to compare normally distributed continuous variables; the Mann-Whitney U-test was used for non-normally distributed continuous variables (gestational age at delivery and interval between sonographic evaluation and delivery); and the chi-square test was used for categorical data. The accuracy of each model was compared between male and female fetuses using the independent Student's t-test for the systematic error, Levene's test (equality of variance) for the random error and the chi-square test for the fraction of weight estimations within 10% of birth weight.
Different models were compared (separate comparisons performed for male and female fetuses) using the paired Student's t-test for the systematic error, Pitman's test (equality of correlated variance) for the random error and McNemar's test for the fraction of estimates within 10% of actual birth weight. The one-sample t-test was used to assess whether systematic errors were significantly different from zero. Differences were considered significant at P < 0.05. For intermodel comparisons, the new sex-specific model 2 (model 11) was used as reference, and Bonferroni corrections were used to control for multiple comparisons as necessary, maintaining an overall type I error rate of 0.05.
A total of 3416 fetal weight estimations (1708 males, 1708 females) met the inclusion criteria. The cohort was randomly divided into two groups of equal size and with the same proportion of male and female fetuses, which were used for model-generation and model-evaluation, respectively (Table 2). The model-generation and model-evaluation groups were similar with regard to all characteristics (Table 2). In both groups male fetuses, compared with female fetuses, were characterized by a higher birth weight and a higher rate of a high mean cephalic index (Table 2).
|Model-generation group (n = 1708)||Model-evaluation group (n = 1708)|
|Characteristic||Males (n = 854)||Females (n = 854)||P†||Males (n = 854)||Females (n = 854)||P†||P*|
|Maternal age (years)||30.2 ± 5.3||30.5 ± 5.2||0.3||30.2 ± 5.0||30.2 ± 4.8||0.8||0.5|
|Nulliparity||372 (43.6)||342 (40.0)||0.2||361 (42.3)||340 (39.8)||0.3||0.6|
|Gestational age at delivery (weeks)||38.8 ± 2.5||38.7 ± 2.5||0.6||38.9 ± 2.4||38.8 ± 2.4||0.8||0.4|
|Delivery at < 37 weeks||126 (14.8)||114 (13.3)||0.5||113 (13.2)||111 (13.0)||0.9||0.4|
|Breech presentation||29 (3.4)||33 (3.9)||0.6||36 (4.2)||36 (4.2)||1.0||0.4|
|Days from fetal weight estimation to delivery||1.2 ± 1.0||1.2 ± 1.0||0.6||1.2 ± 1.0||1.2 ± 1.0||0.9||0.8|
|Fetal weight estimated:|
|On day of delivery||239 (28.0)||220 (25.8)||0.3||238 (27.9)||226 (26.5)||0.5||0.9|
|1 day prior to delivery||327 (38.3)||335 (39.2)||0.7||324 (37.9)||347 (40.6)||0.3||0.8|
|2 days prior to delivery||174 (20.4)||184 (21.5)||0.6||172 (20.1)||173 (20.3)||0.9||0.6|
|3 days prior to delivery||114 (13.3)||116 (13.6)||0.9||120 (14.1)||108 (12.6)||0.4||0.9|
|Birth weight (g)||3286 ± 708||3111 ± 700||< 0.001||3281 ± 708||3114 ± 670||< 0.001||0.9|
|Birth weight > 4000 g||118 (13.8)||67 (7.8)||< 0.001||105 (12.3)||59 (6.9)||< 0.001||0.2|
|Cephalic index (CI)||79.7 ± 26.3||79.4 ± 28.4||0.8||80.1 ± 36.0||78.4 ± 3.2||0.09||0.9|
|CI < 75||106 (12.4)||111 (13.0)||0.6||82 (9.6)||105 (12.3)||0.06||0.1|
|CI = 75–80||443 (51.9)||483 (56.6)||0.07||467 (54.7)||504 (59.0)||0.1||0.1|
|CI > 80||305 (35.7)||260 (30.4)||0.02||305 (35.7)||245 (28.7)||0.003||0.7|
Systematic error was highest for the original unadjusted models (Hadlock I (Model 1), Hadlock II (Model 4, for females) and Schild's (Model 7)) and these differences were statistically significant when using the new sex-specific model 2 (Model 11) as reference (1.6% to 4.9% vs. − 0.4% to 0.5% (Table 3; Figure 1)). Systematic error was lower for the sample-specific sex-independent Hadlock models (Models 2 and 5) and the new sex-independent models (Models 8 and 9), ranging from − 1.4% to 1.5%. The adjusted sex-specific Hadlock models (Models 3 and 6) and the new sex-specific models (Models 10 and 11) were associated with even smaller systematic errors (−0.4% to 0.5%) and were the only models for which the systematic error was not statistically different from zero (Table 3, Figure 1). There were no differences in systematic error between the adjusted sex-specific versions of the Hadlock I and II models and the newly developed sex-specific models (0.0% to 0.4% vs. − 0.4% to 0.5%; P = 0.4).
|Systematic error (%, mean (95% CI))||Random error (%, mean (95% CI))||EFW within 10% of BW (95% CI) (%)|
|1: Hadlock I (AC-FL-BPD)||1.6§ (1.0–2.1)||3.9§ (3.3–4.5)||< 0.001||8.2 (7.9–8.5)||8.2 (7.9–8.5)||0.3||76.6††(75.2–78.0)||75.3††(73.7–76.9)||0.04|
|2: Hadlock I; sample-specific||− 1.2 (−1.7 to 0.7)||1.4 (0.9–2.0)||< 0.001||7.7 (7.4–8.0)||7.8 (7.5–8.1)||0.6||80.1 (78.7–81.5)||82.7 (81.3–84.1)||0.2|
|3: Hadlock I; sample-specific, sex-specific||0.1¶ (−0.5 to 0.6)||0.4¶ (−0.1 to 1.0)||0.3||7.8 (7.5–8.1)||7.7 (7.4–8.0)||0.4||81.0 (79.6–82.4)||83.3 (81.9–84.7)||0.2|
|4: Hadlock II (AC-FL-HC)||− 0.6 (−1.1 to 0.1)||2.3§ (1.7–2.8)||< 0.001||8.1 (7.8–8.4)||8.0 (7.7–8.3)||0.3||80.0 (78.6–81.4)||80.1 (78.6–81.6)||0.9|
|5: Hadlock II; sample-specific||− 1.4 (−1.9 to 0.9)||1.4 (0.9–2.0)||< 0.001||7.8 (7.5–8.1)||7.7 (7.4–8.0)||0.7||80.6 (79.2–82.0)||83.7 (82.3–85.1)||0.1|
|6: Hadlock II; sample-specific, sex-specific||0.0¶ (−0.5 to 0.5)||0.3¶ (−0.3 to 0.8)||0.5||7.8 (7.5–8.1)||7.6 (7.3–7.9)||0.4||82.4 (81.0–83.8)||84.4 (83.0–85.8)||0.3|
|7: Schild's; sex-specific||2.0§ (1.2–2.7)||4.9§ (4.0–5.8)||< 0.001||10.8** (10.3–11.2)||12.4** (11.9–12.9)||0.03||68.7††(67.1–70.3)||65.1††(63.3–66.9)||0.1|
|8: New sex-independent model 1||− 1.3 (−1.9 to 0.8)||1.3 (0.7–1.8)||< 0.001||7.7 (7.4–8.0)||7.7 (7.4–8.0)||0.7||80.1 (78.7–81.5)||83.9 (82.5–85.3)||0.06|
|9: New sex-independent model 2||− 1.1 (−1.6 to 0.6)||1.5 (0.9–2.0)||< 0.001||7.8 (7.5–8.1)||7.8 (7.5–8.1)||0.6||80.1 (78.7–81.5)||83.3 (81.9–84.7)||0.1|
|10: New sex-specific model 1||− 0.4¶ (−1.2 to 0.1)||0.2¶ (−0.3 to 0.8)||0.2||7.7 (7.4–8.0)||7.6 (7.3–7.9)||0.5||81.3 (79.9–82.7)||84.3 (82.9–85.7)||0.1|
|11: New sex-specific model 2||− 0.4¶ (−1.1 to 0.0)||0.5¶ (−0.1 to 1.0)||0.08||7.8 (7.5–8.1)||7.7 (7.4–8.0)||0.5||81.3 (79.9–82.7)||84.1 (82.7–85.5)||0.1|
With regard to differences between male and female fetuses, in all the original unadjusted models (i.e. Hadlock I (Model 1), Hadlock II (Model 4) and Schild's (Model 7)) the systematic error was significantly higher for females than males (Table 3; Figure 1). In the case of the sample-specific Hadlock sex-independent models (Models 2 and 5) and the new sex-independent models (Models 8 and 9) the systematic error was still significantly different between males and females (P < 0.001) and was consistently negative for male fetuses (−1.1% to − 1.4%) and positive for female fetuses (1.3 to 1.5%), reflecting underestimation and overestimation of fetal weight for males and females, respectively (Table 3). However, when considering the magnitude of systematic error (i.e. the absolute value of the systematic error), there were no differences between males and females using these sample-specific and new sex-independent models (P = 0.3–0.5, Figure 1). The use of sex-specific models (modified Hadlock models (Models 3 and 6) and the new models (Models 10 and 11)) eliminated sex-related differences in the systematic error (Table 3; Figure 1).
The random error demonstrated only minimal variation (7.6–8.2%), was similar for all models and was unrelated to fetal sex, with the exception of Schild's sex-specific model (Model 7), for which the overall random error was significantly large (P < 0.001, using Model 11 as reference) and was higher for female than male fetuses (Table 3). Similarly, the fraction of estimates within 10% of actual birth weight was relatively similar for most models (80.0–84.3%) (Table 3).
In order to determine whether the advantages of sex-specific models are related to birth weight, we performed subgroup analysis in which systematic and random errors of the various models were calculated for three different birth-weight subgroups (below the first quartile, in interquartile range, and above the third quartile, as described in the Methods section) (Table 4). In general, weight estimations in the first-quartile subgroup were associated with a stronger tendency for overestimation of birth weight (i.e. more positive systematic error) while weight estimations in the third-quartile subgroup were associated with a tendency for underestimation of birth weight (i.e. more negative systematic error) (Table 4).
|Birth weight < lower quartile* (n = 398)||Birth weight in interquartile range* (n = 801)||Birth weight > upper quartile* (n = 399)|
|Model||Males (n = 213)||Females (n = 185)||P†||Males (n = 428)||Females (n = 373)||P†||Males (n = 213)||Females (n = 186)||P†|
|1: Hadlock I (AC-FL-BPD)||1.4 ± 9.5||4.8 ± 10.1§||< 0.001||2.2 ± 8.1§||4.4 ± 7.6§||< 0.001||− 0.4 ± 6.8‡§||2.0 ± 7.0§||< 0.001|
|2: Hadlock I; sample-specific||1.0 ± 9.0‡§||4.2 ± 9.4§||< 0.001||− 0.6 ± 7.3‡§||1.8 ± 6.8§||< 0.001||− 4.6 ± 6.0§||− 2.3 ± 6.2§||< 0.001|
|3: Hadlock I; sample-specific, sex-specific||2.7 ± 9.1§||3.0 ± 9.4||0.7||0.6 ± 7.2‡§||0.9 ± 6.8||0.6||− 3.6 ± 6.0§||− 3.1 ± 6.1||0.4|
|4: Hadlock II (AC-FL-HC)||−0.4 ± 9.6‡§||3.4 ± 9.5§||< 0.001||0.4 ± 7.7‡§||2.6 ± 7.6§||< 0.001||− 2.2 ± 6.9§||0.3 ± 6.9‡§||< 0.001|
|5: Hadlock II; sample-specific||0.4 ± 9.4‡§||4.3 ± 9.0§||< 0.001||− 0.7 ± 7.1§||1.8 ± 7.0§||< 0.001||− 4.5 ± 6.3§||− 2.1 ± 6.2§||< 0.001|
|6: Hadlock II; sample-specific, sex-specific||2.2 ± 9.4§||3.0 ± 9.0||0.4||0.6 ± 7.1‡§||0.7 ± 6.9‡||0.9||− 3.4 ± 6.3§||− 3.2 ± 6.1||0.8|
|7: Schild's; sex specific||14.3 ± 10.9§||20.1 ± 12.4§||< 0.001||2.5 ± 5.9§||3.2 ± 5.7§||0.1||− 8.5 ± 6.5§||− 7.4 ± 6.3§||0.1|
|8: New sex-independent model 1||0.5 ± 9.2‡§||4.1 ± 9.2§||< 0.001||− 0.7 ± 7.2‡§||1.6 ± 7.0§||< 0.001||− 4.6 ± 6.0§||− 2.2 ± 6.1§||< 0.001|
|9: New sex-independent model 2||0.8 ± 9.4‡§||4.2 ± 9.2§||< 0.001||− 0.5 ± 7.1‡§||1.8 ± 7.0§||< 0.001||− 4.3 ± 6.1§||− 2.0 ± 6.2§||< 0.001|
|10: New sex-specific model 1||1.7 ± 9.2||2.8 ± 9.1||0.2||0.0 ± 7.1‡||0.6 ± 6.8‡||0.2||− 4.2 ± 6.0||− 3.1 ± 6.1||0.1|
|11: New sex-specific model 2||1.7 ± 9.5||2.9 ± 9.1||0.2||0.0 ± 7.1‡||0.9 ± 6.9||0.1||− 4.0 ± 6.0||− 2.9 ± 6.2||0.1|
Similar to the findings for the overall cohort, systematic error was significantly higher for female than for male fetuses in the original and the sample-specific models, independent of birth-weight subgroup (Table 4). Furthermore, as observed for the overall cohort, the use of sample-specific sex-specific models (modified Hadlock models (Models 3 and 6) and the new models (Models 10 and 11)) eliminated these sex-related differences in the systematic error in each of the three birth-weight subgroups (Table 4).
With respect to relative accuracy of the different models, the sample-specific sex-specific models were associated with the lowest systematic error in the interquartile-birth-weight subgroup (Table 4), similar to findings in the overall cohort (Table 3). Such a beneficial effect of the sample-specific sex-specific models was not clearly observed in the other two birth-weight subgroups (i.e. below the first quartile and above the third quartile).
In the present study we sought to determine whether the use of a sex-specific sonographic model improves the accuracy of fetal weight estimation, as well as to provide a better understanding of the reasons for sex-related differences in the accuracy of fetal weight estimation. Our study revealed several key findings: 1) unadjusted published models are associated with the highest systematic error which is significantly higher for female than male fetuses; 2) adjustment of model coefficients to the local population decreases systematic error and results in a systematic error of similar magnitude but opposite in direction for male and female fetuses; 3) sex-specific (adjusted or newly developed) models are associated with the lowest systematic error and are the only models for which systematic error is statistically insignificant and similar for male and female fetuses (the latter observation being independent of birth-weight subgroup); and 4) random error (a measure of precision rather than accuracy) is unrelated to the type of model and to fetal sex.
Considering the sex-related differences in intrauterine growth patterns2–8 and in accuracy of sonographic weight estimation7, 9, it is reasonable to assume that the use of two distinct sex-specific models which are optimized for male and female fetuses will improve the accuracy of fetal weight estimation. However, this hypothesis has been tested in only a small number of studies7, 16, 17. Schild et al.7 developed a sex-specific model based on the results of 527 sonographic weight estimations which was subsequently found to be associated with the lowest mean absolute percentage error (6.8%) and the second best systematic error (−0.5 ± 8.6) when compared with 10 widely accepted sex-independent models17. Similarly, in the current study, using a large cohort of unselected women who underwent sonographic evaluation in a single tertiary center within 3 days prior to delivery, we were able to confirm our hypothesis and to demonstrate that sex-specific (adjusted or newly developed) models are associated with the lowest systematic error and are the only models that eliminate the sex-related differences in systematic error.
One possible explanation for the apparently higher accuracy of the newly developed sex-specific models may actually be unrelated to sex-specificity but rather to the fact that these new models were derived from the same population that was subsequently used in the evaluation phase, a potential bias that has not been addressed in previous studies in which sex-specific models were developed1, 17. In order to control for this potential bias, we have compared the new sex-specific models to sex-independent models that have been developed from the same population, as well as to modified versions of the Hadlock models I and II for which coefficients were adjusted to the study population, as has been previously described by Lee et al.25. Indeed, these control models demonstrated that adjustment of model coefficients to the local population significantly improves the accuracy of weight estimation. However, it was also clearly evident that the use of sex-specific models further improves the accuracy, independent of the adjustment of model coefficients to the local population.
It is unclear whether the higher accuracy achieved with sex-specific models is simply the result of a different set of model coefficients, adjusted to each of the sexes, or because the combination of biometric indices that provides the optimal fit to birth weight is different for males and females. The fact that there were no differences in systematic error between the sample-specific sex-specific versions of the Hadlock I and II models and the newly developed sex-specific models supports the former explanation, since the combination of biometric indices was identical for the male and female versions of the sample-specific sex-specific Hadlock I and II models. Nevertheless, in the process of development of the new models, we found that the optimal models for males and females differed with regard to the combination of fetal biometric indices (Table 1), although this conclusion may be limited by the potential inherent multicollinearity in these models (due to the fact that some of the indices and their transformations are highly correlated with each other).
The reason for the poor performance of Schild's sex-specific model in the current study is unclear. Possible explanations include: differences in the study population (c. 10% of the women in the study of Schild et al.1 had either gestational or pregestational diabetes and 30–40% delivered at < 37 weeks of gestation); the inclusion of weight estimations performed within up to 8 days of delivery (a considerable period during which significant fetal growth may occur); and the fact that sonographic examinations were performed by a large and heterogeneous group (18 residents and consultants)16. In addition, the fact that the performance of this model was evaluated using the same population from which it was generated and that the model was compared only to original, non-adjusted published models could have led to overestimation of its relative accuracy in the previous studies7, 16, 17.
With regard to the effect of birth-weight subgroup, we found that elimination of sex-related differences in the systematic error by using sex-specific models was observed in each of the different birth-weight subgroups. However, these sex-specific models were associated with highest accuracy (lowest systematic error) only in the larger interquartile birth-weight subgroup, while this advantage was not clearly observed in the extremes of birth weight (below the first quartile and above the third quartile). This observation is probably related to the general tendency for over- and underestimation of birth weight in cases of low and high birth weight, respectively18, as was observed also in the current study. These birth weight-related changes in systematic error shift the overall optimized systematic error achieved with sex-specific models towards a more positive or more negative systematic error, respectively. Clearly, an optimal customized model would need to account for multiple factors that are known to affect the accuracy of weight estimation, including fetal sex and birth-weight subgroup.
In conclusion, we have confirmed our hypothesis that the use of sex-specific models improves the accuracy of fetal weight estimation, principally because the optimal set of model coefficients differs for male and female fetuses. Improved accuracy is mainly the result of a decrease in systematic error, as random error was not affected by the use of such sex-specific models. Nevertheless, it should be emphasized that from a more clinical perspective, the differences in systematic errors were relatively small and thus did not translate into significant differences in other measures of accuracy which may be of more interest to the clinician (e.g. the proportion of estimations within 10% of birth weight). Still, although the absolute decrease in systematic error achieved with these sex-specific models is relatively small, the use of such models may be one of several steps that may eventually contribute to improvement of the accuracy of fetal weight estimation, including the use of alternative biometric indices25–28, incorporation of other clinical factors into the models29, 30 and the use of models optimized for specific weight ranges1, 31, 32.