Is obstetric and neonatal outcome worse in fetuses who fail to reach their own growth potential?



Dr Chang (January 1994), writes a fervent defence of the subject matter of the study by Danielian et al. (June 1992), but this was never under attack. We all agree that the main issue is growth rather than size, as was also stated in my letter (Gardosi 1993). But just because the purpose of a study is correct, this should not place incorrect methodology beyond criticism; my point about Danielian and his coworkers' selective use of nomograms while disregarding the adjustment tables for maternal size (Altman & Coles 1980) still stands. The letter from Danielian et al. (1993) fails to explain why they bothered to adjust for parity and sex, either of which results in not more birthweight variation than that observed for even only a 0.5 unit difference in body mass index around the mean, such as between two mothers who are 2.5 cm plus 2.5 kg above, vs 2.5 cm and 2.5 kg below the population average in height and booking weight, respectively (Altman & Coles 1980 Table 2).

The view which is now being put forward is that it is irrelevant which weight-for-gestational age charts are used (Danielian et al. 1994); we could do away with them altogether (Chang 1994) in favour of routine serial growth assessment. This is even more worrying. Serial scanning does pick up abnormal growth and is often used to monitor high risk pregnancies, but proposals to apply this routinely to screen for abnormal fetal weight gain would, even if ever feasible, result in many false positive and false negative assessments because an ultrasound estimation of fetal weight is after all just that, an estimate.

A simple calculation, applied to the study by Danielian et al. (1992) will illustrate how often a significant error is likely to occur. In their protocol, only one estimation of fetal weight was needed, as the second point to calculate a change was the actual birthweight. The estimation of fetal weight was obtained between 28 and 34 weeks of gestation according to a fetal weight formula (Hadlock et al. 1985). This was used to obtain a weight centile from which an estimated birthweight was derived for comparison with the actual birthweight. Receiver–operator curves suggested that a drop by more than 5% between scan-predicted weight and birthweight was clinically significant, and this was observed to have occurred in 22% of pregnancies.

Assuming the accuracy of estimated fetal weights was as good in this study as the best possible results published in the literature, the standard deviation of error was ± 7.3% (Hadlock et al. 1985; Robson et al. 1993). A 5% difference in estimated fetal weight would translate into a Z-score of 5/7.3 = 0.685 which represents 25% of a normal distribution. This means that in a quarter of all pregnancies, an initial overestimation of estimated fetal weight is to be expected which will have resulted in an apparent weight drop of 5% or more due to ultrasound error alone. This is more than the 22% of cases that resulted in the study by Danielian et al. (1992), most of which were therefore unlikely to be representing a population of babies that were growth retarded by their own definition. This begs the question as to why an expected excess of the number of false positives, due to the truly weight reduced babies in a population, were not picked up in their study.

Another source of problem is systematic estimated fetal weight error, which was not addressed in the study by Danielian et al. (1992) but apparently resulted in an average difference between estimated and actual birthweight of – 76 g (–2.2%). The choice of fetal weight standard itself can also lead to considerable variation. Danielian and his coworkers used a weight formula by Hadlock et al. (1985) which is based on four parameters (HC, BPD, AC, FL); however, to calculate fetal weight centiles they then chose a reference standard by Jeanty et al. (1984) which is based on an altogether different weight formula (Shepard et al. 1982) derived from two parameters (BPD, AC) and has a different normal range. To illustrate the potential error due to this variable alone, the median estimated fetal weight at 28 weeks is 1228 g according to Jeanty et al. (1984) and Shepard et al. 1982; the median is 1210 g by Hadlock's fetal weight standard (Hadlock et al. 1991) using his own fetal weight formula (Hadlock et al. 1985), (i.e., 6.1% less). According to fetal weight tables which Dr Chang himself helped to construct (Gallivan et al. 1993) and which are also based on Hadlock's weight formula, the median 28 week weight is 1256 g, which is still 2.2% lower than Jeanty's; whereas, at 34 weeks it is 2445 g vs 2369 g (i.e., 3.2%higher). The outer limits for fetal weight show even more variation: Jeanty's 5th centile at 28 weeks is 957 g and Gallivan's 1058 g (+ 10.6%); at 34 weeks, the difference rises to 1750 vs 2096 g(+19.6%).

Therefore it is apparent that the methodology adopted in the study by Danielian et al. (1992) was not conducive to detect a real 5% loss in fetal weight, and it seems unlikely that the clinical differences observed for the two subgroups were due to a failure of fetuses to reach their own growth potential. Yet clinical differences were still claimed, which highlights the potential problems of relying on soft outcomes, such as CTG abnormalities and operative delivery rates, to draw conclusions about fetal growth, while all three neonatal outcome measures (Apgar scores, intubation, transfer to neonatal unit) assessed in this study showed no differences. Neonatal morphometry was, regrettably, not included but would have been more likely to show which babies were actually growth retarded. The need to define appropriate endpoints in such studies has been raised in previous correspondence about this paper (Owen 1993).

Turning now to the prospect of inferring growth from two consecutive estimated fetal weights, the potential for error must be even greater because either or both weight estimations can err in either direction, resulting in a large number of possible growth velocities. With luck, the systematic error may cancel out. The longer the interval between scans the smaller the relative effect of estimated fetal weight error over the actual weight increment, but also the longer the delay before abnormal growth is detected. Chang et al. (1993) performed up to eight scans between 26 and 40 weeks but chose the first and last measurements to calculate changes in estimated fetal weight to show an association with neonatal morphometric indices. This resulted in the last scan occurring on average only five days before delivery. For prospective growth assessment it would be more useful to use an earlier second scan to assess growth velocity, but this may result in larger error due to forward projection. In a protocol of two scans at, say 32 and 36 weeks, any error in growth velocity derived from the two estimated fetal weights would double during the next four weeks up to the estimated date of delivery. Using the definition of Danielian et al. (1992) for IUGR at birth (i.e., weight loss of more than 5%), this would be found after an estimated fetal weight over-estimation of 2.5% at the first scan alone (chance: Z= 2.5/7.3 = 0.342; 37%), or an estimated fetal weight under-estimation of 2.5% at the second scan alone (chance 37%).

No current strategies are successful but researchers who propose routine serial scanning to screen for growth retardation must be aware of the errors inherent in this investigation in both missing and overstating IUGR. These are already apparent in the hands of scanning enthusiasts within the setting of research projects. In a population screening programme, there are added issues, such as costs, logistics, scan performance in busy clinics and error from inter-observer variation, and a proposal to rely solely on scan-derived assessments of growth velocity would seem even more unrealistic.