Current Down syndrome screening strategies are based on multiple ultrasound and serum markers. The single most discriminatory marker is first-trimester ultrasound nuchal translucency (NT) measurement. However, to achieve good performance, operators need to follow a standard NT protocol, such as that promoted by The Fetal Medicine Foundation. This includes carrying out repeat measurements and using the largest value for risk calculation. In a study in this issue of the journal, Salomon et al. investigated circumstances in which just one NT measurement would be sufficient.
It is a statistical fact that repeating any measurement of a marker routinely and taking the average (or the largest) value will reduce variability and hence improve discriminatory power. Indeed, it is standard practice in biochemistry to test analytes in duplicate, unless costs are very high and intra-assay variability is known to be extremely low. The reason given in this study for using a single NT measurement in selected cases is different: the measurement may be sufficiently small that repeating it is unlikely to alter the estimated Down syndrome risk.
The distribution of NT in Down syndrome and unaffected pregnancies follows a log Gaussian model over most of the range, although there are limits beyond which the model deviates, and it is customary to ‘truncate’ risk for extreme NT levels to the nearest limit. Most risk calculation software further raises the bottom limit to avoid an apparently paradoxical situation in which lower NT is associated with a higher risk. Whilst this phenomenon may have a biological explanation, it contradicts common sense and can undermine confidence in calculated risks.
Thus, there will be some low NT values which are unlikely to increase sufficiently on repeat measurement to reach the bottom truncation limit and change the risk. The authors investigate this using simulated results based on estimates of intraoperator variability from the literature together with a published risk calculator. This yields a table of crown–rump length (CRL)-specific NT cut-off values, and applying them to a series of 165 actual scans shows that in about one-quarter of these cases, repeat measurement could have been avoided.
Two weaknesses are acknowledged by the authors, one concerning the published intraoperator variability, which was assumed to be unrelated to the NT level, and the other concerning the confirmatory series, which was retrospective and from a single center. They call for a prospective multioperator trial. I concur, since both average NT and variability can differ markedly between operators, profoundly affecting risks. Operator-specific parameters overcome these differences, but most risk calculators, including this study, use overall parameters.
While adoption of this approach may shorten the average NT scanning time, it could have a negative consequence unforeseen by the authors. External quality assessment of NT, apart from involving review of images, relies on epidemiological performance characteristics, such as median multiples of the median (MoM), standard deviation of log MoM and slope against CRL; all three of which will be distorted if a large proportion of measurements are ‘censored’.
Finally, in the coming era of cell-free DNA testing, screening strategies are likely to widen the range of disorders sought by traditional markers, marginalizing the current focus on Down syndrome risk calculation.