International standards for early fetal size and pregnancy dating based on ultrasound measurement of crown–rump length in the first trimester of pregnancy
A. T. Papageorghiou,
Nuffield Department of Obstetrics & Gynaecology and Oxford Maternal & Perinatal Health Institute, Green Templeton College, University of Oxford, Oxford, UK
Correspondence to: Dr A. Papageorghiou, Nuffield Department of Obstetrics & Gynaecology, University of Oxford, Women's Centre, Level 3, John Radcliffe Hospital, Headington, Oxford OX3 9DU, UK (e-mail: firstname.lastname@example.org)
There are no international standards for relating fetal crown–rump length (CRL) to gestational age (GA), and most existing charts have considerable methodological limitations. The INTERGROWTH-21st Project aimed to produce the first international standards for early fetal size and ultrasound dating of pregnancy based on CRL measurement.
Urban areas in eight geographically diverse countries that met strict eligibility criteria were selected for the prospective, population-based recruitment, between 9 + 0 and 13 + 6 weeks' gestation, of healthy well-nourished women with singleton pregnancies at low risk of fetal growth impairment. GA was calculated on the basis of a certain last menstrual period, regular menstrual cycle and lack of hormonal medication or breastfeeding in the preceding 2 months. CRL was measured using strict protocols and quality-control measures. All women were followed up throughout pregnancy until delivery and hospital discharge. Cases of neonatal and fetal death, severe pregnancy complications and congenital abnormalities were excluded from the study.
A total of 4607 women were enrolled in the Fetal Growth Longitudinal Study, one of the three main components of the INTERGROWTH-21st Project, of whom 4321 had a live singleton birth in the absence of severe maternal conditions or congenital abnormalities detected by ultrasound or at birth. The CRL was measured in 56 women at < 9 + 0 weeks' gestation; these were excluded, resulting in 4265 women who contributed data to the final analysis. The mean CRL and SD increased with GA almost linearly, and their relationship to GA is given by the following two equations (in which GA is in days and CRL in mm): mean CRL = −50.6562 + (0.815118 × GA) + (0.00535302 × GA2); and SD of CRL = −2.21626 + (0.0984894 × GA). GA estimation is carried out according to the two equations: GA = 40.9041 + (3.21585 × CRL0.5) + (0.348956 × CRL); and SD of GA = 2.39102 + (0.0193474 × CRL).
During pregnancy, accurate estimation of gestational age (GA), at the level of the individual, is essential to interpret fetal anatomy and growth patterns, predict the date of delivery and gauge the maturity of the newborn[1-3]. At a population level, GA estimation is required to determine rates of small-for-gestational-age fetuses and preterm birth accurately in order to allocate resources appropriately[4, 5].
GA has traditionally been calculated from the first day of the last menstrual period (LMP). However, in a proportion of pregnancies, depending on the locality, the LMP is unknown or the information is unreliable[6, 7]. In such cases, GA can be estimated by ultrasound measurement of fetal crown–rump length (CRL) or head circumference at < 14 weeks' and ≥ 14 weeks' gestation, respectively. Between 9 and 13 weeks' gestation, linear growth evaluated by CRL is rapid and the SD is rather small, which means that GA can be estimated accurately. In later pregnancy, head circumference is typically used for dating, as CRL can no longer be measured owing to curling of the growing fetus; however, variation is greater, which results in less accurate estimation of GA. For this reason, first-trimester ultrasound estimation of GA is recommended in clinical practice.
Various studies have been conducted to derive CRL reference charts for the estimation of GA, mostly in single institutions or geographical locations. A review of their methodological quality has shown several limitations including highly heterogeneous study designs and approaches to statistical analysis and reporting. All the studies have been ‘descriptive’, whereas we have consistently argued that ‘prescriptive’ standards should be used in clinical practice, reflecting how fetuses should grow rather than how they have grown in a given place and time. This could be achieved by first selecting pregnant women at low risk for fetal growth impairment, living in environments with minimal exposure to factors that have an adverse effect on growth. From such populations, women at low risk of adverse pregnancy outcomes who deliver healthy newborns without congenital malformations would then be identified[11-13].
Our aim in this study was to generate CRL data according to GA using an optimal study design and prescriptive approach in order to develop international, population-based standards for early fetal linear size estimation and ultrasound dating of pregnancy in the first trimester that can be used throughout the world.
INTERGROWTH-21st is a multicenter, multiethnic, population-based project, conducted between 2009 and 2014 in eight urban areas in eight different countries: the cities of Pelotas, Brazil; Turin, Italy; Muscat, Oman; Oxford, UK; Seattle, USA; Shunyi County, Beijing, China; the central area of the city of Nagpur (Central Nagpur), Maharashtra, India; and the Parklands suburb of Nairobi, Kenya. Its primary aim was to study growth, health, nutrition and neurodevelopment of fetuses from < 14 + 0 weeks' gestation to 2 years of age, using the same conceptual framework as the World Health Organization (WHO) Multicentre Growth Reference Study, in order to produce prescriptive growth standards to complement the existing WHO Child Growth Standards.
These urban areas had to be located at low altitude (≤ 1600 m) and women receiving antenatal care had to plan to deliver in these institutions or in a similar hospital located in the same geographical area, and there had to be an absence or low levels of major, known, non-microbiological contamination such as pollution, domestic smoke, radiation or any other toxic substances, evaluated during the study period at the cluster level using a data collection form specifically developed for the project. In the eight urban areas, we selected all institutions providing pregnancy and intrapartum care, in which > 80% of deliveries occurred.
To generate the CRL data for our stated aims, women with a singleton pregnancy that was conceived naturally were asked to participate in the Fetal Growth Longitudinal Study (FGLS), one of the three main components of the INTERGROWTH-21st Project, whose study methods have been described in detail elsewhere. Briefly, we recruited women from the selected populations with no clinically relevant obstetric or gynecological history, who met the entry criteria of optimal health, nutrition, education and socioeconomic status to create a group of affluent, clinically healthy women who were at low risk of intrauterine growth restriction and preterm birth. Recruitment occurred prospectively and consecutively at 9 + 0 to 13 + 6 weeks' gestation as estimated by LMP provided that: (1) the date was certain; (2) the agreement between LMP and CRL dating was ≤ 7 days; (3) the women had a regular 24–32-day menstrual cycle; and (4) they had not been using hormonal contraception or breastfeeding in the preceding 2 months. The women, who were all well-educated and living in urban areas, reported the date and certainty of their LMP at their first antenatal clinic visit in response to specific questions.
A single type of ultrasound machine (Philips HD-9; Philips Ultrasound, Bothell, WA, USA) with an abdominal probe was the machine of choice to measure CRL. However, as the first contact with the study often occurred at several different clinics in the geographical area, it was considered acceptable to use other, locally available, machines for the CRL measurement at the first antenatal visit only, provided that they were evaluated and approved by the study team. All 39 ultrasonographers at the eight study sites underwent rigorous training and standardization specifically for CRL measurement. In accordance with the study's quality-control protocol, they also submitted images of the CRL measurements, which were reviewed blindly by our collaborators at the Société Française pour l'Amélioration des Pratiques Echographiques. The ultrasonographers were only certified to measure CRL in the study if they demonstrated adequate knowledge of the study protocol and the quality of the images submitted for review was satisfactory.
CRL was measured once using strict techniques and imaging criteria. A discrepancy between GA based on LMP and that derived from CRL of more than 7 days was a reason to exclude the woman from the study. All women were then followed to delivery with standardized antenatal care evaluation and regular ultrasound scans every 5 ± 1 weeks.
The INTERGROWTH-21st Project was approved by the Oxfordshire Research Ethics Committee ‘C’ (ref: 08/H0606/139) and the research ethics committees of the individual participating institutions, as well as the corresponding regional health authorities in which the project was implemented.
The sample size was based principally on the precision and accuracy of a single centile and regression-based reference limits[19, 20]. We have shown that with a sample of 4000, we would obtain a precision of 0.03 SD at the 3rd or 97th centile. Further details on the precision obtained at the 5th or 10th centile by sample size (ranging from 500 to 6000) are provided in a previous publication. We determined a mean target sample of 500 women per site, after excluding complicated pregnancies and those lost to follow-up. We expected that, overall, approximately 3% would be lost to follow-up, and that another 3% would be excluded (using criteria decided a priori) from the study population because of fetal/neonatal losses and congenital abnormalities. We also excluded mothers diagnosed with catastrophic or very severe medical conditions, those with severe unanticipated pregnancy-related conditions requiring hospital admission and those identified during pregnancy who no longer fulfilled all the entry criteria.
The statistical methods used are described in detail elsewhere. Briefly, data were first explored visually by a scatter plot of CRL against GA and vice versa. The relationship between GA and CRL is non-linear, although the distribution of CRL is conditionally normal at any given GA. We applied fractional polynomial models to the data by fitting separate models to the mean and SD of GA to account for increases in variance with greater CRL and gestation[23, 24]. Using equations of the mean and SD one can easily compute any desired centile using the relationship:
where Z is the normal equivalent deviate (Z-score) corresponding to a particular centile, e.g. Z = –1.88, –1.645, –1.28, 0, +1.28, +1.645 and +1.88 for the 3rd, 5th, 10th, 50th, 90th, 95th and 97th centiles, respectively; the SDs in this equation are the predicted estimates from the regression analysis.
To overcome the effect of data truncation at the limits of recruitment at 9 + 0 and 13 + 6 weeks' gestation, we explored three alternative statistical approaches. Truncation occurs when data are constrained by a restricted range of GA; such a restriction is commonly put in place for recruitment reasons, but also because fetal curling prevents accurate measurement beyond 13 + 6 weeks. In our analysis, all three statistical approaches gave very similar results, and we opted for the one (simulation for small and large CRL) that had the best fit at both the upper and lower limits of GA.
Fitted curves (3rd, 50th and 97th centiles) from different models were assessed visually for a good fit and by comparing the deviances from each model. Goodness of fit was assessed by a scatter plot of the distribution of residuals in Z-scores by CRL and also by counting the number of observations below the 3rd and above the 97th centiles. Assessment of increasing variability with gestation, and smooth changes of both mean and SD across GA, were undertaken as part of the fractional polynomial approach.
Of the 13 108 pregnant women screened between May 2009 and July 2013 at the eight study sites, 4607 (35%) met the clinical eligibility criteria and were enrolled in the study. All the women were closely followed up throughout pregnancy by the study team until delivery and discharge from hospital. A total of 4321 women had live singleton births in the absence of severe maternal conditions or congenital abnormalities detected by ultrasound or at birth. The sample size per country ranged from 311 in the USA to 640 in the UK. The overall maternal and pregnancy outcome characteristics are shown in Table 1. The CRL was measured in 56 women at < 9 + 0 weeks' gestation, resulting in 4265 women who contributed data to the final analysis (Figure 1).
Table 1. Maternal and pregnancy characteristics of the 4265 women enrolled in the Fetal Growth Longitudinal Study of the INTERGROWTH-21st Project who had crown–rump length measured between 9 + 0 and 13 + 6 weeks' gestation according to last menstrual period and a live singleton birth, in the absence of severe maternal conditions or congenital abnormalities detected by ultrasound or at birth
Data are given as mean ± SD or n (%).
≥ 37 + 0 weeks of gestation only. NICU, neonatal intensive care unit.
Maternal age (years)
28.3 ± 3.9
Maternal height (cm)
162.2 ± 5.8
Paternal height (cm)
174.4 ± 7.3
Maternal weight (kg)
61.2 ± 9.1
Maternal body mass index (kg/m2)
23.2 ± 3.0
Gestational age at first visit (weeks)
11.8 ± 1.4
Formal education (years)
15.0 ± 2.8
Hemoglobin level before 15 weeks' gestation (g/dL)
As we have reported elsewhere, evaluation of the similarities in CRL across the eight populations was performed using variance component analysis, standardized site difference and sensitivity analysis. All three analytical strategies demonstrated that the populations were similar enough to justify pooling the data.
Mean fetal size and SD increased with GA (Table 2, Table S1, Figure 2). Their relationship to GA can be defined between 58 and 105 days' gestation by the two equations below, in which GA is expressed in days and CRL in mm:
Table 2. Sample size and crown–rump length (CRL) according to gestational week
Gestational age (weeks)
CRL (mm) (mean ± SD)
9 + 0 to 9 + 6
27.47 ± 4.83
10 + 0 to 10 + 6
36.23 ± 6.10
11 + 0 to 11 + 6
49.39 ± 6.62
12 + 0 to 12 + 6
60.78 ± 7.07
13 + 0 to 13 + 6
72.53 ± 7.29
The data were then used to create a dating equation to allow GA estimation (as a dependent variable) in all women by measuring CRL (as an independent variable) (Figure 3, Table 3). The relationship can be defined when CRL is between 15 and 95 mm by the two equations below, in which CRL is expressed in mm and GA in days:
Table 3. Chart for pregnancy dating based on measurements of crown–rump length (CRL) in 4265 normal pregnancies
For the goodness-of-fit analysis, mean residuals by week of gestation expressed as Z-scores did not show any obvious pattern (–0.12, 0.00, –0.05, –0.06, 0.03 and 0.14 at 9, 10, 11, 12, 13 and 14 weeks' gestation, respectively).
We studied a large, international cohort of women from eight diverse geographical locations worldwide, with minimal constraints on fetal growth at both population and individual level (i.e. a prescriptive approach to growth evaluation), in order to construct standards for CRL and the corresponding GA estimation in the first trimester of pregnancy. These populations were judged to be similar enough for the data to be pooled into a single cohort. This is the first time that an international, early fetal linear size standard and equation for GA estimation have been produced. When fully implemented they will allow for uniform early pregnancy evaluation at all levels of healthcare across the world. Using the same standard to identify abnormal conditions early in pregnancy or make diagnoses is routine practice in most areas of medicine and is long overdue in obstetric care.
Our study has a number of important methodological and conceptual strengths. Firstly, we included a diverse range of geographical locations and populations from different ethnic backgrounds around the world to make the findings as generalizable as possible. This is of special relevance today given the extent of multi-ethnic populations and children of mixed parents. Secondly, unified protocols were used for recruitment, clinical care until hospital discharge and data collection, and rigorous quality-control processes were employed. Thirdly, the study was purposely prospective and population-based, and only included singleton pregnancies that were conceived naturally with a known LMP. Fourthly, only healthy women sampled from preselected, geographically defined populations with low adverse perinatal outcome rates were selected. Lastly, all participants were studied to the end of pregnancy, but women were excluded if fetal/neonatal deaths, severe pregnancy complications or congenital abnormalities occurred. This cohort of women, therefore, had the greatest potential for achieving optimal fetal growth.
The approach has allowed us to create an international prescriptive standard for early fetal growth. This is crucial for estimating GA because it is based on the assumption that the CRL values are from healthy fetuses that remained so for the remainder of the pregnancy. We based our strategy and rationale on the knowledge gained from our recent systematic review of existing charts for GA estimation, which showed that the overall quality of study design, statistical analysis and reporting was less than optimal. Only eight of the 29 previous studies identified and enrolled unselected or low-risk pregnancies, and while almost all the studies reported using some of the FGLS inclusion/exclusion criteria, no study used all of them. A comprehensive strategy for ultrasound quality control was not employed in any of the 29 studies. Many studies also used retrospective analysis of large databases of routinely collected clinical data. Such retrospective studies are at high risk of bias, as the quality of the recorded data is variable and the ability to perform prospective ultrasound quality assurance is compromised. In contrast, clinical application of our standard globally will allow fetal size centiles to be plotted uniformly, making comparisons of fetal size and GA across populations easier to interpret.
Furthermore, we compared our GA equation from the pooled populations with those of the two studies selected during the systematic review as having the lowest risk of methodological bias and that were conducted in populations with adequate medical care and nutritional conditions in developed countries, making them potentially eligible for the INTERGROWTH-21st Project[26, 27]. Interestingly, and reassuringly for the global introduction into clinical practice of our new international standards, the differences in GA estimation based on CRL, between these studies and ours, were small and unlikely to result in important clinical differences.
The first of these studies, carried out in 1973 in Scotland, was an analysis of 214 CRL measurements in 80 patients; the second was a population-based study in The Netherlands between 2002 and 2006 with 2079 individual CRL measurements. The difference in both studies in GA estimation was ± 1 day of gestation, except for CRL > 80 mm, in which the difference between the INTERGROWTH-21st equation and that of Verburg et al. approached and then exceeded 2 days at a CRL of > 85 mm. These striking similarities suggest that early linear fetal growth, evaluated by CRL measurement, appears to be uniform both over time and among different ethnic populations once they have reached an adequate level of health, nutrition and socioeconomic condition, reinforcing the appropriateness of using international standards.
A potential limitation of our study was the use of multiple ultrasonographers, as it has previously been argued that reference studies should be performed by a single operator in order to reduce interobserver error. In our opinion, this is not appropriate: it produces small studies concentrated in a single practice; devalues the contribution of international, multicenter studies; reduces external validity; and fails to recognize that clinical services are delivered in most institutions by many members of staff. Rather, studies should account for the variability introduced by ultrasonographers by taking steps to improve the quality and consistency of measurements through standardization, audit and quality control of all aspects of ultrasonography[16, 18, 28, 29].
A disadvantage of GA estimation based purely on the ultrasound measurement of fetal anatomical parameters is that all biological variation in GA for a given value of CRL disappears – an assumption that is, of course, biologically implausible. This is not a problem peculiar to ultrasound but also occurs with any other biological parameter being predicted by a single measurement. We therefore suggest that all information collected at the time of the first antenatal visit (including the reported LMP and assessment of its reliability) should be taken into account when estimating GA or assessing fetal growth during future antenatal visits30. When a reliable LMP and ultrasound estimate concur, small discrepancies in GA may mask inherent CRL measurement error. Conversely, an apparently reliable and accurate LMP with a substantial difference in estimated GA based on CRL should be considered as an indicator of possible growth disturbance or underlying pathology that needs to be monitored and corroborated[31, 32]. Finally, it is important to emphasize that all estimates of GA should be explained and given to women with the corresponding measure of variability, e.g. SD or centiles, to provide a measure of the error of the estimation.
In short, we have presented, building on the experience of decades of ultrasound work conducted by others, international standards for evaluating fetal linear size in the first trimester and a corresponding new equation for the estimation of GA from CRL that can be used across countries and populations. The new GA estimations are in close agreement with studies with a low risk of methodological bias conducted in populations from developed countries, suggesting that when high methodological standards are met and populations adequately selected, early fetal growth is similar across populations. The adoption of these standards, through their introduction via ultrasound machines and fetal database systems, will standardize the evaluation of fetal growth across levels of care and facilitate comparisons internationally.
This project was supported by a generous grant (no. 49038) from the Bill & Melinda Gates Foundation to the University of Oxford, for which we are very grateful. We would also like to thank the Health Authorities in Pelotas, Brazil; Beijing, China; Nagpur, India; Turin, Italy; Nairobi, Kenya; Muscat, Oman; Oxford, UK and Seattle, USA, who facilitated the project by allowing participation of these study sites as collaborating centers. We are extremely grateful to Philips Healthcare for providing the ultrasound equipment and technical assistance throughout the project. We also thank MedSciNet UK Ltd for setting up the INTERGROWTH-21st web-site and for the development, maintenance and support of the online data management system.
We thank the parents and infants who participated in the studies and the more than 200 members of the research teams who made the implementation of this project possible. The participating hospitals included: Brazil, Pelotas (Hospital Miguel Piltcher, Hospital São Francisco de Paula, Santa Casa de Misericórdia de Pelotas, and Hospital Escola da Universidade Federal de Pelotas); China, Beijing (Beijing Obstetrics & Gynecology Hospital, Shunyi Maternal & Child Health Centre, and Shunyi General Hospital); India, Nagpur (Ketkar Hospital, Avanti Institute of Cardiology Private Limited, Avantika Hospital, Gurukrupa Maternity Hospital, Mulik Hospital & Research Centre, Nandlok Hospital, Om Women's Hospital, Renuka Hospital & Maternity Home, Saboo Hospital, Brajmonhan Taori Memorial Hospital, and Somani Nursing Home); Kenya, Nairobi (Aga Khan University Hospital, MP Shah Hospital and Avenue Hospital); Italy, Turin (Ospedale Infantile Regina Margherita Sant' Anna and Azienda Ospedaliera Ordine Mauriziano); Oman, Muscat (Khoula Hospital, Royal Hospital, Wattayah Obstetrics & Gynaecology Poly Clinic, Wattayah Health Centre, Ruwi Health Centre, Al-Ghoubra Health Centre and Al-Khuwair Health Centre); UK, Oxford (John Radcliffe Hospital) and USA, Seattle (University of Washington Hospital, Swedish Hospital, and Providence Everett Hospital).
Members of INTERGROWTH-21st and its committees are listed in Appendix S1. Full acknowledgment of all those who contributed to the development of the INTERGROWTH-21st Project protocol appears at www.intergrowth21.org.uk.
SUPPORTING INFORMATION ON THE INTERNET
The following supporting information may be found in the online version of this article:
Appendix S1 Members of the International Fetal and Newborn Growth Consortium for the 21st Century (INTERGROWTH-21st) and its committees
Table S1 Fetal crown–rump length chart based on gestational age according to last menstrual period