This was an observational cross-sectional study. The data and pregnancy outcomes were collected from Leto Maternity Hospital and from Attikon University Hospital between January 2009 and July 2010.

Risk assessment for chromosomal abnormality by a combination of ultrasound markers (fetal nuchal translucency thickness (NT) and nasal bone assessment) and biochemical measurements (free beta-human chorionic gonadotropin (β-hCG) and pregnancy-associated plasma protein-A (PAPP-A)) was carried out at 11 to 13 + 6 weeks of gestation at both hospitals, using the same protocol. Maternal blood was drawn at the time of the ultrasound examination and analyzed either simultaneously or within 24 h. Maternal demographic characteristics (weight, height, parity and smoking status), method of conception (spontaneous or assisted, including ovulation induction and *in-vitro* fertilization) and ultrasound parameters (crown–rump length (CRL) and NT) were recorded in a computer database (Astraia software; Astraia GmbH, Munich, Germany). Gestational age in days was defined by last menstrual period (LMP). In cases with uncertain dates or if the difference between the LMP- and CRL-derived dates was 7 days or more, the gestational age was corrected according to CRL. Women were considered to be parous if they had had a previous delivery at or beyond 24 weeks.

Third-trimester ultrasound examination was performed routinely at 30–34 weeks and included measurements of biparietal diameter (BPD), head circumference (HC, calculated from BPD and occipitofrontal diameter measurements), abdominal circumference (AC, calculated from anteroposterior and transabdominal diameter measurements) and femur length (FL). Color Doppler was used to assess the umbilical artery (UA) and the fetal middle cerebral artery (MCA) and to measure their pulsatility indices (PI). Fetal Doppler studies were performed in the absence of fetal and breathing movements over three consecutive cardiac cycles.

The study sample consisted of viable, singleton pregnancies with known outcomes that were delivered beyond 24 weeks. Exclusion criteria were hypertensive disorders of pregnancy, gestational diabetes mellitus and pre-eclampsia. Women with either a history of a previous pregnancy complicated by these conditions or a medical history of hypertension and diabetes mellitus (Type I or Type II) were also excluded. Finally, pregnancies with chromosomal abnormalities and/or structural defects, pregnancies resulting in miscarriage or intrauterine death and pregnancies diagnosed with severe early-onset growth restriction prior to the routine third-trimester scan were not considered in the analysis.

All ultrasound scans were performed by sonographers certified by The Fetal Medicine Foundation (FMF). The blood samples for PAPP-A and free β-hCG were analyzed by the Kryptor system (Brahms, Berlin, Germany). Biochemical analyte values were adjusted for maternal weight, method of conception, gestational age, cigarette smoking and race according to FMF algorithms and the results were converted to multiples of the median (MoM). The ethics committees of the two institutes approved the use of these data for analysis.

### Data management and statistical analysis

Descriptive exploratory data analysis was used and normality of the distributions was tested by the Kolmogorov–Smirnov test. The distributions of skewed parameters were made Gaussian by logarithmic transformation. Comparisons of normally distributed parameters between appropriate-for-gestational-age (AGA) and SGA groups were made by unpaired *t*-test, whereas Mann–Whitney *U*-test was applied for non-normally distributed continuous variables. Comparisons when the examined variable was normally distributed in one group and non-normally distributed in the other group were carried out by Mann–Whitney *U*-test. Dichotomous categorical variables were compared by χ^{2} test.

The cohort of 4702 women with known outcomes and no pregnancy complications was used to construct reference ranges for birth weight (BW) in relation to gestational age, using the mean and SD model developed by Royston and Wright16. Log-transformed BW was regressed against gestational age at delivery (GAD), with GAD expressed in days, as this is considered the optimal methodology with which to produce accurate reference ranges16. Scaled absolute residuals derived by this regression were regressed against gestational age to define the association between SD of BW (SD_{GA}) and gestational age16. The scaled absolute residuals were computed from residuals by removing the sign and multiplying by 1.25. Furthermore, we added fetal gender as a regressor to construct gender-adjusted BW standards. First- and second-degree terms for GAD sufficiently described our data and fetal gender was entered as a dichotomous variable in the regression equation. Specifically:

where BW_{GA} is BW in relation to gestational age.

BW centiles were constructed as follows:

where K is the *z*-score for the corresponding centile and assuming that SD_{GA} was the same for male and female fetuses. Finally, we obtained the anti-log values for BW_{GA}.

Fetuses with BW at or below the 5^{th} centile according to this formula were classified as SGA. Our study cohort of 4702 women consisted of 4441 non-SGA and 261 SGA neonates, according to our reference ranges.

Regression analysis with first-, second- and third-degree equation terms was used in order to identify the best-fit polynomial equation that described the associations between log_{10} CRL and gestational age (calculated in days according to LMP) and log_{10} NT and CRL. The regression equations were fitted in the non-SGA fetuses. Delta values for NT and CRL for both non-SGA and SGA neonates were calculated as the difference between the observed and expected values, the latter being estimated from the regression equations.

Regression analysis was also applied in the subgroup of 2310 women who had undergone both first-trimester screening and a third-trimester growth ultrasound scan with Doppler assessment to develop nomograms for BPD, HC, AC, FL and EFW and to calculate the respective delta values. The EFW was extracted from the database and was calculated by the Hadlock formula17. The 2189 non-SGA fetuses of this subgroup were investigated in order to define UA-PI and MCA-PI nomograms. MoMs for UA-PI and MCA-PI were computed by dividing observed by expected values for each biophysical parameter. The potential relation of UA-PI and MCA-PI with maternal and pregnancy characteristics was assessed.

Regression diagnostics in terms of residual analysis were used to confirm the regression assumptions (linearity, homoscedasticity). Identification of outliers by Mahalanobi's and Cook's distances was performed and outliers were removed to improve our regression models, but they are included in our presentation of descriptive statistics.

A first-trimester prediction model for SGA was developed by multiple logistic regression analysis. Receiver–operating characteristics (ROC) curves were plotted based on the probabilities predicted by the model in order to evaluate the predictive value of the regression model. The utilized predictors were: delta NT, delta CRL, PAPP-A MoM, free β-hCG MoM, maternal age, maternal weight, maternal height, parity, smoking status and method of conception. Racial origin was not used as a predictor because 99.97% of the studied population was Caucasian.

Third-trimester prediction models for SGA were constructed separately for dBPD, dHC, dAC, dFL, *z* log_{10} EFW (*z*-score of log transformed EFW), UA-PI MoM and MCA-PI MoM. dBPD, dHC, dAC, dFL and *z* log_{10} EFW were not used simultaneously in the regression model due to significant collinearity among these variables. The individual ROC curves were plotted and compared. We combined third-trimester EFW (the best predictor of SGA in third trimester) with maternal/pregnancy characteristics and Doppler studies to develop a third-trimester combined model. Subsequently, first- and third-trimester parameters were combined by logistic regression modeling to develop an integrated model and evaluate its performance in identifying SGA fetuses.

Finally, contingency policies were evaluated. Baye's theorem was applied to recalculate the risk in the 2310 pregnancies that had a third-trimester scan. This methodology has been used extensively in screening strategies18. According to Baye's theorem: posterior odds = prior odds × likelihood ratio. In our study, posterior odds (O_{posterior}) were the odds for SGA after reevaluation in the third trimester, while prior odds (O_{prior}) were those according to the first-trimester prediction model. The first-trimester prediction model for SGA was applied to calculate case-specific O_{prior}. Predicted probabilities (PP) obtained by the first-trimester prediction model were converted to O_{prior} by the equation:

We assumed that *z* log_{10} EFW (the best predictor of SGA in the third trimester) had a Gaussian distribution in SGA and AGA neonates, with respective means of M_{SGA} and M_{AGA} and standard deviations of SD_{SGA} and SD_{AGA}. Assuming that Y_{SGA} and Y_{AGA} are the distributions of *z* log_{10} EFW for SGA and AGA, respectively, with M_{SGA}≠ M_{AGA} and SD_{SGA} ≠ SD_{AGA}, the likelihood ratio (LR) for each individual would be:

where *x* is the estimated *z* log_{10} EFW by third-trimester scan for the individual.

We investigated three different strategies, with third-trimester scans being performed in 20%, 30% or 50% of pregnancies with the highest predicted probability for SGA according to the first-trimester model. O_{posterior} was calculated supposing that we rescanned 20%, 30% and 50% of fetuses and the remaining population maintained their first-trimester assigned risk. The contingency strategies were evaluated by ROC-curve analysis.

The level of significance was set at *P* < 0.05 and *P*-values for all tests were two-sided for all analyses.