Evaluation and validation of a new risk score (CLEOPATRA score) to predict the probability of premature delivery for patients with threatened preterm labor

Authors


Abstract

Objective

To develop a clinically useful tool to predict the probability of preterm delivery in patients with threatened preterm labor.

Methods

One hundred and seventy patients with preterm labor between 24 and 34 weeks of gestation were included. Preterm delivery < 37 weeks of gestation was the main endpoint of the study. The data were randomized and split into an evaluation set (n = 85) and a validation set (n = 85). The evaluation set was subjected to stepwise backward logistic regression analysis to quantify the relative impact of four potential risk factors, including individual patient factors, results of a rapid fetal fibronectin assay, and sonographic measurement of cervical length. Using the constant of the logistic regression analysis and the beta-coefficients for the identified risk factors the individual probability of preterm delivery for each woman of the validation dataset was calculated. The area under a receiver–operating characteristics curve (AUC) was used to evaluate the discriminating power of the score.

Results

The overall rate of preterm delivery was 27.1%. The logistic regression analysis was performed for the potential predictors of spontaneous preterm delivery, identified by univariate analysis. These were positive fetal fibronectin, cervical length, previous preterm delivery and maternal age. Two risk factors were independent predictors of preterm delivery and were included in the CLEOPATRA I (clinical evaluation of preterm delivery and theoretical risk assessment) score: cervical length measurement and previous preterm delivery were associated with a higher risk of preterm delivery (odds ratio, 7.65 and 6.74, respectively). Since fetal fibronectin assay is not available at all institutions worldwide, it was excluded from the initial model. In the CLEOPATRA II model the risk factors fetal fibronectin and previous preterm delivery were associated with higher risk of preterm delivery, with odds ratios of 17.9 and 4.56, respectively. The discrimination power (AUC) obtained from the models were: CLEOPATRA I, 0.69 (95% CI, 0.56–0.82); CLEOPATRA II, 0.81 (95% CI, 0.69–0.93).

Conclusion

In symptomatic women the risk for preterm delivery can be predicted best with the CLEOPATRA II score based on fetal fibronectin and previous preterm delivery. Copyright © 2005 ISUOG. Published by John Wiley & Sons, Ltd.

Introduction

Despite several clinical and monitoring improvements the incidence of prematurity in most European countries and in the United States has remained unchanged, at 8–9%, over the past decade1. The delivery of infants prior to 37 weeks of gestation complicates approximately 5–6% of births in Germany, and it is a leading cause of neonatal morbidity and mortality2; preterm delivery accounts for 60% of all perinatal deaths3.

To direct prevention strategies towards high-risk obstetric populations several means of identification of women at risk of preterm delivery have been suggested, including risk-scoring systems, biochemical markers of inflammation, such as fetal fibronectin, and screening for various infections, including bacterial vaginosis. Risk-scoring systems focusing on epidemiological, anamnestic and clinical risk factors associated with preterm delivery have been developed to decrease unnecessary intervention for patients with symptoms of preterm labor and to identify patients who might benefit from aggressive therapy such as tocolysis, corticosteroids and transfer to a tertiary care facility4–7. However, sensitivity and positive predictive values are low. Thus, most women who deliver preterm are not identified by the risk-scoring system and most women identified as high risk do not deliver preterm.

In the last few years transvaginal sonographic assessment of the uterine cervix has been found to be an effective marker with a high negative predictive value (NPV), and numerous investigators have observed that in singleton pregnancies a shortened cervical length is predictive of preterm delivery in both symptomatic and asymptomatic patients8–10. The most clinically useful biochemical approach to discriminate women who are at high risk for impending preterm delivery is measurement of fetal fibronectin in cervicovaginal secretions. Studies have demonstrated it to be an excellent marker, with an NPV of > 99% for predicting delivery within 7 or 14 days in symptomatic women11–14.

However, no study has evaluated a risk score for symptomatic women combining clinical risk factors, fetal fibronectin and cervical length determined by transvaginal ultrasound. Thus, we designed an observational study to identify and quantify the most important risk factors for preterm delivery and to set up different risk scores according to the routine availability of equipment and laboratory examination facilities to predict the probability of premature delivery for patients with threatened preterm labor. We aimed to develop a risk score with high discriminatory power, with the test result available within hours that would allow calculation of an exact risk.

Methods

One hundred and seventy women participated in this observational study in a tertiary care obstetric unit (Department of Obstetrics and Perinatal Medicine, Marburg) from November 2001 to January 2004. All women with singleton pregnancies receiving care at this institution were eligible for participation if they arrived at the hospital between 24 weeks and 34 + 6 weeks with intact membranes and symptoms that suggested preterm labor. Gestational age was assigned on the basis of the last menstrual period confirmed by first- or early second-trimester sonography; if there was a discrepancy of > 10 days, sonographic gestational age was used. Preterm labor was defined as the presence of uterine contractions on external tocodynamometry occurring at a frequency of four in 20 min or eight in 1 hour, or any uterine activity associated with changes in cervical effacement (at least 50%) and dilatation (at least 2 cm). Women whose pregnancies were complicated by cervical cerclage, cervical dilatation ≥ 3 cm, placenta previa, clinical criteria of intrauterine infection, vaginal bleeding of unknown origin, fetal growth restriction, pre-eclampsia, suspected fetal asphyxia or a major fetal anomaly were excluded from the study. Medically indicated preterm deliveries were not considered. The study was approved by the local ethics committee and all patients gave their informed consent to participate in the study.

Data collection at enrollment included pre-pregnancy weight and height, pregnancy history, demographic variables (age, parity, gravidity, miscarriages, previous preterm delivery), transvaginal cervical ultrasound, cervical sampling for fetal fibronectin, vaginal sampling for bacterial vaginosis or vaginal infection, and digital examination (cervical dilatation, effacement and Bishop score15). Body mass index (BMI) in kgm−2 was derived from pre-pregnancy weight and measured height.

Samples for the fetal fibronectin assay were taken from the cervix during a speculum examination by placing a dry Dacron (DuPont, Wilmington, DE, USA) swab against the neck of the cervix for 10 s. The probe was analyzed using the rapid fetal fibronectin TLi System (Adeza Biomedical, Sunnyvale, CA, USA) qualitative method, with results reported as either positive (≥ 50 ng/mL) or negative (< 50 ng/mL). Digital cervical examination was then performed with sterile gloves and lubricant.

Vaginal secretions for bacterial vaginosis testing were collected with moistened cotton-tipped swabs placed in the posterior fornix. One swab was smeared in a drop of normal saline on a microscope slide, which was then examined by microscopy for clue cells. The other swab was placed in 10% potassium hydroxide (KOH) to determine whether there was an amine (fishy) odor. Vaginal secretions were tested with pH paper to determine whether the pH exceeded 4.5. Clinical diagnosis of bacterial vaginosis included the presence of clue cells plus two of the following: (1) homogenous vaginal discharge; (2) vaginal pH above 4.5; (3) amine odor with 10% KOH.

A dry cotton-tipped swab was used to collect secretions from the posterior fornix. The swab was placed in a transport medium and was analyzed using Gram staining and non-selective blood agar culture. The diagnosis of vaginal infection was based on the presence of enterococcal species, group B streptococcus, Chlamydia trachomatis, Ureplasma urealyticum, or Neisseria gonorrhea alone or in combination.

The standard care at the study hospital for patients with preterm labor between 24 and 34 weeks of gestation was administered to all patients. This involves administration of either magnesium sulfate or ß-mimetics (fenoterol) as a tocolytic agent combined with betamethasone as antenatal corticosteroid therapy to induce fetal lung maturity. In the presence of bacterial vaginosis or vaginal infection, systemic antibiotic treatment was started. The managing physician was blinded to the results of the fetal fibronectin assay and these did not influence therapy.

Sonographic scans of the uterine cervix were performed with a 4–8-MHz transvaginal transducer (HDI 3000, Philips Medical Systems, Hamburg, Germany) after the patient had emptied her bladder. The internal cervical os was identified in the sagittal plane, and the probe was moved until the entire cervical canal was visualized. The cervical length was measured as the distance between the internal os and the external os, identified by sonolucency of the cervical canal. All the sonographic measurements were performed by a single investigator (I.T.).

Statistical analysis

Preterm delivery, defined as birth before the 37th week of gestation, was used as the major endpoint. The data were randomized and split into an evaluation set (n = 85) and a validation set (n = 85). In the description of subjects, categorical data were expressed as percentages, and continuous variables as mean and SD. Nominal values were analyzed with the χ2 test or two-sided Fisher's exact test. Univariate odds ratios (ORs) and 95% CIs were calculated. Variables that tended to influence the occurrence of preterm delivery (defined as a P-value of < 0.20 in the univariate analysis) were subjected to a stepwise backward logistic regression analysis using the maximum likelihood function. Nagelkerke R2 (approximately the proportion of the explained variance in the logistic regression model) was used to judge the goodness-of-fit of a model. Factors included in the model were used to set up the score for the probability of preterm delivery. Using the constant of the logistic regression analysis and the beta-coefficients for the identified risk factors, the individual probability of preterm delivery for each woman in the validation dataset was calculated. Discrimination of these scores was tested by plotting true-positive values (sensitivity) against false-positive values (1—specificity); the area under the receiver–operating characteristics (ROC) curve (AUC) is a measure for the accuracy of the test16. AUCs between 0.7 and 0.8 were classified as ‘acceptable’ and those between 0.8 and 0.9 as ‘excellent’ discrimination17. For each curve the optimal cut-off point was calculated, defined to correspond to the maximum distance between the ROC curve and the 45° line. For this risk separating a positive from a negative prediction, the corresponding sensitivity, specificity and positive predictive value were calculated. Data analyses were performed using SPSS for Windows (version 11.0; SPSS Inc, Chicago, IL, USA).

Results

Study population

A total of 170 patients were included in the analysis; 85 data records were used for the evaluation of the risk factors and 85 for the validation of the score. Table 1 presents the clinical and demographic factors of patients delivered at < 37 weeks' gestation and at ≥ 37 weeks' gestation in the evaluation set. The overall rate of preterm delivery was 27.1% (23/85). No significant differences were observed between patients delivered preterm and those delivered at term in the univariate statistics concerning maternal and gestational age at admission, BMI, gravidity, parity, miscarriages, bacterial vaginosis, Bishop score, vaginal pH and vaginal infection. Table 1 shows also the ORs, calculated by univariate analysis, for spontaneous preterm birth. For preterm delivery at < 37 weeks, a positive fetal fibronectin assay (OR, 17.5; 95% CI, 5.26–58.2, P < 0.001) and cervical length (OR, 5.15; 95% CI, 1.77–15, P = 0.002) were the two strongest risk factors; previous preterm delivery (OR, 3.88; 95% CI, 1.05–14.3, P = 0.064) was a closely significant independent predictor of spontaneous preterm birth.

Table 1. Demographic and clinical characteristics of patients delivering at < 37 weeks or ≥ 37 weeks in the evaluation dataset
 Delivery < 37 weeks (n = 23)Delivery ≥ 37 weeks (n = 62)P*Odds ratio (95% CI)
  • P-values are the results of univariate statistical analysis.

  • *

    Fisher's exact test or χ2 test.

  • First category serves as reference. Data were not available for all cases.

Maternal age (years, n (%))0.13 
 < 200 (0.0)4 (6.5) 
 20–3520 (87.0)56 (90.3) Not computable
 > 353 (13.0)2 (3.2) 
Body mass index (kgm−2, n (%))0.66 
 < 19.88 (34.8)15 (25.0) 
 19.8–2914 (60.9)40 (66.7) 0.66 (0.23–1.88)
 > 291 (4.3)5 (8.3) 0.36 (0.04–3.79)
Parity (n (%))1 
 Nulliparous11 (47.8)28 (45.2) 
 Parous12 (52.2)34 (54.8) 0.90 (0.34–2.34)
Gravidity (n (%))0.78 
 0–15 (22.7)18 (29.0) 
 > 117 (77.3)44 (71.0) 1.40 (0.45–4.34)
Miscarriages (n (%))10 (45.5)21 (33.9)0.441.63 (0.60–4.38)
Previous preterm delivery (n (%))6 (26.1)5 (8.3)0.0643.88 (1.05–14.3)
Vaginal pH ≥ 4.5 (n (%))6 (30.0)16 (30.2)10.99 (0.32–3.04)
Bacterial vaginosis (n (%))4 (20.0)15 (27.3)0.770.67 (0.19–2.32)
Vaginal infection (n (%))14 (66.7)40 (74.1)0.570.70 (0.34–2.09)
Bishop score ≥ 4 (n (%))10 (43.5)23 (37.1)0.621.30 (0.49–3.45)
Fetal fibronectin-positive (n (%))15 (65.2)6 (9.7)< 0.00117.5 (5.26–58.2)
Cervical length⩽2.5 cm (n (%))17 (73.9)22 (35.5)0.0025.15 (1.77–15.0)
Gestational age at enrollment (weeks, mean ± SD)30.2 ± 3.729.6 ± 3.6 
Gestational age at delivery (weeks, mean ± SD)33.4 ± 2.739.2 ± 1.3 

Evaluation of risk factors

The logistic regression analysis was performed for the potential predictors of spontaneous preterm delivery identified by univariate analysis. These were positive fetal fibronectin, shortened cervical length, previous preterm delivery and maternal age. An additional score was set up by excluding fetal fibronectin from the initial model and subjecting this to the stepwise backward logistic regression procedure.

Based on the potential significant factors, the risk for the individual patient of threatened preterm delivery was calculated as a logistic function of z by the equation:

equation image

CLEOPATRA I model

To create the CLEOPATRA I (clinical evaluation of preterm delivery and theoretical risk assessment) score, the only variables included in the regression model were those that were available initially at admission of the patient and that showed a tendency to influence the risk for preterm delivery in the univariate analysis. Thus, fetal fibronectin was not used for this model. After commencing the logistic analysis, two risk factors remained in the final model (Table 2). Maternal age was excluded as being not significant. Nagelkerke R2 (0.275) showed poor goodness-of-fit.

Table 2. Results of the different logistic regression analyses
ModelPredictorBeta-coefficientSEPOdds ratio95% CINagelkerke R2
  1. Logistic regression was performed including all predictors with a P-value < 0.2 (maternal age, cervical length, previous preterm delivery, fetal fibronectin). Only the variables with significant influence in the regression model (P-value < 0.1) were considered in the final models presented here. SE, standard error.

CLEOPATRA I 0.275
Cervical length (⩽ 2.5 cm)2.040.620.0017.652.28–25.7 
Previous preterm delivery1.910.790.0166.741.44–31.7 
Constant−2.380.54< 0.001 
CLEOPATRA II 0.421
Fetal fibronectin2.880.64< 0.00117.95.10–62.7 
Previous preterm delivery1.520.810.0614.560.93–22.4 
Constant−2.160.43< 0.001 

The z-value of the risk score I was defined by:

equation image

The variables were coded as follows: cervical length: ⩽ 2.5 cm = 1, > 2.5 cm = 0; previous preterm delivery: yes = 1, no = 0.

As an example, the resulting probabilities for preterm delivery, based on the CLEOPATRA I score are presented in Table 3 according to cervical length and previous preterm labor.

Table 3. Calculated risk based on CLEOPATRA I score
Cervical length ⩽ 2.5 cmPrevious preterm deliveryRisk (%)
YesYes38
YesNo8
NoYes83
NoNo42

CLEOPATRA II model

A further risk score was then designed using the variables of the initial CLEOPATRA I model and including the variable fetal fibronectin. Thus three risk factors were included in this model. Fetal fibronectin and previous preterm delivery were associated with higher risk of preterm delivery, with odds ratios of 17.9 and 4.56, respectively. Nagelkerke R2 (0.421) showed an improved goodness-of-fit.

The z-value of the risk score II was defined by:

equation image

The variables were coded as follows: fetal fibronectin: positive = 1, negative = 0; previous preterm delivery: yes = 1, no = 0.

Validation of predicted probabilities

Table 4 presents the clinical and demographic factors of patients delivered at < 37 weeks' gestation and at ≥ 37 weeks' gestation in the validation set. Figure 1 shows the two ROC curves of the scores for prediction of preterm delivery. The discriminatory power of the CLEOPATRA II AUC (0.811; 95% CI, 0.694–0.928) was excellent while that of the CLEOPATRA I AUC (0.692; 95% CI, 0.562–0.821) was not acceptable.

Figure 1.

Comparison of receiver–operating characteristics curves and areas under the curve (AUC) constructed for CLEOPATRA scores in the prediction of preterm delivery using the validation dataset. ----, CLEOPATRA I (AUC, 0.69; 95% CI, 0.56–0.82); ······, CLEOPATRA II (AUC, 0.81; 95% CI, 0.69–0.93); ———, 45° line.

Table 4. Demographic and clinical characteristics of patients delivering at < 37 weeks or ≥ 37 weeks in the validation dataset
 Delivery < 37 weeks (n = 22)Delivery ≥ 37 weeks (n = 63)
  1. Data were not available for all cases.

Maternal age (years, n (%))
 < 200 (0.0)4 (6.3)
 20–3519 (86.4)53 (84.1)
 > 353 (13.6)6 (9.5)
Body mass index (kgm−2, n (%))
 < 19.81 (4.5)11 (17.7)
 19.8–2918 (81.8)46 (74.2)
 > 293 (13.6)5 (8.1)
Parity (n (%))
 Nulliparous14 (63.6)35 (55.6)
 Parous8 (36.4)28 (44.4)
Gravidity (n (%))
 0–110 (47.6)26 (41.9)
 > 111 (52.4)36 (58.1)
Miscarriages (n (%))9 (40.9)25 (39.7)
Previous preterm delivery (n (%))4 (18.2)5 (7.9)
Vaginal pH ≥ 4.5 (n (%))8 (38.1)27 (48.2)
Bacterial vaginosis (n (%))6 (30.0)22 (40.7)
Vaginal infection (n (%))16 (76.2)35 (63.6)
Bishop score ≥ 4 (n (%))10 (45.5)19 (30.2)
Fetal fibronectin-positive (n (%))16 (72.7)9 (14.3)
Cervical length ⩽ 2.5 cm (n (%))16 (72.7)25 (39.7)
Gestational age at enrollment (weeks, mean ± SD)28.8 ± 4.130.3 ± 3.5
Gestational age at delivery (weeks, mean ± SD)34.3 ± 2.739.4 ± 1.3

Cervical length and previous preterm delivery were the two factors in the CLEOPATRA I model and fetal fibronectin and previous preterm delivery in the CLEOPATRA II model that were associated with the primary outcome of delivery at < 37 weeks of gestation. The OR of spontaneous preterm delivery at < 37 weeks was analyzed by a combination of these significant parameters in both risk scores (Tables 5 and 6).

Table 5. Combination of variables included in CLEOPATRA I model in the prediction of preterm delivery using the validation dataset
Cervical length (cm)Previous preterm deliverynDelivery < 37 weeks (n %)Odds ratio95% CI
> 2.5No395 (12.8)Reference 
⩽ 2.5No3713 (35.1)3.681.16–11.7
> 2.5Yes51 (20.0)1.300.40–4.29
⩽ 2.5Yes43 (75.0)2.731.21–6.18
Table 6. Combination of variables included in CLEOPATRA II model in the prediction of preterm delivery using the validation dataset
Fetal fibronectinPrevious preterm deliverynDelivery < 37 weeks (n %)Odds ratio95% CI
  1. NC, not computable.

NegativeNo545 (9.3)Reference 
PositiveNo2213 (59.1)14.24.05–49.5
NegativeYes61 (16.7)1.400.44–4.50
PositiveYes33 (100.0)NCNC

Discussion

In order to combat preterm delivery effectively, it is important to search for risk factors in the patient's medical history and/or use diagnostic tools. Demographic data, laboratory tests and physical examination data have been studied as predictors of spontaneous preterm delivery18–21. Unfortunately, however, the sensitivity and/or specificity of these tests are low.

In this study we evaluated the relationship between many single potential factors and preterm delivery, and found relatively few significant associations. Shortened cervical length, fetal fibronectin and previous preterm delivery were the most important predictors.

Cervical length is an established predictor of preterm delivery in patients with preterm labor22. Since the introduction of transvaginal sonographic measurement of cervical length and dilatation of the internal cervical os, several studies have shown the outstanding predictive value of this method in patients with preterm contractions as well as in asymptomatic patients8–10, 23. Direct comparison of sonographic and digital examinations to assess cervical morphology revealed vaginal sonography to be more accurate24. Furthermore, in contrast to digital examination, the interobserver variability of sonography is acceptable and can be improved further by standardization of the method. Our data confirmed these findings; those patients who subsequently had a spontaneous preterm delivery had a significantly shorter mean cervical length (P = 0.002).

In 1991, Lockwood et al.25 published the first study in which the presence of fetal fibronectin in preterm labor patients was associated with preterm delivery. Although the presence of fetal fibronectin in vaginal fluids does not necessarily indicate the onset of labor (positive predictive value, 15–25%), its absence rules out labor within 7 days with a very high negative predictive value (97%–99.5%)11–14. Therefore, a negative result indicates a low likelihood of delivery, but a positive test should not be interpreted as an indication of labor or a reason for admission. In our univariate analysis the presence of a positive test result significantly increased the risk of preterm delivery (OR, 17.5; P < 0.001).

Epidemiological studies have demonstrated clearly that a prior history of preterm delivery is a major risk factor for its recurrence in a subsequent pregnancy10, 26. It has been suggested that several candidate genes for polymorphisms associated with labor exist, including those for oxytocin receptors, adrenocorticotropic hormones, thromboxane A2 receptor, endothelin-1, F-actin, tumor necrosis factor α (TNF-α), and interleukin-1 receptor antagonist27, 28. We confirmed these findings; in our study the risk for preterm delivery was significantly increased in patients with a previous preterm delivery (OR, 3.88; P = 0.064).

There have been several studies associating direct and indirect evidence of intra-amniotic infections and spontaneous preterm delivery29, 30. Microorganisms that have been implicated include bacterial vaginosis, group B streptococcus, Trichomonas vaginalis, mycoplasmas, fusobacteria and Ureaplasma urealyticum. More recent studies focused on screening for bacterial vaginosis31 and its detection in the second and third trimesters of pregnancy has been linked with preterm delivery. The controversy remains regarding the benefit of treating asymptomatic women32. With the exception of high-risk patients, a policy of routine screening for bacterial vaginosis is not currently recommended, based on the results of a multicenter randomized controlled trial32. In our study bacterial vaginosis and vaginal infection had no significant influence as predictors of preterm delivery.

Preterm labor and birth has been studied at both extremes of maternal age. A young maternal age (i.e. ≤ 20 years) is often associated with an increased rate of preterm delivery33. Most of the complications associated with older maternal age are caused by age-related confounders such as diabetes and hypertension, which often lead to indicated preterm delivery. Our data demonstrated that maternal age had no significant influence as a predictor for preterm delivery.

Most, but not all, studies find a low BMI to be a strong risk factor for preterm delivery. Whether this relationship is due to a specific vitamin or mineral deficiency associated with low caloric consumption or to other factors is unknown34. In our study, BMI had no significant influence as a predictor for preterm delivery.

Bishop scores, consisting of dilatation, effacement, station, consistency and position, are generally used to judge the state of parturition in humans15. However, determination of these parameters is based on experience and does not provide an objective estimation of cervical ripening. The digital examination is neither sensitive nor specific. Indeed, graded risk-assessment scores, despite the use of complex models developed using a multivariate analysis procedure, showed low sensitivity (approximately 20%) and low positive predictive value (< 30%)7, 8, 24.

Evaluation of risk factors

The relative impact of risk factors should be quantified to develop a risk score that allows the calculation of an exact probability for preterm delivery in patients with threatened preterm labor. This score should be reliable in identifying a high-risk group that may need aggressive treatment. When the predicted risk reaches a certain cut-off level, long-term hospitalization with bed rest and transport to a tertiary care center can be considered. In addition to our CLEOPATRA II score, which can only be calculated with the use of expensive biochemical testing (rapid fetal fibronectin TLi System) not available in all centers, we developed the CLEOPATRA I score, based primarily on the medical history of the woman, clinical examination and cervical length measurement.

Calculation and validation of the risk score

The methodological strength of this analysis was the splitting of data into an evaluation and a validation set. This is important as any regression analysis is designed to calculate the best fit for the analyzed sample itself. Therefore, the discriminatory power of the different scores may be less if a prediction is made for another sample drawn from the same population. To avoid overestimation of the discriminatory power, the data records were split randomly into an evaluation set and a validation set and only data of patients that were not used to set up the model were used for validation. The performance of the prognostic models was evaluated by their discriminatory power. Discrimination (i.e. the ability of a prognostic score to classify patients correctly as preterm delivery or non-preterm delivery) was measured by AUC16, 17.

We found the discriminatory power of the CLEOPATRA II score to be excellent, and that for CLEOPATRA I to be acceptable. These results suggest that preterm delivery can be identified based on our CLEOPATRA II score and on two independent risk factors.

Limitations of the risk model

A major limitation of our analysis is that the risk calculation is only possible for women who have been treated with standard care with tocolysis, betamethasone and hospitalization. A study without treatment would be unethical.

One other major limitation is that our results may not be applicable to institutions with different patient populations. As a consequence, parameters that were under- or not represented in our dataset had no chance of being identified as a significant risk factor or protective factor. For example, there were only nine patients in the evaluation set with an extreme maternal age (< 20 or > 35 years). Thus, extreme maternal age had no chance of showing its predictive value in this analysis, though it is known that young and older maternal age is often associated with an increased rate of preterm delivery33. Another potential criticism of our study might be the preprocessing of the data before the logistic regression analysis. While combining and dichotomizing the data, several simplifications were made that could have affected the final results. For example, previous preterm delivery was merged into one dichotomous variable (yes vs. no); we did not distinguish between different gestational ages at preterm delivery, or specify how many previous preterm deliveries there had been, although we noted that the risk for spontaneous preterm delivery increased as the gestational age of any previous preterm delivery decreased. The highest risk is in women with more than one preterm delivery or when the preterm delivery occurred before 28 weeks of gestation35–38. The risk of recurrence is lower if the previous pregnancy went to term. While this procedure eliminated potentially valuable information, it was the only way to create variables with adequate statistical power to ‘survive’ in the logistic regression model.

Other problematic aspects of the statistical modeling are potential interactions between the individual factors. Logistic regression is not the optimal statistical tool for complex interaction analysis. Furthermore, it has been recommended that such interactions should only be examined when there is a biological rationale for a potential interaction. For example, an interaction term ‘Bishop score ≥ 4’ AND ‘previous preterm delivery’ or ‘Bishop score ≥ 4’ AND ‘parity’ would provide more information than would each of the single variables alone.

In conclusion, in our population the results indicate that the probability of preterm delivery can be predicted by CLEOPATRA II scores. The discriminatory power was excellent. Further studies are needed in order to investigate the usefulness of our risk score in other centers according to the availability of equipment and laboratory examination facilities.

Ancillary