Prediction of spontaneous preterm delivery in women with threatened preterm labour: a prospective cohort study of multiple proteins in maternal serum



I was interested to read the paper by Tsiartas and colleagues1 published in the June 2012 issue of BJOG. The authors reported the prediction of preterm labour by building a model using a cohort data set. As the authors point out in their conclusion, such a model needs to be tested in a new cohort in order to confirm its predictive ability. Why did the authors not divide their cohort into two and develop the model on one half and test it on the other? Or use other methods such as bootstrapping to test the reliability of their model?

They have reported a sensitivity of 74%, a specificity of 87%, a positive predictive value (PPV) of 76%, a negative predictive value (NPV) of 86%, a likelihood ratio of 5.8 and an area under the receiver operating characteristic (AUC) curve of 0.88, concluding that serum proteins and cervical length constituted the best prediction model for the mentioned outcome.1 Sensitivity, specificity, PPV, NPV, positive likelihood ratio (true positive/false negative) and negative likelihood ratio (false positive/true negative), as well as odds ratio (true results/false results, preferably with a value of more than 50), are among the tests to evaluate the validity (i.e. accuracy) of a single test compared with a gold standard.2–4

Considering the range of values for the positive likelihood ratio (LR+ = 1–infinity) and negative likelihood ratio (LR = 0–1), knowing that both LR+ and LR equal 1 is the worst situation for the test. An LR+ of 5.8 says nothing about the predictive value, because this should be compared with another LR+. Moreover, considering the range of possible LR+ values, an LR+ of 5.8 seems unimpressive. In such a situation, it is better to at least report the LR as well.2,4

The area under the ROC curve is usually reported for diagnostic rather prognostic values of a model. The ROC for models may be comparable with LR+ for a test because both of them actually use sensitivity and 1 – specificity; however, in LR+ they are divided, and in the ROC we should plot sensitivity to 1 – specificity.4