Volume 33, Issue 14
Research Article

A new calibration test and a reappraisal of the calibration belt for the assessment of prediction models based on dichotomous outcomes

Giovanni Nattino

Corresponding Author

GiViTI Coordinating Center, Laboratory of Clinical Epidemiology, IRCCS ‐ Istituto di Ricerche Farmacologiche ‘Mario Negri’, Villa Camozzi, Ranica (BG), Italy

Correspondence to: Giovanni Nattino, GiViTI Coordinating Center, Laboratory of Clinical Epidemiology, IRCCS ‐ Istituto di Ricerche Farmacologiche ‘Mario Negri’, Villa Camozzi, Ranica (BG), Italy.

E‐mail: giovanni.nattino@marionegri.it

Search for more papers by this author
Stefano Finazzi

INO‐CNR BEC Center and Dipartimento di Fisica, Università di Trento, via Sommarive 14, 38123 Povo, Trento, Italy

Search for more papers by this author
Guido Bertolini

GiViTI Coordinating Center, Laboratory of Clinical Epidemiology, IRCCS ‐ Istituto di Ricerche Farmacologiche ‘Mario Negri’, Villa Camozzi, Ranica (BG), Italy

Search for more papers by this author
First published: 04 February 2014
Citations: 41

Abstract

Calibration is one of the main properties that must be accomplished by any predictive model. Overcoming the limitations of many approaches developed so far, a study has recently proposed the calibration belt as a graphical tool to identify ranges of probability where a model based on dichotomous outcomes miscalibrates. In this new approach, the relation between the logits of the probability predicted by a model and of the event rates observed in a sample is represented by a polynomial function, whose coefficients are fitted and its degree is fixed by a series of likelihood‐ratio tests. We propose here a test associated with the calibration belt and show how the algorithm to select the polynomial degree affects the distribution of the test statistic. We calculate its exact distribution and confirm its validity via a numerical simulation. Starting from this distribution, we finally reappraise the procedure to construct the calibration belt and illustrate an application in the medical context. Copyright © 2014 John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 41

  • Independent, external validation of clinical prediction rules for the identification of extended-spectrum β-lactamase-producing Enterobacterales, University Hospital Basel, Switzerland, January 2010 to December 2016, Eurosurveillance, 10.2807/1560-7917.ES.2020.25.26.1900317, 25, 26, (2020).
  • Case-mix affects calibration of cardiosurgical severity scores, Minerva Anestesiologica, 10.23736/S0375-9393.20.14280-9, 86, 7, (2020).
  • Sepsis and Septic Shock in Patients With Malignancies, Critical Care Medicine, 10.1097/CCM.0000000000004322, 48, 6, (822-829), (2020).
  • Predicting Mortality in Children With Pediatric Acute Respiratory Distress Syndrome, Critical Care Medicine, 10.1097/CCM.0000000000004345, 48, 6, (e514-e522), (2020).
  • Evaluation of quality of care in trauma patients using international scoring systems, Medicina Intensiva (English Edition), 10.1016/j.medine.2020.05.002, (2020).
  • Graphical calibration curves and the integrated calibration index (ICI) for survival models, Statistics in Medicine, 10.1002/sim.8570, 39, 21, (2714-2742), (2020).
  • Prognostic Research in Traumatic Brain Injury: Markers, Modeling and Methodological Principles, Journal of Neurotrauma, 10.1089/neu.2019.6708, (2020).
  • Can machine learning improve mortality prediction following cardiac surgery?, European Journal of Cardio-Thoracic Surgery, 10.1093/ejcts/ezaa229, (2020).
  • Validation of the Norwegian survival prediction model in trauma (NORMIT) in Swedish trauma populations, BJS (British Journal of Surgery), 10.1002/bjs.11306, 107, 4, (381-390), (2019).
  • A prospective cohort study characterising patients declined emergency laparotomy: survival in the ‘NoLap’ population, Anaesthesia, 10.1111/anae.14839, 75, 1, (54-62), (2019).
  • Derivation and internal validation of a data-driven prediction model to guide frontline health workers in triaging children under-five in Nairobi, Kenya, Wellcome Open Research, 10.12688/wellcomeopenres.15387.1, 4, (121), (2019).
  • Near-Infrared–Based Cerebral Oximetry for Prediction of Severe Acute Kidney Injury in Critically Ill Children After Cardiac Surgery, Critical Care Explorations, 10.1097/CCE.0000000000000063, 1, 12, (e0063), (2019).
  • Detecting Anomalies Among Practice Sites Within Multicenter Trials, Circulation: Cardiovascular Quality and Outcomes, 10.1161/CIRCOUTCOMES.118.004907, 12, 3, (2019).
  • Performance of Pediatric Mortality Prediction Scores for PICU Mortality and 90-Day Mortality*, Pediatric Critical Care Medicine, 10.1097/PCC.0000000000001764, 20, 2, (113-119), (2019).
  • Development of a prediction model for postoperative pneumonia, European Journal of Anaesthesiology, 10.1097/EJA.0000000000000921, 36, 2, (93-104), (2019).
  • Early Recognition of Persistent Acute Kidney Injury, Seminars in Nephrology, 10.1016/j.semnephrol.2019.06.003, 39, 5, (431-441), (2019).
  • Valoración de la calidad asistencial al traumatismo grave mediante comparación con estándares internacionales, Medicina Intensiva, 10.1016/j.medin.2019.02.002, (2019).
  • Machine learning versus physicians’ prediction of acute kidney injury in critically ill adults: a prospective evaluation of the AKIpredictor, Critical Care, 10.1186/s13054-019-2563-x, 23, 1, (2019).
  • External validation of the Revised Cardiac Risk Index and National Surgical Quality Improvement Program Myocardial Infarction and Cardiac Arrest calculator in noncardiac vascular surgery, British Journal of Anaesthesia, 10.1016/j.bja.2019.05.029, (2019).
  • Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt, The Stata Journal: Promoting communications on statistics and Stata, 10.1177/1536867X1801700414, 17, 4, (1003-1014), (2019).
  • Optimal threshold of the prostate health index in predicting aggressive prostate cancer using predefined cost–benefit ratios and prevalence, International Urology and Nephrology, 10.1007/s11255-019-02367-z, (2019).
  • Association Between Post–Dural Puncture Headache After Neuraxial Anesthesia in Childbirth and Intracranial Subdural Hematoma, JAMA Neurology, 10.1001/jamaneurol.2019.2995, (2019).
  • A nonparametric updating method to correct clinical prediction model drift, Journal of the American Medical Informatics Association, 10.1093/jamia/ocz127, (2019).
  • Changes in critically ill cancer patients’ short-term outcome over the last decades: results of systematic review with meta-analysis on individual data, Intensive Care Medicine, 10.1007/s00134-019-05653-7, (2019).
  • High-intensity endurance capacity assessment as a tool for talent identification in elite youth female soccer, Journal of Sports Sciences, 10.1080/02640414.2019.1656323, (1-7), (2019).
  • Predicting 30-Day Hospital Readmission Risk in a National Cohort of Patients with Cirrhosis, Digestive Diseases and Sciences, 10.1007/s10620-019-05826-w, (2019).
  • Nomograms to predict naming decline after temporal lobe surgery in adults with epilepsy, Neurology, 10.1212/WNL.0000000000006629, 91, 23, (e2144-e2152), (2018).
  • Prediction of 60-Day Case Fatality in Critically Ill Patients Receiving Renal Replacement Therapy, SHOCK, 10.1097/SHK.0000000000001054, 50, 2, (156-161), (2018).
  • Mortality attributable to different Klebsiella susceptibility patterns and to the coverage of empirical antibiotic therapy: a cohort study on patients admitted to the ICU with infection, Intensive Care Medicine, 10.1007/s00134-018-5360-0, 44, 10, (1709-1719), (2018).
  • Pretreatment prediction of response to ursodeoxycholic acid in primary biliary cholangitis: development and validation of the UDCA Response Score, The Lancet Gastroenterology & Hepatology, 10.1016/S2468-1253(18)30163-8, (2018).
  • Cirrhotic patients admitted to the ICU for medical reasons: Analysis of 5506 patients admitted to 286 ICUs in 8 years, Journal of Critical Care, 10.1016/j.jcrc.2018.03.018, 45, (220-228), (2018).
  • Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt, The Stata Journal: Promoting communications on statistics and Stata, 10.1177/1536867X1701700414, 17, 4, (1003-1014), (2018).
  • Intensive care medicine in 2050: statistical tools for development of prognostic models (why clinicians should not be ignored), Intensive Care Medicine, 10.1007/s00134-017-4825-x, 43, 9, (1403-1406), (2017).
  • Analyzing How Discursive Practices Affect Physicians’ Decision-Making Processes: A Phenomenological-Based Qualitative Study in Critical Care Contexts, INQUIRY: The Journal of Health Care Organization, Provision, and Financing, 10.1177/0046958017731962, 54, (004695801773196), (2017).
  • Does my patient have chronic Chagas disease? Development and temporal validation of a diagnostic risk score, Revista da Sociedade Brasileira de Medicina Tropical, 10.1590/0037-8682-0196-2016, 49, 3, (329-340), (2016).
  • Mirror, mirror on the wall…predictions in anaesthesia and critical care, Anaesthesia, 10.1111/anae.13537, 71, 9, (1104-1109), (2016).
  • A new test and graphical tool to assess the goodness of fit of logistic regression models, Statistics in Medicine, 10.1002/sim.6744, 35, 5, (709-720), (2015).
  • External validation of the Norwegian survival prediction model in trauma after major trauma in Southern Finland, Acta Anaesthesiologica Scandinavica, 10.1111/aas.12592, 60, 1, (48-58), (2015).
  • Bootstrap confidence intervals for loess‐based calibration curves, Statistics in Medicine, 10.1002/sim.6167, 33, 15, (2699-2700), (2014).
  • Comments on ‘Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers’ by Peter C. Austin and Ewout W. Steyerberg, Statistics in Medicine, 10.1002/sim.6126, 33, 15, (2696-2698), (2014).
  • Calibration of Prognostic Risk Scores, Wiley StatsRef: Statistics Reference Online, 10.1002/9781118445112, (1-10), (2014).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.