Volume 36, Issue 14
Research Article

Firth's logistic regression with rare events: accurate effect estimates and predictions?

Rainer Puhr

The Kirby Institute, University of New South Wales, Sydney, Australia

Search for more papers by this author
Georg Heinze

Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria

Search for more papers by this author
Mariana Nold

Institute of Medical Statistics, Computer Sciences and Documentation, University Hospital Jena, Jena, Germany

Search for more papers by this author
Lara Lusa

Institute for Biostatistics and Medical Informatics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia

Search for more papers by this author
Angelika Geroldinger

Corresponding Author

E-mail address: angelika.geroldinger@meduniwien.ac.at

Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria

Correspondence to: Angelika Geroldinger, Medical University of Vienna, Vienna, Austria.

E‐mail: angelika.geroldinger@meduniwien.ac.at

Search for more papers by this author
First published: 12 March 2017
Citations: 18

Abstract

Firth's logistic regression has become a standard approach for the analysis of binary outcomes with small samples. Whereas it reduces the bias in maximum likelihood estimates of coefficients, bias towards one‐half is introduced in the predicted probabilities. The stronger the imbalance of the outcome, the more severe is the bias in the predicted probabilities. We propose two simple modifications of Firth's logistic regression resulting in unbiased predicted probabilities. The first corrects the predicted probabilities by a post hoc adjustment of the intercept. The other is based on an alternative formulation of Firth's penalization as an iterative data augmentation procedure. Our suggested modification consists in introducing an indicator variable that distinguishes between original and pseudo‐observations in the augmented data. In a comprehensive simulation study, these approaches are compared with other attempts to improve predictions based on Firth's penalization and to other published penalization strategies intended for routine use. For instance, we consider a recently suggested compromise between maximum likelihood and Firth's logistic regression. Simulation results are scrutinized with regard to prediction and effect estimation. We find that both our suggested methods do not only give unbiased predicted probabilities but also improve the accuracy conditional on explanatory variables compared with Firth's penalization. While one method results in effect estimates identical to those of Firth's penalization, the other introduces some bias, but this is compensated by a decrease in the mean squared error. Finally, all methods considered are illustrated and compared for a study on arterial closure devices in minimally invasive cardiac surgery. Copyright © 2017 John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 18

  • A Precision Environment-Wide Association Study of Hypertension via Supervised Cadre Models, IEEE Journal of Biomedical and Health Informatics, 10.1109/JBHI.2019.2918070, 24, 3, (916-925), (2020).
  • A Comparative Study of the Bias Correction Methods for Differential Item Functioning Analysis in Logistic Regression with Rare Events Data, BioMed Research International, 10.1155/2020/1632350, 2020, (1-12), (2020).
  • The risk of pregnancy‐associated hypertension in women with nonalcoholic fatty liver disease, Liver International, 10.1111/liv.14563, 40, 10, (2417-2426), (2020).
  • An advanced prediction model for postoperative complications and early implant failure, Clinical Oral Implants Research, 10.1111/clr.13636, 31, 10, (928-935), (2020).
  • Immunogenicity Measures of Influenza Vaccines: A Study of 1164 Registered Clinical Trials, Vaccines, 10.3390/vaccines8020325, 8, 2, (325), (2020).
  • Unmet Supportive Care Needs of Survival Patients with Nasopharyngeal Carcinoma, International Journal of Environmental Research and Public Health, 10.3390/ijerph17103519, 17, 10, (3519), (2020).
  • Energy Justice in Slum Rehabilitation Housing: An Empirical Exploration of Built Environment Effects on Socio-Cultural Energy Demand, Sustainability, 10.3390/su12073027, 12, 7, (3027), (2020).
  • Machine learning workflows to estimate class probabilities for precision cancer diagnostics on DNA methylation microarray data, Nature Protocols, 10.1038/s41596-019-0251-6, (2020).
  • Prediction of default probability by using statistical models for rare events, Journal of the Royal Statistical Society: Series A (Statistics in Society), 10.1111/rssa.12467, 182, 4, (1143-1162), (2019).
  • Ethics by Design: The Impact of Form of Government on Municipal Corruption, Public Administration Review, 10.1111/puar.13050, 79, 4, (591-600), (2019).
  • Sample size considerations and predictive performance of multinomial logistic prediction models, Statistics in Medicine, 10.1002/sim.8063, 38, 9, (1601-1619), (2019).
  • Bring More Data!—A Good Advice? Removing Separation in Logistic Regression by Increasing Sample Size, International Journal of Environmental Research and Public Health, 10.3390/ijerph16234658, 16, 23, (4658), (2019).
  • The old terrorism: a dataset, 1860 – 1969, International Interactions, 10.1080/03050629.2019.1649259, (1-16), (2019).
  • Comparison of penalized logistic regression models for rare event case, Communications in Statistics - Simulation and Computation, 10.1080/03610918.2019.1676438, (1-13), (2019).
  • Dynamic prediction of cumulative incidence functions by direct binomial regression, Biometrical Journal, 10.1002/bimj.201700194, 60, 4, (734-747), (2018).
  • Investigation of parameters predicting the need for diagnostic imaging beyond computed tomography in the evaluation of dogs with thoracolumbar myelopathy: Retrospective evaluation of 555 dogs, Veterinary Radiology & Ultrasound, 10.1111/vru.12576, 59, 2, (147-154), (2017).
  • Grundlagen der Statistik und Anwendung in der GefäßchirurgieThe basics of statistics and its application in vascular surgery, Gefässchirurgie, 10.1007/s00772-017-0305-4, 22, 6, (420-427), (2017).
  • Cepheid Xpert® Flu/RSV and Seegene Allplex™ RP1 show high diagnostic agreement for the detection of influenza A/B and respiratory syncytial viruses in clinical practice, Influenza and Other Respiratory Viruses, 10.1111/irv.12799, 0, 0, (undefined).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.