Volume 20, Issue 11
Original Report

Measuring balance and model selection in propensity score methods

Svetlana V. Belitser

Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute of Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands

Search for more papers by this author
Edwin P. Martens

Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute of Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands

Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, Netherlands

Search for more papers by this author
Wiebe R. Pestman

Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute of Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands

Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, Netherlands

Search for more papers by this author
Rolf H.H. Groenwold

Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute of Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands

Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, Netherlands

Search for more papers by this author
Anthonius de Boer

Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute of Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands

Search for more papers by this author
Olaf H. Klungel

Corresponding Author

Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute of Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands

Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, Netherlands

O. H. Klungel, Department of Pharmacoepidemiology and Pharmacotherapy, Utrecht Institute of Pharmaceutical Sciences, Utrecht University, Sorbonnelaan 16, 3584 CA Utrecht, Netherlands. E‐mail: o.h.klungel@uu.nlSearch for more papers by this author
First published: 29 July 2011
Citations: 43

ABSTRACT

Purpose

Propensity score (PS) methods focus on balancing confounders between groups to estimate an unbiased treatment or exposure effect. However, there is lack of attention in actually measuring, reporting and using the information on balance, for instance for model selection. We propose to use a measure for balance in PS methods and describe several of such measures: the overlapping coefficient, the Kolmogorov‐Smirnov distance, and the Lévy distance.

Methods

We performed simulation studies to estimate the association between these three and several mean based measures for balance and bias (i.e., discrepancy between the true and the estimated treatment effect).

Results

For large sample sizes (n = 2000) the average Pearson's correlation coefficients between bias and Kolmogorov‐Smirnov distance (r = 0.89), the Lévy distance (r = 0.89) and the absolute standardized mean difference (r = 0.90) were similar, whereas this was lower for the overlapping coefficient (r = −0.42). When sample size decreased to 400, mean based measures of balance had stronger correlations with bias. Models including all confounding variables, their squares and interaction terms resulted in smaller bias than models that included only main terms for confounding variables.

Conclusions

We conclude that measures for balance are useful for reporting the amount of balance reached in propensity score analysis and can be helpful in selecting the final PS model. Copyright © 2011 John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 43

  • A review of the use of propensity score diagnostics in papers published in high-ranking medical journals, BMC Medical Research Methodology, 10.1186/s12874-020-00994-0, 20, 1, (2020).
  • Bleeding After Musculoskeletal Surgery in Hospitals That Switched From Hydroxyethyl Starch to Albumin Following a Food and Drug Administration Warning, Anesthesia & Analgesia, 10.1213/ANE.0000000000004942, 131, 4, (1193-1200), (2020).
  • Performance of propensity score matching to estimate causal effects in small samples, Statistical Methods in Medical Research, 10.1177/0962280219887196, 29, 3, (644-658), (2020).
  • Does the choice of balance-measure matter under genetic matching?, Empirical Economics, 10.1007/s00181-020-01873-9, (2020).
  • Comparison of outcome between intrauterine balloon tamponade and uterine artery embolization in the management of persistent postpartum hemorrhage: A propensity score‐matched cohort study, Acta Obstetricia et Gynecologica Scandinavica, 10.1111/aogs.13679, 98, 11, (1473-1482), (2019).
  • Avoiding pitfalls when combining multiple imputation and propensity scores, Statistics in Medicine, 10.1002/sim.8355, 38, 26, (5120-5132), (2019).
  • Inhaled corticosteroids in COPD and onset of type 2 diabetes and osteoporosis: matched cohort study, npj Primary Care Respiratory Medicine, 10.1038/s41533-019-0150-x, 29, 1, (2019).
  • Propensity Score Methods in Health Technology Assessment: Principles, Extended Applications, and Recent Advances, Frontiers in Pharmacology, 10.3389/fphar.2019.00973, 10, (2019).
  • Properties and pitfalls of weighting as an alternative to multilevel multiple imputation in cluster randomized trials with missing binary outcomes under covariate-dependent missingness, Statistical Methods in Medical Research, 10.1177/0962280219859915, (096228021985991), (2019).
  • Matched or unmatched analyses with propensity‐score–matched data?, Statistics in Medicine, 10.1002/sim.7976, 38, 2, (289-300), (2018).
  • Propensity Score Estimation Using Classification and Regression Trees in the Presence of Missing Covariate Data, Epidemiologic Methods, 10.1515/em-2017-0020, 0, 0, (2018).
  • A Kernel-Based Metric for Balance Assessment, Journal of Causal Inference, 10.1515/jci-2016-0029, 0, 0, (2018).
  • Review of Recent Methodological Developments in Group-Randomized Trials: Part 2—Analysis, American Journal of Public Health, 10.2105/AJPH.2017.303707, 107, 7, (1078-1086), (2017).
  • Double-adjustment in propensity score matching analysis: choosing a threshold for considering residual imbalance, BMC Medical Research Methodology, 10.1186/s12874-017-0338-0, 17, 1, (2017).
  • Propensity Scores in Pharmacoepidemiology: Beyond the Horizon, Current Epidemiology Reports, 10.1007/s40471-017-0131-y, 4, 4, (271-280), (2017).
  • The “Dry-Run” Analysis: A Method for Evaluating Risk Scores for Confounding Control, American Journal of Epidemiology, 10.1093/aje/kwx032, 185, 9, (842-852), (2017).
  • Methodological approaches in analysing observational data: A practical example on how to address clustering and selection bias, International Journal of Nursing Studies, 10.1016/j.ijnurstu.2017.06.017, 76, (36-44), (2017).
  • Improving Causal Inference: Recommendations for Covariate Selection and Balance in Propensity Score Methods, Journal of the Society for Social Work and Research, 10.1086/691464, 8, 2, (279-303), (2017).
  • Cesarean Outcomes in US Birth Centers and Collaborating Hospitals: A Cohort Comparison, Journal of Midwifery & Women's Health, 10.1111/jmwh.12553, 62, 1, (40-48), (2016).
  • Head to head comparison of the propensity score and the high-dimensional propensity score matching methods, BMC Medical Research Methodology, 10.1186/s12874-016-0119-1, 16, 1, (2016).
  • Missing Confounder Data in Propensity Score Methods for Causal Inference, Statistical Causal Inferences and Their Applications in Public Health Research, 10.1007/978-3-319-41259-7_5, (101-110), (2016).
  • Propensity score to detect baseline imbalance in cluster randomized trials: the role of the c-statistic, BMC Medical Research Methodology, 10.1186/s12874-015-0100-4, 16, 1, (2016).
  • Performance of the high-dimensional propensity score in adjusting for unmeasured confounders, European Journal of Clinical Pharmacology, 10.1007/s00228-016-2118-x, 72, 12, (1497-1505), (2016).
  • Evaluating different physician's prescribing preference based instrumental variables in two primary care databases: a study of inhaled long‐acting beta2‐agonist use and the risk of myocardial infarction, Pharmacoepidemiology and Drug Safety, 10.1002/pds.3860, 25, S1, (132-141), (2016).
  • Instrumental variables analysis using multiple databases: an example of antidepressant use and risk of hip fracture, Pharmacoepidemiology and Drug Safety, 10.1002/pds.3863, 25, S1, (122-131), (2016).
  • The IMI PROTECT project: purpose, organizational structure, and procedures, Pharmacoepidemiology and Drug Safety, 10.1002/pds.3933, 25, S1, (5-10), (2016).
  • Accounting for interactions and complex inter‐subject dependency in estimating treatment effect in cluster‐randomized trials with missing outcomes, Biometrics, 10.1111/biom.12519, 72, 4, (1066-1077), (2016).
  • A new weighted balance measure helped to select the variables to be included in a propensity score model, Journal of Clinical Epidemiology, 10.1016/j.jclinepi.2015.04.009, 68, 12, (1415-1422.e2), (2015).
  • The Impact of Inpatient Palliative Care Consultations on 30-Day Hospital Readmissions, Journal of Palliative Medicine, 10.1089/jpm.2015.0138, 18, 11, (956-961), (2015).
  • An Application of Inverse Probability Weighting Estimation of Marginal Structural Models of a Continuous Exposure, Epidemiology, 10.1097/EDE.0000000000000346, 26, 5, (e52-e53), (2015).
  • Reporting of covariate selection and balance assessment in propensity score analysis is suboptimal: a systematic review, Journal of Clinical Epidemiology, 10.1016/j.jclinepi.2014.08.011, 68, 2, (122-131), (2015).
  • Neural Networks for Propensity Score Estimation: Simulation Results and Recommendations, Quantitative Psychology Research, 10.1007/978-3-319-19977-1_20, (279-291), (2015).
  • Does Continuous Hospice Care Help Patients Remain at Home?, Journal of Pain and Symptom Management, 10.1016/j.jpainsymman.2015.04.007, 50, 3, (297-304), (2015).
  • Quantitative Falsification of Instrumental Variables Assumption Using Balance Measures, Epidemiology, 10.1097/EDE.0000000000000152, 25, 5, (770-772), (2014).
  • How Generalizable Is Your Experiment? An Index for Comparing Experimental Samples and Populations, Journal of Educational and Behavioral Statistics, 10.3102/1076998614558486, 39, 6, (478-501), (2014).
  • Propensity score balance measures in pharmacoepidemiology: a simulation study, Pharmacoepidemiology and Drug Safety, 10.1002/pds.3574, 23, 8, (802-811), (2014).
  • Metrics for covariate balance in cohort studies of causal effects, Statistics in Medicine, 10.1002/sim.6058, 33, 10, (1685-1699), (2013).
  • Can Palliative Home Care Reduce 30-Day Readmissions? Results of a Propensity Score Matched Cohort Study, Journal of Palliative Medicine, 10.1089/jpm.2013.0213, 16, 10, (1290-1293), (2013).
  • Matching by Propensity Score in Cohort Studies with Three Treatment Groups, Epidemiology, 10.1097/EDE.0b013e318289dedf, 24, 3, (401-409), (2013).
  • The z-difference can be used to measure covariate balance in matched propensity score analyses, Journal of Clinical Epidemiology, 10.1016/j.jclinepi.2013.06.001, 66, 11, (1302-1307), (2013).
  • Prognostic score–based balance measures can be a useful diagnostic for propensity score methods in comparative effectiveness research, Journal of Clinical Epidemiology, 10.1016/j.jclinepi.2013.01.013, 66, 8, (S84-S90.e1), (2013).
  • Propensity score analysis: promise, reality and irrational exuberance, Journal of Experimental Criminology, 10.1007/s11292-012-9166-8, 9, 2, (129-144), (2012).
  • Balance measures for propensity score methods: a clinical example on beta‐agonist use and the risk of myocardial infarction, Pharmacoepidemiology and Drug Safety, 10.1002/pds.2251, 20, 11, (1130-1137), (2011).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.