Volume 33, Issue 10
Research Article

Metrics for covariate balance in cohort studies of causal effects

Jessica M. Franklin

Corresponding Author

Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A.

Correspondence to: Jessica M. Franklin, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, 1620 Tremont St., Suite 3030, Boston, MA 02120, U.S.A.

E‐mail: jmfranklin@partners.org

Search for more papers by this author
Jeremy A. Rassen

Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A.

Search for more papers by this author
Diana Ackermann

Department Global Epidemiology, Boehringer Ingelheim GmbH, Ingelheim, Germany

Search for more papers by this author
Dorothee B. Bartels

Department Global Epidemiology, Boehringer Ingelheim GmbH, Ingelheim, Germany

Department of Epidemiology, Hannover Medical School, Hannover, Germany

Search for more papers by this author
Sebastian Schneeweiss

Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A.

Search for more papers by this author
First published: 09 December 2013
Citations: 87

Abstract

Inferring causation from non‐randomized studies of exposure requires that exposure groups can be balanced with respect to prognostic factors for the outcome. Although there is broad agreement in the literature that balance should be checked, there is confusion regarding the appropriate metric. We present a simulation study that compares several balance metrics with respect to the strength of their association with bias in estimation of the effect of a binary exposure on a binary, count, or continuous outcome. The simulations utilize matching on the propensity score with successively decreasing calipers to produce datasets with varying covariate balance. We propose the post‐matching C‐statistic as a balance metric and found that it had consistently strong associations with estimation bias, even when the propensity score model was misspecified, as long as the propensity score was estimated with sufficient study size. This metric, along with the average standardized difference and the general weighted difference, outperformed all other metrics considered in association with bias, including the unstandardized absolute difference, Kolmogorov–Smirnov and Lévy distances, overlapping coefficient, Mahalanobis balance, and L1 metrics. Of the best‐performing metrics, the C‐statistic and general weighted difference also have the advantage that they automatically evaluate balance on all covariates simultaneously and can easily incorporate balance on interactions among covariates. Therefore, when combined with the usual practice of comparing individual covariate means and standard deviations across exposure groups, these metrics may provide useful summaries of the observed covariate imbalance. Copyright © 2013 John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 87

  • The Effect of Surgical Approach on Outcomes Following Total Hip Arthroplasty Performed for Displaced Intracapsular Hip Fractures, The Journal of Bone and Joint Surgery, 10.2106/JBJS.19.00195, 102, 1, (21-28), (2020).
  • Association of Tramadol Use With Risk of Hip Fracture, Journal of Bone and Mineral Research, 10.1002/jbmr.3935, 35, 4, (631-640), (2020).
  • Transparency in real‐world evidence (RWE) studies to build confidence for decision‐making: Reporting RWE research in diabetes, Diabetes, Obesity and Metabolism, 10.1111/dom.13918, 22, S3, (45-59), (2020).
  • Serious infection risk in children with psoriasis on systemic treatment: A propensity score-matched population-based study, Journal of the American Academy of Dermatology, 10.1016/j.jaad.2020.02.065, (2020).
  • A review of the use of propensity score diagnostics in papers published in high-ranking medical journals, BMC Medical Research Methodology, 10.1186/s12874-020-00994-0, 20, 1, (2020).
  • Comparative Outcomes of Treatment Initiation With Brand vs. Generic Warfarin in Older Patients, Clinical Pharmacology & Therapeutics, 10.1002/cpt.1743, 107, 6, (1334-1342), (2020).
  • Is aspirin as effective as the newer direct oral anticoagulants for venous thromboembolism prophylaxis following total hip and knee replacement? An analysis from the National Joint Registry for England, Wales, Northern Ireland and the Isle of Man, The Journal of Arthroplasty, 10.1016/j.arth.2020.04.088, (2020).
  • Adverse outcomes and mortality in users of non-steroidal anti-inflammatory drugs who tested positive for SARS-CoV-2: A Danish nationwide cohort study, PLOS Medicine, 10.1371/journal.pmed.1003308, 17, 9, (e1003308), (2020).
  • Effect of Ruxolitinib on lung function after allogeneic stem cell transplantation, Biology of Blood and Marrow Transplantation, 10.1016/j.bbmt.2020.07.033, (2020).
  • Using Healthcare Databases to Replicate Trial Findings for Supplemental Indications: Adalimumab in Patients with Ulcerative Colitis, Clinical Pharmacology & Therapeutics, 10.1002/cpt.1861, 108, 4, (874-884), (2020).
  • Incorporating Bayesian methods into the propensity score matching framework: A no-treatment effect safety analysis, Accident Analysis & Prevention, 10.1016/j.aap.2020.105691, 145, (105691), (2020).
  • Extending balance assessment for the generalized propensity score under multiple imputation, Epidemiologic Methods, 10.1515/em-2019-0003, 9, 1, (2020).
  • Performance of propensity score matching to estimate causal effects in small samples, Statistical Methods in Medical Research, 10.1177/0962280219887196, 29, 3, (644-658), (2020).
  • Optimal subset selection for causal inference using machine learning ensembles and particle swarm optimization, Complex & Intelligent Systems, 10.1007/s40747-020-00169-w, (2020).
  • Improved time to treatment failure and survival in ibrutinib-treated malignancies with a pharmaceutical care program: an observational cohort study, Annals of Hematology, 10.1007/s00277-020-04045-y, (2020).
  • Introduction to clinical research based on modern epidemiology, Clinical and Experimental Nephrology, 10.1007/s10157-020-01870-3, (2020).
  • Risk of connective tissue disease, morphoea and systemic vasculitis in patients with hidradenitis suppurativa, Journal of the European Academy of Dermatology and Venereology, 10.1111/jdv.16728, 0, 0, (2020).
  • Risk of venous thromboembolism in knee, hip and hand osteoarthritis: a general population-based cohort study, Annals of the Rheumatic Diseases, 10.1136/annrheumdis-2020-217782, (annrheumdis-2020-217782), (2020).
  • Nonrandomized Real‐World Evidence to Support Regulatory Decision Making: Process for a Randomized Trial Replication Project, Clinical Pharmacology & Therapeutics, 10.1002/cpt.1633, 107, 4, (817-826), (2019).
  • The EMPagliflozin compaRative effectIveness and SafEty (EMPRISE) study programme: Design and exposure accrual for an evaluation of empagliflozin in routine clinical care, Endocrinology, Diabetes & Metabolism, 10.1002/edm2.103, 3, 1, (2019).
  • Evaluating the Utility of Coarsened Exact Matching for Pharmacoepidemiology Using Real and Simulated Claims Data, American Journal of Epidemiology, 10.1093/aje/kwz268, 189, 6, (613-622), (2019).
  • Association of Matched Sibling Donor Hematopoietic Stem Cell Transplantation With Transcranial Doppler Velocities in Children With Sickle Cell Anemia, JAMA, 10.1001/jama.2018.20059, 321, 3, (266), (2019).
  • Intra-Articular Corticosteroids and the Risk of Knee Osteoarthritis Progression: Results from the Osteoarthritis Initiative, Osteoarthritis and Cartilage, 10.1016/j.joca.2019.01.007, (2019).
  • Sequential Monitoring of the Comparative Effectiveness and Safety of Dabigatran in Routine Care, Circulation: Cardiovascular Quality and Outcomes, 10.1161/CIRCOUTCOMES.118.005173, 12, 2, (2019).
  • Risk of malignancy associated with use of tocilizumab versus other biologics in patients with rheumatoid arthritis: a multi-database cohort study, Seminars in Arthritis and Rheumatism, 10.1016/j.semarthrit.2019.03.002, (2019).
  • Psychosis with Methylphenidate or Amphetamine in Patients with ADHD, New England Journal of Medicine, 10.1056/NEJMoa1813751, 380, 12, (1128-1138), (2019).
  • Risk of serious infections in tocilizumab versus other biologic drugs in patients with rheumatoid arthritis: a multidatabase cohort study, Annals of the Rheumatic Diseases, 10.1136/annrheumdis-2018-214367, 78, 4, (456-464), (2019).
  • Advanced Approaches to Controlling Confounding in Pharmacoepidemiologic Studies, Pharmacoepidemiology, 10.1002/9781119413431, (1078-1107), (2019).
  • Azithromycin use and increased cancer risk among patients with bronchiolitis obliterans after hematopoietic cell transplantation, Biology of Blood and Marrow Transplantation, 10.1016/j.bbmt.2019.10.025, (2019).
  • Using Real-World Data to Predict Findings of an Ongoing Phase IV Cardiovascular Outcome Trial: Cardiovascular Safety of Linagliptin Versus Glimepiride, Diabetes Care, 10.2337/dc19-0069, 42, 12, (2204-2210), (2019).
  • Diagnosis and outcome of acute respiratory failure in immunocompromised patients after bronchoscopy, European Respiratory Journal, 10.1183/13993003.02442-2018, 54, 1, (1802442), (2019).
  • Effectiveness of Targeted Insulin-Adherence Interventions for Glycemic Control Using Predictive Analytics Among Patients With Type 2 Diabetes, JAMA Network Open, 10.1001/jamanetworkopen.2019.0657, 2, 3, (e190657), (2019).
  • Association of Tramadol with Risk of Myocardial Infarction Among Patients with Osteoarthritis, Osteoarthritis and Cartilage, 10.1016/j.joca.2019.10.001, (2019).
  • Effectiveness of two‐drug therapy versus monotherapy as initial regimen in hypertension: A propensity score‐matched cohort study in the UK Clinical Practice Research Datalink, Pharmacoepidemiology and Drug Safety, 10.1002/pds.4884, 28, 12, (1572-1582), (2019).
  • Direct admission to the intensive care unit from the emergency department and mortality in critically ill hematology patients, Annals of Intensive Care, 10.1186/s13613-019-0587-7, 9, 1, (2019).
  • Thiazide Diuretics and Risk of Knee Replacement Surgery among Patients with Knee Osteoarthritis: A General Population-based Cohort Study, Osteoarthritis and Cartilage, 10.1016/j.joca.2019.05.020, (2019).
  • Empagliflozin and the Risk of Heart Failure Hospitalization in Routine Clinical Care: A First Analysis from the Empagliflozin Comparative Effectiveness and Safety (EMPRISE) Study, Circulation, 10.1161/CIRCULATIONAHA.118.039177, (2019).
  • Propensity score-integrated composite likelihood approach for incorporating real-world evidence in single-arm clinical studies, Journal of Biopharmaceutical Statistics, 10.1080/10543406.2019.1684309, (1-13), (2019).
  • Sodium–Glucose Cotransporter-2 Inhibitors and the Risk for Severe Urinary Tract Infections, Annals of Internal Medicine, 10.7326/M18-3136, (2019).
  • Cardiovascular Outcomes of Calcium-Free vs Calcium-Based Phosphate Binders in Patients 65 Years or Older With End-stage Renal Disease Requiring Hemodialysis, JAMA Internal Medicine, 10.1001/jamainternmed.2019.0045, (2019).
  • Impact of a novel pharmacist-delivered behavioral intervention for patients with poorly-controlled diabetes: The ENhancing outcomes through Goal Assessment and Generating Engagement in Diabetes Mellitus (ENGAGE-DM) pragmatic randomized trial, PLOS ONE, 10.1371/journal.pone.0214754, 14, 4, (e0214754), (2019).
  • Alternative approaches for confounding adjustment in observational studies using weighting based on the propensity score: a primer for practitioners, BMJ, 10.1136/bmj.l5657, (l5657), (2019).
  • Performance evaluation of regression splines for propensity score adjustment in post-market safety analysis with multiple treatments, Journal of Biopharmaceutical Statistics, 10.1080/10543406.2019.1657138, (1-12), (2019).
  • Association between intravenous contrast media exposure and non-recovery from dialysis-requiring septic acute kidney injury: a nationwide observational study, Intensive Care Medicine, 10.1007/s00134-019-05755-2, (2019).
  • Assessing covariate balance when using the generalized propensity score with quantitative or continuous exposures, Statistical Methods in Medical Research, 10.1177/0962280218756159, 28, 5, (1365-1377), (2018).
  • Assessment of Cardiovascular Risk in Older Patients With Gout Initiating Febuxostat Versus Allopurinol, Circulation, 10.1161/CIRCULATIONAHA.118.033992, 138, 11, (1116-1126), (2018).
  • No difference in cardiovascular risk of tocilizumab versus abatacept for rheumatoid arthritis: A multi-database cohort study, Seminars in Arthritis and Rheumatism, 10.1016/j.semarthrit.2018.03.012, 48, 3, (399-405), (2018).
  • Evaluating large-scale propensity score performance through real-world and synthetic data experiments, International Journal of Epidemiology, 10.1093/ije/dyy120, 47, 6, (2005-2014), (2018).
  • Using Super Learner Prediction Modeling to Improve High-dimensional Propensity Score Estimation, Epidemiology, 10.1097/EDE.0000000000000762, 29, 1, (96-106), (2018).
  • Trabecular Metal Versus Non-Trabecular Metal Acetabular Components and the Risk of Re-Revision Following Revision Total Hip Arthroplasty, The Journal of Bone and Joint Surgery, 10.2106/JBJS.17.00718, 100, 13, (1132-1140), (2018).
  • Implications of the Propensity Score Matching Paradox in Pharmacoepidemiology, American Journal of Epidemiology, 10.1093/aje/kwy078, 187, 9, (1951-1961), (2018).
  • Survival after bilateral internal mammary artery in coronary artery bypass grafting: Are women at risk?, International Journal of Cardiology, 10.1016/j.ijcard.2018.05.028, 270, (89-95), (2018).
  • Out-of-system Care and Recording of Patient Characteristics Critical for Comparative Effectiveness Research, Epidemiology, 10.1097/EDE.0000000000000794, 29, 3, (356-363), (2018).
  • Comparative effectiveness and safety of antiplatelet drugs in patients with diabetes mellitus and acute coronary syndrome, Pharmacoepidemiology and Drug Safety, 10.1002/pds.4668, 27, 12, (1361-1370), (2018).
  • Risk of Incident Osteoarthritis of the Hand in Statin Initiators: A Sequential Cohort Study, Arthritis Care & Research, 10.1002/acr.23616, 70, 12, (1795-1805), (2018).
  • Impact of coexisting overactive bladder in Medicare patients with osteoporosis, Archives of Gerontology and Geriatrics, 10.1016/j.archger.2017.11.005, 75, (44-50), (2018).
  • Propensity Score Estimation Using Classification and Regression Trees in the Presence of Missing Covariate Data, Epidemiologic Methods, 10.1515/em-2017-0020, 0, 0, (2018).
  • A Kernel-Based Metric for Balance Assessment, Journal of Causal Inference, 10.1515/jci-2016-0029, 0, 0, (2018).
  • Cardiovascular Risks of Probenecid Versus Allopurinol in Older Patients With Gout, Journal of the American College of Cardiology, 10.1016/j.jacc.2017.12.052, 71, 9, (994-1004), (2018).
  • Risk of maternal mortality in women with severe anaemia during pregnancy and post partum: a multilevel analysis, The Lancet Global Health, 10.1016/S2214-109X(18)30078-0, 6, 5, (e548-e554), (2018).
  • Comparative effectiveness of delayed-release dimethyl fumarate versus interferon, glatiramer acetate, teriflunomide, or fingolimod: results from the German NeuroTransData registry, Journal of Neurology, 10.1007/s00415-018-9083-5, (2018).
  • Comparative Effectiveness of Cancer Control and Survival after Robot-Assisted versus Open Radical Prostatectomy, Journal of Urology, 10.1016/j.juro.2016.09.115, 197, 1, (115-121), (2017).
  • Cardiovascular Safety of Tocilizumab Versus Tumor Necrosis Factor Inhibitors in Patients With Rheumatoid Arthritis: A Multi‐Database Cohort Study, Arthritis & Rheumatology, 10.1002/art.40084, 69, 6, (1154-1164), (2017).
  • The prognostic value of lymph node yield in the earliest stage of colorectal cancer: a multicenter cohort study, BMC Medicine, 10.1186/s12916-017-0892-7, 15, 1, (2017).
  • Propensity Scores in Pharmacoepidemiology: Beyond the Horizon, Current Epidemiology Reports, 10.1007/s40471-017-0131-y, 4, 4, (271-280), (2017).
  • The feasibility of matching on a propensity score for acupuncture in a prospective cohort study of patients with chronic pain, BMC Medical Research Methodology, 10.1186/s12874-017-0318-4, 17, 1, (2017).
  • Potential Pitfalls of Reporting and Bias in Observational Studies With Propensity Score Analysis Assessing a Surgical Procedure, Annals of Surgery, 10.1097/SLA.0000000000001797, 265, 5, (901-909), (2017).
  • Reporting and Guidelines in Propensity Score Analysis: A Systematic Review of Cancer and Cancer Surgical Studies, JNCI: Journal of the National Cancer Institute, 10.1093/jnci/djw323, 109, 8, (2017).
  • The “Dry-Run” Analysis: A Method for Evaluating Risk Scores for Confounding Control, American Journal of Epidemiology, 10.1093/aje/kwx032, 185, 9, (842-852), (2017).
  • Matching Weights to Simultaneously Compare Three Treatment Groups, Epidemiology, 10.1097/EDE.0000000000000627, 28, 3, (387-395), (2017).
  • Endoscopic resection of high-risk T1 colorectal carcinoma prior to surgical resection has no adverse effect on long-term outcomes, Gut, 10.1136/gutjnl-2015-310961, 67, 2, (284-290), (2016).
  • Evaluation of subset matching methods and forms of covariate balance, Statistics in Medicine, 10.1002/sim.7036, 35, 27, (4961-4979), (2016).
  • Comparing the effect of angiotensin-converting enzyme inhibitors and angiotensin receptor blockers on renal function decline in diabetes, Journal of Comparative Effectiveness Research, 10.2217/cer.15.64, 5, 3, (229-237), (2016).
  • Instrumental variables analysis using multiple databases: an example of antidepressant use and risk of hip fracture, Pharmacoepidemiology and Drug Safety, 10.1002/pds.3863, 25, S1, (122-131), (2016).
  • Evaluating different physician's prescribing preference based instrumental variables in two primary care databases: a study of inhaled long‐acting beta2‐agonist use and the risk of myocardial infarction, Pharmacoepidemiology and Drug Safety, 10.1002/pds.3860, 25, S1, (132-141), (2016).
  • Pessary placement in the prevention of preterm birth in multiple pregnancies: a propensity score analysis, European Journal of Obstetrics & Gynecology and Reproductive Biology, 10.1016/j.ejogrb.2015.11.001, 197, (72-77), (2016).
  • Outcome after Transplantation According to Reduced-Intensity Conditioning Regimen in Patients Undergoing Transplantation for Myelofibrosis, Biology of Blood and Marrow Transplantation, 10.1016/j.bbmt.2016.02.019, 22, 7, (1206-1211), (2016).
  • A Review of Disease Risk Scores and Their Application in Pharmacoepidemiology, Current Epidemiology Reports, 10.1007/s40471-016-0088-2, 3, 4, (277-284), (2016).
  • Propensity score to detect baseline imbalance in cluster randomized trials: the role of the c-statistic, BMC Medical Research Methodology, 10.1186/s12874-015-0100-4, 16, 1, (2016).
  • The Balance Super Learner: A robust adaptation of the Super Learner to improve estimation of the average treatment effect in the treated based on propensity score matching , Statistical Methods in Medical Research, 10.1177/0962280216682055, (096228021668205), (2016).
  • Association between pre-operative statin use and major cardiovascular complications among patients undergoing non-cardiac surgery: the VISION study, European Heart Journal, 10.1093/eurheartj/ehv456, 37, 2, (177-185), (2015).
  • Sulfonylureas and risk of falls and fractures among nursing home residents with type 2 diabetes mellitus, Diabetes Research and Clinical Practice, 10.1016/j.diabres.2015.05.009, 109, 2, (411-419), (2015).
  • A new weighted balance measure helped to select the variables to be included in a propensity score model, Journal of Clinical Epidemiology, 10.1016/j.jclinepi.2015.04.009, 68, 12, (1415-1422.e2), (2015).
  • Reporting of covariate selection and balance assessment in propensity score analysis is suboptimal: a systematic review, Journal of Clinical Epidemiology, 10.1016/j.jclinepi.2014.08.011, 68, 2, (122-131), (2015).
  • Addressing Limitations in Observational Studies of the Association Between Glucose-Lowering Medications and All-Cause Mortality: A Review, Drug Safety, 10.1007/s40264-015-0280-1, 38, 3, (295-310), (2015).
  • An Application of Inverse Probability Weighting Estimation of Marginal Structural Models of a Continuous Exposure, Epidemiology, 10.1097/EDE.0000000000000346, 26, 5, (e52-e53), (2015).
  • How Generalizable Is Your Experiment? An Index for Comparing Experimental Samples and Populations, Journal of Educational and Behavioral Statistics, 10.3102/1076998614558486, 39, 6, (478-501), (2014).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.