Metrics for covariate balance in cohort studies of causal effects

Authors

  • Jessica M. Franklin,

    Corresponding author
    1. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A.
    • Correspondence to: Jessica M. Franklin, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, 1620 Tremont St., Suite 3030, Boston, MA 02120, U.S.A.

      E-mail: jmfranklin@partners.org

    Search for more papers by this author
  • Jeremy A. Rassen,

    1. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A.
    Search for more papers by this author
  • Diana Ackermann,

    1. Department Global Epidemiology, Boehringer Ingelheim GmbH, Ingelheim, Germany
    Search for more papers by this author
  • Dorothee B. Bartels,

    1. Department Global Epidemiology, Boehringer Ingelheim GmbH, Ingelheim, Germany
    2. Department of Epidemiology, Hannover Medical School, Hannover, Germany
    Search for more papers by this author
  • Sebastian Schneeweiss

    1. Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, U.S.A.
    Search for more papers by this author

Abstract

Inferring causation from non-randomized studies of exposure requires that exposure groups can be balanced with respect to prognostic factors for the outcome. Although there is broad agreement in the literature that balance should be checked, there is confusion regarding the appropriate metric. We present a simulation study that compares several balance metrics with respect to the strength of their association with bias in estimation of the effect of a binary exposure on a binary, count, or continuous outcome. The simulations utilize matching on the propensity score with successively decreasing calipers to produce datasets with varying covariate balance. We propose the post-matching C-statistic as a balance metric and found that it had consistently strong associations with estimation bias, even when the propensity score model was misspecified, as long as the propensity score was estimated with sufficient study size. This metric, along with the average standardized difference and the general weighted difference, outperformed all other metrics considered in association with bias, including the unstandardized absolute difference, Kolmogorov–Smirnov and Lévy distances, overlapping coefficient, Mahalanobis balance, and L1 metrics. Of the best-performing metrics, the C-statistic and general weighted difference also have the advantage that they automatically evaluate balance on all covariates simultaneously and can easily incorporate balance on interactions among covariates. Therefore, when combined with the usual practice of comparing individual covariate means and standard deviations across exposure groups, these metrics may provide useful summaries of the observed covariate imbalance. Copyright © 2013 John Wiley & Sons, Ltd.

Ancillary