Get access

Drug safety data mining with a tree-based scan statistic

Authors

  • Martin Kulldorff,

    Corresponding author
    1. The HMO Research Network Center for Education and Research in Therapeutics
    • Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
    Search for more papers by this author
  • Inna Dashevsky,

    1. Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
    Search for more papers by this author
  • Taliser R. Avery,

    1. Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
    Search for more papers by this author
  • Arnold K. Chan,

    1. Harvard School of Public Health, Boston, MA, USA
    2. i3 Drug Safety, Waltham, MA, USA
    Search for more papers by this author
  • Robert L. Davis,

    1. Kaiser Permanente Georgia, Atlanta, GA, USA
    Search for more papers by this author
  • David Graham,

    1. Office of Drug Safety, Food and Drug Administration, Rockville, MD, USA
    Search for more papers by this author
  • Richard Platt,

    1. Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
    2. The HMO Research Network Center for Education and Research in Therapeutics
    Search for more papers by this author
  • Susan E Andrade,

    1. The HMO Research Network Center for Education and Research in Therapeutics
    2. Meyers Primary Care Institute, University of Massachusetts Medical School, the Fallon Foundation, and Fallon Community Health Plan, Worcester, MA, USA
    Search for more papers by this author
  • Denise Boudreau,

    1. The HMO Research Network Center for Education and Research in Therapeutics
    2. Group Health Research Institute, Seattle, WA, USA
    Search for more papers by this author
  • Margaret J. Gunter,

    1. The HMO Research Network Center for Education and Research in Therapeutics
    2. Lovelace Clinic Foundation, Albuquerque, NM, USA
    Search for more papers by this author
  • Lisa J. Herrinton,

    1. The HMO Research Network Center for Education and Research in Therapeutics
    2. Kaiser Permanente Northern California, Oakland, CA, USA
    Search for more papers by this author
  • Pamala A. Pawloski,

    1. The HMO Research Network Center for Education and Research in Therapeutics
    2. HealthPartners Institute for Research and Education, Minneapolis, MN, USA
    Search for more papers by this author
  • Marsha A. Raebel,

    1. The HMO Research Network Center for Education and Research in Therapeutics
    2. Kaiser Permanente Colorado, Denver, CO, USA
    Search for more papers by this author
  • Douglas Roblin,

    1. The HMO Research Network Center for Education and Research in Therapeutics
    2. Kaiser Permanente Georgia, Atlanta, GA, USA
    Search for more papers by this author
  • Jeffrey S. Brown

    1. Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA
    2. The HMO Research Network Center for Education and Research in Therapeutics
    Search for more papers by this author

Correspondence to: M. Kulldorff, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, 133 Brookline Avenue, 6th Floor, Boston, MA 02215, USA. E-mail: martin_kulldorff@hms.harvard.edu

ABSTRACT

Purpose

In post-marketing drug safety surveillance, data mining can potentially detect rare but serious adverse events. Assessing an entire collection of drug–event pairs is traditionally performed on a predefined level of granularity. It is unknown a priori whether a drug causes a very specific or a set of related adverse events, such as mitral valve disorders, all valve disorders, or different types of heart disease. This methodological paper evaluates the tree-based scan statistic data mining method to enhance drug safety surveillance.

Methods

We use a three-million-member electronic health records database from the HMO Research Network. Using the tree-based scan statistic, we assess the safety of selected antifungal and diabetes drugs, simultaneously evaluating overlapping diagnosis groups at different granularity levels, adjusting for multiple testing. Expected and observed adverse event counts were adjusted for age, sex, and health plan, producing a log likelihood ratio test statistic.

Results

Out of 732 evaluated disease groupings, 24 were statistically significant, divided among 10 non-overlapping disease categories. Five of the 10 signals are known adverse effects, four are likely due to confounding by indication, while one may warrant further investigation.

Conclusion

The tree-based scan statistic can be successfully applied as a data mining tool in drug safety surveillance using observational data. The total number of statistical signals was modest and does not imply a causal relationship. Rather, data mining results should be used to generate candidate drug–event pairs for rigorous epidemiological studies to evaluate the individual and comparative safety profiles of drugs. Copyright © 2013 John Wiley & Sons, Ltd.

Ancillary