Confounding adjustment via a semi-automated high-dimensional propensity score algorithm: an application to electronic medical records
Article first published online: 30 JUN 2011
Copyright © 2011 John Wiley & Sons, Ltd.
Pharmacoepidemiology and Drug Safety
Volume 20, Issue 8, pages 849–857, August 2011
How to Cite
Toh, S., García Rodríguez, L. A. and Hernán, M. A. (2011), Confounding adjustment via a semi-automated high-dimensional propensity score algorithm: an application to electronic medical records. Pharmacoepidem. Drug Safe., 20: 849–857. doi: 10.1002/pds.2152
- Issue published online: 25 JUL 2011
- Article first published online: 30 JUN 2011
- Manuscript Accepted: 22 MAR 2011
- Manuscript Revised: 19 MAR 2011
- Manuscript Received: 22 DEC 2010
- NIH. Grant Number: R01 HL080644
- propensity score analysis;
A semi-automated high-dimensional propensity score (hd-PS) algorithm has been proposed to adjust for confounding in claims databases. The feasibility of using this algorithm in other types of healthcare databases is unknown.
We estimated the comparative safety of traditional non-steroidal anti-inflammatory drugs (NSAIDs) and selective COX-2 inhibitors regarding the risk of upper gastrointestinal bleeding (UGIB) in The Health Improvement Network, an electronic medical record (EMR) database in the UK. We compared the adjusted effect estimates when the confounders were identified using expert knowledge or the semi-automated hd-PS algorithm.
Compared with the 411,616 traditional NSAID initiators, the crude odds ratio (OR) of UGIB was 1.50 (95%CI: 0.98, 2.28) for the 43,569 selective COX-2 inhibitor initiators. The OR dropped to 0.81 (0.52, 1.27) upon adjustment for known risk factors for UGIB that are typically available in both claims and EMR databases. The OR remained similar when further adjusting for covariates—smoking, alcohol consumption, and body mass index—that are not typically recorded in claims databases (OR 0.81; 0.51, 1.26) or adding 500 empirically identified covariates using the hd-PS algorithm (OR 0.78; 0.49, 1.22). Adjusting for age and sex plus 500 empirically identified covariates produced an OR of 0.87 (0.56, 1.34).
The hd-PS algorithm can be implemented in pharmacoepidemiologic studies that use primary care EMR databases such as The Health Improvement Network. For the NSAID–UGIB association for which major confounders are well known, further adjustment for covariates selected by the algorithm had little impact on the effect estimate. Copyright © 2011 John Wiley & Sons, Ltd.