Propensity Score Matching in Randomized Clinical Trials
Article first published online: 30 NOV 2009
© 2009, The International Biometric Society
Volume 66, Issue 3, pages 813–823, September 2010
How to Cite
Xu, Z. and Kalbfleisch, J. D. (2010), Propensity Score Matching in Randomized Clinical Trials. Biometrics, 66: 813–823. doi: 10.1111/j.1541-0420.2009.01364.x
- Issue published online: 30 NOV 2009
- Article first published online: 30 NOV 2009
- Received December 2008. Revised September 2009. Accepted September 2009.
- Clustered randomized trial;
- Experimental design;
- Optimal full matching;
- Propensity score matching;
- Randomization study
Summary Cluster randomization trials with relatively few clusters have been widely used in recent years for evaluation of health-care strategies. On average, randomized treatment assignment achieves balance in both known and unknown confounding factors between treatment groups, however, in practice investigators can only introduce a small amount of stratification and cannot balance on all the important variables simultaneously. The limitation arises especially when there are many confounding variables in small studies. Such is the case in the INSTINCT trial designed to investigate the effectiveness of an education program in enhancing the tPA use in stroke patients. In this article, we introduce a new randomization design, the balance match weighted (BMW) design, which applies the optimal matching with constraints technique to a prospective randomized design and aims to minimize the mean squared error (MSE) of the treatment effect estimator. A simulation study shows that, under various confounding scenarios, the BMW design can yield substantial reductions in the MSE for the treatment effect estimator compared to a completely randomized or matched-pair design. The BMW design is also compared with a model-based approach adjusting for the estimated propensity score and Robins-Mark-Newey E-estimation procedure in terms of efficiency and robustness of the treatment effect estimator. These investigations suggest that the BMW design is more robust and usually, although not always, more efficient than either of the approaches. The design is also seen to be robust against heterogeneous error. We illustrate these methods in proposing a design for the INSTINCT trial.