The use of propensity scores to assess the generalizability of results from randomized trials


Elizabeth A. Stuart, Departments of Mental Health and Biostatistics, Johns Hopkins Bloomberg School of Public Health, 8th Floor, 624 North Broadway, Baltimore, MD 21205, USA.


Summary.  Randomized trials remain the most accepted design for estimating the effects of interventions, but they do not necessarily answer a question of primary interest: will the programme be effective in a target population in which it may be implemented? In other words, are the results generalizable? There has been very little statistical research on how to assess the generalizability, or ‘external validity’, of randomized trials. We propose the use of propensity-score-based metrics to quantify the similarity of the participants in a randomized trial and a target population. In this setting the propensity score model predicts participation in the randomized trial, given a set of covariates. The resulting propensity scores are used first to quantify the difference between the trial participants and the target population, and then to match, subclassify or weight the control group outcomes to the population, assessing how well the propensity-score-adjusted outcomes track the outcomes that are actually observed in the population. These metrics can serve as a first step in assessing the generalizability of results from randomized trials to target populations. The paper lays out these ideas, discusses the assumptions underlying the approach and illustrates the metrics by using data on the evaluation of a schoolwide prevention programme called ‘Positive behavioral interventions and supports’.