Using Regression Models to Analyze Randomized Trials: Asymptotically Valid Hypothesis Tests Despite Incorrectly Specified Models
Article first published online: 5 FEB 2009
© 2009, The International Biometric Society
Volume 65, Issue 3, pages 937–945, September 2009
How to Cite
Rosenblum, M. and van der Laan, M. J. (2009), Using Regression Models to Analyze Randomized Trials: Asymptotically Valid Hypothesis Tests Despite Incorrectly Specified Models. Biometrics, 65: 937–945. doi: 10.1111/j.1541-0420.2008.01177.x
- Issue published online: 14 SEP 2009
- Article first published online: 5 FEB 2009
- Received January 2008. Revised August 2008. Accepted August 2008.
- Causal effect;
- Generalized linear model;
- Misspecified model;
- Randomized trial;
- Robust methods
Summary Regression models are often used to test for cause-effect relationships from data collected in randomized trials or experiments. This practice has deservedly come under heavy scrutiny, because commonly used models such as linear and logistic regression will often not capture the actual relationships between variables, and incorrectly specified models potentially lead to incorrect conclusions. In this article, we focus on hypothesis tests of whether the treatment given in a randomized trial has any effect on the mean of the primary outcome, within strata of baseline variables such as age, sex, and health status. Our primary concern is ensuring that such hypothesis tests have correct type I error for large samples. Our main result is that for a surprisingly large class of commonly used regression models, standard regression-based hypothesis tests (but using robust variance estimators) are guaranteed to have correct type I error for large samples, even when the models are incorrectly specified. To the best of our knowledge, this robustness of such model-based hypothesis tests to incorrectly specified models was previously unknown for Poisson regression models and for other commonly used models we consider. Our results have practical implications for understanding the reliability of commonly used, model-based tests for analyzing randomized trials.