We thank John Carlin for several long discussions; Jennifer Hill, Donald Rubin, Ross Stolzenberg, and an anonymous reviewer for helpful comments; and the National Science Foundation for support through grants SBR-9708424, SES-9987748, SES-0318115, and Young Investigator Award DMS-9796129. Direct correspondence to Andrew Gelman at email@example.com, www.stat.columbia.edu/~gelman.
AVERAGE PREDICTIVE COMPARISONS FOR MODELS WITH NONLINEARITY, INTERACTIONS, AND VARIANCE COMPONENTS
Article first published online: 18 MAY 2007
Volume 37, Issue 1, pages 23–51, December 2007
How to Cite
Gelman, A. and Pardoe, I. (2007), AVERAGE PREDICTIVE COMPARISONS FOR MODELS WITH NONLINEARITY, INTERACTIONS, AND VARIANCE COMPONENTS. Sociological Methodology, 37: 23–51. doi: 10.1111/j.1467-9531.2007.00181.x
- Issue published online: 18 MAY 2007
- Article first published online: 18 MAY 2007
In a predictive model, what is the expected difference in the outcome associated with a unit difference in one of the inputs? In a linear regression model without interactions, this average predictive comparison is simply a regression coefficient (with associated uncertainty). In a model with nonlinearity or interactions, however, the average predictive comparison in general depends on the values of the predictors. We consider various definitions based on averages over a population distribution of the predictors, and we compute standard errors based on uncertainty in model parameters. We illustrate with a study of criminal justice data for urban counties in the United States. The outcome of interest measures whether a convicted felon received a prison sentence rather than a jail or non-custodial sentence, with predictors available at both individual and county levels. We fit three models: (1) a hierarchical logistic regression with varying coefficients for the within-county intercepts as well as for each individual predictor; (2) a hierarchical model with varying intercepts only; and (3) a nonhierarchical model that ignores the multilevel nature of the data. The regression coefficients have different interpretations for the different models; in contrast, the models can be compared directly using predictive comparisons. Furthermore, predictive comparisons clarify the interplay between the individual and county predictors for the hierarchical models and also illustrate the relative size of varying county effects.