Get access

Bayesian information criterion for longitudinal and clustered data

Authors


Richard H. Jones, Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Denver, 13001 East 17th Place, Mail Stop B119, Aurora, CO 80045, USA.

E-mail: richard.jones@UCDenver.edu

Abstract

When a number of models are fit to the same data set, one method of choosing the ‘best’ model is to select the model for which Akaike's information criterion (AIC) is lowest. AIC applies when maximum likelihood is used to estimate the unknown parameters in the model. The value of −2 log likelihood for each model fit is penalized by adding twice the number of estimated parameters. The number of estimated parameters includes both the linear parameters and parameters in the covariance structure. Another criterion for model selection is the Bayesian information criterion (BIC). BIC penalizes −2 log likelihood by adding the number of estimated parameters multiplied by the log of the sample size. For large sample sizes, BIC penalizes −2 log likelihood much more than AIC making it harder to enter new parameters into the model. An assumption in BIC is that the observations are independent. In mixed models, the observations are not independent. This paper develops a method for calculating the ‘effective sample size’ for mixed models based on Fisher's information. The effective sample size replaces the sample size in BIC and can vary from the number of subjects to the number of observations. A number of error models are considered based on a general mixed model including unstructured, compound symmetry (random intercept), first-order autoregression with observational error and random intercept and slope. Copyright © 2011 John Wiley & Sons, Ltd.

Get access to the full text of this article

Ancillary