A General Maximum Likelihood Analysis of Variance Components in Generalized Linear Models



Summary. This paper describes an EM algorithm for nonparametric maximum likelihood (ML) estimation in generalized linear models with variance component structure. The algorithm provides an alternative analysis to approximate MQL and PQL analyses (McGilchrist and Aisbett, 1991, Biometrical Journal33, 131–141; Breslow and Clayton, 1993; Journal of the American Statistical Association88, 9–25; McGilchrist, 1994, Journal of the Royal Statistical Society, Series B56, 61–69; Goldstein, 1995, Multilevel Statistical Models) and to GEE analyses (Liang and Zeger, 1986, Biometrika73, 13–22). The algorithm, first given by Hinde and Wood (1987, in Longitudinal Data Analysis, 110–126), is a generalization of that for random effect models for overdispersion in generalized linear models, described in Aitkin (1996, Statistics and Computing6, 251–262). The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully nonparametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters can be sensitive to the specification of a parametric form for the mixing distribution. The nonparametric analysis can be extended straightforwardly to general random parameter models, with full NPML estimation of the joint distribution of the random parameters. This can produce substantial computational saving compared with full numerical integration over a specified parametric distribution for the random parameters. A simple method is described for obtaining correct standard errors for parameter estimates when using the EM algorithm. Several examples are discussed involving simple variance component and longitudinal models, and small-area estimation.