Multilevel Latent Class Models with Dirichlet Mixing Distribution
Article first published online: 16 JUN 2010
© 2010, The International Biometric Society
Volume 67, Issue 1, pages 86–96, March 2011
How to Cite
Di, C.-Z. and Bandeen-Roche, K. (2011), Multilevel Latent Class Models with Dirichlet Mixing Distribution. Biometrics, 67: 86–96. doi: 10.1111/j.1541-0420.2010.01448.x
- Issue published online: 14 MAR 2011
- Article first published online: 16 JUN 2010
- Received February 2009. Revised March 2010. Accepted March 2010.
- Dirichlet distribution;
- EM algorithm;
- Latent class analysis (LCA);
- Multilevel models;
- Pairwise likelihood
Summary Latent class analysis (LCA) and latent class regression (LCR) are widely used for modeling multivariate categorical outcomes in social science and biomedical studies. Standard analyses assume data of different respondents to be mutually independent, excluding application of the methods to familial and other designs in which participants are clustered. In this article, we consider multilevel latent class models, in which subpopulation mixing probabilities are treated as random effects that vary among clusters according to a common Dirichlet distribution. We apply the expectation-maximization (EM) algorithm for model fitting by maximum likelihood (ML). This approach works well, but is computationally intensive when either the number of classes or the cluster size is large. We propose a maximum pairwise likelihood (MPL) approach via a modified EM algorithm for this case. We also show that a simple latent class analysis, combined with robust standard errors, provides another consistent, robust, but less-efficient inferential procedure. Simulation studies suggest that the three methods work well in finite samples, and that the MPL estimates often enjoy comparable precision as the ML estimates. We apply our methods to the analysis of comorbid symptoms in the obsessive compulsive disorder study. Our models' random effects structure has more straightforward interpretation than those of competing methods, thus should usefully augment tools available for LCA of multilevel data.