Get access

THE SIMULTANEOUS DECISION(S) ABOUT THE NUMBER OF LOWER- AND HIGHER-LEVEL CLASSES IN MULTILEVEL LATENT CLASS ANALYSIS

Authors


  • Olga Lukočiené's contribution to this research was funded by the Netherlands Organisation for Scientific Research, MaGW/NWO project number 400-03-295. Direct Correspondence to Jeroen K. Vermunt, Tilburg University, P.O. Box 90153, 5000 LE Tilburg, The Netherlands; e-mail: J.K.Vermunt@uvt.nl.

Abstract

Recently, several types of extensions of the latent class (LC) model have been developed for the analysis of data sets having a multilevel structure. The most popular variant is the multilevel LC model with finite mixture distributions at multiple levels of a hierarchical structure; that is, with LCs for both lower-level units (e.g. individuals, citizens, or patients) and higher-level units (e.g. groups, regions, or hospitals). A problem in the application of this model is that determining the number of LCs is much more complicated than in standard (single-level) LC analysis because it involves multiple, nonindependent decisions. We propose a three-step model-fitting procedure for deciding about the number of higher- and lower-level classes. We also investigate the performance of information criteria (BIC, AIC, CAIC, and AIC3) in the context of multilevel LC analysis, with different types of response variables. A specific difficulty associated with using BIC and CAIC in any type of multilevel analysis is that these measures contain the sample size in their formulae, and we investigate whether this should be the number of groups, the number of individuals, or either the number of groups or individuals depending on whether one has to decide about model features concerning the higher or lower level. The three main conclusions of our simulations studies are that (1) the proposed three-step model-fitting strategy works rather well, (2) the number of higher-level units (K) is the preferred sample size for BIC and CAIC, both for decisions about higher- and lower-level classes, and (3) with categorical indicators, AIC3 and BIC based on the higher-level sample size are the preferred measures for deciding about the number of LCs at both the higher and lower level. With continuous indicators, BIC(K) performs better than AIC3. AIC performs best in very specific situations—namely, with poorly separated classes and categorical indicators.

Ancillary