Get access

A latent factor linear mixed model for high-dimensional longitudinal data analysis


Correspondence to: Xinming An, SAS Institute Inc., Cary, NC, U.S.A.



High-dimensional longitudinal data involving latent variables such as depression and anxiety that cannot be quantified directly are often encountered in biomedical and social sciences. Multiple responses are used to characterize these latent quantities, and repeated measures are collected to capture their trends over time. Furthermore, substantive research questions may concern issues such as interrelated trends among latent variables that can only be addressed by modeling them jointly. Although statistical analysis of univariate longitudinal data has been well developed, methods for modeling multivariate high-dimensional longitudinal data are still under development. In this paper, we propose a latent factor linear mixed model (LFLMM) for analyzing this type of data. This model is a combination of the factor analysis and multivariate linear mixed models. Under this modeling framework, we reduced the high-dimensional responses to low-dimensional latent factors by the factor analysis model, and then we used the multivariate linear mixed model to study the longitudinal trends of these latent factors. We developed an expectation–maximization algorithm to estimate the model. We used simulation studies to investigate the computational properties of the expectation–maximization algorithm and compare the LFLMM model with other approaches for high-dimensional longitudinal data analysis. We used a real data example to illustrate the practical usefulness of the model. Copyright © 2013 John Wiley & Sons, Ltd.