A nonparametric Bayesian model for inference in related longitudinal studies

Authors


Peter Müller, Department of Biostatistics, University of Texas M. D. Anderson Cancer Center, 1515 Holcombe Boulevard, Box 447, Houston, TX 77030-4009, USA.
E-mail: pm@odin.mdacc.tmc.edu

Abstract

Summary.  We discuss a method for combining different but related longitudinal studies to improve predictive precision. The motivation is to borrow strength across clinical studies in which the same measurements are collected at different frequencies. Key features of the data are heterogeneous populations and an unbalanced design across three studies of interest. The first two studies are phase I studies with very detailed observations on a relatively small number of patients. The third study is a large phase III study with over 1500 enrolled patients, but with relatively few measurements on each patient. Patients receive different doses of several drugs in the studies, with the phase III study containing significantly less toxic treatments. Thus, the main challenges for the analysis are to accommodate heterogeneous population distributions and to formalize borrowing strength across the studies and across the various treatment levels. We describe a hierarchical extension over suitable semiparametric longitudinal data models to achieve the inferential goal. A nonparametric random-effects model accommodates the heterogeneity of the population of patients. A hierarchical extension allows borrowing strength across different studies and different levels of treatment by introducing dependence across these nonparametric random-effects distributions. Dependence is introduced by building an analysis of variance (ANOVA) like structure over the random-effects distributions for different studies and treatment combinations. Model structure and parameter interpretation are similar to standard ANOVA models. Instead of the unknown normal means as in standard ANOVA models, however, the basic objects of inference are random distributions, namely the unknown population distributions under each study. The analysis is based on a mixture of Dirichlet processes model as the underlying semiparametric model.

Ancillary