Get access

Simulation of massive public health data by power polynomials


Hakan Demirtas, Division of Epidemiology and Biostatistics, University of Illinois at Chicago, 1603 West Taylor Street, MC 923, Chicago, IL 60612, U.S.A.



Situations in which multiple outcomes and predictors of different distributional types are collected are becoming increasingly common in public health practice, and joint modeling of mixed types has been gaining popularity in recent years. Evaluation of various statistical techniques that have been developed for mixed data in simulated environments necessarily requires joint generation of multiple variables. Most massive public health data sets include different types of variables. For instance, in clustered or longitudinal designs, often multiple variables are measured or observed for each individual or at each occasion. This work is motivated by a need to jointly generate binary and possibly non-normal continuous variables. We illustrate the use of power polynomials to simulate multivariate mixed data on the basis of a real adolescent smoking study. We believe that our proposed technique for simulating such intensive data has the potential to be a handy methodological addition to public health researchers’ toolkit. Copyright © 2012 John Wiley & Sons, Ltd.

Get access to the full text of this article