Simulation of massive public health data by power polynomials
Abstract
Situations in which multiple outcomes and predictors of different distributional types are collected are becoming increasingly common in public health practice, and joint modeling of mixed types has been gaining popularity in recent years. Evaluation of various statistical techniques that have been developed for mixed data in simulated environments necessarily requires joint generation of multiple variables. Most massive public health data sets include different types of variables. For instance, in clustered or longitudinal designs, often multiple variables are measured or observed for each individual or at each occasion. This work is motivated by a need to jointly generate binary and possibly non‐normal continuous variables. We illustrate the use of power polynomials to simulate multivariate mixed data on the basis of a real adolescent smoking study. We believe that our proposed technique for simulating such intensive data has the potential to be a handy methodological addition to public health researchers’ toolkit. Copyright © 2012 John Wiley & Sons, Ltd.
Citing Literature
Number of times cited according to CrossRef: 20
- H. Demirtas, R. Gao, Mixed data generation packages and related computational tools in R, Communications in Statistics - Simulation and Computation, 10.1080/03610918.2020.1745841, (1-44), (2020).
- Yeou-Koung Tung, Lingwan You, Chulsang Yoo, Third-Order Polynomial Normal Transform Applied to Multivariate Hydrologic Extremes, Water, 10.3390/w11030490, 11, 3, (490), (2019).
- Jen-Chih Yang, Pao-Erh Chang, Chi-Chang Ho, Chang-Fu Wu, Application of factor and cluster analyses to determine source–receptor relationships of industrial volatile organic odor species in a dual-optical sensing system, Atmospheric Measurement Techniques, 10.5194/amt-12-5347-2019, 12, 10, (5347-5362), (2019).
- Lingwan You, Yeou-Koung Tung, Derivation of rainfall IDF relations by third-order polynomial normal transform, Stochastic Environmental Research and Risk Assessment, 10.1007/s00477-018-1583-4, 32, 8, (2309-2324), (2018).
- Oscar L. Olvera Astivia, Bruno D. Zumbo, On the solution multiplicity of the Fleishman method and its impact in simulation studies, British Journal of Mathematical and Statistical Psychology, 10.1111/bmsp.12126, 71, 3, (437-458), (2018).
- Oscar L. Olvera Astivia, Bruno D. Zumbo, A Note on the Solution Multiplicity of the Vale–Maurelli Intermediate Correlation Equation, Journal of Educational and Behavioral Statistics, 10.3102/1076998618803381, (107699861880338), (2018).
- Mohan D. Pant, The t -transformed power method distributions for simulating univariate and multivariate non-normal distributions , Communications in Statistics - Simulation and Computation, 10.1080/03610918.2018.1498894, (1-22), (2018).
- Jessica R. Hoag, Chia-Ling Kuo, Normal and Non-normal Data Simulations for the Evaluation of Two-Sample Location Tests, Monte-Carlo Simulation-Based Statistical Modeling, 10.1007/978-981-10-3307-0_3, (41-57), (2017).
- Hakan Demirtas, A Multiple Imputation Framework for Massive Multivariate Data of Different Variable Types: A Monte-Carlo Technique, Monte-Carlo Simulation-Based Statistical Modeling, 10.1007/978-981-10-3307-0_8, (143-162), (2017).
- Hakan Demirtas, Ceren Vardar-Acar, Anatomy of Correlational Magnitude Transformations in Latency and Discretization Contexts in Monte-Carlo Studies, Monte-Carlo Simulation-Based Statistical Modeling, 10.1007/978-981-10-3307-0_4, (59-84), (2017).
- Hakan Demirtas, Rawan Allozi, Yiran Hu, Gul Inan, Levent Ozbek, Joint Generation of Binary, Ordinal, Count, and Normal Data with Specified Marginal and Association Structures in Monte-Carlo Simulations, Monte-Carlo Simulation-Based Statistical Modeling, 10.1007/978-981-10-3307-0_1, (3-15), (2017).
- Hakan Demirtas, Inducing any feasible level of correlation to bivariate data with any marginals, The American Statistician, 10.1080/00031305.2017.1379438, (0-0), (2017).
- Hakan Demirtas, Robab Ahmadian, Sema Atis, Fatma Ezgi Can, Ilker Ercan, A nonnormal look at polychoric correlations: modeling the change in correlations before and after discretization, Computational Statistics, 10.1007/s00180-016-0653-7, 31, 4, (1385-1401), (2016).
- Anup Amatya, Hakan Demirtas, Concurrent generation of multivariate mixed data with variables of dissimilar types, Journal of Statistical Computation and Simulation, 10.1080/00949655.2016.1177530, 86, 18, (3595-3607), (2016).
- Nicholas J. Higham, Nataša Strabić, Vedran Šego, Restoring Definiteness via Shrinking, with an Application to Correlation Matrices with a Fixed Block, SIAM Review, 10.1137/140996112, 58, 2, (245-263), (2016).
- Hakan Demirtas, A Note on the Relationship Between the Phi Coefficient and the Tetrachoric Correlation Under Nonnormal Underlying Distributions, The American Statistician, 10.1080/00031305.2015.1077161, 70, 2, (143-148), (2016).
- Hakan Demirtas, Concurrent generation of binary and nonnormal continuous data through fifth-order power polynomials, Communications in Statistics - Simulation and Computation, 10.1080/03610918.2014.963613, 46, 1, (344-357), (2015).
- Hakan Demirtas, Donald Hedeker, Computing the Point-biserial Correlation under Any Underlying Continuous Distribution, Communications in Statistics - Simulation and Computation, 10.1080/03610918.2014.920883, 45, 8, (2744-2751), (2014).
- Anup Amatya, Hakan Demirtas, Simultaneous generation of multivariate mixed data with Poisson and normal marginals, Journal of Statistical Computation and Simulation, 10.1080/00949655.2014.953534, 85, 15, (3129-3139), (2014).
- Hakan Demirtas, Yasemin Yavuz, Concurrent Generation of Ordinal and Normal Data, Journal of Biopharmaceutical Statistics, 10.1080/10543406.2014.920868, 25, 4, (635-650), (2014).




