A mixture of generalized hyperbolic distributions
Abstract
enWe introduce a mixture of generalized hyperbolic distributions as an alternative to the ubiquitous mixture of Gaussian distributions as well as their near relatives within which the mixture of multivariate t‐distributions and the mixture of skew‐t distributions predominate. The mathematical development of our mixture of generalized hyperbolic distributions model relies on its relationship with the generalized inverse Gaussian distribution. The latter is reviewed before our mixture models are presented along with details of the aforesaid reliance. Parameter estimation is outlined within the expectation–maximization framework before the clustering performance of our mixture models is illustrated via applications on simulated and real data. In particular, the ability of our models to recover parameters for data from underlying Gaussian and skew‐t distributions is demonstrated. Finally, the role of generalized hyperbolic mixtures within the wider model‐based clustering, classification, and density estimation literature is discussed. The Canadian Journal of Statistics 43: 176–198; 2015 © 2015 Statistical Society of Canada
Résumé
frLes auteurs présentent un mélange de distributions hyperboliques généralisées comme solution de rechange aux mélanges habituels basés sur la distribution gaussienne, celle de Student ou celle de Student asymétrique. Les auteurs passent en revue les propriétés de l'inverse généralisé de la distribution gaussienne puisque le développement mathématique qu'ils présentent repose sur un lien, présenté en détail, entre cet inverse généralisé et les distributions hyperboliques généralisées. Ils procèdent à l'estimation des paramètres par un algorithme d'espérance‐maximisation, puis ils illustrent la performance de leur modèle dans le cadre d'une analyse de regroupement en l'appliquant à des données simulées, ainsi qu’à un jeu de données réelles. Les auteurs démontrent la capacité de leur modèle à récupérer les paramètres des distributions sous‐jacentes lorsque celles‐ci sont gaussiennes, ou lorsqu'elles suivent une loi de Student asymétrique. Finalement, ils discutent le rôle de la distribution hyperbolique généralisée lorsqu'un modèle est utilisé pour l'analyse de regroupement, la classification ou l'estimation de la densité. La revue canadienne de statistique 43: 176–198; 2015 © 2015 Société statistique du Canada
Citing Literature
Number of times cited according to CrossRef: 54
- Michael P. B. Gallaugher, Paul D. McNicholas, Parsimonious Mixtures of Matrix Variate Bilinear Factor Analyzers, Advanced Studies in Behaviormetrics and Data Science, 10.1007/978-981-15-2700-5_11, (177-196), (2020).
- Evženie Suzdaleva, Ivan Nagy, Practical Initialization of Recursive Mixture-Based Clustering for Non-negative Data, Informatics in Control, Automation and Robotics, 10.1007/978-3-030-11292-9_34, (679-698), (2020).
- R. Deepana, C. Kiruthika, Model based clustering using finite mixtures of multivariate geometric skew normal distribution, Model Assisted Statistics and Applications, 10.3233/MAS-190478, 15, 1, (53-65), (2020).
- Yuhong Wei, Yang Tang, Paul D. McNicholas, Flexible High-Dimensional Unsupervised Learning with Missing Data, IEEE Transactions on Pattern Analysis and Machine Intelligence, 10.1109/TPAMI.2018.2885760, 42, 3, (610-621), (2020).
- Henri Karttunen, An autoregressive model based on the generalized hyperbolic distribution, Scandinavian Journal of Statistics, 10.1111/sjos.12427, 47, 3, (787-816), (2020).
- Paolo Giordani, Maria Brigida Ferraro, Francesca Martella, Paolo Giordani, Maria Brigida Ferraro, Francesca Martella, Issues in Gaussian Model-Based Clustering, An Introduction to Clustering with R, 10.1007/978-981-13-0553-5_7, (291-340), (2020).
- Edoardo Redivo, Hien D. Nguyen, Mayetri Gupta, Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions, Computational Statistics & Data Analysis, 10.1016/j.csda.2020.107040, 152, (107040), (2020).
- Sharon X. Lee, Tsung-I Lin, Geoffrey J. McLachlan, Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions, Advances in Data Analysis and Classification, 10.1007/s11634-020-00420-9, (2020).
- Sharon X. Lee, Geoffrey J. McLachlan, On formulations of skew factor models: Skew factors and/or skew errors, Statistics & Probability Letters, 10.1016/j.spl.2020.108935, (108935), (2020).
- Yang Wang, Volodymyr Melnykov, On variable selection in matrix mixture modelling, Stat, 10.1002/sta4.278, 9, 1, (2020).
- Cristian Poliziani, Federico Rupi, Felix Mbuga, Joerg Schweizer, Cristina Tortora, Categorizing three active cyclist typologies by exploring patterns on a multitude of GPS crowdsourced data attributes., Research in Transportation Business & Management, 10.1016/j.rtbm.2020.100572, (100572), (2020).
- Yana Melnykov, Xuwen Zhu, Volodymyr Melnykov, Transformation mixture modeling for skewed data groups with heavy tails and scatter, Computational Statistics, 10.1007/s00180-020-01009-8, (2020).
- Maria Brigida Ferraro, Paolo Giordani, Soft clustering, WIREs Computational Statistics , 10.1002/wics.1480, 12, 1, (2019).
- Yuhong Wei, Yang Tang, Paul D. McNicholas, Mixtures of generalized hyperbolic distributions and mixtures of skew-t distributions for model-based clustering with incomplete data, Computational Statistics & Data Analysis, 10.1016/j.csda.2018.08.016, 130, (18-41), (2019).
- Michael P.B. Gallaugher, Paul D. McNicholas, Three skewed matrix variate distributions, Statistics & Probability Letters, 10.1016/j.spl.2018.08.012, 145, (103-109), (2019).
- Geoffrey J. McLachlan, Sharon X. Lee, Suren I. Rathnayake, Finite Mixture Models, Annual Review of Statistics and Its Application, 10.1146/annurev-statistics-031017-100325, 6, 1, (355-378), (2019).
- Shuchismita Sarkar, Xuwen Zhu, Volodymyr Melnykov, Salvatore Ingrassia, On parsimonious models for modeling matrix data, Computational Statistics & Data Analysis, 10.1016/j.csda.2019.106822, (106822), (2019).
- Charles Bouveyron, Gilles Celeux, T. Brendan Murphy, Adrian E. Raftery, , Model-Based Clustering and Classification for Data Science, 10.1017/9781108644181, (2019).
- Sandro Cumani, Pietro Laface, undefined, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 10.1109/ICASSP.2019.8683379, (6121-6125), (2019).
- Lili Zhang, Jangsun Baek, Mixtures of Gaussian copula factor analyzers for clustering high dimensional data, Journal of the Korean Statistical Society, 10.1016/j.jkss.2018.12.001, (2019).
- Shantanu Jain, Michael Levine, Predrag Radivojac, Michael W. Trosset, Identifiability of two‐component skew normal mixtures with one known component, Scandinavian Journal of Statistics, 10.1111/sjos.12377, 46, 4, (955-986), (2019).
- Phillip Shreeves, Jeffrey L. Andrews, A bootstrap‐augmented alternating expectation‐conditional maximization algorithm for mixtures of factor analyzers, Stat, 10.1002/sta4.243, 8, 1, (2019).
- Safdar Ghasami, Zahra Khodadadi, Mohsen Maleki, Autoregressive processes with generalized hyperbolic innovations, Communications in Statistics - Simulation and Computation, 10.1080/03610918.2018.1535066, (1-13), (2019).
- Cristina Tortora, Brian C. Franczak, Ryan P. Browne, Paul D. McNicholas, A Mixture of Coalesced Generalized Hyperbolic Distributions, Journal of Classification, 10.1007/s00357-019-09319-3, (2019).
- Michael P. B. Gallaugher, Paul D. McNicholas, Mixtures of skewed matrix variate bilinear factor analyzers, Advances in Data Analysis and Classification, 10.1007/s11634-019-00377-4, (2019).
- Paula M. Murray, Ryan P. Browne, Paul D. McNicholas, Mixtures of Hidden Truncation Hyperbolic Factor Analyzers, Journal of Classification, 10.1007/s00357-019-9309-y, (2019).
- Stergios B. Fotopoulos, Venkata K. Jandhyala, Alex Paparas, Some Properties of the Multivariate Generalized Hyperbolic Laws, Sankhya A, 10.1007/s13171-019-00173-4, (2019).
- Volodymyr Melnykov, Xuwen Zhu, Studying crime trends in the USA over the years 2000–2012, Advances in Data Analysis and Classification, 10.1007/s11634-018-0326-1, 13, 1, (325-341), (2018).
- Mohsen Maleki, Darren Wraith, Reinaldo B. Arellano-Valle, Robust finite mixture modeling of multivariate unrestricted skew-normal generalized hyperbolic distributions, Statistics and Computing, 10.1007/s11222-018-9815-5, 29, 3, (415-428), (2018).
- Alexander H. Foss, Marianthi Markatou, Bonnie Ray, Distance Metrics and Clustering Methods for Mixed‐type Data, International Statistical Review, 10.1111/insr.12274, 87, 1, (80-109), (2018).
- Michel van de Velden, Alfonso Iodice D'Enza, Angelos Markos, Distance‐based clustering of mixed data, Wiley Interdisciplinary Reviews: Computational Statistics, 10.1002/wics.1456, 11, 3, (2018).
- Volodymyr Melnykov, Xuwen Zhu, On model-based clustering of skewed matrix data, Journal of Multivariate Analysis, 10.1016/j.jmva.2018.04.007, 167, (181-194), (2018).
- Katherine Morris, Antonio Punzo, Paul D. McNicholas, Ryan P. Browne, Asymmetric clusters and outliers: Mixtures of multivariate contaminated shifted asymmetric Laplace distributions, Computational Statistics & Data Analysis, 10.1016/j.csda.2018.12.001, (2018).
- Mehrdad Naderi, Wen-Liang Hung, Tsung-I Lin, Ahad Jamalizadeh, A novel mixture model using the multivariate normal mean–variance mixture of Birnbaum–Saunders distributions and its application to extrasolar planets, Journal of Multivariate Analysis, 10.1016/j.jmva.2018.11.015, (2018).
- Jaehyuk Choi, Yeda Du, Qingshuo Song, Inverse Gaussian Quadrature and Finite Normal-Mixture Approximation of Generalized Hyperbolic Distribution, SSRN Electronic Journal, 10.2139/ssrn.3259013, (2018).
- Angelina Pesevski, Brian C. Franczak, Paul D. McNicholas, Subspace clustering with the multivariate-t distribution, Pattern Recognition Letters, 10.1016/j.patrec.2018.07.003, 112, (297-302), (2018).
- Michael P.B. Gallaugher, Paul D. McNicholas, Finite mixtures of skewed matrix variate distributions, Pattern Recognition, 10.1016/j.patcog.2018.02.025, 80, (83-93), (2018).
- Yang Tang, Ryan P. Browne, Paul D. McNicholas, Flexible clustering of high‐dimensional data via mixtures of joint generalized hyperbolic distributions, Stat, 10.1002/sta4.177, 7, 1, (2018).
- Nam-Hwui Kim, Ryan Browne, Subspace clustering for the finite mixture of generalized hyperbolic distributions, Advances in Data Analysis and Classification, 10.1007/s11634-018-0333-2, (2018).
- Meredith L. Wallace, Daniel J. Buysse, Anne Germain, Martica H. Hall, Satish Iyengar, Variable Selection for Skewed Model-Based Clustering: Application to the Identification of Novel Sleep Phenotypes, Journal of the American Statistical Association, 10.1080/01621459.2017.1330202, 113, 521, (95-110), (2017).
- Sharon M. McNicholas, Paul D. McNicholas, Ryan P. Browne, A Mixture of Variance-Gamma Factor Analyzers, Big and Complex Data Analysis, 10.1007/978-3-319-41573-4_18, (369-385), (2017).
- Antonello Maruotti, Antonio Punzo, Model-based time-varying clustering of multivariate longitudinal data with covariates and outliers, Computational Statistics & Data Analysis, 10.1016/j.csda.2016.05.024, 113, (475-496), (2017).
- Paula M. Murray, Ryan P. Browne, Paul D. McNicholas, Hidden truncation hyperbolic distributions, finite mixtures thereof, and their application for clustering, Journal of Multivariate Analysis, 10.1016/j.jmva.2017.07.008, 161, (141-156), (2017).
- Przemysław Spurek, General split gaussian Cross–Entropy clustering, Expert Systems with Applications, 10.1016/j.eswa.2016.10.025, 68, (58-68), (2017).
- Paula M. Murray, Ryan P. Browne, Paul D. McNicholas, A mixture of SDB skew- t factor analyzers, Econometrics and Statistics, 10.1016/j.ecosta.2017.05.001, 3, (160-168), (2017).
- Evgenia Suzdaleva, Ivan Nagy, Matej Petrous, undefined, 2017 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS), 10.1109/ICIIBMS.2017.8279700, (63-70), (2017).
- Mohsen Maleki, Reinaldo B. Arellano-Valle, Maximum a-posteriori estimation of autoregressive processes based on finite mixtures of scale-mixtures of skew-normal distributions, Journal of Statistical Computation and Simulation, 10.1080/00949655.2016.1245305, 87, 6, (1061-1083), (2016).
- Luca Bagnato, Antonio Punzo, Maria G. Zoia, The multivariate leptokurtic‐normal distribution and its application in model‐based clustering, Canadian Journal of Statistics, 10.1002/cjs.11308, 45, 1, (95-119), (2016).
- Paul D. McNicholas, Model-Based Clustering, Journal of Classification, 10.1007/s00357-016-9211-9, 33, 3, (331-373), (2016).
- Paul D. McNicholas, References, Mixture Model-Based Classification, 10.1201/9781315373577, (181-204), (2016).
- Katherine Morris, Paul D. McNicholas, Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures, Computational Statistics & Data Analysis, 10.1016/j.csda.2015.10.008, 97, (133-150), (2016).
- Xuwen Zhu, Volodymyr Melnykov, Manly transformation in finite mixture modeling, Computational Statistics & Data Analysis, 10.1016/j.csda.2016.01.015, (2016).
- Cristina Tortora, Paul D. McNicholas, Ryan P. Browne, A mixture of generalized hyperbolic factor analyzers, Advances in Data Analysis and Classification, 10.1007/s11634-015-0204-z, 10, 4, (423-440), (2015).
- Utkarsh J. Dang, Ryan P. Browne, Paul D. McNicholas, Mixtures of multivariate power exponential distributions, Biometrics, 10.1111/biom.12351, 71, 4, (1081-1089), (2015).




