Penalized Gaussian Process Regression and Classification for High-Dimensional Nonlinear Data
Article first published online: 8 MAR 2011
© 2011, The International Biometric Society No claim to original US government works
Volume 67, Issue 4, pages 1285–1294, December 2011
How to Cite
Yi, G., Shi, J. Q. and Choi, T. (2011), Penalized Gaussian Process Regression and Classification for High-Dimensional Nonlinear Data. Biometrics, 67: 1285–1294. doi: 10.1111/j.1541-0420.2011.01576.x
- Issue published online: 14 DEC 2011
- Article first published online: 8 MAR 2011
- Received February 2010. Revised November 2010. Accepted December 2010.
- Covariance kernel;
- Functional data;
- Nonparametric regression and classification;
- Penalized Gaussian process regression;
- Variable selection
Summary The model based on Gaussian process (GP) prior and a kernel covariance function can be used to fit nonlinear data with multidimensional covariates. It has been used as a flexible nonparametric approach for curve fitting, classification, clustering, and other statistical problems, and has been widely applied to deal with complex nonlinear systems in many different areas particularly in machine learning. However, it is a challenging problem when the model is used for the large-scale data sets and high-dimensional data, for example, for the meat data discussed in this article that have 100 highly correlated covariates. For such data, it suffers from large variance of parameter estimation and high predictive errors, and numerically, it suffers from unstable computation. In this article, penalized likelihood framework will be applied to the model based on GPs. Different penalties will be investigated, and their ability in application given to suit the characteristics of GP models will be discussed. The asymptotic properties will also be discussed with the relevant proofs. Several applications to real biomechanical and bioinformatics data sets will be reported.