Robust Gaussian Graphical Modeling Via l1 Penalization
Version of Record online: 28 SEP 2012
© 2012, The International Biometric Society
Volume 68, Issue 4, pages 1197–1206, December 2012
How to Cite
Sun, H. and Li, H. (2012), Robust Gaussian Graphical Modeling Via l1 Penalization. Biometrics, 68: 1197–1206. doi: 10.1111/j.1541-0420.2012.01785.x
- Issue online: 21 DEC 2012
- Version of Record online: 28 SEP 2012
- Received August 2011. Revised March 2012. Accepted April 2012.
- Coordinate descent algorithm;
- Genetic network;
- Iterative proportional fitting;
- Penalized likelihood
Summary Gaussian graphical models have been widely used as an effective method for studying the conditional independency structure among genes and for constructing genetic networks. However, gene expression data typically have heavier tails or more outlying observations than the standard Gaussian distribution. Such outliers in gene expression data can lead to wrong inference on the dependency structure among the genes. We propose a l1 penalized estimation procedure for the sparse Gaussian graphical models that is robustified against possible outliers. The likelihood function is weighted according to how the observation is deviated, where the deviation of the observation is measured based on its own likelihood. An efficient computational algorithm based on the coordinate gradient descent method is developed to obtain the minimizer of the negative penalized robustified-likelihood, where nonzero elements of the concentration matrix represents the graphical links among the genes. After the graphical structure is obtained, we re-estimate the positive definite concentration matrix using an iterative proportional fitting algorithm. Through simulations, we demonstrate that the proposed robust method performs much better than the graphical Lasso for the Gaussian graphical models in terms of both graph structure selection and estimation when outliers are present. We apply the robust estimation procedure to an analysis of yeast gene expression data and show that the resulting graph has better biological interpretation than that obtained from the graphical Lasso.