Maximum likelihood estimation of a multi-dimensional log-concave density


Richard Samworth, Statistical Laboratory, Centre for Mathematical Sciences, Wilberforce Road, Cambridge, CB3 0WB, UK.


Summary.  Let X1,…,Xn be independent and identically distributed random vectors with a (Lebesgue) density f. We first prove that, with probability 1, there is a unique log-concave maximum likelihood estimator inline image of f. The use of this estimator is attractive because, unlike kernel density estimation, the method is fully automatic, with no smoothing parameters to choose. Although the existence proof is non-constructive, we can reformulate the issue of computing inline image in terms of a non-differentiable convex optimization problem, and thus combine techniques of computational geometry with Shor's r-algorithm to produce a sequence that converges to inline image. An R version of the algorithm is available in the package LogConcDEAD—log-concave density estimation in arbitrary dimensions. We demonstrate that the estimator has attractive theoretical properties both when the true density is log-concave and when this model is misspecified. For the moderate or large sample sizes in our simulations, inline image is shown to have smaller mean integrated squared error compared with kernel-based methods, even when we allow the use of a theoretical, optimal fixed bandwidth for the kernel estimator that would not be available in practice. We also present a real data clustering example, which shows that our methodology can be used in conjunction with the expectation–maximization algorithm to fit finite mixtures of log-concave densities.