SEARCH

SEARCH BY CITATION

Keywords:

  • Binning;
  • Kernel density estimation;
  • Non-parametric statistics

Summary.  The estimation of a density profile from experimental data points is a challenging problem, which is usually tackled by plotting a histogram. Prior assumptions on the nature of the density, from its smoothness to the specification of its form, allow the design of more accurate estimation procedures, such as maximum likelihood. Our aim is to construct a procedure that makes no explicit assumptions, but still providing an accurate estimate of the density. We introduce the self-consistent estimate: the power spectrum of a candidate density is given, and an estimation procedure is constructed on the assumption, to be released a posteriori, that the candidate is correct. The self-consistent estimate is defined as a prior candidate density that precisely reproduces itself. Our main result is to derive the exact expression of the self-consistent estimate for any given data set, and to study its properties. Applications of the method require neither priors on the form of the density nor the subjective choice of parameters. A cut-off frequency, akin to a bin size or a kernel bandwidth, emerges naturally from the derivation. We apply the self-consistent estimate to artificial data generated from various distributions and show that it reaches the theoretical limit for the scaling of the square error with the size of the data set.