Sampling from Dirichlet partitions: estimating the number of species

Authors

  • Thierry Huillet,

    1. Laboratoire de Physique Théorique et Modélisation, Université de Cergy-Pontoise, UMR CNRS 8089, 2 avenue Adolphe Chauvin, 95032 Cergy-Pontoise, France
    Search for more papers by this author
  • Christian Paroissin

    Corresponding author
    1. Laboratoire de Mathématiques et de leurs Applications, Université de Pau et des Pays de l'Adour, UMR CNRS 5142, Avenue de l'Université, 64013 Pau cedex, France
    • Laboratoire de Mathématiques et de leurs Applications, Université de Pau et des Pays de l'Adour, UMR CNRS 5142, Avenue de l'Université, 64013 Pau cedex, France.
    Search for more papers by this author

Abstract

The Dirichlet partition of an interval can be viewed as the generalization of several classical models in ecological statistics. We recall the unordered Ewens sampling formulae -ESF) from finite Dirichlet partitions. As this is a key variable for estimation purposes, focus is on the number of distinct visited species in the sampling process. These are illustrated in specific cases. We use these preliminary statistical results on frequencies distribution to address the following sampling problem: what is the estimated number of species when sampling is from Dirichlet populations? The obtained results are in accordance with the ones found in sampling theory from random proportions with Poisson–Dirichlet -PD) distribution. To conclude with, we apply the different estimators suggested to two different sets of real data. Copyright © 2009 John Wiley & Sons, Ltd.

Ancillary