Inferring dispersal and migrations from incomplete geochemical baselines: analysis of population structure using Bayesian infinite mixture models

Authors

  • Philipp Neubauer,

    Corresponding authorCurrent affiliation:
    1. Dragonfly Science, Wellington 6141, New Zealand
    • Victoria University Coastal Ecology Laboratory, School of Biological Sciences, Victoria University of Wellington, Wellington, New Zealand
    Search for more papers by this author
  • Jeffrey S. Shima,

    1. Victoria University Coastal Ecology Laboratory, School of Biological Sciences, Victoria University of Wellington, Wellington, New Zealand
    Search for more papers by this author
  • Stephen E. Swearer

    1. Department of Zoology, University of Melbourne, Melbourne, Vic., Australia
    Search for more papers by this author

Correspondence author. E-mail: neubauer.phil@gmail.com

Summary

  1. Geochemical and stable isotope tags are often used to attribute individual animals in a sample of mixed origins to distinct sources, be it spawning, overwintering or foraging habitats. In order for individuals to be uniquely classified to one source, modelling approaches generally assume that all potential sources have been characterized in terms of their geochemical signature. This assumption is rarely met in applications of geochemistry in environments where species distributions and spawning grounds are poorly known; statistical methods that can accommodate this problem are therefore essential.
  2. We develop nonparametric Bayesian mixture models for geochemical signatures that estimate the most likely number of sources represented in a mixed sample, both in the absence and presence of baseline data. We then use a marginal clustering framework to evaluate the probability that a fish comes from a particular source.
  3. Using both simulations and a previously analysed data set, we illustrate the method and highlight the potential merits and difficulties. These examples reveal how our interpretations of geochemistry data sets can change when potentially un-sampled sources are taken into account.

Ancillary