Selecting food web models using normalized maximum likelihood

Authors

  • Phillip P. A. Staniczenko,

    Corresponding author
    1. Centre for Biodiversity and Environment Research, University College London, London, WC1E 6BT, UK
    2. Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
    Current affiliation:
    1. Centre for Biodiversity and Environment Research, University College London, London, UK
    Search for more papers by this author
  • Matthew J. Smith,

    1. Centre for Biodiversity and Environment Research, University College London, London, WC1E 6BT, UK
    Search for more papers by this author
  • Stefano Allesina

    1. Centre for Biodiversity and Environment Research, University College London, London, WC1E 6BT, UK
    2. Computation Institute, University of Chicago, Chicago, IL, USA
    Search for more papers by this author

Summary

  1. Ecological models link theory and data. They distil processes into a mathematical form that explains the salient features of observed data. Food webs describe the pattern of interactions between species in an ecosystem, and many models have been proposed to explain their structure. When selecting the most appropriate model for data, it is important to penalize against overly complicated models.
  2. Here, we introduce to ecology the use of normalized maximum likelihood (NML) for model selection and demonstrate its application to models for food web structure. Unlike AIC, which penalizes models using the number of parameters, NML normalizes the likelihood of data given a model by the sum of likelihoods for all possible food webs with the same number of species. NML favours models that fit observed data well and all other data sets poorly, in contrast with overly flexible models that fit many (unobserved) data sets by the same amount and thus provide little information on the system under investigation. As such, NML represents a natural measure for comparing very different models and enables ecologists to determine not only whether a particular model is superior to others, but also whether, objectively, the model is a poor description of data.
  3. We used NML to compare models from four popular model families (cascade, niche, modular and group) and found that the best models performed much better than random graphs incorporating no ecological principles. However, models specified by empirical characteristics such as species body mass, taxonomic classification or habitat were frequently far-from-optimal and, in some cases, performed worse than random graphs. This suggests that ecological interactions cannot be explained by a single species trait or coarse-grained environmental factor. The ranking of empirically determined models using NML was generally consistent with model selection according to AIC, BIC and Bayes factors. We also show how NML can improve the development of new model families by measuring the effectiveness of incremental changes to existing families or combining families.
  4. NML offers ecologists a rigorous and elegant framework for revealing the defining features of data through the systematic formulation, testing and modification of models.

Ancillary