Modelling distributions of fossil sampling rates over time, space and taxa: assessment and implications for macroevolutionary studies


Correspondence author. E-mail:


  1. Observed patterns in the fossil record reflect not just macroevolutionary dynamics, but preservation patterns. Sampling rates themselves vary not simply over time or among major taxonomic groups, but within time intervals over geography and environment, and among species within clades. Large databases of presences of taxa in fossil-bearing collections allow us to quantify variation in per-collection sampling rates among species within a clade. We do this separately not just for different time/stratigraphic intervals, but also for different geographic or ecologic units within time/stratigraphic intervals. We then re-assess per-million-year sampling rates given the distributions of per-collection sampling rates
  2. We use simple distribution models (geometric and lognormal) to assess general models of per-locality sampling rate distributions given occurrences among appropriate fossiliferous localities. We break these down not simply by time period, but by general biogeographic units in order to accommodate variation over space as well as among species.
  3. We apply these methods to occurrence data for Meso-Cenozoic mammals drawn from the Paleobiology Database and the New and Old Worlds fossil mammal database. We find that all models of distributed rates do vastly better than the best uniform sampling rates and that the lognormal in particular does an excellent job of summarizing sampling rates. We also show that the lognormal distributions vary fairly substantially among biogeographic units of the same age.
  4. As an example of the utility of these rates, we assess the most likely divergence times for basal (Eocene–Oligocene) carnivoramorphan mammals from North America and Eurasia using both stratigraphic and morphological data. The results allow for unsampled taxa or unsampled portions of sampled lineages to be in either continent and also allow for the variation in sampling rates among species. We contrast five models using stratigraphic likelihoods in different ways to summarize how they might affect macroevolutionary inferences.