• Calibration;
  • Continuous predictions;
  • Crossvalidation;
  • Regression;
  • Scoring rules

Summary Calibration, the statistical consistency of forecast distributions and the observations, is a central requirement for probabilistic predictions. Calibration of continuous forecasts is typically assessed using the probability integral transform histogram. In this article, we propose significance tests based on scoring rules to assess calibration of continuous predictive distributions. For an ideal normal forecast we derive the first two moments of two commonly used scoring rules: the logarithmic and the continuous ranked probability score. This naturally leads to the construction of two unconditional tests for normal predictions. More generally, we propose a novel score regression approach, where the individual scores are regressed on suitable functions of the predictive variance. This conditional approach is applicable even for certain nonnormal predictions based on the Dawid–Sebastiani score. Two case studies illustrate that the score regression approach has typically more power in detecting miscalibrated forecasts than the other approaches considered, including a recently proposed technique based on conditional exceedance probability curves.