Summary We consider forecasting using a combination, when no model coincides with a non-constant data generation process (DGP). Practical experience suggests that combining forecasts adds value, and can even dominate the best individual device. We show why this can occur when forecasting models are differentially mis-specified, and is likely to occur when the DGP is subject to location shifts. Moreover, averaging may then dominate over estimated weights in the combination. Finally, it cannot be proved that only non-encompassed devices should be retained in the combination. Empirical and Monte Carlo illustrations confirm the analysis.