Research Article
You have full text access to this OnlineOpen article
Understanding forecast verification statistics
Article first published online: 20 MAR 2008
DOI: 10.1002/met.51
Copyright © 2008 Royal Meteorological Society
Issue
1469-8080/asset/cover.gif?v=1&s=97473bc2d1b0a36c424dad5c220821833bfaeef5)
Meteorological Applications
Special Issue: Forecast Verification
Volume 15, Issue 1, pages 31–40, March 2008
Additional Information
How to Cite
Mason, S. J. (2008), Understanding forecast verification statistics. Met. Apps, 15: 31–40. doi: 10.1002/met.51
Publication History
- Issue published online: 20 MAR 2008
- Article first published online: 20 MAR 2008
- Manuscript Accepted: 2 JAN 2008
- Manuscript Revised: 6 DEC 2007
- Manuscript Received: 18 SEP 2007
Funded by
- National Oceanic and Atmospheric Administration. Grant Number: AN07GP0213
References
- . 2002. Categorical Data Analysis, 2nd edn. Wiley-Interscience: Hoboken; 734.
- . 2007. An Introduction to Categorical Data Analysis, 2nd edn. Wiley-Interscience: Hoboken; 372.
- , . 1993. A degeneracy in cross-validated skill in regression-based forecasts. Journal of Climate 6: 963–977.
- , . 2007. Scoring probabilistic forecasts: the importance of being proper. Weather and Forecasting 22: 382–388.
- , . 1991. Use of statistical methods in the search for teleconnections. Teleconnections Linking Worldwide Climate Anomalies: Scientific Basis and Societal Impact. Cambridge University Press: Cambridge; 371–400.
- . 1983. Effects of sampling errors in statistical estimation. Deep-Sea Research 30: 1083–1103.
- . 1985. Sensitivity of verification scores to the classification of the predictand. Monthly Weather Review 113: 1384–1392.
- . 1976. Predictability of sea surface temperature and sea level pressure anomalies over the North Pacific Ocean. Journal of Physical Oceanography 6: 249–266.
- , , . 2006. Field significance revisited: spatial bias errors in forecasts as applied to the Eta model. Monthly Weather Review 134: 519–534.
- , . 1994. Assessing forecast skill through cross validation. Weather and Forecasting 9: 619–624.
- . 1969. A scoring system for probability forecasts of ranked categories. Journal of Applied Meteorology 8: 985–987.
- . 1983. Effective scoring rules for probabilistic forecasts. Management Science 29: 447–454.
- , . 1992. Equitable skill scores for categorical forecasts. Monthly Weather Review 120: 361–370.
- . 1975. The predictive sample reuse method with applications. Journal of the American Statistical Association 70: 320–328.
- . 1992. A note on Gandin and Murphy's equitable skill score. Monthly Weather Review 120: 2707–2712.
- , . 2007. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association 102: 359–378.
- , . 2006. Measuring forecast skill: is it real skill or is it the varying climatology? Quarterly Journal of the Royal Meteorological Society 132: 2905–2923.
- , , . 1996. Coupled model predictions of ENSO during the 1980s and the 1990s at the National Centers for Environmental Prediction. Journal of Climate 9: 3105–3120.
- . 2004. P stands for …. Weather 59: 77–79.Direct Link:
- . 2007. Uncertainty and inference for verification measures. Weather and Forecasting 22: 637–650.
- . 2008. The impenetrable hedge: a note on propriety, equitability and consistency. Meteorological Applications 15: 25–29.
- , . 2003. Introduction. Forecast Verification: A Practitioner's Guide in Atmospheric Science. Wiley: Chichester; 1–12.
- , . 2008. Proper scores for probability forecasts can never be equitable. Monthly Weather Review in press.
- . 1988. Use of cross correlations in the search for teleconnections. Journal of Climatology 8: 241–253.
- , . 1991. The problem of multiplicity in research on teleconnections. International Journal of Climatology 11: 505–513.
- , . 1998. Decadal variability in ENSO predictability and prediction. Journal of Climate 11: 2804–2822.
- , . 2006. Resampling methods for spatial region models under a class of stochastic designs. Annals of Statistics 34: 1774–1813.
- , . 1983. Statistical field significance and its determination by Monte Carlo techniques. Monthly Weather Review 111: 46–59.
- . 2003. Binary events. Forecast Verification: A Practitioner's Guide in Atmospheric Science. Wiley: Chichester; 37–76.
- . 2004. On using “climatology” as a reference strategy in the Brier and ranked probability skill scores. Monthly Weather Review 132: 1891–1895.
- , . 2002. Areas beneath the relative operating characteristics (ROC) and levels (ROL) curves: statistical significance and interpretation. Quarterly Journal of the Royal Meteorological Society 128: 2145–2166.
- , . 2002. Comparison of some statistical methods of probabilistic forecasting of ENSO. Journal of Climate 15: 8–29.
- , . 2008. How can we know whether the forecasts are any good? Seasonal Climate Variability: Forecasting and Managing Risk. Kluwer Academic Publishers: Dordrecht, in press.
- , , , . 2008. Locality and the ranked probability skill score. Monthly Weather Review Submitted to.
- . 1987. Cross-validation in statistical climate forecast models. Journal of Climate and Applied Meteorology 26: 1589–1600.
- . 1969. On the “ranked probability score”. Journal of Applied Meteorology 8: 988–989.
- . 1970. The ranked probability score and the probability score: a comparison. Monthly Weather Review 98: 917–924.
- . 1971. A note on the ranked probability score. Journal of Applied Meteorology 10: 155–156.
- . 1973. A new vector partition of the probability score. Journal of Applied Meteorology 12: 595–600.
- . 1991. Forecast verification: its complexity and dimensionality. Monthly Weather Review 119: 1590–1601.
- . 1993. What is a good forecast? An essay on the nature of goodness in weather forecasting. Weather and Forecasting 8: 281–293.
- . 1996. The Finley affair: a signal event in the history of forecast verification. Weather and Forecasting 11: 3–20.
- , . 1987. A general framework for forecast verification. Monthly Weather Review 115: 1330–1338.
- . 1985. Should scoring rules be effective? Management Science 31: 527–535.
- . 2001. The insignificance of significance testing. Bulletin of the American Meteorological Society 82: 981–986.
- , , , . 1996. Revised “LEPS” scores for assessing climate model simulations and long-range forecasts. Journal of Climate 9: 34–53.
- , . 1980. Inflation of R2 in best subset regression. Technometrics 22: 49–53.
- , . 1999. On cross validation for model selection. Neural Computation 11: 863–870.
- . 2007. Performance targets and the Brier score. Meteorological Applications 14: 185–194.
- , . 2002. Evaluating probabilistic forecasts using information theory. Monthly Weather Review 130: 1653–1660.
- . 1993. Linear model selection by cross-validation. Journal of the American Statistical Association 88: 486–494.
- . 2007. Handbook of Parametric and Nonparametric Statistical Procedures, 4th edn. Chapman and Hall/CRC: Boca Raton; 1776.
- . 1974. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society 36B: 111–147.
- , , . 2004. Controlling the proportion of falsely rejected hypotheses when conducting multiple tests with climatological data. Journal of Climate 17: 4343–4356.
- , . 1991. Prediction of seasonal rainfall in the north Nordeste of Brazil using eigenvectors of sea-surface temperatures. International Journal of Climatology 11: 711–743.
- , . 1981. Tests of significance in forward selection regression with an F-to-enter stopping rule. Technometrics 23: 377–380.
- . 1997. Resampling hypothesis tests for autocorrelated fields. Journal of Climate 10: 65–82.
- . 1998. Multisite generalizations of a daily stochastic precipitation generation model. Journal of Hydrology 210: 178–191.
- . 2006a. Statistical Methods in the Atmospheric Sciences, 2nd edn. Academic Press: San Diego; 627.
- . 2006b. On “field significance” and the false discovery rate. Journal of Applied Meteorology and Climatology 45: 1181–1189.
- , , . 1999. A strategy for verification of weather element forecasts from an ensemble prediction system. Monthly Weather Review 127: 956–970.
- , . 2001. Monte Carlo cross validation. Chemometrics and Intelligent Laboratory Systems 56: 1–11.
- . 1987. Statistical considerations for climate experiments. Part II: multivariate tests. Journal of Climate and Applied Meteorology 26: 477–487.

1469-8080/asset/MET_left.gif?v=1&s=89d0327f02e96d32bf7b7ab1ade638e5da405850)
1469-8080/asset/MET_right.gif?v=1&s=8a838181ae6da4500f7beece57a5215c0394d075)