Predictability: Recent insights from information theory



[1] This paper summarizes a framework for investigating predictability based on information theory. This framework connects and unifies a wide variety of statistical methods traditionally used in predictability analysis, including linear regression, canonical correlation analysis, singular value decomposition, discriminant analysis, and data assimilation. Central to this framework is a procedure called predictable component analysis (PrCA). PrCA optimally decomposes variables by predictability, just as principal component analysis optimally decomposes variables by variance. For normal distributions the same predictable components are obtained whether one optimizes predictive information, the dispersion part of relative entropy, mutual information, Mahalanobis error, average signal to noise ratio, normalized mean square error, or anomaly correlation. For joint normal distributions, PrCA is equivalent to canonical correlation analysis between forecast and observations. The regression operator that maps observations to forecasts plays an important role in this framework, with the left singular vectors of this operator being the predictable components and the singular values being the canonical correlations. This correspondence between predictable components and singular vectors occurs only if the singular vectors are computed using Mahalanobis norms, a result that sheds light on the role of norms in predictability. In linear stochastic models the forcing that minimizes predictability is the one that renders the “whitened” dynamical operator normal. This condition for minimum predictability is invariant to linear transformation and is equivalent to detailed balance. The framework also inspires some new approaches to accounting for deficiencies of forecast models and estimating distributions from finite samples.