## 1 Introduction

[2] The concept of potential predictability is based on the idealization that interannual variability of seasonal means can be partitioned into two independent components: weather noise variability that is inherently unpredictable beyond a few days (or a few weeks for certain large‒scale structures) and a slowly varying component that is potentially predictable [*Madden*, 1976]. The extent to which seasonal variability exceeds weather noise variability determines the degree of potential predictability. The source of potential predictability often is identified with slowly varying components of the climate system, such as sea surface temperature, sea ice, or soil moisture, and, as such, is relatively constant within a season but can change dramatically from year to year. In principle, potential predictability also can arise from internal atmospheric dynamics [*Frederiksen and Zheng*, 2007]. Strictly speaking, externally forced changes in greenhouse gases, aerosols, and solar insolation also contribute to potential predictability. The word “potential” is used because the persistent components themselves may not be predictable on seasonal timescales.

[3] There are two fundamentally different approaches to quantifying potential predictability. The first is based on generating an ensemble of climate realizations using a dynamical model. For instance, all members of an ensemble could be forced by the same sea surface temperature (SST) but initialized at slightly different atmospheric states [*Rowell et al.*, 1995; *Kumar and Hoerling*, 1995; *Zwiers*, 1996]. The spread of ensemble members for a given SST measures the weather noise variability, while the variation of the ensemble mean due to varying SSTs measures the boundary‒forced interannual variability. This approach has the advantage of having scope to detect weak signals according to *Rowell et al.* [1995] but has the disadvantage of relying on models that are always imperfect.

[4] The second approach is to estimate potential predictability from statistical models fitted to a single realization of observation‒based data. This approach avoids problems due to inadequate dynamical and physical representations but requires long, homogeneous time series and makes statistical assumptions that may be violated. Just as dynamical models differ in their assumptions regarding the behavior of physical processes, statistical methods differ in their assumptions regarding the probabilistic structure of the underlying stochastic process. *Madden* [1976] proposed a frequency domain approach to estimating potential predictability in which weather noise variance is estimated from power spectra derived from 96 day time series [see also *Shukla*, 1983; *Zwiers*, 1987]. *Shukla and Gutzler* [1983] proposed an analysis of variance method modified to account for autocorrelated time series [see also *DelSole and Feng*, 2013]. *Jones et al.* [1995] proposed a time domain approach in which time series are fitted to a mixed model with errors having an autoregressive moving‒average structure. *Zheng* [1996] proposed this model for estimating autocorrelations, and *Feng et al.* [2012] proposed a simplified version of this model, called the analysis of covariance (ANOCOVA) model, for testing hypotheses about potential predictability. *Zheng et al.* [2000] proposed an analysis of variance method that can be applied to monthly mean time series. Finally, *Feng et al.* [2011] proposed a bootstrap technique that makes less restrictive assumptions about the underlying stochastic process.

[5] A comparison of the above predictability estimates has not been made using the same data set spanning the same modern time period over the entire globe. The purpose of this study is to compare estimates of potential seasonal predictability derived from *observation‒based daily time series*. These methods are denoted as Shukla‒Gutzler (SG), Madden (MN), ANOCOVA, and bootstrap and are described in more detail in section 2. The method of *Zheng et al.* [2000] is not included since it is based on monthly means, but we will show that many of our results are consistent with the results shown in *Zheng et al.* [2000], which is impressive given that it does not process daily information. Furthermore, we consider only the autoregressive approach of *Feng et al.* [2012], which appears to be adequate in most cases. The estimates of potential predictability produced by these methods are of fundamental interest, and their differences highlight the sensitivity of these estimates to the choice of method. Application of these methods to synthetic data is discussed in section 3, while application to observation‒based data is discussed in section 5. The concluding section provides a summary and discussion of results.