2.1. Methodology, Model and Experiments
 Testing the impact of ocean initial conditions on the seasonal forecast skill requires generation of a comprehensive set of forecasts from the different ocean states. The baseline experiment for this study is the ECMWF S3 seasonal forecasting system [Anderson et al., 2006; Molteni et al., 2007], where the ocean initial conditions are created using the ECMWF ocean reanalysis system ORA-S3 [Balmaseda et al., 2008]. All available observations of temperature, salinity and altimetric sea level anomalies are used. The assimilation of altimeter data needs the prescription of the mean dynamic topography (MDT), which in ORA-S3 is derived from a previous assimilation run where subsurface temperature and salinity were used (MDT1). In methods (i) and (ii) the atmospheric fluxes are from the ERA40 reanalysis for the period January 1959 to June 2002 and NWP operational analyses thereafter (ERA/OPS). In all methods the ocean model SSTs are strongly relaxed to analyzed daily SST maps from the OIv2 SST product [Reynolds et al., 2002] from 1982 onwards. In methods ii and iii, no other ocean observations are used. The ocean model has a horizontal resolution of 1° × 1° with equatorial refinement. For further details see Balmaseda et al. .
 The same ocean model that is used for the analyses is used for the coupled forecasts, coupled daily to the atmospheric model, IFS cycle 31r1 at T159 horizontal resolution with 62 levels in the vertical, extending up to 5 hPa. Table 1 gives a summary of the different experiments conducted. The hindcast, sometimes called reforecast experiments consist of 80 different start times, spanning the period 1987–2006 and sampled every three months (Jan, Apr, Jul and Oct). For each date, an ensemble of 5 coupled forecasts with perturbed initial conditions is integrated to 7-months ahead.
Table 1. Description of Experiments
|Experiment||Information in the Ocean Initial Conditions|
|i ALL||SST + Atmos obs + Ocean Obs|
|ii NO-OCOBS||SST + Atmos obs|
|NOMOOR||ALL except moorings|
|NOALTI||ALL except altimeter|
|NOARGO||ALL except Argo|
|MDT0||As ALL, but using MDT0|
 The three initialisation strategies can also be viewed as OSE type experiments in which atmospheric and oceanic data are withdrawn as can be seen in Table 1. Differences in forecast skill between ALL (method i) and NO-OCOBS (method ii) are indicative of the impact of ocean observations, between NO-OCOBS and SST-ONLY (method iii) of the impact of atmospheric data that went into the ERA40/OPS. The combined impact of ocean and atmosphere information can be gauged by the differences between ALL and SST-ONLY. All skill scores have been cross-validated.
2.2. Assessment of Skill of Different Initialisation Strategies
 The evolution of the SST bias in the coupled model is shown in Figures 1a and 1b for regions NINO3 and NINO4 (defined in Table 2) for the 3 experiments. The amplitude of the interannual variability of the coupled model as a function of lead time is shown in the lower panels. This latter is calculated as the ratio between the standard deviation of the interannual anomalies of the coupled model (computed separately for each ensemble member) and that of the observed SST.
Figure 1. (top) Forecast drift as a function of forecast lead time for 4 start months in regions (a) NINO3 and (b) NINO4 for experiments ALL, NO-OCOBS and SST-ONLY. (bottom) Variance ratio as a function of lead time for the same experiments averaged over all start months in (c) NINO3 and (d) NINO4.
Download figure to PowerPoint
Table 2. Definition of Area Averaged Indices
 Both the model bias and the amplitude of the interannual variability are sensitive to the initial conditions. In the Eastern Pacific (NINO3; Figures 1a and 1c), ALL shows the strongest warm bias for forecasts initialized in April, July and October. The warm bias is symptomatic of the existence of initialization shock: the coupled model is not able to maintain the slope of the thermocline in the initial conditions, and fast dynamic adjustment takes place through a downwelling Kelvin wave which depresses the thermocline in the Eastern Pacific, shutting down the upwelling and producing surface warming. The bias is close to zero in experiment NO-OCOBS, where the initial conditions have a flatter thermocline, and consequently the dynamic Kelvin wave adjustment is weaker. The cold bias in SST-ONLY, which develops especially fast for October starts is likely to be related to the thermocline being too shallow in the initial conditions, leading to an overestimation of the cooling by upwelling and the development of a cold bias as soon as the strong relaxation to SST used in the initialization process is turned off.
 The amplitude of the interannual variability seems to be related to the magnitude of the bias, the least activity occurring in the presence of warm bias. This is to be expected if convective processes set an upper limit on how large values of SST can be and could explain why the amplitude of the interannual variability in ALL is low. However, it does not explain the low levels of activity in NO-OCOBS and SST-ONLY, suggesting that the underestimation of the interannual variability in NINO3 is not only related to the initial conditions, but stems from other sources of error in the coupled model.
 In the Central Western Pacific (NINO4; Figures 1b and 1d) the initial conditions also have a large impact on the model bias and interannual variability. Here, ALL shows the smallest bias, followed by NO-OCOBS. The cold bias for SST-ONLY is the largest. The cold biases in NO-OCOBS and SST-ONLY are especially large during the second half of the year, consistent with the cold tongue penetrating too far west. In this region the amplitude of the interannual variability is related to the mean state and to the initialization procedure. For instance, overactive upwelling, characteristic of a cold tongue regime in this area, will produce an overestimation of the interannual variability, as happens in experiment SST-ONLY. The amplitude is underestimated in experiment ALL, even when the bias is small. The underestimation of the interannual variability in NINO4 for experiment ALL, and in NINO3 for all the experiments, suggests the existence of errors not corrected with the initialization, such as the underestimation of the atmospheric intraseasonal variability.
 Balmaseda et al.  show that the assimilation of ocean observations has two main effects on the ocean mean state: it increases the heat content of the Equatorial Pacific by deepening the thermocline and increases the slope of the thermocline. Results shown in Figure 1 suggest that while the first correction is maintained during the forecast, thus avoiding the westward penetration of the cold tongue and the cold bias in NINO4, the slope of the thermocline is difficult to maintain, and is lost by rapid dynamical adjustment leading to the warm bias in the Eastern Pacific (NINO3).
 The impact of initialization strategies in the forecast skill appears in Figure 2 as a function of lead time for region NINO4. In this region the most skillful forecast at all lead times is obtained by method (i), and the worst by method (iii). There is a clear advantage from assimilating ocean observations. The results hold for both RMS error (Figure 2a) and anomaly correlation (Figure 2b).
Figure 2. Impact of initialization strategies in forecast skill as a function of lead time in region NINO4, in terms of (a) RMS error and (b) anomaly correlation. The best skill is obtained by experiment ALL and the worst by SST-ONLY.
Download figure to PowerPoint
 Figure 3 shows the impact on forecast skill for various regions in Table 2. The relative reduction in the monthly mean absolute error (MAE) resulting from adding information from the ocean and/or atmosphere observations for forecast range 1–7 months appears in Figure 3a. For example, in the EQ3 region the impact of not using ocean obs is to increase the MAE error by 12%, of using neither ocean nor atmospheric observations is about 28%. The impact of not using atmospheric observations is close to 15%. With the exception of EQATL, the best scores are achieved by experiment ALL. This means that for the ECMWF system, which uses i, the benefits of the ocean data assimilation and the use of fluxes from atmospheric (re)analyses more than offset problems arising from initialization shock. In the first 3 months of the forecast (not shown), the combined information of oceanic and atmospheric observations reduces the error by more than 40% in the different areas of the Equatorial Pacific (EQ3, NINO4, NINO3). Atmospheric observations are the main contributor to the reduction of forecast error. The contribution of the ocean observations is largest in the Central Western Pacific (13% in EQ3), but is negative in EQATL.
Figure 3. Impact of initialization in forecast skill for different regions, as measured by the reduction in mean absolute error for the forecast range 1–7 months. (a) Comparison of initialization strategies for the period 1987–2006. OCOBS indicates the differences between strategy i and ii which differ in the use of ocean observations. ATOBS indicates differences between ii and iii, which differ in the use of atmospheric data, while OC + AT gives differences between i and iii and represents the combined impact of atmospheric and oceanic data. (b) Comparison of altimeter, moorings and MDT for the period 1993–2006. ALTI indicates the difference in skill between NO-ALTI and ALL, and MOOR the difference between NO-MOOR and ALL. MEAN indicates the differences from using the different MDTs. (c) Comparison between Argo, altimeter and moorings for the period 2001–2006. ARGO represents the difference between NO-ARGO and ALL. Only differences exceeding the 70% significant level of a one-tailed T-test are shown.
Download figure to PowerPoint
 The contribution of oceanic and atmospheric observations seem to be cumulative in the reduction for MAE error at all lead times. The OC + AT bars measure the difference in skill between (i) and (iii) confirming that the assimilation of atmospheric and oceanic data is markedly better than using just SST, suggesting that the Luo et al.  approach is not the best, at least not at the forecast ranges considered here.
 The degraded skill in the Equatorial Atlantic may be indicative of poor balance between the ocean initial conditions and the coupled model fluxes, perhaps symptomatic of coupled model errors [Davey et al., 2001]. It can also be indicative of errors in the ocean initial conditions, although comparison with independent data shows that assimilating ocean data does improve the quality of the ocean analysis [Balmaseda et al., 2008]. It could also reflect spurious variability produced by the non-stationarity of the ocean observing system. A closer look at the impact of the different ocean observing systems for more specific time periods is given in the next section.