Identifying the causes of the poor decadal climate prediction skill over the North Pacific

Authors


Abstract

[1] While the North Pacific region has a strong influence on North American and Asian climate, it is also the area with the worst performance in several state-of-the-art decadal climate predictions in terms of correlation and root mean square error scores. The failure to represent two major warm sea surface temperature events occurring around 1963 and 1968 largely contributes to this poor skill. The magnitude of these events competes with the largest observed temperature anomalies in the twenty-first century that might be associated with the long-term warming. Understanding the causes of these major warm events is thus of primary concern to improve prediction of North Pacific, North American and Asian climate. The 1963 warm event stemmed from the propagation of a warm ocean heat content anomaly along the Kuroshio-Oyashio extension. The 1968 warm event originated from the upward transfer of a warm water mass centered at 200 m depth. For being associated with long-lived ocean heat content anomalies, we expect those events to be, at least partially, predictable. Biases in ocean mixing processes present in many climate prediction models seem to explain the inability to predict these two major events. Such currently unpredictable warm events, if occurring again in the next decade, would substantially enhance the effect of long-term warming in the region.

1. Introduction

[2] It has been recently hypothesized that initial-condition information could significantly improve skill in predicting the near-term climate relative to uninitialized historical simulations or climate projections [Smith et al., 2007; Keenlyside et al., 2008; Pohlmann et al., 2009; Mochizuki et al., 2010; Smith et al., 2010]. This initiated the inclusion in the recent CMIP5 project of a near-term climate prediction (also termed decadal prediction) exercise [Taylor et al., 2012], in which the climate predictability arising from both initial-condition information and changes in external forcings are exploited [Hawkins and Sutton, 2009; Meehl et al., 2009; Murphy et al., 2010; Doblas-Reyes et al., 2010]. Near-term climate prediction can also be viewed as a new tool to diagnose discrepancies between modeled and observed climate processes. Within the framework of the CMIP5 decadal climate prediction exercise, we focus in this article on the effective predictability of North Pacific Sea Surface Temperature (SST) departures which influence significantly North American climate variables such as temperature and precipitation on seasonal to decadal timescales [Latif and Barnett, 1994; Zhang et al., 1997; Mantua and Hare, 2002; Yu et al., 2007; Ault and St. George, 2010].

[3] On decadal timescales, the Pacific Decadal Oscillation (PDO) [Mantua et al., 1997], whose inter-hemispheric signature is also called Inter-decadal Pacific Oscillation (IPO) [Power et al., 1999; Folland et al., 2002], stands as the dominant North Pacific mode of decadal SST variability. The PDO has been hypothesized to be primarily driven by atmospheric variability through heat fluxes and advection by the mean ocean gyres [Saravanan and McWilliams, 1998; Wang and Chang, 2004] or wind-driven adjustments in the ocean circulation via Rossby waves [Frankignoul et al., 1997; Jin, 1997; Neelin and Weng, 1999]. The Pacific North American (PNA) teleconnection pattern [Blackmon et al., 1984; Trenberth and Hurrell, 1995] is argued to play a key role in this forcing [Trenberth and Hurrell, 1995; Newman et al., 2003]. The El Niño Southern Oscillation (ENSO) derived variability of the PNA has been hypothesized to be integrated and low-passed by the ocean to yield the decadal PDO pattern [Newman et al., 2003; Schneider and Cornuelle, 2005; Newman, 2007]. This low-pass filter builds on the re-emergence mechanism [Alexander and Deser, 2005; Alexander et al., 1999; Deser et al., 2003] and ocean dynamics may play a role in shaping the anomalies [Schneider and Cornuelle, 2005]. However, a positive feedback of the SST anomalies on the atmosphere may result in self-sustaining decadal oscillations [Latif and Barnett, 1994, 1996; Robertson, 1996; Kwon and Deser, 2007], possibly through a linkage with the tropical Pacific variability [Trenberth and Hurrell, 1995; Pierce et al., 2000; Deser et al., 2004; Vimont, 2005]. Boer and Lambert [2008] suggest a potential decadal predictability of North Pacific surface temperatures, which they quantify as the ratio of decadal variance over total variance, in the preindustrial control simulations performed in the framework of the CMIP3 project [Meehl et al., 2007]. Using a “perfect model approach”, those findings are not verified in the Hadley Centre Coupled Model version 3 by Collins [2002]. Hermanson and Sutton [2010] show however, still using a “perfect model approach”, a potential predictability of the IPO, in the same climate model as Collins [2002], up to 2 years ahead, for a particular set of initial climate states. Attempts at predicting the North Pacific climate 20 years into the future have suggested a skill dominated by greenhouse forcing [Meehl et al., 2010] in the Community Climate System Model version 3 [Collins et al., 2006]. It has also been argued that the North Pacific ocean heat content in the top 300 m is predictable up to a decade ahead [Mochizuki et al., 2010] thanks to diagnosed skill in predicting the Pacific Decadal Oscillation.

[4] Contrary to previous studies which focused on a single model set of climate predictions, our analyses rely on an wide ensemble of climate forecasting systems comprising the four involved in the ENSEMBLES project [Doblas-Reyes et al., 2010], the perturbed-parameter nine-member Met Office Decadal Climate Prediction System [Smith et al., 2007, 2010], and five of the participants to the CMIP5 project [Taylor et al., 2012]. We thus analyze from the very first decadal prediction attempts to the very latest ones, which use various initialization techniques and ensemble generation methods. All those state-of-the-art dynamical forecast system exhibit consistently weak performance in predicting the North Pacific multiannual variability as we illustrate inSection 3 after describing those systems together with the observational data sets and the analysis methods in Section 2. This failure raises the question whether the multiannual climate variability in this region is fundamentally unpredictable or if this variability is driven by potentially predictable mechanisms that the current generation of climate models is unable to capture. To investigate the reasons for this particularly low skill, we identify and describe, in Section 4, two major warm events which are consistently missed by every climate forecast system. In Sections 5 and 6, we show that different mechanisms are responsible for the two events, based on an extensive set of eleven observational data sets. Section 7returns to the climate predictions to attempt to explain why they exhibit such low skill in the North Pacific and points to the representation of ocean mixing processes as the most probable responsible weakness of the current generation of climate models, though those ocean mixing biases might be linked to uncertainties in the ocean-atmosphere fluxes.Sections 8 and 9 respectively provide a discussion and conclusions.

2. Data and Methods

2.1. The Forecast Systems

[5] We assess the performance of the current generation of climate forecast systems from three multimodel ensembles, the characteristics of which are summarized in Table 1:

Table 1. Summary Table of the Ensembles of Decadal Hindcasts Used in This Articlea
Ensemble NameModelsInitialization
  • a

    The second column provides the list of models included in the ensemble. The third column provides the initialization technique. More details are provided in Section 2.

CMIP5HadCM3Assimilation in the coupled model of anomalies from ERA40 and ERAint reanalyses and ocean observations
 MRI-CGCM3Assimilation in the coupled model of gridded ocean subsurface observations of T and S
 MIROC4Assimilation in the coupled model of ocean anomalies of gridded subsurface observations of T and S
 MIROC5Assimilation in the coupled model of ocean anomalies of gridded subsurface observations of T and S
 EC-Earth v2Full field initialization from NEMOVAR-ORAS4 ocean reanalysis and ERAint (before 1989) and ERA40 (after 1989) land/atmosphere reanalysis
 
DePreSys9 variants of HadCM3Nudging of anomalies in horizontal winds, atmospheric temperature, surface pressure, ocean temperature and salinity + online flux correction
 
ENSEMBLESCNRM-CM3full field initialization
 ECMWF-S3full field initialization
 HadGEM1full field initialization
 ECHAM5/OM1Nudging of observed SST anomalies in the coupled model

[6] 1. The first one comprises contributions to the CMIP5 project [Taylor et al., 2012] produced with the HadCM3 [Gordon et al., 2000; Pope et al., 2000], the MRI-CGCM3 [Yukimoto et al., 2001], the MIROC4, the MIROC5 [Hasumi and Emori, 2004] and the EC-Earth v2 [Hazeleger et al., 2010] coupled climate models with respectively 10, 9, 3, 6, and 5 ensemble members per start date. These contributions consist of 10-year-long hindcasts initialized from estimates of observed climate states every 5 years over the period 1960–2005. The initialization date varies from November, 1st to the following January, 1st depending on the forecast system. We thus consider that the first forecast year starts in the first January of the hindcast. This multimodel ensemble will be referred to as CMIP5 in the following. For five realizations of the EC-Earth v2 model, the hindcasts also include one start date every year over the 1960–2005 period.

[7] 2. The second is a nine-member perturbed-parameter ensemble consisting of 9 variants of the HadCM3 model [Hawkins and Sutton, 2009; Gordon et al., 2000] within the Decadal Climate Prediction System of the UK Meteorological Office [Smith et al., 2007, 2010]. The variants are obtained by perturbing simultaneously 29 atmosphere and sea-ice parameters [Murphy et al., 2004]. This set of hindcasts will be referred to as DePreSys. Ten-year hindcasts were started every year from 1960 to 2005, on November, 1st using an anomaly initialization technique [Robson, 2010].

[8] 3. The third one comprises contributions to the ENSEMBLES project [Doblas-Reyes et al., 2010] with four different coupled ocean-atmosphere models. The experimental setup is the one later chosen for the decadal prediction exercise of CMIP5 and consists of ten-year long ensemble dynamical hindcasts initialized once every five years over the period 1960–2005, on November, 1st. These hindcasts have three-member ensembles per model.

[9] Each near-term climate prediction produced with those systems is initialized from an estimate of the observed climate state and takes into account in different ways both natural and anthropogenic changes to the radiative forcing. The reader is referred toTable 1 for more details about the initialization techniques.

2.2. The Observational Data Sets

[10] The analyses of the key mechanisms driving the North Pacific climate variability rely on the following data sets: (1) SST: the NOAA Extended Reconstructed SST v3b data set (named ERSST in this article) [Smith et al., 2008] and the HadISST v1.1 data set from the UK Met Office (named HadISST) [Rayner et al., 2003]; (2) 3-dimensional ocean temperature: the NEMOVAR-COMBINE reanalysis [Balmaseda et al., 2010] (named NEMOVAR); (3) mixed layer depth: the NEMOVAR-COMBINE reanalysis [Balmaseda et al., 2010] and a gridded observational data set based on thermodynamic profiles [de Boyer Montégut et al., 2004]; (4) surface heat fluxes: the da Silva et al. [1994] (named DS94) and the OAFLuxes [Yu et al., 2008] data sets; (5) 2-meter temperature (T2M): GHCN observations [Peterson et al., 1998], and the NCEP/NCAR R1 [Kalnay et al., 1996] (named NCEP), ERA-Interim [Dee et al., 2011] (named ERAint) and ERA40 [Uppala et al., 2004] reanalyses; and (6) 10-meter wind speed: the DFS4.3 data set [Brodeau et al., 2010] and the NCEP reanalysis.

2.3. Computation of Anomalies

[11] When assessing the forecast system SST skill in Section 3, the model or observation climatology is defined as a function of lead time, by averaging the hindcast SST across the starting dates, using only hindcast values for which observations are available at the corresponding dates. The model climatologies obtained in such a way are then subtracted from each raw hindcast to obtain anomalies over the whole hindcast period. The same method is applied to the observations to obtain anomalies over the whole observational period. The anomalies thus obtained are referred to as “per-pair” anomalies followingGarcia-Serrano and Doblas-Reyes [2012]. For example, the computation of the “per-pair” climatologies to compare the CMIP5 ensemble-mean SST to the ERSST one will not take into account the hindcast that starts in 2005 for lead times longer than six years since the ERSST data set ends in December 2011. However, we will be able to compute “per-pair” anomalies for lead times longer than six years for this hindcast by subtracting the climatology computed from the nine preceding hindcasts.

[12] The hindcast performance is assessed from the bias-corrected “per-pair” anomalies independently of the initialization technique employed. Indeed, up to now, there is no guarantee that anomaly initialization techniques remove initial drifts or shocks [Robson, 2010]. This analysis method avoids spurious disparities in the assessed performance arising from post-processing the data from different forecast systems inconsistently. Hindcast skill is measured either using the anomaly correlation coefficient or Root Mean Square Error (RMSE). The significance level for the correlation skill is computed via a one-sided Student t-test which takes into account the autocorrelation of the time series.

[13] When considering the observational data sets independently of the forecast systems for process analysis in Sections 4 to 6, the common period to all the data sets, i.e. from January 1958 to June 1994, is used to compute their climatological annual cycle. This annual cycle is then subtracted from the whole data set to obtain the anomalies.

[14] For plotting purposes, a smoothing is performed as the last step of the data processing with a 12-month running mean inSections 3 to 6and with a 6-month running mean inSection 7.

3. State-of-the-Art Climate Forecast Skill

[15] In the CMIP5 ensemble, the North Pacific region stands out, along with the Southern Ocean, as the region with the poorest anomaly skill score globally, for hindcasts averaged over the forecast time 2–5 years (Figure 1a). This characteristic appears in each individual forecast system included in the CMIP5 ensemble (Figure S1 in auxiliary material Text S1) as well as the ENSEMBLES and DePreSys ensembles (Figure S2 in auxiliary material Text S1). To compare quantitatively the performance of the CMIP5 hindcasts in the different world oceans, we compute the combined spatial and temporal correlation (Figure 1b) and RMSE (Figure 1c) of the predicted against observed SST anomalies as a function of start date, for each ocean after applying a 12-month running mean.

display math
display math

where math formula, SSTano_obs and SSTano_modstand for the observed and modeled “per-pair” SST anomalies computed as described insection 2, lonmin, lonmax, latmin, latmax correspond to the geographical limits defined for a given region, forecast1 and forecast10 correspond to the first and tenth forecast respectively, and tis the forecast time. Those scores quantify the ability of the CMIP5 hindcasts to reproduce the spatial patterns rather than the basin-averaged yearly SST anomalies as a function of the forecast time. The best performance is found in the Indian Ocean where the correlation reaches about 0.4–0.6 and the RMSE about 0.2–0.3°C. The Southern region consistently has the lowest correlation across the hindcasts but also tends to have the lowest RMSE. The North Pacific Basin correlation is only slightly above the Southern Ocean one and its RMSE tends to be the highest. Accounting for the RMSE and correlation scores, the North Pacific Basin thus appears as the region where the state-of-the-art decadal climate predictions perform the worst worldwide, followed closely by the South Pacific region.

Figure 1.

(a) Correlation between ERSST and CMIP5 ensemble-mean SST anomalies averaged across the lead times 2 to 5 years. (b) Correlation-skill and (c) Root Mean Square Error computed against ERSST across the starting dates, longitude and latitude dimensions after smoothing for various oceans in CMIP5 hindcasts: Indian (40°S–30°N, 20°E–120°E), North Atlantic (0–65°N, 100°W–40°E), North Pacific (0–65°N, 100–260°E), South Atlantic (45°S–0, 75°W–20°E), South Pacific (45°S–0, 120–285°E), and Southern (70–45°S) Oceans. (d) Per-pair smoothed CMIP5 SST anomalies averaged in the 155–235°E, 10–45°N box. In color: ensemble-mean for each forecast system, one color per start date. Thick line: multimodel ensemble-mean. The black lines correspond to ERSST data set. (e) Increase in CMIP5 SST correlation skill when removing the 1960s hindcasts.

[16] The forecast time series of ensemble-mean smoothed SST anomalies from each CMIP5 climate forecast system averaged over the region of lowest SST skill (155°E–235°E–10°N–45°N) are shown inFigure 1dwith one color per starting date together with their observed counterparts in black. The multimodel ensemble mean SST anomalies are shown with thick lines. In the forecasts starting between 1970 and 2005, the forecast systems generally follow the warming evolution of the observed anomalies. The ensemble misses though some large excursions from the warming trend like the very sharp one around 1987 and the wider one around 1999 and tends to underestimate the slow down of the warming in the XXIst century except in one of the forecast systems. The performances are generally much poorer in the 1960s. The ensemble-mean prediction from each forecast system misses systematically the two major warm events which peak in 1963 and 1968 (Figures S3 and S4 inauxiliary materialText S1). Even in the hindcasts initialized in November 1961 and 1966 performed with the EC-Earth (Figure S3 inauxiliary material Text S1) and DePreSys (Figure S4 in auxiliary material Text S1) forecast systems, the warm anomalies are damped in less than six months. A few ensemble members do seem able to capture those warm anomalies, but they are largely outnumbered.

[17] The failure to predict these two major warm events largely contributes to the particularly low skill in SST in the North Pacific region. Not including the 1960s hindcasts increases substantially this skill (Figure 1e and Figures S5 and S6 in auxiliary material Text S1) in the areas where the SST skill is poor (Figure 1a) although it also reduces this skill in some other areas such as the south-eastern North Pacific. Since the forecast system's failure to represent the 1963 and 1968 warm events stands as their most striking failure over the whole hindcast period, we focus in the following on the causes of these warm events and on why the forecast systems fail to capture them.

4. The Major Warm Events of 1963 and 1968

[18] The major warm events of 1963 and 1968, peaking at 0.3–0.4°C (Figure 2a), are the largest on record. Although they might appear to be small, they compete with the SST anomalies that might be associated with the long-term warming in the recent past in this region. These events are not related to well known modes of variability dominating the North Pacific climate: the PDO, the PNA, and ENSO [Rasmuson and Carpenter, 1982]. The events still appear after filtering out the effect of these modes by a multilinear regression at a range of lags from −1 year to +1 year at the grid point level (Figure 3). Large SST anomalies are associated with each one of these modes at the grid point level. However, as these SST anomalies have opposite signs across the North Pacific, the averaging over the North Pacific Ocean makes their integrated impact relatively small. The warm events still appear also after removing the effects of ENSO and volcanic eruptions [Thompson et al., 2010, Figure 3].

Figure 2.

Anomalies smoothed out with a 12-month running mean and averaged in the 155–235°E,10–45°N box: (a) SST from the ERSST data set; (b) surface heat fluxes positive from the ocean to the atmosphere. In Figure 2a, the counterpart averaged in the 155–235°E, 35–45°N (170–235°E, 10–35°N) box and scaled by the area of averaging is added as a dashed (dotted) line from November 1960 (1965) to October 1965 (1970). In Figure 2b, total DS94 heat fluxes are computed as the sum of the turbulent and radiative heat fluxes (red), turbulent DS94 fluxes are computed as the sum of the latent and sensible heat fluxes (blue), and turbulent OAFluxes are computed as the sum of the latent and sensible heat fluxes (green). Vertical lines are drawn in January 1963 and 1968.

Figure 3.

ERSST SST anomalies (grey solid) averaged in the 155–235°E,10–45°N box. ERSST SST anomalies (black solid) after subtracting the multilinear regression of the North Pacific SSTs on the (a) Pacific Decadal Oscillation index (PDO), (b) Pacific North American mode (PNA), and (c) Multivariate ENSO index (MEI) at each grid point at lags −12, 0, and 12 months. The PDO, PNA and ENSO indices were taken from http://www.esrl.noaa.gov/psd/data/climateindices/. All indices have been previously smoothed out with a 12-month running mean.

[19] These two major warm events occur during a period in which the surface heat flux anomalies (155°E–235°E–10°N–45°N) are from the ocean to the atmosphere (Figure 2b). The DS94 turbulent (in blue) and total (in red) heat fluxes are close to one another. The heat exchange between the ocean and the atmosphere is therefore dominated by turbulent heat fluxes. Around 1963 and 1968, the OAFluxes (in green) and DS94 turbulent and total surface heat fluxes show peaks corresponding roughly to the ones observed in the SST time series. This suggests that the ocean anomalies might have forced the atmosphere during these two major warm events. This behavior is not systematic in the region. The correlation between the ERSST anomalies and the DS94 total, the DS94 turbulent and the OAFluxes turbulent flux anomalies reaches respectively 0.5, 0.45 and 0.35. For example, the cooling around 1999 (Figure 2a) also missed by the forecast systems (Figure 1d) coincides with a peak in surface heat fluxes from the ocean toward the atmosphere also (Figure 2b) which suggests that the atmosphere was forcing the ocean anomalies during this event. Note, however that the DS94 and OAFluxes estimates of turbulent heat fluxes disagree in particular during the 1963 event when the peak occurs one year earlier and has a lower amplitude in OAFLuxes than DS94. The OAFluxes data set thus suggests a late contribution of the atmosphere in amplifying the initial warm ocean anomaly.

5. Horizontal Advection of the 1963 Heat Anomaly

[20] During 1963, ERSST SST anomalies in the study domain (Figure 4a) show a large positive anomaly confined to 180°E–205°E–35°N–45°N and peaking at about 1.5°C while weak anomalies cover the rest of the domain. The Hovmöller diagram (Figure 4b) of SST anomalies averaged in the 35°N–45°N latitude band from November 1960 to December 1963 suggests an eastward propagation of the anomaly at roughly 20° per year. This propagation also appears in Hovmöller diagrams of HadISST SST anomalies, NCEP and ERA40 near-surface air temperature anomalies and DS94 total and turbulent heat flux anomalies (Figure S7 inauxiliary material Text S1). Computation of backward trajectories launched in the 180°E–205°E–35°N–45°N domain using Ariane Lagrangian trajectory software [Blanke et al., 2001; Van Roekel et al., 2009; Getzlaff et al., 2006] confirms that the rate of propagation is consistent with advection of particles along the Kuroshio-Oyashio extension. Some trajectories are illustrated (Figure 4c) for particles launched in April 1963. Most of the particles originate in the 140°E–170°E longitude band which corresponds to the original location of the warm anomaly seen in the Hovmöller diagram (Figure 4b). Note that the initial warm anomaly in the western North Pacific basin might have been triggered by a previous El Niño event since our filtering method in the previous section only considered a 2-year window around an ENSO-peak. However, understanding the origin of the warm anomaly before the date of initialization of the forecast systems is beyond the scope of this article.

Figure 4.

(a) Pattern of 1963 ERSST SST anomalies smoothed out with a 12-month running mean. (b) Hovmöller of the smoothed ERSST SST anomalies averaged in the 35–45°N latitude band. (c) Backward trajectories computed with the Ariane software (http://stockage.univ-brest.fr/grima/Ariane/ariane.html) of particles launched in the 180–205°E–35–45°N domain at 5 m depth. End (November 1960) and start points (April 1963) of these backward trajectories are indicated by circles. Ariane is an off-line diagnostic tool which provides the 3-dimensional Lagrangian trajectories from the 3-dimensional velocity and thermodynamic fields. The trajectories were computed from the monthly NEMOVAR data. (d) Annual-mean profiles of the NEMOVAR area-averaged temperature anomalies in the 180–205°E, 35–45°N domain for 1961, 1962 and 1963. The ensemble-mean is shown as a thick continuous line while the interval between maximum and minimum across the members is shown as dashed thin lines.

[21] The ocean vertical profiles of the annual temperature anomalies averaged over 180°E–205°E–35°N–45°N (Figure 4d) show an anomalous heat reservoir extending down to roughly 300 meters building up progressively from 1961 to 1963. A cold anomaly is present below this depth but seems not to substantially change in this period. The amplitude of the SST anomaly experiences a seasonal cycle along its propagation with maximum amplitude in late winter (Figure 4b). Superimposed on these seasonal variations, the SST and near-surface air temperature anomalies also seem to increase along their propagation (Figure 4b and Figure S7 in auxiliary materialText S1). The smaller long-term mean mixed layer depth in the central part of the North Pacific relative to the western part (Figure 5a) confines the heat content anomaly to an increasingly thinner layer along its pathway which could favor the increase in SST anomaly.

Figure 5.

Late winter-early spring mean mixed layer depth (February–April) in meters. (a) Long-term mean from NEMOVAR ocean reanalysis; (b, c, and d) average of all the hindcasts (1960–2010) for every forecast time for three of the forecast systems.

6. Upward Transfer of the 1968 Heat Anomaly

[22] During 1968, the SST anomalies (Figure 6a) show a large warm feature of amplitude 0.8°C over 170°E–235°E–10°N–35°N and a secondary peak in the north-western part with large amplitude but much smaller extent that explains only 10% of the total anomaly. The Hovmöller diagram of SST anomalies averaged in the 35°N–45°N latitude band from November 1965 to December 1968 does not show any particular propagative feature (not shown). The different spatial patterns of SST anomalies during 1963 (Figure 4a) and 1968 (Figure 6a) and the lack of propagative feature along the Kuroshio-Oyashio extension from 1965 to 1968 suggest that the 1963 and 1968 events are dynamically distinct. The monthly profiles of temperature anomalies from April 1966 to April 1967 (Figure 6b) show a warm anomaly centered at 200 meter depth at the end of the winter 1966 which persists during the summer below the mixed layer. This warm anomaly is transferred to the top 150 meters by April 1967. Since this upward transfer occurs in an area of large-scale downwelling, it might rather be caused by turbulent mixing processes through the re-emergence mechanism of SST anomalies previously observed in the North Pacific Ocean [Alexander et al., 1999, 2008]. The warm anomaly is later amplified most probably by the atmospheric weather “noise”, which favors a stabilization of the vertical profile (Figure 6b) either through a decrease in wind speed in this region during the years 1967 and 1968 (Figure 6c), or through an Ekman-induced shoaling of the mixed layer (Figure 6d). Note, however, that those two atmospheric data sets bear large uncertainties.

Figure 6.

(a) Pattern of 1968 SST anomalies from ERSST data set smoothed out with a 12-month running mean. (b) Monthly profiles of NEMOVAR temperature anomalies averaged in the 170–235°E–10–35°N domain from April 1966 to April 1967 and from May 1967 to May 1968. The ensemble-mean is shown as a thick continuous line while the interval between maximum and minimum across the members is shown as dashed thin lines. (c) Smoothed wind speed anomalies from DFS4.3 (red) and NCEP (blue) data set averaged in the 170–235°E–10–35°N domain. The vertical line highlights January 1968. (d) In colors: pattern of sea level pressure anomalies, in hPa, in 1968 from the ERA40 reanalysis. The contours give the annual mean sea level pressure over the 1958/01-1994/06 period.

7. Potential Causes for the Forecast Systems to Miss Those Events

[23] The 1963 and 1968 warm events were associated with long-lived ocean heat content anomalies. Since the 1963 anomaly fed the atmosphere during its propagation along the Kuroshio-Oyashio extension (Figure 2b) and thus persisted against the atmospheric damping, we expect the 1963 large warming to be predictable. As long as the 1968 original deep anomaly is isolated from the surface, we also expect the ocean system to be able to persist it. However, when this anomaly reaches the surface, its amplification is controlled by some dynamical atmospheric weather “noise”. We thus expect the 1968 event to be predictable although not its maximum amplitude.

[24] Then, which potentially misrepresented processes in the forecast systems might be responsible for their failure in representing those events? A vertical section of temperature anomalies at 146°E in the 35–45°N latitude band in the initial conditions of the climate forecast initialized in November 1960 with the EC-Earth forecast system (Figure 7) shows a warm anomaly extending down to 500 m with meridional extent of 10°N. The warm anomalies peak at 6°C at 75 m while the surface anomaly reaches only 1°C. The monthly sea surface temperature anomalies during the first six months of this climate forecast (Figure 8) show a warm anomaly which seem to travel eastward at an approximate speed of 5°E in 6 months, i.e. slightly slower than the advective timescales (Figure 4b). This anomaly grows weaker and weaker along its propagation path contrary to what occurs in the observation (Figure 4b) and it vanishes after six months although two of the members are able to persist it for about one year and half (not shown). A similar behavior can be observed in all the forecast systems considered in this study: the warm anomaly (Figures S8 and S9 in auxiliary materialText S1) is initially present but vanishes with different e-folding times depending on the forecast system. Note though that the CMIP5 hindcasts have been initialized at different dates between November 1960, 1st and January 1961, 1st. A comparison in January 1961 is thus not a perfectly fair comparison of their performances (Figure S9 inauxiliary materialText S1). The turbulent surface heat flux anomalies in the EC-Earth forecast initialized in November 1960 (Figure 9a) are larger than the observed ones over the core of the initial warm anomaly during the first six months of the hindcast. Those excessively large surface heat flux anomalies contributed to damping this anomaly. However, in the forecasts initialized in 1961 and 1962, the heat flux anomalies tend to be lower than the observed ones but the SST anomalies are also damped in a few months. Though contributing to damping the warm anomaly, errors in surface heat fluxes seem not to be the main cause. The inability of the forecast systems to persist the warm anomaly might rather come from the generalized strong biases in mixed layer depth and turbulent and mesoscale eddy mixing processes in climate models in this region [Lienert et al., 2011] (Figure 5). An inaccurate representation of the warm anomaly in the initial conditions could also contribute to its particularly quick disappearing for some of the forecast systems.

Figure 7.

In color: ensemble-mean “per-pair” temperature anomalies at 146°E in the initial conditions of the climate forecast initialized in November 1960 with EC-Earth version2 in the framework of CMIP5 project. Black contours provide the maximum anomalies across the 5 members.

Figure 8.

Monthly ensemble-mean “per-pair” SST anomalies during the first six months of the climate forecast initialized in November 1960 with EC-Earth version2 in the framework of CMIP5 project.

Figure 9.

Per-pair smoothed turbulent heat flux anomalies averaged (a) in the 140–160°E, 35–45°N box, (b) in the 155–180°E, 35–45°N and (c) in the 180–205°E, 35–45°N. In color: in the hindcasts initialized in 1960, 1961 and 1962 with EC-Earth version2, one color per start date. Thick line: ensemble-mean. Thin lines: individual members. Continuous and dashed black lines correspond to the OAFlux and DaSilva data sets.

[25] A horizontal section of temperature anomalies at 200 m depth in the initial conditions of the EC-Earth climate forecast initialized in November 1965 (Figure 10) shows a warm anomaly that peaks at 2.5°C with similar shape than the SST anomaly observed in 1968 (Figure 6a). The simulated turbulent surface heat flux anomalies (Figure 11) are weaker than the observed ones in the forecasts initialized in 1965, 1966 and 1967. The damping of the warm anomaly thus rather comes from oceanic processes. The monthly “per-pair” climatologies in EC-Earth forecast heat content anomaly averaged in the 170–235°E–10–45°N–100–300 m box (Figure 12), where is located the initial heat content anomaly in 1966 shows a strong drift with respect to the NEMOVAR reanalysis from the first summer. An initial transfer of heat content from the intermediate layers toward the surface layers is systematically observed in the forecasts produced with the EC-Earth climate model [Du et al., 2012]. Although the hindcasts have been detrended a-posteriori following the method described insection 2.3, the interaction between the large drift and the superimposed simulated variability stands as a strong obstacle for the forecast system to predict the 1968 warm anomaly.

Figure 10.

In color: ensemble-mean “per-pair” temperature anomalies at 207 m depth, in Celsius degrees, in the initial conditions of the climate forecast initialized in November 1965 with EC-Earth version2 in the framework of CMIP5 project. Black contours provide the maximum anomalies across the 5 members.

Figure 11.

Per-pair smoothed turbulent heat flux anomalies averaged in the 170–235°E, 10–35°N box. In color: in the hindcasts initialized in 1965, 1966 and 1967 with EC-Earth version2, one color per start date. Thick line: ensemble-mean. Thin lines: individual members. Continuous and dashed black lines correspond to the OAFlux and DaSilva data sets.

Figure 12.

Monthly “per-pair” climatologies in heat content anomaly averaged in the 170–235°E–10–45°N–100–300 m box from the hindcasts produced with EC-Earth v2 in color, and from NEMOVAR reanalysis in black.

8. Discussion

[26] To describe the 1963 and 1968 events, we performed our analyses on a set of 11 observational data sets. However, during the 60s, the observations were sparser than they are nowadays. The sea surface temperature observations were frequent in this region thanks to the commercial lines, but few ocean thermodynamic profiles have sampled the deep ocean. The NEMOVAR ocean reanalysis stands as a physical extrapolation of this sparse available information. However, the ocean model on which is based this physical extrapolation shares the same typical biases with the state-of-the-art ocean models included in the forecast systems for which we assessed the performance in this article. Indeed, the ocean heat budget we performed to investigate the mechanisms explaining the 1963 and 1968 events are not closed. The assimilation term constitutes a substantial contribution allowing the warm anomaly to persist along its propagation. The sparse observational coverage, the biases in the ocean model used for their physical extrapolation and the uncertainty in the atmospheric forcing fluxes are at the basis of the uncertainty on the processes we pointed as to be involved in the development of the 1963 and 1968 events. However, those observational data sets will be highly challenging to improve for this particular period. A more accurate assessment of the mechanisms leading to the 1963 and 1968 events will require the ocean models to improve in such a way that the assimilation increments become substantially smaller than the other terms of the heat budget.

9. Conclusion

[27] In this work, we have used a wide variety of climate forecast systems to show that the North Pacific region is the area where the state-of-the-art decadal climate predictions of sea surface temperature (SST) perform the worst worldwide for forecast times ranging from the second to the fifth year, according to correlation and RMSE (Root Mean Square Error) measures. This systematic error is dominated by the models' inability to capture two major warm events in the 1960s. Based on an extensive set of 11 observational data sets, we investigated the mechanisms explaining those large warm events. We suggest that the 1963 one stemmed from the propagation of a warm anomaly along the Kuroshio-Oyashio extension. The 1968 warm event originated from the upward transfer of a warm water mass centered at 200 m depth. Over the whole hindcast period in the framework of the ENSEMBLES and CMIP5 project, those two large warm events are unique and extreme. Their magnitudes compete with the largest observed temperature anomalies in the twenty-first century that might be associated with the long-term warming. We show that the initial warm anomaly vanishes in every forecast system and hypothesize that the generalized model biases in ocean mixing processes might be responsible for damping the associated heat content anomalies, though those ocean mixing biases might be linked to uncertainties in the ocean-atmosphere fluxes. Accurately representing the ocean mixing processes in the ocean general circulation models is a priority since the occurrence of such a warm event in the next decade could jeopardize climate prediction in the North Pacific as attempted byMochizuki et al. [2010]. Although reducing systematic biases in ocean stratification and improving the representation of ocean mixing processes has been a long-standing effort, our conclusions suggest that resources devoted to improving simulation of ocean mixing has the potential to significantly improve decadal climate prediction.

Acknowledgments

[28] Joan Ballester is greatly acknowledged for interesting discussions about our results. This work was supported by the EU-funded QWeCI (FP7-ENV-2009-1-243964), CLIM-RUN (FP7-ENV-2010-1-265192), the MICINN-funded RUCSS (CGL2010-20657) projects and the Catalan Government. The authors wish to thank the three reviewers for their fruitful suggestions. The authors thankfully acknowledge the computer resources, technical expertise and assistance provided by the Red Española de Supercomputación (RES).

Ancillary