Will Arctic sea ice thickness initialization improve seasonal forecast skill?

Arctic sea ice thickness is thought to be an important predictor of Arctic sea ice extent. However, coupled seasonal forecast systems do not generally use sea ice thickness observations in their initialization and are therefore missing a potentially important source of additional skill. To investigate how large this source is, a set of ensemble potential predictability experiments with a global climate model, initialized with and without knowledge of the sea ice thickness initial state, have been run. These experiments show that accurate knowledge of the sea ice thickness ﬁeld is crucially important for sea ice concentration and extent forecasts up to 8 months ahead, especially in summer. Perturbing sea ice thickness also has a signiﬁcant impact on the forecast error in Arctic 2 m temperature a few months ahead. These results suggest that advancing capabilities to observe and assimilate sea ice thickness into coupled forecast systems could signiﬁcantly increase skill.


Introduction
The recent rapid reduction in Arctic summer sea ice has led to a large increase in demand for forecasts of sea ice cover and surface meteorological conditions at monthly and longer timescales [Eicken, 2013].This is important information for end users interested in Arctic marine accessibility [e.g., Stephenson et al., 2011] and may even improve midlatitude meteorological forecast skill [e.g., Jung et al., 2014].This interest has led to the development of a number of initialized sea ice seasonal prediction systems based on general circulation models (GCMs) [e.g., Sigmond et al., 2013;Chevallier et al., 2013;Wang et al., 2013], compared in Guemas et al. [2014a], as well as systems based on empirical methods [e.g., Schröder et al., 2014;Stroeve et al., 2014].
These operational prediction systems show some skill in predicting summer sea ice conditions up to a few months ahead, but diagnosing the source of forecast errors is problematic.Such forecast errors may be due to both inadequate representation of important physical processes in the model and incomplete knowledge of the initial state of key variables such as sea ice thickness and subsurface ocean state variables, which are not well observed.
Analysis of idealized "perfect model" experiments with coupled GCMs provides a setting where perfect knowledge of the initial model state exists and forecast quality is not affected by model bias [e.g., Collins, 2002;Pohlmann et al., 2004].In this system, the source of forecast errors can be tested.There is an inherent limit to predictability in the Arctic climate system due to chaotic atmospheric variability, but studies such as Holland et al. [2010] and Blanchard-Wrigglesworth et al. [2011a] indicate that potential predictability, inherited from the initial state, of Arctic sea ice area exists for 1-2 years, and sea ice volume for 2-4 years.These timescales are robust across models [Tietsche et al., 2014] indicating that there is potential to extend skill in operational systems to longer lead times.
One potential source of untapped skill in initialized predictions is sea ice thickness [e.g., Doblas-Reyes et al., 2013].Diagnostic analysis with models has shown that the sea ice thickness field is a strong predictor of summer sea ice extent [Chevallier and Salas-Mélia, 2012].However, a lack of long-term pan-Arctic thickness observations prohibits this type of diagnostic analysis in the real climate system.Thickness observations at point locations have been used to initialize seasonal hindcasts [see Lindsay et al., 2012], but satellite observations are the only prospect to retrieve pan-Arctic thickness observations.However, such methods of retrieval are at an early stage of development.Pan-Arctic thickness fields are being constructed from CryoSat2 altimeter retrievals [Laxon et al., 2013] and from the Soil Moisture and Ocean Salinity mission's Microwave Radiometry instrument [e.g., Kaleschke et al., 2012].However, both products have intrinsic problems; coverage is not year round, due to complications during the melt season, and is only available in Geophysical Research Letters 10.1002/2014GL061694 regions of high concentration.In addition, Soil Moisture and Ocean Salinity mission (SMOS) is unable to measure thicknesses over 1 m.The data assimilation systems required to initialize the sea ice thickness field are also in their infancy [e.g., Tietsche et al., 2013;Mathiot et al., 2012].Nevertheless, preliminary results in a coupled system assimilating SMOS thickness suggest that forecasts of sea ice cover could be improved by their inclusion [Yang et al., 2014].
Improving the situation described above will require considerable effort and resource.It is therefore important to determine the value of such developments to inform priorities for operational forecast and data processing centers.Previous studies have used GCMs to quantify the potential predictability that comes from perfect knowledge of individual ocean state variables, such as ocean temperature and salinity [Dunstone et al., 2011], but have not focused on sea ice.Here we present a set of idealized prediction experiments which are used to quantify the importance of Arctic sea ice thickness initialization for the prediction of other fields.In order to do this, two sets of "perfect model" experiments have been run with the Hadley Centre Global Environmental Model version 1.2 (HadGEM1.2) GCM.These sets of simulations were identical, except that the sea ice thickness field in the initial state was degraded in one set.This removes the portion of the forecast skill which results from memory of the sea ice thickness initial state.

Models and Experiments
We first briefly describe the HadGEM1.2GCM used [Shaffrey et al., 2009] and the experiments performed.

Model Description
HadGEM1.2 is similar to HadGEM1, the Coupled Model Intercomparison Project 3 version of the UK Hadley Centre GCM, which is fully described in Johns et al. [2006].The atmosphere component has a resolution of 1.25 • latitude by 1.875 • longitude with 38 layers in the vertical.The ocean component has a zonal resolution of 1 • and a meridional resolution of 1 • between the poles and 30 • latitude, increasing smoothly to 1∕3 • at the equator with 40 unevenly spaced levels in the vertical.A number of improvements to HadGEM1 are included in HadGEM1.2, including changes to the snow-free sea ice albedo, runoff into frozen soil, and in the calculation of surface fluxes.Each of these changes improved the HadGEM1.2mean state compared to HadGEM1 (see Shaffrey et al. [2009] for full details and model evaluation).
The sea ice component of HadGEM1.2 is identical to HadGEM1 and was fully described and evaluated by McLaren et al. [2006].The sea ice component shares much of its code with the Los Alamos sea ice model (CICE) [Hunke and Lipscomb, 2004].The ice pack is modeled as a five-category ice thickness distribution that evolves through advection, ridging, and thermodynamic growth or melt.Mean sea ice extent in the reference simulation used in this study is higher than mean observations during the satellite era  with sea ice volume also significantly higher than observed estimates [see Tietsche et al., 2014, Figure S1].This is in part due to a cold bias in the North Pacific [McLaren et al., 2006] and should be taken into account when interpreting the results presented in section 3.However, predictability metrics indicate that sea ice extent and volume predictability in this model is fairly typical when compared to other GCMs [see Tietsche et al., 2014, Figure 1].This and the strong performance of this model in reproducing many other climate indices [Johns et al., 2006] indicate that this is a useful model with which to investigate sea ice predictability.

Predictability Experiments and Sea Ice Thickness Perturbations
A set of "perfect model" ensemble prediction experiments were performed for the Arctic Prediction and Predictability on Seasonal-to-Interannual Time Scales (APPOSITE) project, using HadGEM1.2,hereafter referred to as the SITINIT (initialized set) (as documented in Day et al. [2014]).To initialize these simulations, 10 years were chosen from a 250 year present-day control simulation (with 1990 radiative forcings).In order to sample a range of sea ice initial states these were chosen to sample a matrix of high, medium, and low sea ice extent, volume, and Atlantic meridional heat transport states.For each of these years an ensemble of simulations was started on 1 January and 1 July, using initial conditions from the control simulation.Each ensemble contains 16 members with each member having identical initial conditions to the reference run, except for a tiny, spatially varying, Gaussian white noise perturbation (with  = 10 −4 K) to the sea surface temperature (SST) field.This is similar to the methodology used to assess sea ice predictability in previous studies [Koenigk and Mikolajewicz, 2009;Holland et al., 2010;Blanchard-Wrigglesworth et al., 2011a;Tietsche et al., 2014].

10.1002/2014GL061694
Perfect model experiments such as those described above do not suffer from model error because the model is being used to predict itself.Neither do they suffer from a lack of information about the initial state, since the full atmosphere-ice-ocean state of the reference simulation is known precisely for each of the start times.Hence, under the assumption that they represent the most important features of the climate system well, they can be used to assess the potential to predict it [e.g., Collins, 2002;Latif et al., 2006].Because the initial state of each of the ensemble prediction experiments is known perfectly, for each variable, the importance of memory from any given variable on the predictability of other fields can be quantified [e.g., Dunstone et al., 2011].We repeated our original perfect model set of experiments (SITINIT) but removed information on the ice thickness anomaly to assess the importance of ice thickness memory to forecast skill.
For each initial state used in the SITINIT set, we chose to replace the sea ice thickness such that the grid box mean thickness was set to the model's climatological value (1 January or 1 July), but the grid box mean sea ice concentration and snow cover fields were held fixed.This was done to all Northern Hemisphere grid cells with two categories of exception: 1.Where there is ice cover in the original state, but none in the climatology, thickness was set to the minimum value of 0.10 m. 2.Where there is ice in the climatology, but none in the original state vector, all sea ice fields are set to zero.
All other fields, including ocean salinity and temperature, were left unchanged from the original initial condition.This is a conservative choice, since preliminary experiments which additionally modified the salinity for consistency with the ice volume change lead to larger differences in ice cover.The set of forecasts initialized in this manner will hereafter be referred to as the SITCLIM set.Refer to the supporting information for full details.
The creation of these initial conditions is an attempt to mimic the current operational forecast situation, where sea ice concentration is well known, but sea ice thickness is not (see Figures S1 and S2 in the supporting information), and is similar to the situation in the Canadian Sea Ice Prediction System, where the model's ice thickness field is relaxed to the Pan-Arctic Ice Ocean Modeling and Assimilation System (PIOMAS) climatology [see Sigmond et al., 2013].However, recently more sophisticated methods to generate sea ice thickness initial conditions, which are consistent with observations of the atmosphere-ocean state, have been developed [see Guemas et al., 2014b;Chevallier et al., 2013].

Metrics
Skill in this study is measured using the root-mean-square error (RMSE) of Collins [2002].In a perfect model study, any ensemble member may be chosen randomly as the "truth" and the effective sample size can be increased by taking each member in turn as the truth and every other member as a forecast.Hence, the ensemble RMSE is defined as where x ij (t) is the sea ice extent at lead time t for the ith member of the jth ensemble, n d is the number of start dates, and n m is the number of ensemble members.The state is said to be predictable at lead time t when RMSE(t) < √ 2, where  is the climatological standard deviation of the reference simulation [Collins, 2002].Significance of this inequality is calculated using an F test with n d (n m − 1) degrees of freedom, but note that this is a conservative estimate of the true number of degrees of freedom [see Collins, 2002].
To assess the skill of the SITINIT set, the RMSE is calculated exactly as shown in equation ( 1), so for each year, each SITINIT ensemble member is compared against every other SITINIT member.In this procedure each x kj is considered to be "the 'truth"' and each x ij the forecast.The skill of the SITCLIM set is evaluated by considering how closely each of its members resembles the equivalent members in the SITINIT set, which are considered to be the truth.In practice, this is calculated by replacing x ij in equation ( 1) with, y ij , the equivalent member from the SITCLIM set.The difference between these two RMSE values gives the gain in skill between climatological and perfect initialization of the sea ice thickness field.

Pan-Arctic Sea Ice Conditions
Correctly initializing the sea ice thickness field produces large improvements in the skill of sea ice extent forecasts initialized on 1 July (compare dashed curves in Figure 1a).The normalized RMSE (NRMSE) of sea ice extent (normalized by √ 2) is significantly lower for the SITINIT set than in the SITCLIM set, from the first forecast month, approximately halving the error in the September forecast.In the SITCLIM set there is very little skill in the prediction of the following September extent (the NRMSE is close to one) although predictability returns during the subsequent freeze season.After 6 months the two forecast sets are statistically indistinguishable.
The SITINIT forecasts of sea ice extent, initialized on 1 January (solid curves in Figure 1a), are statistically more skillful than SITCLIM for the first month and then indistinguishable for the next five months.Erroneous thickness values in the first month lead to velocity errors near the sea ice edge, as they are related via the momentum equation.This results in concentration errors at and around the sea ice edge, leading to small but statistically significant extent errors.The SITINIT set is significantly more skillful during the subsequent August and September (months 7 and 8).

Geophysical Research Letters
At the beginning of the forecasts, for both January and July start dates, the NRMSE of the sea ice volume field (Figure 1b) for the SITINIT set are much smaller than the SITCLIM set.The ensembles are indistinguishable after 1 year, in the case of January start dates, and 2 years in the case of the July start dates.It is not surprising that differences between the ensembles in the sea ice volume field last for longer than those in the sea ice extent field; persistence of sea ice thickness anomalies is much higher than that of sea ice extent anomalies (compare Figures 1 and 5 of Day et al. [2014]).All colored points are significant at the 5% level according to a one-sided F test.

Geophysical Research Letters
These results suggest that initializing sea ice thickness could substantially improve skill in predicting summer sea ice extent from forecasts initialized in July and to a lesser extent improve skill from forecasts initialized in January.

Regional Sea Ice Predictability
In order to understand the potential benefit of thickness initialization on both pan-Arctic sea ice conditions and regional predictability we examine the spatial differences in RMSE between the SITCLIM and SITINIT forecast sets (see Figures 2 and 3).Comparing the RMSE of sea ice thickness (first rows), one can see that the magnitude of the difference in error between the SITCLIM ensemble relative to the SITINIT and the timescales at which they decay are similar between the ensembles initialized in July and those initialized in January and that differences are negligible after 2 years for both start dates (not shown).

10.1002/2014GL061694
Despite the strong similarity between the magnitude of sea ice thickness errors after start in January and July, the impact on the sea ice concentration field (Figures 2 and 3, second rows) is very different.The RMSE of the SITINIT July ensemble has significantly smaller errors than the SITCLIM ensemble, particularly during the first 6 months of the forecast.In contrast, the difference between sea ice concentration RMSE in the SITCLIM and SITINIT January ensemble are small (<8% in all grid cells, Figure 2), isolated to the sea ice edge and do not lead to a significant difference in forecasts of sea ice extent during the first 6 months (Figure 1).
The July ensembles are initialized during the melt season and the spatial properties of the seasonal reduction of ice extent in the following months depends critically on the ice thickness.The thicker the ice, the more energy is required to melt the existing ice at a given point.The erroneous ice thickness in the SITCLIM set leads thus to errors in the ice concentration.The ice albedo feedback will further amplify the initial errors in the SITCLIM set.
In contrast, the January ensembles are initialized during the freeze season when sea ice is growing and consequently the sea ice edge is extending southward.The position of the winter sea ice edge is, to a large extent, determined by the ocean heat flux convergence [Bitz et al., 2005] and the persistence of related SST anomalies [Blanchard-Wrigglesworth et al., 2011b;Day et al., 2014].As a result, during the freeze season, sea ice concentration is not affected by thickness anomalies farther north.However, because thickness anomalies persist until the following summer, there are differences in RMSE when the ice edge returns and moves north of its January extent the following summer (see Figure 2, fourth column).

Near-Surface Atmosphere Predictability
A defining feature of the ice-covered Arctic Ocean's surface energy budget is a strong seasonal cycle in surface fluxes [e.g., Serreze and Barry, 2009].Over Arctic sea ice the winter net surface heat flux is upward away from the surface and dominated by the conductive flux through sea ice, which depends in ice thickness, (equation ( S2)) and sensible heat flux terms.During summer the net surface heat flux is downward into the surface and dominated by the latent heat flux term, due to sea ice surface melt [e.g., Persson et al., 2002], and the surface and 2 m temperatures are constrained to zero above sea ice, regardless of its thickness.Hence, one might expect that the impact of initializing forecasts with different sea ice thickness fields on properties of the overlying atmosphere to be greater in winter.This is indeed the case; during the first month of the January initialized forecasts, errors in Arctic 2 m air temperature in the SITINIT ensemble are much smaller than errors in SITCLIM.However, there is little difference between those initialized in July, despite a similar difference in thickness error between the two periods (Figures 2 and 3, third rows).
Forecasts of 2 m temperature started in January have significantly lower RMSE values in the SITINIT set during the first forecast month, over 1 • C smaller in some areas of the Arctic.Significant differences, in the Arctic basin, are present during the first three months of the forecast and negligible after that.During January, areas of the interior ice pack with the largest differences in 2 m temperature RMSE with the largest difference in RMSE are colocated with the areas of largest difference in thickness and conductive heat flux errors (see Figures 2 and S3).The conductive heat flux through sea ice is inversely proportional to the sea ice thickness (see equation ( S2)), hence by applying a positive (negative) perturbation to the sea ice thickness, in a given forecast, will decrease (increase) the conductive heat flux through it.Additional conductive heat flux mean errors, regionally more than 10W m −2 , in perturbed forecasts are directly induced in this way (see Figures 2 and S3).This flux changes the ice surface temperature, therefore leading to higher 2 m temperature errors.The increased 2 m temperature errors in the SITCLIM set are present during the first few months but are much reduced in magnitude and spatial extent, isolated largely to locations in the eastern sector of the Eurasian coastal shelf where the largest thickness and conductive heat flux RMSE persists (see Figures 2  and S3).Errors in the turbulent heat flux can also result in 2 m temperature errors.Such errors can be caused by just small shifts in the position of the sea ice edge.This mechanism leads to a large difference in 2 m temperature errors in the Barents Sea (Figure 2, first column).
The SITINIT set started in July also have significantly lower 2 m temperature RMSE, than the SITCLIM set, during the first season of the forecasts.In contrast to the forecasts initialized in January, during the first forecast month the additional errors in the 2 m temperature field are found in the North Atlantic and North Pacific, in the location of the July marginal ice zone (MIZ).This closely follows the pattern of increased RMSE in the sea ice concentration field (see Figure 3), indicating that the reduction in error in the 2 m temperature field are the result of reduced errors in the forecasts of the presence of sea ice in the SITINIT set rather than just errors in thickness which in turn result in reduced turbulent heat flux errors.As the forecasts progress until September an additional 2 m temperature errors reduction is located in the September MIZ.
Differences between the RMSE of sea level pressure between the ensemble sets are generally not significant, except during the first month of the January ensembles.In January there is a significant reduction in the mean sea level pressure RMSE over northern Eurasia in the SITINIT set (see Figure 4), reducing the RMSE by as much as 26%.It is notable that this area also experiences differences in temperature (see Figure 2) and snow cover RMSE (not shown).However, discerning the mechanisms responsible is difficult in this experimental setup.It does, however, indicate that differences in sea ice thickness conditions could impact northern European and Eurasian weather forecasts at monthly timescales.

Conclusions and Discussion
We have analyzed parallel sets of idealized experiments with the HadGEM1.2GCM to explore the importance of sea ice thickness initial conditions for seasonal predictions.In summary 1. Accurately initializing sea ice thickness has a large impact on forecasts of sea ice cover started in July, increasing their skill in predicting the following September extent.However, the same initialization method has relatively little impact on the forecasts initialized in January for the first 6 months but does increase skill during the following August to September.2. The improvement in skill, caused by initializing thickness, on forecasts of the overlying atmosphere is largest in winter months, when errors in the sea ice thickness field lead to errors in conductive heat flux.3. Beyond a lead time of 8 months, there is no significant difference in forecasts of sea ice extent or atmospheric fields between those ensembles initialized with perfect sea ice thickness conditions and those initialized with climatological ones.
In this model it is clear that information on the sea ice thickness field is crucial for prediction of Arctic sea ice concentration and extent on seasonal time scales, as well as for key high latitude atmospheric surface

Geophysical Research Letters
10.1002/2014GL061694 variables on monthly timescales.This indicates that advancing capabilities to observe and assimilate sea ice thickness into coupled forecast systems could unlock a previously untapped reservoir of forecast skill.However, the magnitude and timing of the impact of sea ice thickness initialization on forecast skill is extremely dependent on the variable under investigation and the initialization month of the forecast.There is a notable lack of impact on midlatitude variables, particularly over land, although we made conservative choices in significance testing and only have a small sample of start dates.
In the Arctic there are significant uncertainties in the observed climatology of many key variables and consequently, large intermodel differences [e.g., Massonnet et al., 2012;Koenigk et al., 2014].Of particular importance for this study is the spread in correlation length scale and timescale between models, which varies with climatological thickness [Day et al., 2014;Blanchard-Wrigglesworth and Bitz, 2014].Hence, in order to determine if these findings are true, in general, it will be necessary to confirm these results with other GCMs and prediction systems in the future.The latter will be complicated to achieve for summer predictions as satellite retrievals of sea ice thickness are only currently available from October to April and do not cover July, which was used as one of the initialization months in this study.However, the method presented in this study could be used to assess the importance of sea ice thickness intialization in an operational forecast system.Thanks to Matthieu Chevallier and an anonymous reviewer for their considered and helpful suggestions, which greatly helped to improve this manuscript.Thanks also to Jeff Ridley and François Massonnet for their insightful comments on a draft and Alison McLaren for helpful information on the HadGEM1.2sea ice model.The APPOSITE project (grant NE/I029447/1) was funded by the U.K. Natural Environment Research Council as part of the Arctic Research Program.GCM data produced for the APPOSITE project will be openly available at the British Atmospheric Data Centre.
Eric Calais thanks Matthieu Chevallier and one anonymous reviewer for their assistance in evaluating this paper.

Figure 1 .
Figure 1.Normalized RMSE of monthly mean (a) sea ice extent and (b) sea ice volume.The time series for the SITINIT set are shown in black; the SITCLIM set are in red.Forecasts initialized in January are shown in solid lines; July are shown in dashed.Dots indicate where differences between the perturbed and original sets are significant at the 5% level, calculated using a one-sided F test.

Figure 2 .
Figure2.Maps of the difference in the RMSE of monthly mean (first row) sea ice thickness (SIT), (second row) sea ice concentration (SIC), (third row) 2 m temperature (TAS), (fourth row) conductive heat flux through sea ice (CHF), and (fifth row) turbulent heat flux (THF = sensible + latent) in the SITCLIM set minus the SITINIT set for January starts.The difference in (first column) January, (second column) March, (third column) May, and (fourth column) September.All colored points are significant at the 5% level according to a one-sided F test.

Figure 3 .
Figure3.Maps of the difference in the RMSE of monthly mean (first row) sea ice thickness (SIT), (second row) sea ice concentration (SIC), (third row) 2 m temperature (TAS), (fourth row) conductive heat flux through sea ice (CHF), and (fifth row) turbulent heat flux (THF = sensible + latent) in the SITCLIM set minus the SITINIT set for July starts.The difference in (first column) July, (second column) September, (third column) November, and (fourth column) March.All colored points are significant at the 5% level according to a one-sided F test.