Predictability of Antarctic Sea Ice Edge on Subseasonal Time Scales

Coupled subseasonal forecast systems with dynamical sea ice have the potential of providing important predictive information in polar regions. Here, we evaluate the ability of operational ensemble prediction systems to predict the location of the sea ice edge in Antarctica. Compared to the Arctic, Antarctica shows on average a 30% lower skill, with only one system remaining more skillful than a climatological benchmark up to ∼30 days ahead. Skill tends to be highest in the west Antarctic sector during the early freezing season. Most of the systems tend to overestimate the sea ice edge extent and fail to capture the onset of the melting season. All the forecast systems exhibit large initial errors. We conclude that subseasonal sea ice predictions could provide marginal support for decision‐making only in selected seasons and regions of the Southern Ocean. However, major progress is possible through investments in model development, forecast initialization and calibration.


Introduction
Reliable predictions of the sea ice edge location are becoming increasingly important to ensure the safety of human activities at both poles. Furthermore, providing skillful predictions has been recognized as an important scientific challenge that will need to be addressed in the coming years (Alley et al., 2019). Previous efforts of the research community have focused mostly on the Arctic, partly due to the higher economic interests that are at stake and due to its proximity to highly populated regions. While the number of stakeholders that requires sea ice predictions in the Arctic is relatively large and ranges from shipping companies to tourism (Emmerson & Lahn, 2012;Stephenson et al., 2011), Antarctic sea ice predictions in the past were relevant mostly for logistical aspects related to research activities. However, in recent years the tourism industry is flourishing also around Antarctica (Eijgelaar et al., 2010), and the presence of the fishing industry in the Southern Ocean is also expected to increase (Cheung et al., 2010;Smetacek & Nicol, 2015), calling for reliable Antarctic sea ice forecasts to manage the risks that come with enhanced activities.
Sea ice forecasting is relevant not only at short "weather" time scales (forecasts up to 10 days ahead) but also at subseasonal and seasonal time scales (forecasts from weeks to months ahead). The work by Chen and Yuan (2004) is one of the first attempts at providing seasonal predictions of the Antarctic sea ice cover using a statistical approach. Holland et al. (2013) evaluate the mechanisms of Antarctic sea ice predictability. More recently, Ordoñez et al. (2018) compared sea ice predictability between the Arctic and Antarctic. Both these studies are based on climate models as research tools. The systematic investigation of operational sea ice prediction systems, with the assimilation of the observed sea ice state and possibly ensemble based, is still at a very early stage.
While the Sea Ice Outlook (Blanchard-Wrigglesworth et al., 2017;Stroeve et al., 2014) has established a framework to build and evaluate Arctic late-summer sea ice prediction capabilities in 2008, a similar exercise for the Antarctic region, targeting the February sea ice minimum (SIPN South;2017, has been initiated only very recently (Massonnet et al., 2018(Massonnet et al., , 2019, that is, almost 10 years later. In fact, the international scientific community has recognized the need to advance the field of sea ice prediction at both poles simultaneously . In this sense, the present study contributes to closing an important knowledge gap. The recently established database of the Subseasonal to Seasonal (S2S) Prediction Project (Vitart et al., 2012(Vitart et al., , 2016 has proven to be valuable for evaluating the predictive skill of operational S2S ensemble forecast systems in the Arctic (Wayand et al., 2019;Zampieri et al., 2018). The availability of comprehensive sets of both reforecasts and real-time forecasts allows for a robust assessment of the forecast skill over a relatively long time period (>10 years), covering the whole seasonal cycle. Here, we extend the analysis by Zampieri et al. (2018) for the Arctic, to Antarctica, addressing the two following guiding questions: • Are fully coupled forecasting systems in the Antarctic better than observation-based benchmark forecasts in predicting the sea ice edge? • Does the predictive skill of dynamical forecast systems differ between the two hemispheres?
Thereby, the goal is to establish a reference against which future progress in Antarctic sea ice prediction can be quantified. To our knowledge, this study is the first assessment of the S2S forecast systems in the Antarctic, especially when it comes to focusing on the sea ice edge position, which is a crucial variable for navigation and for planning human activities in the Southern Ocean.

Data and Methods
The sea ice forecasts are verified against observations using a verification metric suitable for quantifying the accuracy of the sea ice edge location. The resulting forecast error is compared to that of observation-based benchmark forecasts to assess the predictive skills of the forecast systems and to understand associated shortcomings and model biases. This section briefly describes the main features of forecasts, observations, verification metrics, and benchmark forecasts used in this study. A more detailed description of the methods, forecasts, and observations can be found in the work of Zampieri et al. (2018), including its supporting information.

Forecasts and Observations
The ensemble sea ice forecasts considered here belong to the S2S database (Vitart et al., 2016), which provides sea ice concentration as a standard output variable. Here we focus on the six forecasting systems that employ a dynamical sea ice model in their coupled model: the National Centers for Environmental Prediction (NCEP), China Meteorological Administration (CMA), Météo-France (MF), European Centre for Medium-Range Weather Forecasts (ECMWF), UK Met Office (UKMO), and the Korea Meteorological Administration (KMA) forecast systems. Additionally, we also consider the old version of the ECMWF forecast system in which the sea ice concentration was prescribed based on combining initial sea ice fields with relaxation toward climatological fields (ECMWF Pres.), a method that could be described as "damped persistence." The technical features of these forecast systems are quite diverse: They differ in terms of initialization frequency (from daily to monthly), ensemble size (from 3 to 15 ensemble members), forecast length (from 44 to 60 days), and assimilation strategy. Only some of the systems directly assimilate sea ice concentration from observations and none of them assimilates sea ice thickness. Here, we consider the raw forecast data without calibration (bias/drift correction). The S2S webpage (2015) includes a detailed description of the S2S forecast systems.
The observations used to verify the forecasts are daily sea ice concentration fields retrieved from passive-microwave satellite measurements (OSI-450;OSI SAF, 2017;Lavergne et al., 2019). The sea ice edge has been defined as the 15% sea ice concentration contour line for both the forecast ensemble members and the observations. The verification results are averaged over a 12-year reforecast period (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010) common to all of the S2S forecast systems. All the analyses have been conducted with the sea ice observation fields interpolated to the 1.5 • × 1.5 • grid on which the S2S forecasts are provided. A common conservative land-sea mask has been obtained by combining the land-sea masks of all the models and observations based on the following criteria: If a grid cell is classified as land in one forecast system or in the observations, such classification is extended to all the other forecast systems, thus excluding that grid cell from all the analyses. The verification has been constrained to this land mask to allow a fair comparison between the different systems.

Verification Metrics
The basic verification metric employed in this study is the Spatial Probability Score (SPS; , which is defined as follows: (1) P f and P o are the local sea ice probabilities (SIP: the ensemble-based probability of sea ice concentration being above a certain threshold-here 15% if not differently stated) of respectively forecast and observation at location x. A property of the SPS that makes this metric suitable for verifying ensemble forecasts is its ability to deal directly with probabilities, which allows to avoid degrading probabilistic forecasts to deterministic ones. Since the sea ice observations considered here are deterministic and not probabilistic, their SIP simply consists of binary fields with 0 (no ice) and 1 (ice-covered cell). A is the integration domain, which is the Northern Hemisphere for the Arctic forecasts and the Southern Hemisphere for Antarctic forecasts.
Unlike the pan-Arctic sea ice extent, which measures only the total sea ice coverage, the SPS is designed to capture the accuracy of the sea ice spatial distribution and thus that of the sea ice edge location. Furthermore, the SPS can be decomposed into an Overestimation component (0 = SPS fraction caused by a local overestimation of the ice edge extent) and an Underestimation component (U = SPS fraction caused by a local underestimation of the ice edge extent), which provide additional insight into the type of the forecast error Zampieri et al., 2018). Finally, the SPS can be also normalized (Norm. SPS) if divided by the length of the sea ice edge Melsom et al., 2019;Palerme et al., 2019). The Norm. SPS provides an estimate of the average distance between the (probabilistic) forecast edge and the (deterministic) observed edge. An advantage of this version of the metric is that it is easily understandable by potential forecast users. In this study the length of the observed climatological sea ice edge, defined as the median of the climatological SIP ( Figure S1 in the supporting information), is used as normalization factor to assess longitudinal variations in Antarctic sea ice forecast skill (section 3.3).

Benchmark Forecasts
The predictive skill assessment of the forecast systems is based on the following approach: If for a given lead time the forecast SPS is lower than the SPS of some observational-based benchmarks, we consider this system to have predictive skill for that lead time. We employ two benchmark forecasts as reference to asses the predictive skills of the S2S forecast systems: (1) a probabilistic climatological forecast (CLIM) based on the observed sea ice conditions of the 10 years previous to the forecast target time at the same time of the year and (2) a deterministic persistence forecast (PERS) based on the observed sea ice state at the forecast initial time.

Comparison of the Annual Mean Forecast Skills at the Two Poles
The annual mean forecast skills in predicting the Arctic and Antarctic sea ice edge location are shown in Figure 1 in terms of the SPS. In the following, we first focus on the Antarctic and then compare the predictive skills in the two hemispheres.
The ECMWF system (yellow line) is overall the most skillful system when it comes to predicting the Antarctic ice edge location. The system outperforms the CLIM and PERS benchmark forecasts from about days 5 to ∼30. The UKMO and KMA forecast systems (green and purple lines), which share the same model configuration, exhibit virtually identical results and show marginal predictive skill from days 8 to 15. The old version of the ECMWF forecast system (ECMWF Pres., magenta line) is less skillful than the benchmarks at all lead times and is characterized by a nonmonotonic growth of the forecast error. The nonmonotonicity is caused by the blending of different observations: First, the initial sea ice conditions are persisted up to day 15 of the forecast, and afterward the sea ice concentration is relaxed toward the climatological state based on the observations of the 5 years before the forecast target date. The NCEP forecast system (light blue line) shows a rapid growth of the forecast error and has on average no predictive skill over the benchmarks. The wide uncertainty band is the result of large interannual variability of the NCEP forecast error. The MF forecast system exhibits an error 30% larger than CLIM already at initial time, growing further with lead time. Finally, the CMA forecast system (not visible in Figure 1 because out of range for all lead times) is affected by strong biases related to the lack of assimilation of sea ice observations as well as to significant model biases in the polar regions. In the Antarctic, the ice edge extent is almost always and everywhere underestimated (Figure 3), pointing to a widespread warm bias in the CMA system.
The results indicate some similarities between the two hemispheres. First, the model ranking in the Antarctic is comparable to that in the Arctic. The only exception is the NCEP forecast system, which shows a degradation of its predictive skill in the Southern Ocean relative to the skills of the other systems and benchmarks. With the exception of April and May, the NCEP sea ice edge extent tends to be overestimated in most places (Figure 3), pointing to a prevailing cold bias. Since the same sea ice model physics are implemented for both hemispheres, our results suggest that the NCEP forecast system would benefit from a more careful tuning of its parameters to match better the observed state in the Southern Ocean. A second feature common to the two hemispheres is the large initial error, which amounts to ∼50% of the CLIM error in the decently initialized systems (ECMWF, UKMO, KMA). As described in Zampieri et al. (2018), the initial error can have multiple sources, such as the adjustment of the sea ice edge to the sea surface temperature during the data assimilation, employment of different sea ice observations in the assimilation and verification phases and finally interpolation errors due to the regridding of the model and observational data to the coarse S2S grid. Understanding the relative contributions of different sources to the total initial error is challenging and beyond the scope of the present study.
Selected forecasts users might be interested in the verification of different sea ice concentration contours rather than the usual 15% threshold that defines the ice edge. Figure S2 shows a moderate error reduction when considering a higher threshold (50%), both for the forecast systems (only ECMWF is displayed) and for the climatological benchmark. This leads to a slight increase of the predictive skill at longer lead times (the forecast loses predictive skills at day 39 instead of day 37) that could be explained by a reduced sensitivity of the compact ice to weather events. Moreover, we observe a substantial reduction of the initial error (∼40%), suggesting that this error is in part caused by a misrepresentation of dispersed sea ice in the marginal ice zone.
Finally, an obvious difference between the annual mean forecast errors in the two hemispheres is their overall magnitude. The Antarctic SPS is on average ×2.6 larger than the Arctic SPS. This difference is in part explained by the fact that the Antarctic sea ice edge is on average ×1.8 longer than the Arctic one ( Figure S1). If one assumes errors in terms of ice edge distance to be regionally independent, then the forecast SPS would tend to be proportional to the length of the edge. However, under this assumption the sea ice edge length difference can explain only ∼70% of the hemispheric SPS discrepancy, while the remaining ∼30% reflects increased errors in terms of ice edge distance in the Antarctic. A way to account for variations in ice edge length explicitly is to normalize the SPS with the ice edge length; such an approach is taken in section 3.3.

Seasonality and Components of the Antarctic Forecast Error
One of the strengths of the S2S database is the availability of forecasts all year round for a period of time longer than a decade. This allows us to assess seasonal variations of the forecast error.
The CLIM benchmark forecast exhibits seasonal variations of the SPS that correlate well to the length of the sea ice edge (Figure 2, dashed curves; compare with Figure S1). The SPS reaches its minimum value in March, immediately after the annual sea ice extent minimum and when the sea ice edge is the shortest. The CLIM SPS slowly grows during the following months as the ice edge becomes longer and stretches further to the north. The CLIM SPS maximum is finally reached during the melting season in November and December, when the Antarctic sea ice edge is the longest.
In general, the S2S forecast systems exhibit similar seasonal variations as the CLIM benchmark, in particular at the initial time. The only exception is CMA, which, as already mentioned, is affected by strong model and data assimilation-related biases that we do not further discuss. The ECMWF seasonality is in line with the CLIM benchmark, with the forecast error approaching the climatological error with increasing lead time.
Only during the second half of the freezing season (May to August) the forecast errors at longer lead times significantly exceed the CLIM error due to an overall overestimation of the sea ice edge extent (Figure 3; ECMWF). The UKMO and KMA systems show a similar freezing-season bias, also linked to an overestimation of the ice edge extent. These two systems exhibit an additional degradation of the predictive skills during the melting season (December and January, Figure 2) for lead times longer than 18 days. This suggests that the two systems have difficulties transitioning into the sea ice melting regime when initialized during a maximum-extent phase. The NCEP forecast system is characterized by a similar bias that is largest during the melting season. Specifically, NCEP strongly overestimates the ice edge extent during most of the year, except in the first 2 months of the freezing season (March to May, Figure 3). Figure 4 displays the longitudinal variation of the forecast and CLIM benchmark errors in terms of the Norm. SPS. In agreement with our previous findings, only the ECMWF forecast system is still partially skillful after one forecast month. The forecast error exceeds the error of the climatological benchmark after 32 forecast days in the east Antarctic sector (from 80 • E to 170 • E; Figure 4) and even earlier in the Haakon VII Sea. However, the system is skillful up to day 44 in some portions of the west Antarctic sector (Ross, Amundsen and Weddell Seas), where the Norm. SPS remains up to 40 km lower compared to CLIM. The other forecast systems lose their predictive skill much faster and none of them is skillful at the monthly range in any location around Antarctica (Figure 4). The very similar UKMO and KMA systems are on average skillful up to day 18 (green lines lower than CLIM), whereas the remaining systems lose their predictive skill before day 8 (ECMWF Pres. and NCEP) or are not even skillful at initial time (MF and CMA).

Regional Skill in Terms of Ice Edge Distance
The skill in predicting the sea ice edge location differs substantially among the S2S forecast systems. However, the analysis of the annual mean longitudinal variation of the forecast error reveals also some features common to multiple systems. The forecasts are overall less skillful (relative to the climatological benchmark) in the eastern Antarctic [0 • E; 180 • E] than in the western Antarctic [−180 • E; 0 • E]. This does not necessarily imply that the models are particularly good at capturing the evolution of the sea ice edge in the west Antarctic regions, but rather that the climatological forecasts are more accurate in the eastern sectors because of a lower sea ice edge variability. Both CLIM (Figure 4; gray-dashed line) and the S2S forecasts (colored lines) exhibit larger errors in terms of ice edge distance (Norm. SPS) in the Ross and Weddell Seas, suggesting that formulating accurate subseasonal sea ice edge predictions in these regions is challenging because of the high complexity and variability of the local climate system. Our results agree with Massonnet et al. (2018) who find large sea ice area prediction uncertainties in the Weddell and Ross Seas for late summer.
A further error peak can be observed in the west Haakon VII Sea (0 • E to 40 • E). Unlike the previous error peaks in the Ross and Weddell Seas (featured both in the CLIM benchmarks and the S2S forecasts), the west Haakon VII Sea error peak is more pronounced for the forecast systems (ECMWF, ECMWF Pres., UKMO, KMA, and NCEP) than for the CLIM benchmark. The NCEP system displays a particularly fast error growth with lead time in this region. In contrast, in the more skillful systems (ECMWF, UKMO, and KMA) this regional error peak appears to be caused mainly by accordingly large initial errors (≥100 km). More generally, the Antarctic average initial error in these systems is considerable (≥∼70 km), suggesting again that investments into the sea ice initialization procedure appear promising to enhance predictive capacity.

Discussion
This study provides the first thorough assessment of the skill of current operational ensemble forecasting systems in predicting the location of the Antarctic sea ice edge on subseasonal timescales. We find that only one of the considered forecast systems outperforms two benchmarks (persistence and climatology) for a wide range of lead times, namely from about 5-30 days. On average, the other systems perform worse than either persistence or climatology at any lead time considered here. The forecasts are in general more skillful in the west Antarctic sector than in the east Antarctic sector, where the climatological benchmark forecast provides a more accurate estimate of the sea ice edge location. In particular, the ECMWF forecast system outperforms the climatological benchmark forecast in the Ross, Amundsen, and Weddell Seas, where predictive skill up to 44 days into the forecast is found.
We identify two types of errors that are common to several forecast systems: (i) a "freezing-season bias" that affects ECMWF, UKMO, KMA, and MF and (ii) a "melting-transition bias" that affects UKMO, KMA, and NCEP (Balan-Sarojini et al., 2019;Blockley & Peterson, 2018). Both are caused by a systematic overestimation of the sea ice edge location (i.e., predicted to be too northward). While the first bias can be explained by a misrepresentation of thermodynamical processes in the coupled models, with the oceanic surface cooling and freezing too rapidly, the second bias could be linked to an initial overestimation of the sea ice thickness, which would delay the melting onset and thus the ice edge retreat in spring. At the moment we are not able to test this last hypothesis because the S2S database does not include sea ice thickness as a standard output variable.
The hemispheric comparison reveals that differences between the Arctic and Antarctic cannot be explained by differences in the sea ice edge length. This holds not only for the S2S forecast systems but also for the climatological benchmark forecast, suggesting that larger model biases in the Southern Ocean are not the major cause for this difference, but rather that this is due to an intrinsic property of the Antarctic climate system. The Antarctic forecast skill degradation points to an higher variability of the Antarctic sea ice edge at subseasonal time scales compared to the Arctic. Similar differences in skill between the hemispheres have been found for atmospheric predictions in polar regions and beyond (Bauer et al., 2015;Jung & Matsueda, 2016).
Given the relatively large forecast errors-ranging from 50 km to 250 km even for the best forecast systems-sea ice edge forecasts with state-of-the-art operational systems need to be used carefully. However, there might be some useful applications already. One example relates to the medium-term planning of ship tracks to optimize the provision of research stations in the Antarctic continent during the brief Antarctic summer and at the beginning of the freezing season. Furthermore, the probabilistic nature of the S2S forecasts could be beneficial for identifying the possibility of extreme sea ice conditions.
Our results suggest that current sea ice edge forecast capabilities for the Southern Hemisphere are lagging behind those for the Northern Hemisphere. Nevertheless, we anticipate that major improvements in forecast models and initialization techniques, together with further in situ observations to better understand the physical processes at the atmosphere-sea ice-ocean interfaces, will render Antarctic sea ice forecasts a valuable resource for guiding operational decision making in the Southern Ocean.