Subseasonal Prediction of Regional Antarctic Sea Ice by a Deep Learning Model

Antarctic sea ice concentration (SIC) prediction at seasonal scale has been documented, but a gap remains at subseasonal scale (1–8 weeks) due to limited understanding of ice‐related physical mechanisms. To overcome this limitation, we developed a deep learning model named Sea Ice Prediction Network (SIPNet) that can predict SIC without the need to account for complex physical processes. Compared to mainstream dynamical models like European Centre for Medium‐Range Weather Forecasts, National Centers for Environmental Prediction, and Seamless System for Prediction and Earth System Research developed at Geophysical Fluid Dynamics Laboratory, as well as a relatively advanced statistical model like the linear Markov model, SIPNet outperforms them all, effectively filling the gap in subseasonal Antarctic SIC prediction capability. SIPNet results indicate that autumn SIC variability contributes the most to sea ice predictability, whereas spring contributes the least. In addition, the Weddell Sea displays the highest sea ice predictability, while predictability is low in the West Pacific. SIPNet can also capture the signal of ENSO and SAM on sea ice.

. Moreover, the inter-model spread in SIPN South is much larger than the observational uncertainty (Massonnet et al., 2022).
Although earlier investigations have made some progress in examining the seasonal sea ice prediction in the Antarctic, revealing the significant influence of upper-ocean heat content on winter sea-ice forecasts (Bushuk et al., 2021;Marchi et al., 2019) and the crucial role of sea-ice thickness in predicting summer sea ice (Bushuk et al., 2021;Morioka et al., 2021), research on the subseasonal scale (1-8 weeks) is limited and challenging. Zampieri et al. (2019) found that the subseasonal forecasting skill of the Antarctic sea ice edge is 30% lower than that of the Arctic.
Given the paucity of observations and challenges in simulating sea ice physics in the Antarctic, is it feasible to pursue an alternative approach by utilizing deep-learning (DL) methodology for sea ice forecasting at the subseasonal scale? By extracting sea ice spatiotemporal features at multiple scales, DL has an immense potential to capture signals of sea ice predictability and avoid errors caused by incomplete parameterization in the complicated ocean-atmosphere-ice system (Andersson et al., 2021;Chi & Kim, 2017;Kim et al., 2020;Liu et al., 2021). In this study, we develop a DL model called sea ice prediction network (SIPNet) to predict subseasonal Antarctic sea ice concentration (SIC) using only SIC as input. The aim is to examine whether the DL model can significantly outperform dynamical and statistical models and fill the gap in Antarctic sea ice forecasts at the subseasonal timescale.

Data
We used daily bootstrap SICs from 1979 to 2021 obtained from the National Snow and Ice Data Center (NSIDC) (Comiso, 2017) to train our DL model and evaluate its performance. The data set was derived from passive microwave radiometers with a spatial resolution of 25 km. To examine possible data dependencies, we conducted a comparative analysis utilizing daily SICs from the Ocean and Sea Ice Satellite Application Facility (OSI SAF) version 3 (Sørensen et al., 2022), with identifiers OSI-430-a (1979-2020) and OSI-450-a (2021). The data set was derived from passive microwave radiometers in conjunction with European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis version 5 (ERA5). SIC predictions from ECMWF and the National Centers for Environmental Prediction (NCEP) database of the subseasonal to seasonal (S2S) Prediction project (Vitart et al., 2017) and the Seamless System for Prediction and Earth System Research developed at Geophysical Fluid Dynamics Laboratory (GFDL-SPEAR; Delworth et al., 2020) were used for the comparative analysis. Additionally, Southern Annular Mode (SAM; Marshall, 2003) and Niño 3.4 indices (Rayner et al., 2003) were utilized to assess the SIPNet's ability to capture climate-variability signals in sea ice forecasting.

Methods
Unlike dynamical models that rely on coupling processes among sea ice, atmosphere, and ocean, the SIPNet only uses SIC data for training and forecasting since sea ice variability is an integrated result of atmosphere-ocean-sea ice interactions. SIPNet model composes a multi-scale nested encoder-decoder structure and mainly includes four modules: input, encoder, decoder, and output ( Figure 1a). We use the historical SIC of the last 8 weeks to predict the following 8 weeks. The input is the previous 8-week SIC sequence with dimensions 304 × 320 × 8, which has been cropped in the spatial dimensions. The reason and method for data cropping can be found in Text S1 in Supporting Information S1. The encoder reduces the spatial dimensions of the input SIC sequences by max-pooling and increases the number of feature maps by convolution layers Ren et al., 2022). This process captures spatiotemporal connections from different down-scaled feature maps. The decoder consists of a series of upsampling and convolutional layers that increase the spatial dimensions of feature maps and reduce the number of sequences. At each stage, concatenations are employed to connect the feature map acquired by the encoder and multi-scale nested decoder at the same level, facilitating the integration of spatial dependencies across multiple scales. The feature maps output of the decoder is directed to the output module, which comprises a single convolution neural network (CNN) layer with eight 1 × 1 convolutional kernels. A sigmoid activation layer is subsequently applied to the convolutional outputs, predicting the SIC in each grid cell. The loss is computed by comparing the predicted SIC with the observed SIC. By minimizing the losses step-by-step, the model is trained to a well-performed state. The model uses zero padding to ensure that the output maps have the 10.1029/2023GL104347 3 of 10 same spatial dimensions as the input SIC. More detailed information regarding the model is provided in Text S1 in Supporting Information S1.
The SIC data set spanning 43 years was partitioned into three sets: training (1979-2010), model validation (2011-2016), and assessment (2017)(2018)(2019)(2020)(2021). Compared to daily data, weekly average SIC has a higher signal-to-noise ratio and longer persistence. To address the inadequate weekly time series for model training, we apply data preprocessing before input (Figure 1a), which employs a strategy of expanding the weekly time series from daily data. See the Input section of Text S1 in Supporting Information S1 for details. These created weekly SIC time series are only employed for model training and validation. Model assessment is conducted using standard weekly data.
In addition, the forecasting results were evaluated seasonally, with the seasons defined as winter (June to August), spring (September to November), summer (December to February), and autumn (March to May). Three metrics, the anomaly correlation coefficient (ACC, see Equation 1), mean absolute error (MAE, see Equation 2), and integrated ice-edge error (IIEE, see Equation 3, Goessling et al., 2016) between observations and predictions, are used to assess the SIC prediction skill.
where p represents the SIC-predicted anomalies, p i represents the prediction anomaly at time i, g represents the observed SIC anomalies. n is the length of time series.
where P i represents the predicted SIC value at time i, G i represents the SIC observed value at time i.
where IIEE is defined as the cumulative area encompassing discrepancies between the predicted and observed SIE, this includes both overestimated (O) and underestimated (U). We create anomaly time series from 2017 to 2021 by subtracting climatologies of the same period from weekly mean data. These metrics are only calculated in areas covered by sea ice (Figure 1c). The Antarctic is divided into five subregions based on the NSIDC classifications. In addition to the ECMWF, NCEP, and GFDL dynamical models, we also compare skill against a linear Markov model (Chen & Yuan, 2004;Wang, Yuan, Bi, et al., 2023;. Contrasting to the earlier Markov models built in the multivariate space, the Markov model here only utilizes SIC to construct the model. It is worth noting that the term "target week" refers to the week being predicted, and "lead week" refers to the number of weeks prior to the target week that the forecast was initialized throughout the text.

SIPNet Versus Other Models at Subseasonal Scale
Here predictive capabilities of the SIPNet model were compared to the above-mentioned dynamical and statistical models. We also evaluated the prediction skill by using anomaly persistence forecast as a benchmark. The findings indicate substantial variation in forecast skill across various models, as depicted in Figure 2. In terms of MAE, all the dynamical models fail to exhibit any predictive skill compared to anomaly persistence, highlighting the errors in initial conditions made by the dynamical prediction systems. Zampieri et al. (2019) compared multiple S2S prediction systems for Antarctic sea ice and found that most systems tend to overestimate the SIE. The initial error in the dynamical models can have various sources, including adjustment of the sea ice edge during data assimilation, use of different sea ice observations in the assimilation and verification, and interpolation errors due to regridding (Zampieri et al., 2018).
SIPNet consistently outperforms dynamical models and the anomaly persistence model with a lower MAE across all lead times. Without modeling the interactions between the atmosphere, ice, and ocean, SIPNet considers only SIC and still achieves higher forecast skill. As sea ice variability is a product of integrated ocean-atmosphere interactions, the SIPNet can learn from the past and capture the main features of ice variability. While the Markov model outperforms most dynamic models in terms of MAE, it still has lower skill than SIPNet for all lead times. SIPNet also exhibits strong predictive capabilities for sea ice edge, with the least IIEEs consistently below 1.78 × 10 6 km 2 across all lead times (Figure 2b). This performance is significantly superior to the dynamical models.
In terms of ACC, the dynamical models, especially ECMWF, exhibit better performance than the anomaly persistence model, which displays a rapid loss in prediction skill (Figure 2c). This suggests that the ECMWF model captures the phase of SIC anomalies well in the Antarctic, but struggles in predicting the magnitude of SIC variability, as reflected by its large MAE. The fact that the dynamical models have some ACC skill suggests that their MAE skill has the potential to be further improved through post-forecast bias correction. GFDL-SPEAR exhibits the lowest ACC at lead times of 1-4 weeks, indicative of SIC initialization errors. Note that the GFDL-SPEAR predictions, which are developed for seasonal applications, are initialized solely on the first day of each month, resulting in only 60 prediction samples during the period from 2017 to 2021, which is much fewer than other models (at least 260 samples). This may potentially limit the forecast skill. GFDL-SPEAR demonstrates the capability of skillful predictions in regional Antarctic SIE at the seasonal timescale, surpassing a persistence forecast (Bushuk et al., 2021). SIPNet outperforms all other models, is skillful (ACC > 0.5) at 1-4 lead weeks, and showcases exceptional predictive ability in capturing both the phase and amplitude of Antarctic SIC anomalies.
We have performed additional analysis utilizing the OSI SAF SICs to assess the sensitivity of our model skill to different datasets. The results indicate that the model skill derived from both datasets is remarkably similar (Figure 2), suggesting that the differences in these two datasets are not sufficient to influence the model skill.
To examine whether the model excessively relies on the length of the training set, we trained SIPNet using data from 20 years (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) and 10 years (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) separately. The results revealed that the model skill (ACC) of these two experiments only decreased by 4.4% and 7.7% compared to the model trained using 31 years  of data ( Figure S1 in Supporting Information S1). This suggests that the length of the training set only slightly influences the model skill. We have also investigated the model's predictive capacity for the extreme of SIE in February and September. Figures S2a and S2b in Supporting Information S1 reveals that the model's ability to forecast extreme SIE decreases as lead time increases. However, the forecast error remains relatively small at the 1-3 lead weeks, especially during the rapid decline events in 2015/2016 when the predicted SIE error is 0.082 × 10 6 km 2 at a 3-week lead in February. The predicted error of extreme SIE also does not depend on the SIE anomaly (Figures S2c and S2d in Supporting Information S1). Next, we examine the spatial and seasonal distributions of SIPNet's prediction skill.

Seasonal and Regional Features of Model Skill
Since ECMWF performs better than other dynamic models in Antarctic sea ice predictions, here we use it to represent dynamic models. We present the 4-week leads predictions from SIPNet, Markov, and ECMWF models to reveal prediction skills' seasonal and spatial features (Figure 3 and Figure S3 in Supporting Information S1). Consistent with earlier work (Bushuk et al., 2021), our results indicate that the predictability of Antarctic sea ice is seasonally dependent, according to different physical conditions. During autumn, the SIPNet exhibits its highest level of performance, evidenced by its highest ACC ( Figure 3a) and lowest MAE ( Figure S3 in Supporting Information S1). This superior performance could be related to the physical mechanism of fall predictability provided by predictable ocean heat content anomalies (Holland et al., 2013), presumably reflected in the initial SIC field. Conversely, the SIPNet's lowest skill is observed during spring, where it registers the lowest ACC and highest MAE. The forecasting error during spring mainly occurs in the sea ice-marginal zone. However, the model's ACC is still significant as it can skillfully capture phases of sea ice anomalies despite having a relatively large forecasting error for the magnitudes of those anomalies. During the cold season, SIC encircling the Antarctic continent away from the ice edge is nearly 100%, with limited variability, resulting in relatively low ACC. However, the model is adept at capturing sea ice climatology in these areas, so the MAE is small with less than 6%. The MAE is less than 18%, even in the West Pacific, significantly less than 20%-35% of SIC standard deviation ( Figure S4 in Supporting Information S1).
SIPNet exhibits similar skill patterns as the Markov model across all seasons, especially during the cold season (Figures 3a and 3b). Compared to the Markov model, SIPNet has higher skill in autumn Amundsen and Weddell Seas, where the Antarctic Dipole (ADP) is located (Figure 3d). It also reflects the limitations of the Markov model in predicting the principal components under fixing spatial patterns of the coupled climate modes, which do not represent sea ice's synchronized spatiotemporal evolution. Additionally, ADP is the dominant variability of Antarctic sea ice, driven by ENSO and the Antarctic air-sea interactions. The high skill in these regions suggests that SIPNet may be able to capture the remote climate-forcing signal on sea ice, particularly in fall ( Figure S3d in Supporting Information S1). In addition, Figure 3 and Figure S3 in Supporting Information S1 illustrate a notable disparity in the skill patterns of SIPNet and ECMWF. Compared to ECMWF, SIPNet primarily demonstrates reduced MAE across all seasons, especially in spring, as depicted by green shades in Figure S3e in Supporting Information S1.
10.1029/2023GL104347 6 of 10 To further quantify the model skill in each region, we conducted spatial averaging of MAE and ACC at lead times of 1-8 weeks in Figure 4a and Figure S5 in Supporting Information S1, respectively. The results show that SIPNet outperforms persistence in terms of MAE and ACC across all regions at all lead times. The Weddell Sea shows the best performance, with the smallest MAE (less than 8%) even at a lead time of 8 weeks in the cold season and also maintaining a low level of MAE in the warm season. Its ACC also leads other regions, except for the Amundsen and Bellingshausen Sea ( Figure S5 in Supporting Information S1). The Markov and ECMWF also outperform the persistence in the Weddell Sea during summer and winter. Several factors could contribute to the high predictability of sea ice in the Weddell Sea. First, the high variability of sea ice associated with ENSO and the Amundsen Sea Low (ASL) could provide crucial SIC predictability. Additionally, the Weddell Sea dominates the multiyear ice coverage of the Southern Ocean and has a longer SIC persistence due to its thick ice. The steady eastward sea ice drift may be another contributing factor (Bushuk et al., 2021). The Indian Ocean displays a smaller MAE (below 5%) even at an 8-week lead time ( Figure 4a) and a higher ACC ( Figure S5 in Supporting Information S1) in autumn. The MAE in the Ross Sea is lower in winter but higher in summer. The high variability of Ross sea ice does not contribute much to the overall model's skill, possibly due to the complex physical processes involved. The West Pacific exhibits larger MAE and lower ACC across all seasons, likely because the region lacks strong atmospheric circulation that drives sea ice, leading to a lower signal-to-noise ratio in sea ice variability. Additionally, the West Pacific is characterized predominantly by thin ice and seasonal ice, leading to shorter ice persistence. In summary, our SIPNet model performs best in the Weddell Sea and imperfectly in the West Pacific.
In addition, SIPNet exhibits higher ACC than ECMWF in most regions, except for the spring Weddell Sea and West Pacific ( Figure S5 in Supporting Information S1). It also demonstrates better ACC than Markov in most regions, except for the autumn Ross Sea at the 5-8 week leads. All three models concur that the Amundsen/Bellingshausen Sea exhibits the most significant forecast errors. Interestingly, they all also demonstrate the highest ACC in this region (Figure 4a and Figure S5 in Supporting Information S1), indicating that these models perform well in capturing SIC anomaly signals, but struggle slightly with the magnitudes of anomalies in the region. The intricate physical mechanisms underlying this phenomenon merit further investigation, yet are beyond the scope of the present study. Moreover, whether the high ACC in the Amundsen and Weddell Seas reflects SIPNet's ability to capture the signal of climate forcing on sea ice requires further discussion.
We conducted a regression analysis of the ENSO and SAM index on the prediction and observation anomalies from 2017 to 2021 (Figures 4b and 4c). El Niño events generate stationary Rossby wave trains that curve poleward (Ding et al., 2012;Meehl et al., 2016;Yuan, 2004) and is known as the PSA pattern (Mo & Higgins, 1998), weakening the ASL (Simpkins et al., 2016). The offshore wind associated with the ASL contributes to the formation of more sea ice over the Weddell Sea through both cold atmospheric advection and offshore ice drift (Li et al., 2014;Meehl et al., 2019;Raphael et al., 2016). Conversely, onshore winds melt and compress sea ice around the Amundsen and Bellingshausen Sea region, thus generating out-of-phase sea ice anomalies similar to the ADP pattern (Yuan, 2004). Although SAM is the leading mode of extratropical Southern Hemisphere atmospheric circulation (Thompson & Wallace, 2000), its influence on sea ice is relatively weaker than ENSO-related PSA (Yuan & Li, 2008). During a positive SAM phase, westerly winds intensify, producing a sea ice anomaly pattern opposite to that related to El Niño (Figure 4b), consequently weakening sea ice variability and reducing the strength of the ADP. The anomaly patterns presented in Figures 4b and 4c are highly similar, demonstrating that the SIPNet model can well capture sea ice's responses to these forcings.

Concluding Remarks
In this study, we developed a DL model (called SIPNet) to forecast Antarctic SIC at the subseasonal timescale. To assess the model's predictive capability, we compared its performance against advanced dynamical models such as ECMWF, NCEP, and GFDL-SPEAR, as well as a linear Markov model.
We found that all the dynamical models fail to exhibit any predictive skill compared to anomaly persistence in terms of MAE. The Markov model has limits to predicting the synchronized spatiotemporal evolution of sea ice because it retains fixed spatial patterns of the coupled climate modes. By efficiently capturing the spatiotemporal features of sea ice and bypassing the intricate physical mechanisms, SIPNet outperforms persistence in terms of both MAE and ACC across all seasons and regions, at all lead times. Seasonally, SIPNet performed best in autumn with the lowest MAE and the highest ACC, and worst in spring. Spatially, the Weddell Sea performed best, with the smallest MAE (less than 9%) even at an 8-week lead time and a significant ACC. While the West Pacific is the worst-predicted region with a larger MAE and lower ACC. In addition, despite the opposite SIC anomaly patterns related to ENSO and SAM from 2017 to 2021, which result in a weaker ADP pattern, SIPNet can still capture these forcing signals well. Furthermore, we demonstrated that the model skill of SIPNet is not highly sensitive to the length of the training set time series. SIPNet exhibits well capabilities in predicting sea ice edges and extreme sea ice events.
Overall, SIPNet, showcasing exceptional predictive ability in capturing both the phase and amplitude of Antarctic SIC anomalies, outperforms ECMWF, NCEP, GFDL-SPEAR, and Markov models at a subseasonal timescale. By employing an encoder-decoder structure considering residual CNN (ResNet) with TSAM, SIPNet can capture the intricate nature of sea ice variability, including nonlinear characteristics and spatiotemporal dependencies, while bypassing complex model physics and parameterization. SIPNet also has advantages such as requiring minimal computing resources and fast computation speed. The model effectively fills the gap in Antarctic SIC prediction at the subseasonal timescale. In our future investigation, we will explore the possibility of enhancing the model's skill by incorporating various climate variables, including SIT, ocean heat content (OHC), sea surface temperature (SST), air temperature (AT), wind, and atmospheric pressure fields. Additionally, we will also consider the influence of climate factors such as ENSO, ASL, SAM, and wave-3. These enhancements hold the potential to further improve the model's performance.