Where the White Continent Is Blue: Deep Learning Locates Bare Ice in Antarctica

In some areas of Antarctica, blue‐colored bare ice is exposed at the surface. These blue ice areas (BIAs) can trap meteorites or old ice and are vital for understanding the climatic history. By combining multi‐sensor remote sensing data (MODIS, RADARSAT‐2, and TanDEM‐X) in a deep learning framework, we map blue ice across the continent at 200‐m resolution. We use a novel methodology for image segmentation with “noisy” labels to learn an underlying “clean” pattern with a neural network. In total, BIAs cover ca. 140,000 km2 (∼1%) of Antarctica, of which nearly 50% located within 20 km of the grounding line. There, the low albedo of blue ice enhances melt‐water production and its mapping is crucial for mass balance studies that determine the stability of the ice sheet. Moreover, the map provides input for fieldwork missions and can act as constraint for other geophysical mapping efforts.


10.1029/2023GL106285
2 of 10 the redirection of the ice flow toward the surface lead to the emergence of ice from deeper, hence older layers (Sinisalo & Moore, 2010).This ice had been once formed upstream through compaction of snow under the weight of accumulating snow and firn layers that can be up to ca. 120 m thick (Ligtenberg et al., 2011;Van den Broeke, 2008).The transformation of snow into ice reduces the air content, resulting in more absorption of light, especially in the red part of the spectrum, which explains the distinctive blue color of a thick layer of ice (Bohren, 1983).
The blue color of exposed ice is clearly visible in satellite imagery.However, detecting BIAs at the continental scale remains challenging because data coverage at high latitudes is sparse (>82.5°S, the typical limit of sun-synchronous satellites) and spatial biases exist in the data caused by varying illumination conditions and atmospheric states (e.g., Bindschadler et al., 2008).Moreover, temporary snow cover on the ice can hinder the detection of BIAs in visual imagery.Over the past decades, several efforts have been performed to detect BIAs using satellite observations in the visual and shortwave infrared spectrum (e.g., Hu et al., 2022;Hui et al., 2014;Orheim & Lucchitta, 1990;Winther et al., 2001).All of these studies use spectral properties to distinguish BIAs from other surfaces.A downside of these approaches is that the spectral properties are quantified using a limited amount of ground observations and are assumed to be applicable over a larger area, ranging from observational scenes (Jawak et al., 2023;Orheim & Lucchitta, 1990;Winther et al., 2001) to the entire continent (Hu et al., 2022;Hui et al., 2014).In these BIA mapping efforts, temporary snow cover on blue ice cannot be distinguished from accumulation areas, which results in snow cover variation being regarded as a variation of BIA extent (Hu et al., 2022;Orheim & Lucchitta, 1990).
In this study, we aim to detect areas that (generally) expose blue ice at the surface, uniformly across the continent at 200-m resolution.To this end, we combine multiple observational data sets in a convolutional neural network (CNN;LeCun et al., 2015).CNNs have found wide use in recent environmental science research (e.g., Reichstein et al., 2019;Tuia et al., 2021), including cryospheric sciences (e.g., Baumhoer et al., 2021;Van der Meer et al., 2023), and are becoming one of the most used data science tools thanks to their performance, scalability, and inherent ability to model nonlinear processes (LeCun et al., 2015).To train the CNN, we use labels of Hui et al. (2014), which are based on the thresholding of a normalized difference index of mono-temporal data of a single sensor for most of the continent.These labels form the starting point for blue ice detection with multi-sensor data captured over 3 years, where we rely on a machine learning approach that uses so-called "noisy" labels to learn an underlying "clean" pattern.In the literature, it is demonstrated that deep neural networks first learn the underlying "clean" patterns of "noisy" labels before eventually adapting to the particular errors of individual scenes during later training epochs (Arpit et al., 2017;Han et al., 2018).To make blue ice well identifiable for the CNN and reduce the sensitivity to sensor-specific or temporal-specific biases, we use complementary multi-sensor and multi-temporal satellite data.We identify when the CNN has learned the "clean" pattern and starts to adjust to the "noise" with hand-labeled validation data and we stop the training process accordingly.Finally, we generate a continent-wide BIA map with corresponding uncertainty estimates and discuss the extent and location of BIAs across the continent.

Materials and Methods
As input for the CNN, we created an Antarctic-wide composite of Moderate Resolution Imaging Spectroradiometer (MODIS) satellite observations in the visual, near infrared, and shortwave infrared part of the spectrum (i.e., the first seven bands; all details of the processing are provided in Text S1 in Supporting Information S1).MODIS Terra passes approximately fifteen times per day over the polar area, and we generated 192 daily composites of these observations covering January, February, and early March of 2008March of , 2009March of , and 2010 (for temporal alignment with other data sets used in this study).In this process, we selected the median value per day of each pixel after reprojecting the atmospherically corrected MODIS Terra 5-Min L2 swath data (Vermote, 2015(Vermote, , 2022) ) into a polar stereographic projection (Snyder, 1987;Snyder & Voxland, 1989) and filtering out cloudy observations (Text S1 in Supporting Information S1).Next, we combined all 192 daily composites into a multi-day, seven-band composite (Figure S1 in Supporting Information S1) by taking the median value of each pixel for each band.We visually assessed that the median value of the large number of observations filters the noise introduced by imperfections in the cloud masking and temporary snow cover on blue ice.The seven-band composite of surface reflectances is available at https://doi.org/10.5281/zenodo.8333864.
We complemented the seven-band composite of MODIS data with three additional data sets (Text S6 in Supporting Information S1): (a) The cloud-free, MODIS-based mosaic of surface morphology (MOA2009) (Haran et al., 2021;Scambos et al., 2007) to provide additional data in areas where our MODIS composite is, for TOLLENAAR ET AL.
10.1029/2023GL106285 3 of 10 example, distorted by clouds that were not correctly masked out; (b) The RADARSAT-2 Antarctica mosaic (Crevier et al., 2010;MacDonald, Dettwiler and Associates (MDA), 2014), a C-band radar signal that shows low backscatter values co-located with blue ice.As C-band radar can penetrate several meters into snow (Jezek, 1999;Wessel et al., 2021), these data allow, on the one hand, to detect blue ice in areas where snow patches hamper the observation of blue ice that directly underlies these typically thin layers of snow (of up to 50 cm; Sinisalo & Moore, 2010) in multi spectral data; on the other hand, C-band radar data is difficult to use or interpret without complementary data; (c) Surface elevation data (TanDEM-X PolarDEM) (Dong et al., 2021;German Aerospace Center (DLR), 2020;Wessel et al., 2021), as the occurrence of blue ice is strongly linked to the surface topography (Takahashi et al., 1992).For example, BIAs are often located close to mountains protruding the ice (nunataks), which can act as barriers to the ice flow upstream, redirecting the ice flow toward the surface (Sinisalo & Moore, 2010).Nunataks can also prevent deposition of blowing snow on the lee-side of the mountain (e.g., Folco et al., 2002).Also, in areas where the surface slope increases, katabatic winds are enhanced, resulting in the erosion of snow, and hence the exposure of ice (e.g., Markov et al., 2019).The surface topography may allow the CNN to learn physical conditions for the presence of blue ice, making the algorithm less sensitive to particular confusing observations in the other data sets (e.g., temporary snow cover on increasing slopes typical for the formation of blue ice (Takahashi et al., 1992) or blue-appearing surface lakes with a perfectly level surface).We standardized all individual data sets so that the mean equals zero and the standard deviation equals one.Next, we aligned all data on a 200-m resolution grid through bilinear interpolation.
We divided the continent into regions for training, validation, and testing (Figure S2 in Supporting Information S1).Within each region, we created tiles of 512 by 512 pixels, corresponding to ca. 100 by 100 km.To train the CNN, we used existing BIA outlines (provided in the Quantarctica data package, Matsuoka et al., 2021) that were obtained by applying a threshold to a normalized difference index of two bands of Landsat data and MODIS data to cover the center of the continent (Hui et al., 2014).Labels were created by considering the centers of the pixels intersecting with the outlines (label one; BIA) or not (label zero; no BIA) and masking out areas such as the ocean (Text S3 in Supporting Information S1).For validation and testing, in five distinct areas of 250 by 250 km, we manually outlined BIAs on the same high-resolution visual imagery as used by Hui et al. (2014) (i.e., Bindschadler et al., 2008), aided by auxiliary maps (Environmental Research & Assesment, 2015).Comparing the hand labels to the outlines of Hui et al. (2014), we observed large patches looking like snow that appeared to be included in the latter (Figure S3 in Supporting Information S1).We quantified the amount of "noise" of the labels by comparing them with the "clean" hand labels and calculated a reference performance ("F1 noisy outlines" in Table 1).In total, there are 1,062 training tiles (including shifted overlapping tiles, Figure S2 in Supporting Information S1), 8 hand-labeled tiles overlapping with the training data (only used for estimating the noise-eliminating property of the CNN), and 32 hand-labeled tiles, without any spatial overlap with the training tiles, used for validation (18) and testing (14).
To predict the presence of blue ice, we employed the common U-Net neural network architecture (Ronneberger et al., 2015) with the PyTorch library (see Text S4 in Supporting Information S1; Paszke et al., 2019;Persson, 2021).We modified the input dimension to accommodate 10 input layers (MODIS composite (7), MOA (1), RADARSAT-2 (1), DEM (1); Figure S6 in Supporting Information S1) and predict a single output layer with the probability for the presence of blue ice at each pixel, where values of 0.5 and higher indicate blue ice.The Note.The F1 score of the noisy outlines is based on the "noisy" training labels compared to the "clean" hand labels.The extent and location of the regions are indicated in Figure S3 in Supporting Information S1.TP, FP, and FN denote true positive, false positive, and false negative, respectively.Following this procedure, we trained five CNNs (with different random seeds but identical hyperparameters) on the combined training and validation data to generate multiple BIA predictions.Moreover, we generated eight different predictions for each of the five trained CNNs by changing the tiles' orientation through a flips and rotations (i.e., test-time augmentation; Shanmugam et al., 2021).The tiles' predictions were combined into seamless maps (Text S5 in Supporting Information S1).The final continent-wide BIA map is an average of the resulting 40 maps and the uncertainty estimates correspond to the standard deviations of the 40 values at each pixel.We also converted the pixel-based map of BIAs to a collection of smooth polygons so that the map can be used at any resolution (Text S5 in Supporting Information S1).

Results
Our BIA map (Figure 1) reveals that the total BIA extent is 139,700 km 2 , corresponding to 1.03% of the surface of Antarctica (grounded ice, ice shelves, and islands; Mouginot et al., 2017).The quality of the map is quantified by three performance metrics: precision, sensitivity (i.e., recall), and F1 score (equations and values in Table 1).
Precision indicates how many of the predicted BIA pixels are, in reality, also blue ice, and sensitivity indicates how many pixels are identified correctly out of all true BIA pixels.The F1 score is the harmonic mean between the precision and the sensitivity, accounting for the trade-off between the two metrics.We calculate the F1 score by comparing the predictions of the CNNs to manually outlined BIAs in (a) regions used in the development of the CNN (validation) and (b) regions that have never been seen before by the CNN (test).The performance of the CNNs in test regions is similar to the performance in validation regions.The F1 score of the "noisy" outlines is calculated by comparing the "noisy" labels to the manually outlined BIAs and acts as reference performance (Table 1).The F1 score of our predictions outperforms this reference by +14% and +11% in the validation and test areas, respectively.The manually outlined BIAs are based on the same imagery as the "noisy" labels from Hui et al. (2014), avoiding potential biases toward our predictions that are not based on this imagery.Therefore, the improved BIA detection can be attributed to our approach in which multiple observational data sets have been combined through deep learning.The error-eliminating property of the CNNs is also confirmed by comparing the CNNs' predictions in erroneously labeled areas used for training to hand labels.In these areas of disagreement, we see that the CNNs still predict correctly (according to our hand labels) for 90.2 ± 2.5% of the pixels.
Given that there are large differences between our BIA mapping and the existing single-sensor-based mapping, we verify our predictions by calculating the spectral reflectance using Landsat and MODIS data.We define four types of regions: two types of areas of agreement to define the typical spectral reflectance (i.e., spectral fingerprint) of "blue ice" and of "no blue ice," and two types of areas of disagreement, to see if the spectral fingerprint in these areas corresponds to blue ice or not (Figure 2a).We find that the spectral fingerprint of blue ice has generally a lower reflectance, especially in the red part of the spectrum, confirming the blue appearance of ice.
As most areas where no blue ice is exposed correspond to snow-covered areas, the reflectance of the "no blue ice" class is high and uniform over the visible part of the spectrum.Moreover, in the two types of areas of disagreement, the spectral fingerprints resemble the spectral fingerprint of blue ice ("AOD type II" in Figure 2) and of no blue ice ("AOD type I").This resemblance is reflected in the data used for the construction of our BIA map, as well as in the data that are not used for constructing our BIA map (Figures 2b and 2c, respectively) and confirms that the new BIA map correctly reduces false positives and false negatives in the previous mapping.
In total, we identify 6,644 BIAs with sizes varying from less than a square kilometer to hundreds of square kilometers.A few (26) BIAs are larger than 1,000 km 2 , among which the renowned, large, interconnected BIA near the Queen Fabiola/Yamato mountains of 4,000 km 2 (Yoshida, 2010), and the largest patch of ca.8,000 km 2 , consisting of interconnected BIAs upstream of the Amery ice shelf.Half of the blue ice is located near the continent's grounding line (<20.9km; Li et al., 2015;Rignot et al., 2011Rignot et al., , 2014Rignot et al., , 2016)).Inversely, 4.6% of the surface around the grounding line (±20 km) exposes blue ice.BIAs that can act as traps for meteorites and/or old ice are typically located at high elevations (>1,500 m), where surface melt does not occur (Harvey, 2003;Sinisalo & Moore, 2010).Our map indicates that 18.2% of blue ice is located in areas higher than 1,500 m, implying that less than 0.2% of the surface (25,400 km 2 ) of Antarctica exposes high-elevation, potential meteorite-rich blue ice.33.6% of this high-elevation blue ice is located at higher latitudes (>82.5°S),where only limited observations are available.
The uncertainty quantification of the BIA map (Figure S11 in Supporting Information S1) reflects the influence of noise and biases in the individual data sets (e.g., penetration depth of radar, Figure S7 in Supporting Information S1), spatial misalignment of the different data sets, absence of similar examples in the training data, under-or overfitting of the neural network, and physical processes.Elevated uncertainties linked to physical processes (Figure 3) include (a) more persistent temporary snow cover (Hu et al., 2022), (b) wind scouring (Das et al., 2013;Scambos et al., 2012), (c) blue slush or surface lakes on snow or on blue ice (Dell et al., 2022), (d) changing wind patterns (Takahashi et al., 1992), and (e) crevasses.Although a disentanglement of the different uncertainty sources is unfeasible, the continent-wide uncertainty estimates allow us to quantify the total BIA-extent uncertainty by adding/subtracting one standard deviation to/from the prediction values to obtain a lower and upper bound of 114,000 and 174,000 km 2 , respectively.The uncertainties are most pronounced near the grounding line (<20 km, encompassing 49.3% of blue ice extent), where predictions range from 79.4% to 128.1% of the mean BIA extent in this region.In mountainous terrain (defined similarly to aviation terminology as areas where, in a radius of nearly 19 km, elevation differences of over 900 m are observed; European commission, 2016), where 33.3% of blue ice is located, predictions range from 85.9% to 118.6% of the BIA extent.In higher latitudes (>82.5°S,where 14.5% of blue ice is located), predictions range from 87.3% to 117.4%, and at high elevation (>1,500 m) from 87.5% to 117.6%.

Discussion and Outlook
We present a new BIA map of Antarctica obtained by a CNN trained with labels that are (clearly) incorrect in certain areas but have continent-wide coverage.Despite these "noisy" training labels, the classes "blue ice" and "no blue ice" are distinguishable for the CNN as we (a) combine data of different satellites and (b) use a limited set of high-quality, manually outlined validation labels to stop training when the CNN starts to adapt to the errors of training labels.Using multiple CNNs, we constructed 40 BIA maps, of which the average provides a continent-wide estimate of the presence of blue ice, and the standard deviation the corresponding uncertainty.Performance metrics show that the CNNs outperform the existing BIA outlines (Table 1).Moreover, an analysis of reflectance data in the visual, near infrared, and shortwave infrared parts of the spectrum indicates that our new map correctly improves continent-wide BIA estimates.Estimated uncertainties appear closely related to physical phenomena, such as changing wind patterns and surface melt.
Compared to earlier estimates on the extent of blue ice, our estimate of 1.03% (0.83%-1.28% for plus-minus one standard deviation) is within the independently estimated ranges of Winther et al. (2001) and Hu et al. (2022) (both ∼0.8% to ∼1.6%).However, the estimate is considerably lower than the currently only openly available and widely-used BIA map of Hui et al. (2014), which indicates an area of 1.67%, or 1.71% when calculated identically to our estimate, eliminating discrepancies introduced through resolution, rasterizing data, and the considered surface area.A spatial comparison indicates that 20.9% of our BIA extent was previously not marked as BIA by Hui et al. (2014), and that 52.4% of their blue ice is not classified as blue ice in our map and could instead be considered as, for example, accumulation areas.
Our new outlines of BIAs are valuable input for estimating the mass balance of the ice sheet and its changes over time (Bell et al., 2018).More specifically, in the grounding zone, BIAs enhance local-scale melt, given the typically low albedo of BIAs in combination with katabatic winds that warm and mix the air when flowing downward over the sloping topography (Bell et al., 2018;Lenaerts et al., 2016).In these regions, observations of surface melt-water production are hampered by limited satellite observations and discrepancies across different sensors are most pronounced over BIAs (Bell et al., 2018;De Roda Husman et al., 2023).Hence, the new BIA outlines can act as constraints for modeling and observing melt-water production that can lead to the development of (downstream) lakes, which in turn can eventually result in the collapse of ice shelves and hence an increase in mass loss of the ice sheet (Bell et al., 2018;De Roda Husman et al., 2023;Liston & Winther, 2005;Van den Broeke et al., 2023).Moreover, our BIA outlines can aid other geophysical mapping efforts, such as the Based on a set of processed Landsat scenes used for LIMA (Bindschadler et al., 2008) and the outlines of Hui et al. (2014).In areas where scenes overlap, we have taken the average values.Points south of 82.5°S, where no Landsat data are available, are discarded.10.1029/2023GL106285 7 of 10 calibration of elevation measurements (Wessel et al., 2021).Also, the new BIA map provides information for fieldwork missions and for assessing the potential of certain areas to find meteorites and/or old ice (Sinisalo & Moore, 2010;Spaulding et al., 2013;Tollenaar et al., 2022).
Future directions of BIA mapping efforts reside in (a) reducing uncertainties through further investigating temporal patterns of, for example, seasonal snow cover or multi-year changes; (b) enhancing the resolution, for  Bindschadler et al., 2008) in selected regions (indicated in italic titles).The Landsat satellite images have not been used as input for the convolutional neural network (no data >82.5°S).Estimated BIAs are drawn as black outlines.(a) High uncertainties related to persistent snow cover through which radar penetrates (e.g., Jezek, 1999), resulting in low backscatter values.(b) Elevated uncertainties correspond to wind scouring zones (green outlines) that have been identified by Das et al. (2013) through an empirical model that considers the surface slope, modeled wind and accumulation.(c) Surface lakes on the Nivlisen ice shelf (Dell et al., 2022) lead to low radar backscatter values and high uncertainty estimates, while Landsat imagery shows snow cover.(d) Blue ice formed on the lee side of a barrier that prevents the deposition of wind-drifting snow, where varying wind patterns result in larger uncertainties further downstream of the nunatak (Bintanja, 1999;Takahashi et al., 1992).(e) Elevated uncertainties related to crevasses within a BIA.
instance through using the RADARSAT-2 data set that covers the entire continent at 25-m resolution (Crevier et al., 2010); and (c) improving BIA outlines near the grounding line, where uncertainties are most pronounced, by using multiple classes for training a CNN specifically adapted to these regions to delineate the diverse glacial landscape with, for example, lakes on snow, lakes on ice, exposed bedrock, and moraines.

Conclusion
In this study, we map BIAs across Antarctica and present unprecedented estimates of the corresponding uncertainties.The BIA map is obtained with multi-sensor satellite observations captured over 3 years to reduce sensitivities to temporal snow cover.In our deep-learning framework, we train a CNN with a continent-wide set of "noisy" BIA outlines.After training, 90% of what we identified as erroneously labeled training pixels are correctly classified.This error-reducing capacity is obtained through (a) providing multiple observational data sets for training and (b) using a small set of manually outlined BIAs to decide when to stop training.A similar approach could be taken to outline land covers in other domains with limited ground truth.Our map reveals that ca.1% of Antarctica exposes blue ice, with potential impact on, for example, estimates of the mass balance of the ice sheet and its changes over time.
CNN has several hyperparameters (e.g., learning rate, weight decay) that influence its performance, which we optimized empirically through a random search (Text S4 in Supporting Information S1).During training, batches of training tiles (batch size of 20) are passed through the CNN and its internal parameters are updated after each batch to reduce the error between the predictions and the training labels.After all training tiles are passed through the CNN once, an epoch is completed and the performance of the CNN is validated on hand-labeled validation data (using the F1-score; Table1).Training stopped when no improved average validation F1 score was obtained during 20 epochs or after a maximum of 100 epochs (which was never reached).With our hardware setup containing an NVIDIA Quadro RTX 8000 GPU, a training session takes less than 2 hr.

Figure 1 .
Figure 1.Blue ice area (BIA) map, with prediction values resulting from averaging predictions of 40 different BIA maps obtained by five trained convolutional neural networks (see Section 2).Corresponding uncertainties are displayed in Figure S11 in Supporting Information S1.Systematic patterns within the prediction values between 0 and 0.5 correspond to sloping terrain.

Figure 2 .
Figure 2. Spectral fingerprints.Median, and 25th and 75th quantiles of spectral reflectances at 10,000 randomly selected points per type of area (e.g., area of agreement, blue ice).Bandwidths corresponding to blue, green, and red are indicated in respective colors, near infrared is indicated in orange.Histograms of values per band are provided in Figures S12 and S13 in Supporting Information S1.(a) Definition of areas of agreement and areas of disagreement between our outlines and the blue ice area outlines of Hui et al. (2014).(b) Based on the generated MODIS composite (Section 2).(c)Based on a set of processed Landsat scenes used for LIMA(Bindschadler et al., 2008) and the outlines ofHui et al. (2014).In areas where scenes overlap, we have taken the average values.Points south of 82.5°S, where no Landsat data are available, are discarded.

Figure 3 .
Figure 3. Blue ice area (BIA) prediction values, estimated uncertainties, radar backscatter (Crevier et al., 2010; MacDonald, Dettwiler and Associates (MDA), 2014), and Landsat satellite images (false color, Bindschadler et al., 2008) in selected regions (indicated in italic titles).The Landsat satellite images have not been used as input for the convolutional neural network (no data >82.5°S).Estimated BIAs are drawn as black outlines.(a) High uncertainties related to persistent snow cover through which radar penetrates (e.g., Jezek, 1999), resulting in low backscatter values.(b) Elevated uncertainties correspond to wind scouring zones (green outlines) that have been identified by Das et al. (2013) through an empirical model that considers the surface slope, modeled wind and accumulation.(c) Surface lakes on the Nivlisen ice shelf(Dell et al., 2022) lead to low radar backscatter values and high uncertainty estimates, while Landsat imagery shows snow cover.(d) Blue ice formed on the lee side of a barrier that prevents the deposition of wind-drifting snow, where varying wind patterns result in larger uncertainties further downstream of the nunatak(Bintanja, 1999;Takahashi et al., 1992).(e) Elevated uncertainties related to crevasses within a BIA.

Table 1
Averaged Performance Metrics of Five Convolutional Neural Networks (in Percentage), With Indicated Standard Deviations