Assessment of ICESat-2 Sea Ice Surface Classification with Sentinel-2 Imagery: Implications for Freeboard and New Estimates of Lead and Floe Geometry

high-resolution surface elevation profiling across the entire globe, including the sea ice cover of the Arctic and Southern Oceans. For sea ice applications, successfully discriminating returns between sea ice and open water is key for accurately determining freeboard (the extension of sea ice above local sea level) and new information regarding the geometry of sea ice floes and leads. We take advantage of near-coincident optical imagery obtained from the European Space Agency (ESA) Sentinel-2 (S-2) satellite over the Western Weddell Sea of the Southern Ocean in March 2019 and the Lincoln Sea of the Arctic Ocean in May 2019 to evaluate the surface classification scheme in the ICESat-2 ATL07 and ATL10 sea ice products. We find a high level of agreement between the ATL07 (specular) lead classification and visible leads in the S-2 imagery in these two coincident images across all six ICESat-2 beams, increasing our confidence in the freeboard products and deriving new estimates of the sea ice state. The S-2 overlays provide additional, albeit limited, evidence of the misclassification of dark leads due to clouds. Dark leads are no longer used to derive sea surface and thus freeboard as of the third release (r003) of the ICESat-2 sea ice products. We show estimates of lead fraction and more preliminary estimates of chord length (a proxy for floe size) using two metrics for classifying sea surface (lead) segments across both the Arctic and Southern Ocean for the first winter season of data collection.

Arctic sea ice heights and freeboards (October 2018 to March 2019) was presented in Kwok et al. (2019b). Sea ice thickness has been estimated from these freeboards using external snow loading estimates from the NASA Eulerian Snow on Sea Ice Model (NESOSIM) and modified versions of the Warren climatology  and through the combination of total (ice plus snow) freeboard from ICESat-2 with estimates of ice freeboard from the CryoSat-2 radar altimeter to derive snow depth and freeboard concurrently (Kwok et al., 2020a). The surface heights and freeboards were assessed against coincident airborne elevation data collected by NASA's Operation IceBridge (Kwok et al., 2019a). The along-track heights generated from the Airborne Topographic Mapper (ATM, Studinger, 2014) flown on OIB showed strong agreement with ATL07-derived heights, while the derived freeboard from ATM showed only small (0-4 cm) differences with ATL10 freeboards, depending on the method used to derive and compare freeboard. Coincident imagery obtained during the OIB surveys by the Continuous Airborne Mapping By Optical Translator (CAMBOT) alluded to erroneous classification of "dark leads" in the presence of clouds (Kwok et al., 2020b). A dark lead classification is associated with a dark surface and relatively low photon rates unlike the high photon rates associated with specular leads (described more in Section 2.1). The dark lead segment classification is still included in the latest (r003) ATL07 sea ice height and surface classification product release, but these have now been excluded from the derivation of freeboard in the latest (r003) ATL10 product release (the ATL07 and ATL10 products are described more in the following section). This change remains in-place for the upcoming r004 data release. The Kwok et al. (2020b) study highlighted the significant utility of coincident high-resolution imagery for better understanding the performance of the ICESat-2 sea ice surface classification algorithm. The assessment, however, was hindered by the high consolidation of the ice pack in the region profiled during the spring 2019 OIB campaign, limiting the number of leads/open water segments and the quality of the freeboard estimates due to the lack of reliable sea surface tie-points.
The main objective of this study is to further assess the performance of the surface classification scheme in ATL07 for discriminating between sea ice and open water. We take advantage of near-coincident optical imagery from the Sentinel-2 satellite mission to assess this specific aspect of the ATL07 algorithm performance. Successfully discriminating returns between sea ice and open water has benefits beyond deriving freeboard. Additional metrics for understanding the sea ice state include concentration, lead fraction (or density) and floe size. In the marginal seas, where waves can break up and fracture the ice (Horvat et al., 2020, and references therein), ice floes are thought to be smaller and less concentrated/consolidated (a higher lead fraction) than within the pack ice. The geometry of floes and leads is an important control on the strength of sea ice and its thermodynamic response to forcing (Feltham, 2005;Horvat et al., 2016), particularly in marginal, coastal, and seasonally ice-covered seas. The floe size distribution (Rothrock & Thorndike, 1984) is increasingly being introduced into sea ice components of climate models (Bateson et al., 2020;Roach et al., 2018), but only limited basin-scale observational constraints exist to-date, for example, estimates derived from ESA's CryoSat-2 radar altimeter for the Arctic Ocean only (Horvat et al., 2019). ICESat-2 offers the exciting potential to provide new observational estimates of the floe size distribution, benefiting from the small footprint, high precision and along-track sampling rate across the multiple beams. Satellite tracks (e.g., those from ICESat-2) profile ice floes across random/unknown cross sections of the floe. The along-track cross section of a floe measured by satellite is commonly referred to as a chord length. A collection of floe chord lengths can provide statistics of floe geometry, for example, moments of the floe size distribution and the open water fraction, under certain assumptions about the underlying floe geometry (Horvat et al., 2019).
The coincident S-2 scenes provide the ideal means for assessing new lead fraction and chord length estimates from ICESat-2 data, complementing the freeboard/thickness estimates already being generated. We choose to discuss lead fraction instead of sea ice concentration as concentration/extent was discussed in Horvat et al. (2020) and an assessment/comparison with the commonly used passive microwave-derived concentrations (taking into account the significant sampling differences) is beyond the scope of this study. et al., 2020c) and the recent changes to this algorithm for Release 003 (r003) are discussed more in Kwok et al. (2020b), so here we provide only a basic overview of the methodology relevant to the results/discussion presented in this study.
ATL07 is generated by aggregating 150 photons from the ATL03 geolocated photon product (Neumann et al., 2019) independently along each of the six beams. The beams are arranged in "strong" and "weak" beam pairs with each beam pair separated by ∼3.3 km in the across-track direction and the strong/weak beams separated by ∼90 m across track and ∼2.5 km along track. A surface finding routine (ATL07/10 ATBD r003, section 4.2) first windows the photon heights around an expected sea surface then extracts a best-guess Gaussian height distribution (convolved with the expected system response) to the photon height histogram to determine (1) a single segment height, (2) an associated quality flag based on the goodness of fit, and (3) associated metrics including photon rate. The heights are expressed relative to the WGS84 ellipsoid with the mean sea surface (MSS) and various time-variable geophysical corrections removed: ocean tides, solid earth tides, ocean loading, solid earth pole tides, inverted barometer. Surface-classified height segments are produced for each of the six beams (three strong and weak pairs) independently. The photon rates of the strong beam are roughly four times higher than those of the weak beam, which results in mean segment lengths of ∼15 m for the strong beam and ∼60 m for the weak beam (Kwok et al., 2019b). Adding the individual laser footprint size of ∼11 m to the segment length provides an estimate of the spatial resolution of the segments (i.e., a mean of ∼26 m × 11 m for the strong beam and 71 m × 11 m for the weak beam).
An empirically based decision-tree algorithm is used to discriminate the returns between sea ice and open water (Kwok et al., 2016). The empirical thresholds were determined principally by (1) data collected in campaigns prior to the launch of ICESat-2 by the Multiple Altimeter Beam Experimental Lidar (MA-BEL)-a test-bed instrument for ICESat-2 (McGill et al., 2013); and (2) postlaunch evaluation of the ATLAS performance. The three inputs to this decision tree (and the physical justifications) are as follows: 1. Photon rate (photon returns per laser pulse, the apparent reflectivity of the surface). Low photon rates are associated with water or very thin ice in open leads, which are generally darker than sea ice. However, specular and quasi-specular returns have been observed from smooth open water/thin ice surfaces in both ICESat and MABEL 2. Width of the Gaussian fit to the photon height distribution (the small-scale surface roughness). Used to partition the surface types into smooth/rough categories 3. Background photon rate. Deviations compared to the photon rate indicate shadows, specular reflections, and/or atmospheric contamination The result of the decision tree determines the radiometric surface type which includes the following classifications (denoted by the height_segment_type flag): cloud (0), snow covered, gray or rough ice (1), specular leads (2-5), smooth dark leads (6-7), and rough dark leads (8-9). This is considered the winter-time decision tree. As discussed in section 1, in the latest ATL10 data release (r003), dark lead segments (height_segment_type = 6-9) have been dropped from the derivation of the local reference sea surface height (as detailed below) due to possible issues with cloud attenuation, so now only specular leads (height_segment_type = 2-5) are used for deriving the reference sea surface height and freeboard (Kwok et al., 2020b).
Due to the expected uncertainty in this radiometric surface classification approach (Kwok et al., 2016), a further filtering is applied based on the local height distribution: 4. Local height filter, based on the distribution of local (10 km) smooth height segments (h sm )-segments with a Gaussian width < 0.13 m. Specifically the lead segment height must be between the minimum of the smooth heights (h sm_min ) and the maximum of either the second percentile of the smooth heights or h sm_min + 2 σ where σ = 2-3 cm, the expected uncertainty in surface height over smooth surfaces The result of the radiometric decision tree and local height filter sets the sea surface height flag (ssh_flag, 0 = sea ice, 1 = sea surface). Sea surface segments are considered candidate lead segments for deriving the reference sea surface height and thus freeboard in ATL10. Note that the summertime (non-winter) decision tree simply extends the classification of ice, specular leads and dark leads as potential melt ponds (not a feature of this analysis).
Freeboards are calculated in ATL10 in 10 km along-track sections based on a reference sea surface derived from the available lead/sea surface segment heights (ATL07/10 ATBD r003, Section 5.1). Briefly, consecutive sea surface segments are grouped together into individual "leads" (to reduce noise in the individual sea surface estimates) before a single reference sea surface estimate is produced for consecutive 10 km along-track sections that include at least one sea surface segment, for each beam independently. A basic interpolation/end-filling procedure is used to extend the reference sea surface estimates into regions where a sea surface segment does not exist. A filtering is also applied to ensure that the 10 km sections are at least 25 km away from the coast and have a concentration (from passive microwave data) > 50%-that is, within the pack ice and away from regions thought to be more affected by waves. The 10 km reference sea surface heights must also lie within a set height window relative to the MSS (±0.5 m for the Arctic and ±1 m for the Southern Ocean) and differences between consecutive sea surface heights must be relatively small (see the ATL07/10 ATBD r003, section 5 for more details). After the filtering, candidate lead segments (i.e., segments with ssh_flag = 1) used to generate a reference sea surface that also pass the above filters are set as lead segments (ssh_flag = 2) to make clear they have indeed been used to derive the reference sea surface height and thus freeboard in ATL10. Freeboard segments are then derived by differencing the sea ice segment heights from the local 10 km reference sea surface height. Negative freeboards are set to zero. More detail about the methodology is given in Kwok et al. (2020bKwok et al. ( , 2020c.

Sentinel-2 Imagery
Sentinel-2 is a constellation of two twin satellites, Sentinel-2A (S-2A) and Sentinel-2B (S-2B), operated by the European Space Agency (ESA) and launched in June 2015 and March 2017, respectively. The satellites host the MultiSpectral Instrument (MSI), which provides 13 reflective-wavelength bands in the wavelength region between 443 nm and 2,202 nm (visible, near-infrared, short-wave infrared). Depending on the band, spatial resolution varies between 10, 20, and 60 m, with the images covering an area of ∼110 × 110 km. For a given area, the shift by 180° between the two sun-synchronous polar orbits and a 290 km wide swath guarantees a revisit time of 5 days at the equator, which improves to one image per day at higher latitudes due to overlapping swaths with different viewing angles. Systematic global coverage of land surfaces and coastal waters (up to 20 km from the coast) is available between 84°N and 56°S. Additional imagery is collected near and over coastal regions of Antarctica, Baffin Bay and the North Atlantic (https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi/ revisit-coverage).
We searched for coincident S-2 and ICESat-2 data over sea ice covered portions of the Arctic and Southern oceans. Our search algorithm matched the footprint of all S-2 images with a nominal cloud coverage of less than 10% acquired during fall, winter, and spring months (fall 2018 to spring 2019) within the search regions (Arctic: September-May/Antarctic: March-November) with the ICESat-2 reference ground track (RGT) times and locations (data from https://icesat-2.gsfc.nasa.gov/science/specs). No images for winter months (e.g., December/January in the Northern Hemisphere) are available due to the lack of sun illumination. We searched for overlapping data pairs with an acquisition time difference of less than 2 h to mitigate for ice drift, resulting in 14 candidate S-2 images. This imagery subset was then checked for image quality, presence/absence of sea ice, actual cloud cover, and for the extent of the overlap with ICESat-2 data. Two optimal coincident images were found, (1) over the Lincoln Sea of the Arctic Ocean in May 2019 (i.e., the end of winter, time difference of 94 min) and (2) over the Western Weddell Sea of the Southern Ocean in March 2019 (i.e., the start of the austral winter, time difference of only 7 min), as shown in Figure 1. This Arctic Ocean scene occurs late in the winter/spring season, after typical dates of melt onset in the Arctic. However, the location is in the coldest/thickest ice regime of the Arctic and no surface melt is visible in the imagery. Visual inspection of these scenes showed clear coincidence between the imagery and the ICESat-2 data with no obvious issues of drift misalignment (no correction was applied). Half of the Southern Ocean image is contaminated by clouds, as is clearly visible in Figure 1. We thus split these into ∼50 km along-track "coincident profiles" for visualization purposes (two Arctic profiles and one Southern Ocean profile). Fifty kilometers was also a good balance between data coverage and ease of interpretation/visualization.
Note that manual drift alignment/correction was attempted for some of the other candidate images but the lack of notable features (e.g., leads and ridges) hindered our efforts. Manual drift correction was successfully applied in generating coincident summer S-2/ICESat-2 scenes, as shown in Tilling et al. (2020).

Classification Evaluation
To compare the IS-2 ATL07/10 data with the S-2 imagery, we use a simple nearest neighbor interpolation scheme (the nearest geocoded pixel of the imagery to the given ATL07/10 beam segment) to produce a coincident profile of S-2 surface reflectance. In particular, we use the red band, which is available at the highest 10 m spatial resolution and convert the Digital Number (DN) provided in the Level-2A data products to surface reflectance by dividing by the quantification value (DN/10,000). We compare these profiles qualitatively as the lack of perfect time coincidence with the imagery, together with the possible impact of ice drift and the contrasting resolutions of the data, make it challenging to carry out a more robust quantitative assessment.
PETTY ET AL.

Lead Fraction
To estimate lead fraction, we take the segment-length-weighted ratio of sea surface segments to the total length of valid segments in 10 km along-track sections across the combined three strong beams (we combine data from the three beams before sectioning the data). Note that the IS-2 measurements include three beam pairs spread over ∼6.6 km across track, so for this to be treated as a two-dimensional lead fraction estimate one needs to assume that the underlying ice is isotropic and homogenous over this 10 km × 6.6 km window, something we plan to test more in future work. The along-track lead fraction is calculated as: where s l is the segment length, all N is the total number of ice and lead segments in the given section (10 km along track in this study), sl N is the total number of lead segments (specular only in r003), and i is the height segment index.
The ATL07/10 sea surface classification approach is designed to be strict-i.e., possible leads are potentially thrown away by the height percentile filter (step 4 in Section 2.1) to reduce the likelihood of ice segments being erroneously used to derive a reference sea surface. This approach makes sense when one considers the high sensitivity of the freeboard estimate to errors in the reference sea surface height calculation. However, for ice floe/lead geometry analyses, such a strict filter could result in an underestimate of lead fraction (and an overestimate of chord length, discussed in the following section). We therefore adopt two approaches for deriving lead fraction: where we use the ATL07 specular lead classification (2 ≤ height_segment_type ≥ 5) together with a less strict 10 km local height percentile filter (20% instead of 2%, see step 4 in Section 2.1) to determine the sea surface segments. In summary v 1 f L is derived using the ssh_flag from the product, whereas v 2 f L is derived using a higher 20% local height filter. This is a relatively crude way of exploring the sensitivity of these lead estimates to the underlying classification algorithm which we will aim to expand on in future work.

Chord Length
To calculate chord length, we devised a simple algorithm that splits the along-track segment data into floe (or chord) groups based on the sea surface classification flag. The sea surface segments are discarded. Similar to lead fraction, we use two approaches here; v 1 l C utilizes the sea surface flag (ssh_flag) and v 2 l C uses the sea surface classification derived in this study using the 20% local height filter. In both cases a "floe" group needs to include at least three height segments and have a maximum spacing between consecutive segments less than 300 m. Three hundred meters is the upper end of the tail of the distribution of segment lengths for the strong beams (Kwok et al., 2020c) meaning a gap between segments greater than this value is very likely to be driven by the presence of anomalously high segment lengths (which are generally more uncertain) or a data gap. Data gaps between segments are caused primarily by atmospheric scattering (e.g., by clouds), which could result in erroneously high chord lengths, especially as clouds can form preferentially over leads (although this is still being debated, e.g., Li et al., 2020). We also discard any floe groups that are smaller than 60 m, which we take to be the minimum resolved chord length due to considerations of footprint size and segment length, and groups that are longer than 50 km. We add 11 m, the approximate photon footprint resolution, to all groups to finally derive estimates of chord length. We run this analysis for each of the three strong beams independently.
Note that we do not provide estimates of lead length in this study as estimates were provided in Kwok et al. (2019b) and this is thought to depend more on the resolution of the ATL07 data, so investigations focused on the underlying ATL07 algorithm may be needed to produce more reliable lead length estimates in future work. PETTY ET AL.

10.1029/2020EA001491
For both the lead fraction and chord length estimates, we bin the along-track data to a 25 km × 25 km polar stereographic grid (EPSG:3411 for the Arctic and EPSG:3412 for the Southern Ocean) using data collected from the first Arctic (November 1, 2018 to April 30, 2019) and austral (May 1, 2019 to September 30, 2019) winters.

Comparisons With Sentinel-2 Imagery
The S-2/ATL07 comparison for the ∼50 km profile (P1) within the western Weddell Sea on March 17, 2019 for strong beam 1 (gt1l in this backward spacecraft orientation) is shown in Figure 2. This cropped S-2 image (top panel) depicts a scene of mixed ice surfaces-large consolidated floes, small broken up floes, and the occasional lead opening. It also features very flat/thin ice for the first ∼20 km and thicker/ rougher ice (relative heights ∼2/2.5 m higher) for the remaining 30 km along track. The final ∼5 km of the scene features clouds, as can be seen more clearly in the zoomed-out Figure 1. The ATL07 radiometric surface classification scheme detected 218 specular lead segments and 19 dark lead segments for this scene, with 196 candidates (and indeed utilized) sea surface segments. The ATL07/10 lead classification shows strong agreement with the S-2 imagery. As most of the lead segments are specular, the photon rate shows corresponding spikes in these same locations, along with drops in the background rate (as expected from background photons scattering away from the detector over specular surfaces). The scene highlights the ability of ICESat-2 to detect both narrow (10s of meters) and wide (100s of meters) openings in the ice cover. Dark lead classified segments are produced at ∼8 km and especially at ∼45-50 km along track, with no leads visible in the imagery in both locations (although in the latter it is harder to assess due to the clouds). The presence of clouds in the last ∼5 km corresponds with a clear attenuation in photon rate around this part of the scene, similar to the Operation IceBridge comparisons given in Kwok et al. (2020b). The ∼8 km along-track dark lead is harder to diagnose, as this is associated with a sharp decline in photon rate, as one would expect from a lead, but no obvious lead or cloud is visible in the imagery. These comparisons benefit from the very small time difference (∼7 min) between the ICE-Sat-2 data and S-2 image acquisition. The derived lead fractions and chord lengths in this scene are also shown in Figure 2 (panel 6). The lead fractions are estimated as 1.38% (v 1 ) and 1.54% (v 2 ), while the mean chord lengths are estimated as 2.96 km (v 1 ) and 2.49 km (v 2 ). The remaining profiles using the strong and weak beams are given in the supplementary information ( Figures S1-S5). Summary statistics (lead classification counts, lead fraction and chord lengths) are provided in Table 1. No obvious degradation of the classification performance is visible in the weak beam profiles, although ICESat-2 generally detects fewer leads and lower lead fractions than with the strong beams.
The S-2/ATL07 comparison for the northernmost ∼50 km profile (P2) within the Lincoln Sea, Arctic Ocean on May 25, 2019 for the middle strong beam (gt2l in this backward spacecraft orientation) is shown in Figure 3. The S-2 image depicts a scene of large consolidated floes, small broken up floes/leads, but also two much larger (>1 km) lead openings. No clouds are visible in the imagery (or implied by attenuations in the photon rate). The heights extend from just over 0 to ∼2-2.5 m. The ATL07 radiometric surface classification scheme detected 416 specular leads and two dark leads for this scene, with 220 candidate (and again utilized) sea surface segments. The ATL07/10 lead classification again shows strong agreement with the S-2 imagery. It is encouraging to note that the refrozen lead of low height/freeboard at ∼10 km along track is not classified as a specular or dark lead. Only a slight decline in photon rate is observed here. At ∼15-20 km along track, the S-2 scene shows some small but highly consolidated ice floes, with some leads detected, highlighting the challenge of lead/floe detection in the more consolidated ice regimes. The lead fractions for this scene are estimated as 2.55% (v 1 ) and 6.37% (v 2 ), while the chord lengths are estimated as 1.82 km (v 1 ) and 1.99 km (v 2 ). The difference between the v 1 and v 2 estimates are higher, and we can see this is mainly driven by the inclusion (or absence) of the segments within the large lead openings. Including the extra sea surface segments by using a higher (20%) height threshold in v 2 increases lead fractions, as expected, however also results in the perhaps more counter intuitive result of increasing the mean chord length as the ice floe groupings remain too small to be classified as a floe and are simply discarded. The profiles of this image for the remaining strong and weak beams are given in the supplementary information (Figures S6-S10) and summarized in Table 1. We also include in the supplementary information the profiles from P3 (Figures S11-S16). We did not include these in the main manuscript as they generally featured fewer leads compared to the P2 comparisons. These additional beam comparisons show again a remarkably high level of agreement with the S-2 imagery in terms of the lead classification but some potentially missing lead classifications within narrower cracks/leads PETTY ET AL.  (ssh_flag, red = ice, yellow = sea surface); (second panel) surface reflectance calculated from the red band of the Sentinel-2 image from nearest neighbor pixels to the ICESat-2 profile; (panels 3-5) segment height, photon rate and background rate respectively from ATL07. Gray shading in panels 2-3 indicate candidate leads (ssh_flag = 1) while in panels 4-5, blue shading indicates specular lead classifications (height_segment_type = 2-5) and red shading indicates dark lead classifications (height_segment_type = 6-9). Panel seven shows the derived chord lengths as horizontal bars (red = v 1 , magenta = v 2 ) with each chord grouping shifted vertically to indicate the groups, and statistics of the mean chord lengths and lead fractions using these two approaches. Panel 8 shows the freeboard in ATL10, with the shading indicating actual leads used to derive reference sea surface (ssh_flag = 2). N indicates the number of segments in ATL07 and ATL10 as specified.
suggested by the S-2 image. The summary statistics are included in Table 1. Also noteworthy is that while the S-2 image in Figure 1 shows clouds around the northwestern section of P3, no indication of clouds can be seen in the ICESat-2 data (no photon rate attenuation), which is likely due to the higher time difference between the ICESat-2 data and S-2 image acquisition here (∼94 min).
In Figure 4, we zoom in further on the P2 profile to highlight the ICE-Sat-2 performance across this second large lead opening (at ∼41-43 km along track in Figure 3). The S-2 image, despite being coarse when looking at this scale, suggests the presence of young gray ice within this lead which is categorized as a dark lead in 2 of the ATL07 segments (note the drops in photon rate compared to the increase in photon rate in the specular lead classifications). This appears to be an accurate classification (they are of similar height to the surrounding specular leads, so could be used to derive the reference sea surface) and highlights the potential utility of the dark lead classification in ATL07/10. Another thing of note here is that due to the strict local height filter in the ATL07 algorithm (Step 4 in Section 2.1) a significant fraction of these segments are not classified as candidate lead segments (ssh_flag = 1). This results in some erroneous ice segments and small "chords" being included within the lead, biasing our statistics. These erroneous chords are removed when we move to the higher height percentile filter (v 2 ), resulting in an increase in chord length (0.74-0.91 km) and a doubling in the lead fraction (9.12-19.17%), albeit in this highly localized, lead profile.

Basin-Scale Assessments
To better understand the overall performance of the sea ice classification algorithm and to provide context for the basin-scale lead fraction/ chord length estimates shown next, we first show basin-scale maps of key lead classification metrics from ATL07 across the Arctic and Southern Ocean for the winter study periods (Arctic: November 1, 2018 to April 30, 2019; Southern Ocean: May 1, 2019 to September 30, 2019). Figure 5 shows the radiometric lead fraction compared to all (segment length weighted) segments, the ratio of radiometric specular lead segments to dark lead segments, the fraction of specular lead segments that become sea surface segments (ssh_flag ≥ 1) and the fraction of sea surface (lead) segments compared to all segments-that is, the lead fraction. The radiometric lead and sea surface lead segments generally follow the spatial pattern expected from our past knowledge of the sea ice state, increases in leads (declines in concentration) toward the ice edge. In both hemispheres dark leads make up a significant (∼50%) fraction of the total population of radiometric leads. In both hemispheres a significant (>70%) fraction of the specular leads pass the height filter and are assigned as sea surface segments. There is a clear increase in the fraction of discarded specular leads along the ice edge in both hemispheres, but also within the Canadian Arctic Archipelago. As ATL10 applies a stricter 50% concentration filter from passive microwave data to avoid wave contamination (ATL07 uses 15%), and a 25 km coastal mask, many of these regions adjacent to the ice edge (in both hemispheres) and near coastlines will not be processed into freeboard. Sea ice concentrations from the monthly (final and near-real time) NSIDC Climate Data Record (CDR, Meier, Fetterer, & Windnagel, 2017; averaged over this same time period across both hemispheres are given in Figure 6 for context.  Strong beams in this orientation are shaded gray. Columns 2-4: Number of specular lead segments: height_segment_type = 2-4, dark leads segments: height_segment_type = 5-9, sea surface segments: ssh_flag = 1 or 2 (all candidate lead segments become lead segments in these profiles).
Note. That no cloud segment classifications were found in these profiles, so they were not included in the Table. Columns 5-8: Lead fraction and chord length estimates using our two different (v 1 and v 2 ) metrics.

Table 1
Summary Statistics of the ICESat-2 Data in the Three S-2 Profiles (P1-P3 in Figure 1) Across all Six Beams especially around the ice edge. We also note the increased lead fractions in areas of known polynya formationthe North Water Polynya to the northwest of Greenland and, to a lesser degree, Terra Nova Bay Polynya in the northern Ross Sea. Neither are as visible in the passive microwave-derived concentrations ( Figure 6).  estimates to this one aspect of the ATL07 lead finding algorithm. Lead fractions within the pack ice of the Central Arctic are very low (<1%) which we discus more in the discussion/summary section.  chord lengths. The Arctic shows a sharp increasing gradient within the more central Arctic region. Chord lengths are high (consistently greater than 10 km) in the central Arctic, which we discuss more in the following section. Figure 9 shows the probability distribution of the individual chord lengths for the Arctic and Southern Ocean using the v 1 l C and v 2 l C chord length estimates. The data are plotted on a log-log scale to highlight the power law nature of the underlying distributions. The consistency of the distributions across hemispheres and lead finding algorithm (v 1 and v 2 ) is encouraging. The distributions become more variable at the tail of the distribution (>10 km) due to the lower sampling rate of these higher chord lengths. While log-log distributions can help visually highlight power law like behavior in empirical data, more robust statistical tests are needed to truly test for whether data is well characterized by a power law, especially due to the issues with the less well-observed tails of the distribution (Clauset et al., 2009;Horvat et al., 2019;Stern et al., 2018). We plan to explore this in future work as we compare against the chord length estimates from other observed data, as discussed below.

Discussion and Summary
The coincident Sentinel-2 (S-2)/ICESat-2 scenes provide crucial validation of the ICESat-2 sea ice classification procedure, with the caveat that these represent only a small fraction of the available sea ice data produced from ICESat-2 to-date. The specular lead classification in the sea ice products shows strong agreement with the imagery across all beams, while the reliability of the smaller quantity of dark leads PETTY ET AL.  found in these profiles was more questionable. The Southern Ocean profile provided further evidence to that presented in Kwok et al. (2020b) of their more uncertain reliability in the presence of clouds. However, the Arctic Ocean profile also showed two dark leads that appeared to be accurately classified (young, dark gray ice in the imagery). More coincident S-2 scenes across different sea ice regimes (time and space) would help provide further insight into the potential utility of the dark leads, especially as they make up a significant fraction (∼50-60%) of the total number of radiometric leads detected by the current ATL07 algorithm. As discussed in Kwok et al. (2020b), the exclusion of dark leads from the freeboard calculation has the downside of reducing coverage, especially in some of the more consolidated sea ice regimes, providing motivation for continued study to ensure that this exclusion is not overly restrictive.
It is interesting to note that the ATL07 segments over the relatively wide (>1 km) leads in our Arctic Ocean S-2 profile ( Figure 3) are generally classified as specular, except for the two dark lead segments over regions of young, dark gray ice. Intuitively we might expect this all to be considered a "dark lead" as the larger opening increases fetch and the potential for surface roughening. However, ICESat-2 segments are classified mainly as specular, until the photon rate drops drastically (to less than that over the snow-covered ice) over the small region of young gray ice. Similarly, the very thin cracks shown in the imagery (on the order of meters to tens of meters, that is, the segment resolution of the ATL07 data) were often classified as ice. It is unclear from the S-2 imagery (10 m resolution) the extent to which these leads/cracks have refrozen and thus how confident we can be in prescribing these segments as misclassified. Higher resolution imagery (e.g., from Digital Globe's Worldview imagery) could provide useful further insight here and will be explored in future work.
The local height filter included in the lead (sea surface) finding algorithm further limits the number of leads identified. This appears reasonable when the goal is accurately determining a reference sea level for freeboard but less ideal for deriving sea ice state information, for example, lead fraction and chord length. This study presented a crude approach to relax this height filter (increasing the percentile threshold from 2% to 20%), which is worthy of further investigation as we seek to increase the utility of the ICESat-2 sea ice data.
Validation, or at least comparisons, of our basin-scale ICESat-2 estimates of lead fraction and chord length with existing observational estimates are still needed. Basin-scale lead fraction estimates have been produced from various satellite sensors, for example, NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) (Hoffman et al., 2019;Willmes & Heinemann, 2016) and Advanced Microwave Scanning Radiometer (AMSR-E) (Röhrs & Kaleschke, 2012), and also ESA'S CryoSat-2 and Envisat radar altimeters (Röhrs & Kaleschke, 2012;Tilling et al., 2019). Chord length distributions have also been produced from a compilation of more sporadic satellite/ airborne imagery estimates (Stern et al., 2018). However, comparisons with these data are hindered by the considerable differences in spatial/temporal sampling, and also the different interpretations and definitions of leads across sensors and algorith discussed earlier, the PETTY ET AL.

10.1029/2020EA001491
13 of 17 lead classification algorithm in ATL07 appears to require clear sections of open water to trigger a specular lead classification. Therefore, newly refrozen leads, which may be defined as leads in other (generally coarser resolution) products, may be simply classified as ice. The radar altimeter-derived estimates of lead fraction and chord length from CryoSat-2 (Horvat et al., 2019;Tilling et al., 2019) provide arguably the most similar data set for comparison/assessment. The CryoSat-2 lead fractions and chord lengths shown in Tilling et al. (2019) appear consistently higher and lower, respectively, than the ICESat-2 results presented here. For example, Tilling et al. (2019, Figure 1), shows the percentage of CryoSat-2 waveforms classified as lead (a quantity related to, but not the same as lead fraction) as ∼10% in the Central Arctic but up to ∼60% in the more peripheral seas, while the chord lengths   Figure 2) are ∼3 km in the Central Arctic and ∼1 km in the peripheral seas. In contrast, the chord lengths presented in this study are consistently >10 km within the Central Arctic. There is better agreement on the general spatial pattern of increasing lead fractions/decreasing chord lengths in the peripheral seas, however. Radar altimeters such as CryoSat-2 are highly sensitive to the presence of specular leads within the radar swath and have the added benefit of being largely unaffected by clouds. However, the CryoSat-2 footprint is much larger (a pulse limited footprint of ∼400 m along track × 1.65 km across track, however off-nadir leads can still dominate returns from within the larger beam-limited across-track footprint of 15 km) than ICESat-2 (mean of ∼26 × 11 m for a given ATL07 strong beam segment). Future work will aim to better assess, and hopefully reconcile, the lead fraction and chord length estimates from these two missions, taking these significant sampling differences into account. We also hope to explore further statistical testing of the power law hypothesis for the floe length distribution, which was recently challenged in Horvat et al. (2019) using CryoSat-2 chord length estimates. This reconciliation is further motivated by the need to provide reliable observational constraints on the floe size distributions being incorporated into sea ice components of Global Climate Models (Horvat et al., 2019). Airborne data from NASA's Operation IceBridge campaigns, which combine high-resolution imagery and laser altimetry, should prove invaluable.
Our hope is that ICESat-2 sea ice data can be used to provide routine, and reliable, basin-scale measurements of lead fraction, ice concentration and chord length estimates, in addition to its primary mission requirement of delivering accurate estimates of sea ice height and freeboard. Additional ICESat-2 sea ice PETTY ET AL.

10.1029/2020EA001491
15 of 17 algorithm testing and development is needed to further improve the classification accuracy, which can be guided by additional comparisons with imagery (satellite and airborne) and other airborne and satellite data sets. The ICESat-2 lead classification algorithm utilizes fixed empirical thresholds (discussed in Section 2.1) which can easily be tuned/calibrated as needed, as can other elements of the algorithm including the height percentile threshold (which we simply explored in this study) and the photon aggregations. These thresholds also need to be better explored in terms of appropriate strong/weak beam differences. Efforts are also ongoing to provide further insight into the dark leads and their potential reintroduction by utilizing a new cloud filter in the sea ice algorithm (Kwok et al., 2020b). However, this assumes that the dark lead classifications unaffected by clouds are more reliable, which is still unclear. The results and discussion presented here also raise the issue of how best to classify newly refrozen leads or gray ice in the sea ice products, or even where exactly to draw the line between sea ice and open water-another key consideration as we seek to increase the utility of the ICESat-2 sea ice products.

Acknowledgments
The authors would like to thank the entire ICESat-2 project for their continued efforts in delivering and maintaining the high-quality sea ice data analyzed in this study. AP would also like to thank Susan and Brian for the child-care support they provided in this challenging time, enabling him to work on and complete this manuscript.