Intercomparison of passive microwave sea ice concentration retrievals over the high-concentration Arctic sea ice



[1] Measurements of sea ice concentration from the Special Sensor Microwave Imager (SSM/I) using seven different algorithms are compared to ship observations, sea ice divergence estimates from the Radarsat Geophysical Processor System, and ice and water surface type classification of 59 wide-swath synthetic aperture radar (SAR) scenes. The analysis is confined to the high-concentration Arctic sea ice, where the ice cover is near 100%. During winter the results indicate that the variability of the SSM/I concentration estimates is larger than the true variability of ice concentration. Results from a trusted subset of the SAR scenes across the central Arctic allow the separation of the ice concentration uncertainty due to emissivity variations and sensor noise from other error sources during the winter of 2003–2004. Depending on the algorithm, error standard deviations from 2.5 to 5.0% are found with sensor noise between 1.3 and 1.8%. This is in accord with variability estimated from analysis of SSM/I time series. Algorithms, which primarily use 85 GHz information, consistently give the best agreement with both SAR ice concentrations and ship observations. Although the 85 GHz information is more sensitive to atmospheric influences, it was found that the atmospheric contribution is secondary to the influence of the surface emissivity variability. Analysis of the entire SSM/I time series shows that there are significant differences in trend between sea ice extent and area, using different algorithms. This indicates that long-term trends in surface and atmospheric properties, unrelated to sea ice concentration, influence the computed trends.

1. Introduction

[2] Sea ice acts as a very efficient inhibitor on fluxes between ocean and atmosphere, generally by a factor of 100 to 1000 compared to open water [Moritz, 1988]. At the same time the albedo of sea ice is an order of magnitude larger than the albedo of open water [e.g., Brandt et al., 2005]. It is evident that accurate knowledge of even small open water areas within the consolidated sea ice is required to describe fluxes and radiative processes at the Earth's poles. Applications encompass climate oriented Coupled General Circulation Model (CGCM) studies as well as operational applications such as numerical weather prediction. Satellite passive microwave remote sensing allows the analysis of daily global sea ice concentration fields. Starting in the early 1970s with the Electrically Scanning Microwave Radiometer (ESMR), successive series of satellites have carried microwave radiometers. These provide comparable measurements at 6–37 GHz. The Special Sensor Microwave Imager (SSM/I) on board the Defense Meteorological Satellite Programme (DMSP) series has been operational since 1987 operating at 19–85 GHz. The Advanced Microwave Scanning Radiometer (AMSR) instrument continues this heritage of measurements in the <100 GHz range and the Special Sensor Microwave Imager Sounder (SSMIS) combine the <100 GHz channels with measurements for atmospheric profiling at 150 and 183 GHz.

[3] The sea ice concentrations derived from passive microwave imagery are affected by errors due to atmospheric absorption and emission, wind roughening over open water [Oelke, 1997; Andersen et al., 2006], as well as anomalous ice and snow emissivity [Wensnahan et al., 1993; Cavalieri, 1994; R. Tonboe et al., On the surface melt induced sea ice emissivity changes around Greenland and the impact on the computed ice concentration using radiometer algorithms, submitted to Remote Sensing Environment, 2007, hereinafter referred to as Tonboe et al., submitted manuscript, 2007]. The errors in the open water limit are relatively easy to assess owing to the known true ice concentration of 0%. Over sea ice, this is much more difficult as a reference ice concentration must be determined from high-resolution imagery or field observations. Comiso et al. [1997] report significant differences between two widely used algorithms, the NASA/TEAM and the Bootstrap. It is found that, on average, the Bootstrap algorithm gives ice concentration 10% higher than NASA/TEAM. The differences, when NASA/TEAM underestimates the ice concentration, are attributed to layering in the snow and young/thin ice types, which particularly affect the normalized polarization difference used by the NASA/TEAM algorithm. Conversely, cases where the Bootstrap algorithm ice concentration estimate is poor are suggested to be primarily owing to fluctuations in the snow and ice surface temperature. Markus and Cavalieri [2000] developed a revised version of NASA/TEAM compensating the effects of surface glace and layering. They report that it seems to resolve the deficiencies in both NASA/TEAM and Bootstrap mentioned by Comiso et al. [1997]. Belchansky and Douglas [2002] compare Bootstrap and NASA/TEAM to Okean SLAR and Radarsat SAR data in the Arctic and find evidence of large interannual variations. They also find that the Bootstrap algorithm tends to be the more seasonally stable. Meier [2005] extends his comparison to AVHRR data over the Arctic marginal seas to include the CAL/VAL and NASA/TEAM2 algorithms and makes the same finding. From the total data set, he concludes that there are significant differences in bias but not in error standard deviation between the algorithms. There are, however, large day-to-day and scene-to-scene variations. He argues that the prime cause of error standard deviation may be the inability to correctly resolve the mixture of different surface types within the coarse SSM/I footprint. Hanna and Bamber [2001] find that an implementation of the Bristol algorithm, in the Antarctic, with variable tie points is better than the NASA/TEAM algorithm but on par with the Bootstrap algorithm compared to AVHRR data. Finally, Kwok [2002] compared openings in the Arctic sea ice, derived from consecutive SAR images, to NASA/TEAM and Bootstrap sea ice concentrations and found significant disagreement in both overall statistics and spatial structure of openings. Similarly, Rayner et al. [2003] found that considerable corrections were necessary to reconcile SSM/I based ice concentration time series with each other and with information from navigational ice charts. Agnew and Howel [2003] extensively compared navigational ice charts with results from NASA/TEAM and found large discrepancies in ice area of up to 44% in summer, whereas during ice consolidation in winter the differences were generally less than 10%, increasing toward spring.

[4] In the present study, we take into account all the above algorithms and construct a reference database, with a balanced seasonal and geographical coverage of the Arctic. Moreover, we will concentrate on the high-concentration regime because errors over low ice concentration surfaces are primarily caused by atmospheric effects, which can be adequately described by radiative transfer models [Oelke, 1997; Kern et al., 2003; Andersen et al., 2006] as well as smearing effects due to the large footprint [Meier, 2005]. The significance of the atmospheric effects on the estimate of total ice concentration decrease linearly with increasing ice concentration for most algorithms, to become negligible at 100% ice coverage [Oelke, 1997; Andersen et al., 2006]. At high ice concentrations variations in ice and snow emissivity become significant and are currently not fully understood. Intercomparison of algorithms is one way to acquire knowledge of the natural variability.

[5] In the following two sections, we will introduce the reference data and the SSM/I ice concentration algorithms including pertinent issues. In section 4 we present the results from the comparison studies and conclude the paper with discussion and conclusions in section 5.

2. Data

2.1. Reference Ice Concentration Data

[6] In this study we apply ship observations, the Radarsat Geophysical Processor System (RGPS) and classified SAR imagery to provide independent references to intercompare passive microwave sea ice concentration retrievals. The different characteristics of each observation type, will be presented in the following.

[7] Visual observations of sea ice from ships can be very accurate. Difficulties mainly arise from the grazing viewing geometry that limits the perimeter around the ship for which the observation is valid. The inexperienced observer will often tend to overestimate ice concentrations because open water tends to be hidden behind the sea ice topography. In contrast to virtually all other observation methods, problems of ice type confusion are typically not an issue. In this work, we use ice diaries taken from two cruises to the Arctic by R/V Polarstern. One, is the ARK 17/2 summer cruise that took place in August and September 2001 [Haas and Lieser, 2003], the other is the Cryovex winter season cruise, taking place in March and April 2003 [Lieser, 2005]. Both cruises entered through the Fram Strait, and the summer cruise penetrated to the North Pole, whereas the winter cruise was limited to an area North of Svalbard; see Figure 1.

Figure 1.

Distribution of the ice observations (red dots) from onboard R/V Polarstern during (left) the cruise ARK-17/2 in August/September 2001 and (right) the cruise ARK-19/1 in March/April 2003. Also shown is the ice concentration calculated using the N90 algorithm for 6 September 2001 and 2 April 2003, respectively.

[8] The RGPS is described by Kwok and Cunningham [2002]. It exploits the fact that any given area within the reception perimeter of the Alaska SAR Facility is generally covered once every 3 days and from Tromsø Satellite station every 6 days. This includes the major part of the Arctic interior and enables the extraction of sea ice motion fields by tracking features in sequential overlapping SAR scenes. These motion vectors are subsequently used in a Lagrangian scheme that computes the cell area change, divergence and numerous derived quantities such as ice age spectra. Owing to the wetness of the sea surface and the following disappearance of traceable features, the RGPS is restricted to operate during the winter season from November through April. In direct comparison with passive microwave data, one potential shortcoming is the uncertainty of when inside the 3-day time window, a given divergence episode occurred and how thick the ice has potentially grown in the exposed open water area. According to the study by Alam and Curry [1998] ice thicknesses up to about 30 cm can form within 3 days by congelation (thermodynamic growth). With increased wind, the growth takes place in the frazil regime and the rate is initially much faster. Yet, it returns to the congelation regime when the ice has piled up to fill the opening. After that, growth continues by congelation and the thickness after 3 days is not much different. It is important to note that, in radiative terms, this ice will belong to the new or young ice category, mainly owing to the absence of snow [Cavalieri, 1994].

[9] Another independent type of sea ice reference data is obtained from C band SAR data classified using a neural network method similar to that of Kern et al. [2003], extended with optimum texture feature sets determined as described by Bøvith and Andersen [2005]. Experienced ice analysts of the Greenland Ice Service at the Danish Meteorological Institute carried out selection of training data and assessment of the resulting classification. Nonunique signatures of different surface types in SAR imagery typically preclude the correct classification throughout an entire scene. Most frequently, the backscatter over wind-roughened water are confused with ice and thin, low backscattering ice types are confused with open water [e.g., Shokr, 1991; Gill, 2001]. We refer to this complex of nonresolvable classification difficulties as ambiguities and scenes, or portions thereof, affected by it as ambiguous. We accept these classification shortcomings, and rely on the expertise of the ice analysts to mask out misclassified areas putting emphasis on the correct classification of the sea ice covered area. The result of combining the manual and automated inputs is regarded as a highly detailed ice chart. The ambiguities between low backscattering ice and smooth open water are a source of uncertainty. It is attempted to consistently classify such low backscatter surfaces as open water, so that a systematic bias toward overrepresentation of open water can be assumed. The resulting classified SAR image is collocated with SSM/I data and ice concentration in a given SSM/I pixel overlap region can be derived as the simple ratio of SAR pixels classified as ice to the total number of SAR pixels. We apply the method at both 25 and 12.5 km resolution consistent with the SSM/I low- and high-resolution channels. Radarsat ScanSAR wide scenes are averaged by a factor 2–100 m pixel spacing, whereas Envisat ASAR wide-swath mode scenes are retained at their nominal resolution of 75 m. This yields more than 60,000 SAR pixels within a 25 km resolution SSM/I pixel and assures ample precision in the resulting concentration estimate. SAR scenes were ordered in 2003 and 2004 from locations well inside the ice edge and distributed geographically. A total of 68 scenes were analyzed and 59 scenes were found to be useful after the classification and masking. Of the 9 rejected scenes, the reason for rejection was either that the proportion of the scene covered with sea ice was small or that there were significant features that could not be unambiguously identified. The seasonal coverage of the data set can be assessed from Figure 2. In general, all seasons are reasonably well represented, when variations in ice extent are taken into account. The ice concentrations found in the wintertime SAR data are very high (see Figure 3) with ice concentrations above 90% being more abundant by at least 2 orders of magnitude. The few low-concentration sea ice observations are found only during summer. The geographical coverage is illustrated in Figure 4, with illustration of the various subsets defined in the analysis to follow. In general there is a reasonable geographical balance, perhaps with a slight overrepresentation in the Laptev Sea during winter. It should be noted that we have applied a very conservative definition of winter and summer. Winter is the period from 31 October till 31 March, while summer is defined as the period including June through September. This explains why the summer subset could seem to have a relatively thinner coverage. To facilitate more precise investigations, a subset of particularly unambiguous scenes was defined, shown in Figure 4d. This data set essentially consists of circumpolar Radarsat ScanSAR wide-swath stripes obtained in the months of November, January, March, and April. To ensure the highest consistency, the same ice analyst processed this data set in one batch. Subsequently, it was examined by independent experts, and four scenes were rejected owing to possible thin ice or open water ambiguities.

Figure 2.

Seasonal distribution of synthetic aperture radar (SAR) data used in this study.

Figure 3.

Distribution of ice concentrations in the SAR data.

Figure 4.

Location of SAR scenes in (a) the total data set, (b) the winter subset, (c) the summer subset, and (d) the winter high-quality subset of particularly consistent scenes.

2.2. Accuracy Considerations

[10] It is difficult, if not impossible, to provide accurate error estimates of the different reference data sets, described above. It would be interesting to compare the SAR derived data to ship observations. However, this turns out to be impractical as each SAR scene would result in only one match against a ship observation and present-day SAR coverage will give less than one match per day. Generally, ice charts, as produced by ice operators, are widely held as the best available data set of the ice edge, and they are regularly validated by mariners, who are critically dependent on a correct classification. It should be noted that the ice charting community have the most stringent requirements to detail and accuracy and have commonly left airborne reconnaissance in favor of satellite SAR observations and that passive microwave imagery is used only for strategic navigation requirements [Bertoia et al., 2004]. Still, some indication of the uncertainty in the SAR derived ice concentrations was obtained through classifying the same scene by two independent analysts [Bøvith and Andersen, 2005]. It was found that the classifications into sea ice and water agreed within 2.1% between the two classifications. It is unclear whether this is a representative estimate of the error in the SAR classification method. In some scenes over high-concentration sea ice, the ice concentration variability is clearly less than the above 2.1% and the accuracy of the SAR classifications is therefore bound to be better. Interestingly, this means that we can put an upper limit to the errors in the SAR concentrations, by defining subsets with more or less stringent quality control.

[11] In ship observations and RGPS data it is even more difficult to quantify the uncertainty. However, the reference data sets are independent. In particular, (1) the RGPS data set provides a kinematic estimate of the ice concentration variability and is mainly affected by uncertainty in the refreezing of the sea ice during the 3-day window between observations, (2) the ship data are based on visual inspection mainly affected by representation errors between the large SSM/I footprints and the observed perimeter around the ship, and (3) the classified SAR ice concentrations are mainly based on microwave surface scattering and misclassification of thin ice as open water is possible.

[12] In addition, the difference in variability of SSM/I ice concentration algorithms, based on different channel combinations, can be considered to mainly reflect a random component that is not related to real ice concentration variation and gives an indication of the range of uncertainty in the estimated ice concentrations. Thus the conclusions that we make in the following are based on a chain of logical reasoning, with the SAR ice concentrations as the most significant evidence while the remaining observations (ship, RGPS, and interalgorithm differences), by reasonable agreement, act as support.

2.3. SSM/I Data

[13] The SSM/I instrument is described by Hollinger et al. [1987]. Two kinds of SSM/I data are used in this study; namely daily gridded DMSP F-13 SSM/I data [Maslanik and Stroeve, 1990] and level 2 swath data from the DMSP F-13, F-14, and F-15 satellites obtained from Fleet Numerical Modeling and Oceanography Center (FNMOC). The swath data from FNMOC have been known to include sporadic large errors in geolocation. However, Ritchie et al. [1998] show that except for these cases, which affect 0.5% of the data, the geolocations compare well with independently processed data from Remote Sensing Systems. The geolocation perturbations are rare and large (4–8 scans), we use a filter on the distance between scans to flag and remove them.

[14] The swath data are used when the temporal gap between reference and SSM/I data is considered important. The intercalibration study by Colton and Poe [1999] showed the F-8 to F-14 SSM/I instruments to agree to within the uncertainty of the applied methodology. Stroeve et al. [1997] performed an analysis of retrieved ice concentration through the 216 days overlap (May–September 1995) of the F-11 and F-13 SSM/I instruments. In the Central Arctic, monthly biases between 0.03% and 0.22%, monthly standard deviations of the difference between 0.76% and 0.98% and constant correlation coefficients of 0.999 were found. Larger errors were found near coasts, over the open ocean and in the marginal ice zone. In general the statistics deteriorated in late summer. H. Schyberg (unpublished data, 2003) in a more recent study compared the FNMOC data from DMSP F-13, F-14 and F-15 as used here and found no significant systematic differences in derived ice concentrations. Therefore we consider the smaller differences in the work of Stroeve et al. [1997] to be representative, or even conservative, estimates of the error of mixing the SSM/I data from different satellites.

3. SSM/I Ice Concentration Algorithms

[15] A set of seven passive microwave sea ice algorithms is considered in the following. We will only provide a brief description and refer to the original reference for the full details. All algorithms, their acronyms and the channel combinations used are given in Table 1. Channels are referred to in terms of frequency in GHz and polarization, which is either vertical or horizontal; the channel measuring the brightness temperature at 19 GHz vertical polarization, is thus 19 V.

Table 1. Overview of the Ice Concentration Algorithmsa
AcronymAlgorithmsChannels UsedTie Points ReferenceReference
  • a

    Special Sensor Microwave Imager (SSM/I) channels are referred to by frequency and appended “V” or “H” signifying vertical and horizontal polarization, respectively.

BRIBristol19V, 19H, 37V, 37HComiso et al. [1997]Smith [1996]
CFBootstrap frequency mode19V, 37VComiso et al. [1997]Comiso [1986]
CPBootstrap polarization mode37V, 37HComiso et al. [1997]Comiso [1986]
N90near 90 GHz algorithm85V, 85HSpreen et al. [2007]Svendsen et al. [1987]
NTNASA TEAM19V, 19H, 37VComiso et al. [1997]Cavalieri et al. [1984]
NT2NASA TEAM219V, 19H, 37V, 37H, 85V, 85HMarkus and Cavalieri [2000]Markus and Cavalieri [2000]
TUDTechnical University of Denmark hybrid19V, 37V, 85V, 85HPedersen [1998]Pedersen [1998]

[16] It is noted that the Bootstrap algorithm is considered in its separate modes rather than following the exact specification, where a switch between modes is envisaged such that the polarization mode (CP) is applied over consolidated ice and the frequency mode (CF) is used near the ice edge and open water. This is thought to offer a clearer insight into the characteristics of the algorithm and allows one to test the usefulness of the separate modes. The TUD algorithm is a hybrid approach that combines the CF with the 85 GHz polarization difference. At concentrations below a threshold (20%), it will return the CF result exclusively, whereas at higher ice concentrations the result is the square root of the product of the CF algorithm and the scaled 85 GHz polarization difference.

[17] In general, all algorithms may return ice concentrations above 100% and below 0%. This is because the algorithms apply reference sea ice emissivities to span the range between 0 and 100%. These tie points are generally defined as emissivities of pure ice and open water surfaces. In practice the tie points are defined from analysis of satellite data and given as brightness temperatures. A retrieval of, for example, 110% is understood as a mixture of 110% sea ice of tie point radiative properties and −10% open water. Wherever possible, we have applied the most recent tie points provided by the authors of a given algorithm; Table 1 provides the pertinent references. Many of these tie points have been tuned to daily average brightness temperature data and our use of swath data may therefore introduce a slight inconsistency that may minimally affect the error standard deviations, but not the correlations, in subsequent comparisons. This possible inconsistency must be weighed against the possible large error from an increased time offset between SSM/I and reference observation. In addition, the diurnal variation in tie point emissivity during winter is minimal and the comparison statistics will only be slightly affected by a minor change in tie points. It is common to limit the range of the algorithms to the physically meaningful range in a postprocessing step, which makes good sense for most users of sea ice concentrations. However, “saturated” ice concentration estimates will have reduced sensitivity to real openings in the ice and this practice additionally complicates comparisons of algorithm statistics as part of the true bias and variance are hidden in the cut off portion of the retrievals. In the present study we avoid all such postprocessing with the exception of NT2 and N90. For N90, the problem is that the smooth interpolation between the ice and water points is only strictly valid inside the 0 to 100% concentration interval. In fact, outside the interval and depending on the tie points, the concentration is not generally monotonously increasing with decreasing polarization [Spreen, 2004]. Our choice of tie points results in a well-behaved characteristic in the domain of interest in this study. However, to avoid errors, pixels where the polarization falls below the ice tie point and the retrieved concentration is below 100% are discarded. This error is only found in very few cases. For the NT2 algorithm tie points are integrated in tables of simulated brightness temperatures that are used in a minimization scheme to find the combination of sea ice and atmospheric contributions matching the satellite observations. Since only solutions for ice concentrations up to 100% are allowed, the variability of the NT2 algorithm is not directly comparable to the remaining algorithms. We have therefore extended these tables to enable solutions to be found in the range between 0 and 120%; we name this version of the algorithm “unconstrained” and refer to it as NT2U henceforth. Owing to the dynamic scheme used to compensate the atmospheric contribution, it is more difficult to account exactly for the consequences of this change. However, the resulting ice concentrations are generally below the hard threshold of 120%, and the fields show a spatial coherency, which suggest that the scheme is still stable. We give both the original and unconstrained results for reference.

[18] Errors in retrieved sea ice concentration may generally be attributed to emissivity variations, atmospheric emission, and extinction as well as general sensor noise and errors caused by smearing due to the coarse spatial resolution of the sensor. In winter, physical changes in the ice and overlying snow layers control variations in surface emissivity. In general, inhomogeneities such as layering in the snow will cause a larger response in the horizontally polarized channels, leading to increased polarization differences. Other effects may derive from atypical coarse snow grains leading to increased volume scattering that tend to affect the spectral gradient rather than the polarization. We adopt the term surface effects from Markus and Cavalieri [2000] but extend it to refer to all the effects in the snow/sea ice system that affect the spectral and polarization characteristics of sea ice emissivity. Thin ice types also display a high polarization signature, but generally a normal spectral gradient, and the potential result of both thin ice and layering effects are reduced ice concentration estimates for all algorithms that use horizontally polarized channels [Wensnahan et al., 1993]. Thin ice affects heat fluxes between atmosphere and ocean significantly; surface effects generally do not. This represents an important distinction between surface and thin ice effects. The surface and thin ice effects affect the algorithms differently; for example, the Bootstrap formulation tends to be less sensitive to thin ice and layering effects with a polarization signature [Comiso et al., 1997] because it will switch to vertically polarized channels (CF mode) in the presence of significant amounts of these surface types. On the other hand, the CF mode will be sensitive to anomalous volume scattering characteristics in the snow and sea ice [Tonboe et al., 2006]. Owing to varying penetration depths at different frequencies, the emissivity response to snow and ice variations is also highly dependent on the depth at which the variations occur. In general, the low-frequency channels will respond more to changes in the sea ice, whereas the 85 GHz channels barely penetrate the typical snow cover. A homogeneous snow cover tends to decrease polarization and raises the emissivity [Campbell et al., 1978] and constitutes the most important part of the typical winter first year ice signature variability. Although it has not been reported, this mechanism can be conceived to modify the sensitivity to thin ice, as at higher observing frequencies, a thinner snow layer may mask the signal from the underlying thin ice. While the emissivity of snow and ice may be modeled depending on microphysical parameters, these changes are not fully understood owing to the complexity and heterogeneity of naturally occurring sea ice and overlying snow layers. A detailed account of sea ice and snow emissivity modeling can be found in the work of Tonboe et al. [2006]. In summer, a multitude of additional effects related to, for example, snow wetness, melt ponds, and extensive melt and refreezing cycles affects the emissions from the sea ice surface. Most fundamentally, melt ponds cover the sea ice with water and are indistinguishable from open water. Their extent and depth is generally determined by sea ice topography [Eicken et al., 2004; Lüthje et al., 2006].

[19] Ice concentration retrievals are also affected by errors due to emission and absorption caused by atmospheric water vapor and cloud liquid water as well as emissivity variations of the open water due to roughening of the surface by wind. Sensitivities to atmospheric errors are analyzed in detail by Oelke [1997] and Andersen et al. [2006]. In general, it is found that algorithms not using the 85 GHz and horizontally polarized information tend to be less vulnerable to atmospheric variability. Ice emissivity errors inherently tend to affect the retrievals more at high ice concentrations while atmospheric errors tend to affect retrievals in the low-concentration range more owing to the general high emissivity of sea ice. In the following sections, we will deal with the high-concentration range exclusively and we will therefore not consider any methods of ameliorating the atmospheric influence such as weather filters or other external correction schemes unless explicitly stated.

4. Results

4.1. Algorithm Intercomparison

[20] In winter, it is possible by simple means to illustrate and quantify differences between the algorithm retrievals. The following calculations are based on daily gridded DMSP F-13 SSM/I data [Maslanik and Stroeve, 1990] in the cold season between 31 October and 31 March. We restrict the analysis to sea ice of high and stable concentration by means of a threshold on the daily ensemble mean concentration. A high threshold will select a relatively small area whereas a low threshold will select larger areas but also allow more ice concentration variability. A reasonable trade off is tentatively found at a threshold value of 90%, however results with other thresholds, for example, 80%, are not significantly different as the gradient toward the marginal sea ice zone tends to be sharp. Figure 5 shows concentration anomalies defined as

equation image

where A denotes the ice concentration anomaly, subscript i is a pixel index and subscript alg denotes one of the selected algorithms. Temporal averaging over a cold season is denoted by brackets and daily ensemble averaging is denoted by an overbar. The pixels that obey the condition of daily ensemble mean ice concentration larger than 90% for more than 90% of the cold season are enclosed in the blue contour in Figure 5. The second term of equation (1) therefore represents the seasonal average retrieved ice concentration over the typical perennial sea ice with secondary contributions from seasonal sea ice areas. This technique mainly removes bias between the algorithms and helps to compare them as well as to highlight the spatial structure of the retrievals. Figure 5a covers the season 2000–2001. For the NT algorithm, we find lower ice concentrations stretching from the Fram Strait area across the North Pole. This pattern is well known and was found by Tonboe et al., submitted manuscript (2007) to be related to snow metamorphosis where warm Atlantic air masses penetrate into the Arctic Ocean. Similarly, in accord with Comiso et al. [1997], we find the Bootstrap algorithm negatively biased in coastal areas such as North of Greenland and the Canadian Archipelago. The NT2 algorithm is limited to return concentrations of maximum 100%, which means that it gives near constant results over the consolidated sea ice. The much higher variability displayed by NT2U is the result of lifting this constraint. The relatively coherent behavior of NT2U may be noted as an indication that the algorithm is stable even in this extended solution range. Some similarities between algorithms exist, especially in the area extending from North of Svalbard toward the Fram Strait, where most algorithms consistently yield lower ice concentrations. The most striking difference in that area is found with the CF algorithm that even appears positively biased here. The area is known as a dynamic area and likely causes for the low-concentration retrievals may be occurrence of thin ice and/or layering effects arising through frequent cycles of melt and subsequent refreezing conditions (Tonboe et al., submitted manuscript, 2007). Interannual variability is significant; this is exemplified in Figure 5b, which shows the anomalies of the following season (2001–2002), where the effects North of Fram Strait are pronounced. Differences from the preceding season are also apparent in other areas such as the Beaufort and East Siberians Seas, where a positive bias is prevailing. To quantify these findings, statistics from all pixels of the season that obey the condition of daily ensemble mean ice concentration larger than 90% are presented in Table 2. The differences in standard deviation indicate that the algorithms respond differently to surface and atmospheric effects. The high sensitivity to atmospheric effects has often been held against the 85 GHz based algorithms [e.g., Lubin et al., 1997; Markus and Cavalieri, 2000]. In general, atmospheric effects tend to increase the retrieved ice concentrations [Andersen et al., 2006]. Here we find that those algorithms give relatively low variability and at the same time display the lowest concentrations near the ice edge, where atmospheric effects are most predominant. This indicates that surface effects and/or thin ice is the dominating source of the observed variations. Interannual variations are assessed from the four winter seasons and shows that especially during 2002–2003 there are lower average concentrations. However the NT algorithm is consistently showing higher variations from year to year, whereas TUD, N90, NT2 and BRI show least variability. Consistent with error propagation theory, the TUD algorithm, which is approximately computed as the square root of the product of the CF and N90 algorithms, displays standard deviations that are lower than its two input algorithms by a factor equation image. There are also variations in the standard deviations of the algorithms from year to year where in particular CF and BRI have low values during the 2000–2001 season. Comparing the mean values for NT2 and NT2U adds to the explanation of the widely different behavior noted in Figure 5. The unconstrained version, NT2U, has a significant positive bias. We speculate that this corresponds to bias in the sea ice tie points implicitly specified through the NT2 lookup tables. The variation around this “natural” mean value is truncated by the cut off at 100% in the official version of the algorithm, NT2, which consequently shows much lower standard deviations.

Figure 5.

Concentration anomalies for the cold seasons of (a) 2000–2001 and (b) 2001–2002. Algorithms used are indicated on the individual maps. See text for derivation.

Table 2. Statistics During the Cold Season of 31 October to 31 March of Ice Concentration Algorithms Inside an Area Delimited by Daily Ensemble Mean Concentrations Larger Than 90%a
  • a

    The amplitude rows show the difference between the extreme values for each algorithm. NT2U is a version of NT2 modified to allow estimates of ice concentration larger than 100%.

2001– 2002

4.2. Comparisons to RGPS Divergence Data

[21] Kwok [2002] compared open water fractions calculated from NT and Bootstrap to openings derived from the divergence of the flow field as measured by the RGPS for the period January through April 1998. It was found that both algorithms, and in particular NT, overestimated the fraction of open water. Furthermore, case studies revealed structural characteristics in the passive microwave record that appeared unrealistic. In order to examine possible algorithm-dependent differences, we compare gridded divergence fields from RGPS as retrieved from the RGPS website ( to changes in SSM/I concentration. The latter are calculated from daily gridded DMSP F-13 SSM/I data [Maslanik and Stroeve, 1990], between the beginning and end of the 3-day time window dictated by the SAR revisit time. The analysis is carried out for the entire 4-year period of RGPS data available at the time of the analysis and the results are shown in Figure 6. We confirm Kwok's observations and extend them to a wider range of algorithms in that widely varying and mostly low correlation coefficients are found between the fields of RGPS divergence and SSM/I concentration change. The analysis was also done on shorter time series of basic RGPS divergence fields with similar results. Owing to the low correlation values it is not possible to assess whether any algorithm or group of algorithms show more consistency with the RGPS fields. The RGPS data represent a kinematic constraint on the openings in the sea ice and consequent formation of new ice. The low correlation between the RGPS and passive microwave estimates reinforces the finding that the variations in sea ice concentrations over the consolidated sea ice cannot be translated directly into variations in openings in the sea ice cover. Furthermore, if thin ice was a dominating source of the ice concentration variability, we would expect the CF algorithm, with its low sensitivity to thin ice, to give correlations markedly lower than the other algorithms. The relatively uniform results for the different algorithms may suggest that surface effects are the dominant source of variability in retrieved ice concentration over the perennial sea ice of the Arctic Ocean.

Figure 6.

Correlations between Radarsat Geophysical Processor System (RGPS) divergence and corresponding ice concentration changes calculated by the following algorithms: (top) BRI, NT2, and NT, and (bottom) CF, CP, N90, and TUD. Each symbol represents the correlation of gridded deformations measured by RGPS over a 3-day period with ice concentration changes over the same 3-day period and on the same grid.

4.3. Comparison to Ship Observations

[22] Comparisons were made between SSM/I ice concentrations and ice observations from two R/V Polarstern cruises to the Arctic in summer and winter, respectively. For each Polarstern sea ice observation the SSM/I swath pixel closest in time and matching the geographic coordinates was selected. Using DMSP F13, F14 and F15 SSM/I data, the time difference was always quite small, that is, a few hours at most. The results are shown in Tables 3 and 4. In both cases, the ship has negotiated both high- and medium-concentration sea ice. Yet, the summer cruise extended far longer into the perennial sea ice, therefore seeing relatively more high-concentration sea ice. Consequently, error standard deviations in the range up to 17% are found with a tendency toward smaller values during summer. The bias values, as expected owing to surface wetness, become increasingly negative in summer and correlations are consistently reduced. In both summer and winter the algorithms based on 85 GHz information are among the best as determined from both correlation and error standard deviation, while of the low-frequency algorithms, the BRI and CP algorithms obtain the most favorable overall statistics. In summer, the correlations are found to be consistently lower for the lower-resolution algorithms. Gaussian smoothing was applied to the high-frequency algorithms to match the spatial resolution of the other algorithms. The difference is therefore not explained by the effects of open melt ponds modulated by differences in spatial resolution. However, refrozen melt ponds were frequently observed during the summer cruise. The surface fraction covered by the meltwater was about 40% on average [Haas and Lieser, 2003]. Multiple reflections at the layers (atmosphere, ice, and water) cause interference that influences the difference of the vertically and horizontally polarized signals, which is an important mean for calculating the sea, ice concentration [Liu et al., 1998]. In addition, snow grain sizes increase owing to repeated melt and refreeze cycles, observed frequently during the cruise and causing increased scattering [Mätzler and Wegmüller, 1987; Haas and Lieser, 2003]. We speculate that these effects are likely to be stronger for the lower-frequency signals owing to deeper penetration, thus the 85 GHz channels may be less affected.

Table 3. Comparison Between SSM/I Ice Concentrations and Observations From R/V Polarstern During the Cryovex Cruise, March–April 2003a
AlgorithmBiasError SDCorrelation
  • a

    The average concentration is 89.9, and the standard deviation (SD) is 18.5.

Table 4. Comparison Between SSM/I Ice Concentrations and Observations From R/V Polarstern During the ARK 17/2 cruise, August–September 2001a
AlgorithmBiasError SDCorrelation
  • a

    The average concentration is 85.2, and the SD is 12.6.


4.4. Comparison to Classified SAR Data

[23] From the total comparison data set, described in section 2, we initially analyzed the correlation between SAR based and SSM/I concentrations as a function of the SAR based concentration. The result is shown in Figure 7. In general, we obtain high correlations, around 0.9, at lower concentrations, giving justification to both methods. On the other hand, the correlations decrease significantly at high ice concentrations. Inspection of the color codes in Figure 7 shows that part of the reason for the poor correlation at high concentration is the near-constant SAR concentration. However, some scenes in the high-concentration regime have variability near 5–10% but consistently correlate poorly with SSM/I derived ice concentrations. Recalling that the SAR scenes are selected arbitrarily only to be well within the ice edge, this is in good accord with the results obtained from the comparison to RGPS as well as with the generic error estimates for SSM/I based ice concentration retrieval of approximately 5–10% [e.g., Steffen et al., 1992]. Comparison to SAR data is shown in Table 5. Note that for TUD and N90 we give the statistics based on the full 85 GHz SSM/I resolution as well as on data averaged to the SSM/I low-resolution channels. The latter signified by an appended “l” on the name of the algorithm. However, the differences between the high- and low-resolution results are small.

Figure 7.

Scene correlations between SAR- and Special Sensor Microwave Imager (SSM/I)–derived ice concentrations as function of average SAR scene ice concentration. Color is assigned according to the standard deviation of the concentrations within the SAR scene.

Table 5. Results of Comparison Between the Full Set of SAR Scenes and SSM/I Concentrationsa
 AverageSDNumber of Observations
  • a

    Synthetic aperture radar (SAR)–derived ice concentrations are derived at both 25 km (low resolution) and 12.5 km (high resolution) resolution consistent with the high- and low-resolution SSM/I channels. For the two high-resolution algorithms, TUD and N90, statistics are given on the basis of 12.5 km resolution data as well as on data averaged to 25 km resolution, denoted by an appended “l.”

SAR (low resolution)97.79.010084
SAR (high resolution)97.69.841012
 BiasError SDCorrelation

[24] Compared to the results from the ship observations, the SAR comparison shown in Table 5 has lower error standard deviations. Part of this difference is probably explained by the relatively higher and less variable ice concentrations of the SAR data set, that tend to limit the errors owing to smearing and atmospheric influences. Biases are in general less than 5% and variability much as expected with NT and N90 underestimating most. This is in agreement with, for example, the work of Comiso et al. [1997], Belchansky and Douglas [2002], Meier [2005], and for N90 it is a consequence of the tie points that were derived to match NT retrievals [Kaleschke et al., 2001; Spreen et al., 2007]. In general, the N90 and TUD algorithms are by far the most sensitive to atmospheric influences. However, these algorithms have the lowest error standard deviations, indicating that for conditions found within the consolidated sea ice, the atmospheric influence is secondary to retrieval of sea ice total concentration.

[25] Limiting the analysis to the wintertime subset, shown in Figure 4b, the variability of the SAR ice concentrations is reduced further, and the average concentrations are above 98%; see Table 6. One consequence is that the bias increases consistently with an increase in average SSM/I retrievals of about 1–2%. The error standard deviation differences between different algorithms are similar to the full data set; however, their absolute value is 1–3% lower (with the exception of CF and BRI). As expected in the high-concentration domain, the correlations with SAR are small for all algorithms. However, the differences in standard deviation between N90 and NT as well as BRI and NT are found to be significant at the 95% level using an F variance test. Furthermore, we find that the variability of the SAR concentrations is significantly lower than that of the SSM/I retrievals. Recalling that the SAR concentrations are biased toward high variability by the misclassification of thin ice as open water, it implies that the variability of the SSM/I retrievals cannot be explained by sea ice concentration variability. The fact that the 85 GHz algorithms (N90, TUD) have less variability indicates that atmospheric variability is not significant and suggests that surface effects or thin ice are the primary influence. We recall that thin ice types will mimic a mixture of sea ice and water in all SSM/I algorithms except CF as well as in the SAR data. Yet, we do not find different statistics with CF as opposed to other algorithms to indicate that this is a significant effect. While these findings are representative of large-scale conditions within the consolidated sea ice, it is likely that locally and during ice growth, ice concentration errors may be dominated by weather effects and thin ice.

Table 6. Same as Table 5 but for the Wintertime Subset
 AverageSDNumber of Observations
SAR (low resolution)98.53.75957
SAR (high resolution)98.64.224298
 BiasError SDCorrelation

[26] To address possible reservations against the quality of the SAR concentration data set, we define a high-quality winter subset of the most consistently classified data, shown in Figure 4d. The resulting statistics are shown in Table 7, and it is clear that this subset closely represents fully ice-covered surfaces. Essentially, the obtained error standard deviations will be due to surface effects and sensor noise. Sensor noise was estimated adding random deviates from a Gaussian distribution with standard deviation following the prelaunch characteristics from Hollinger et al. [1987] to MY and FY ice tie point brightness temperatures. For the 19 and 37 GHz channels 448 realizations for each surface type were computed and for the 85 GHz channels: 1792, to reflect the 4 fold sampling density of these channels relative to the low-resolution channels. From this data set, ice concentrations were derived and for the N90 and TUD algorithms, statistics were derived both from the full and a 4 sample averaged data set. For NT2, this procedure is not straightforward and the noise estimates are repeated from NT based on the finding by Markus and Cavalieri [2000], that the two algorithms have similar sensitivities. With sensor noise between 1.3 and 1.7%, we conclude that the major source of error is due to emissivity variations. The influence is found to vary significantly between algorithms with a range from 2.5 to 5.0% for TUD and CP, respectively. The NT2 algorithm actually gives a value of 1.4%, but this value is not comparable to the remaining algorithms because of the cutoff at 100%, discussed previously. In itself it indicates that a constant ice concentration estimate agrees better with the SAR concentrations than the estimate provided by the ice concentration algorithms. We note a reasonable agreement with the standard deviation figures obtained in section 4.1 and that the SAR derived data set captures most of the spatial variability in emissivity over the high-concentration sea ice. Both sets of statistics, however, underrepresent the larger variations at the ice edge owing to more frequent melt freeze episodes and increased precipitation due to the proximity to the relatively warm open ocean.

Table 7. Same as Table 5 but for the Subset of Particularly Unambiguous SAR Scenes and With Sensor Noise Estimated as Described in the Text
 AverageSDNumber of Observations 
SAR (low resolution)99.70.73669 
SAR (high resolution)99.71.015079 
 BiasError SDCorrelationSensor Noise

[27] During summer all algorithms have negative biases; see Table 8. Relative to the high-quality winter subset the NT, CF, CP and BRI bias change more than 10% while TUD, N90 and NT2 changes are smaller. In general, the differences in error standard deviation are more subtle in contrast to the results from ship observations. This illustrates the large spatial and temporal variability in summer. It is likely that to fully describe the variations in skill, an even larger data set is required than the one presented here.

Table 8. Same as Table 5 but for the Summertime Subset
 AverageSDNumber of Observations
SAR (low resolution)94.617.41556
SAR (high resolution)94.019.16276
 BiasError SDCorrelation

[28] Finally, we investigate geographical variations during the winter season. The areas chosen are (1) Baffin Bay, (2) the Arctic perennial ice cover (MY), (3) all parts of the Arctic with seasonal ice coverage (FY) (excluding the East Greenland Current), and (4) the East Greenland Current including the Fram Strait; see Figure 4a. Some subsets are relatively small and the results are therefore more uncertain. The differences in correlation coefficient here, as well as in the preceding tables, are minute and cannot be interpreted with confidence. The bias can be easily adjusted through tie point setting, whereas standard deviation is a measure of the random error in the ice concentration retrievals and is an inherent property of the algorithm. The results are therefore presented with the algorithms ranked by standard deviation. Note that the standard deviation of NT2, due to the cutoff at 100% as described earlier, is not comparable to the other algorithms. As in other tables, we give the statistics for NT2; however, it is excluded from ranking. In spite of concerns over the significance, a relatively clear result is seen to emerge in Tables 9a9d in that the TUD algorithm is consistently among the two best algorithms. It is followed by the BRI, CP, and N90 algorithms. These algorithms have more variable rankings and it is difficult to assert the significance of the result based on this analysis alone; however, it agrees well with the findings in section 4.1.

Table 9a. Ranking of Algorithms on Regional Winter Subset, Shown Schematically in Figure 4a, Depending on Magnitude of Error Standard Deviationa
RankingBaffin Area (368 pts)
  • a

    SAR average ice concentration is 93.8, and corresponding standard deviation is 9.9. Notice that since NT2 cuts off all ice concentration retrievals larger than 100%, its statistics are not comparable to the remaining algorithms, and it is therefore not ranked.

Table 9b. Ranking of Algorithms on Regional Winter Subset, Shown Schematically in Figure 4a, Depending on Magnitude of Error Standard Deviationa
RankingMY Area (2598 pts)
  • a

    SAR average ice concentration is 98.5, and corresponding standard deviation is 3.1. Notice that since NT2 cuts off all ice concentration retrievals larger than 100%, its statistics are not comparable to the remaining algorithms, and it is therefore not ranked.

Table 9c. Ranking of Algorithms on Regional Winter Subset, Shown Schematically in Figure 4a, Depending on Magnitude of Error Standard Deviationa
RankingFY Area (1688 pts)
  • a

    SAR average ice concentration is 99.3, and corresponding standard deviation is 1.7. Notice that since NT2 cuts off all ice concentration retrievals larger than 100%, its statistics are not comparable to the remaining algorithms, and it is therefore not ranked.

Table 9d. Ranking of Algorithms on Regional Winter Subset, Shown Schematically in Figure 4a, Depending on Magnitude of Error Standard Deviationa
RankingEast Greenland Area (579 pts)
  • a

    SAR average ice concentration is 98.5, and corresponding standard deviation is 3.0. Notice that since NT2 cuts off all ice concentration retrievals larger than 100%, its statistics are not comparable to the remaining algorithms, and it is therefore not ranked.


4.5. Consequences for Ice Area and Extent Estimates

[29] It is natural to explore the possible effects of the differences between algorithms on long-term properties of ice extent and area. We have therefore calculated ice concentration fields for the complete time series of daily gridded SSM/I brightness temperatures [Maslanik and Stroeve, 1990]. The data set includes the three DMSP satellites F8, F11, F13 and extends from July 1987 through June 2004. Since the 85 GHz channels on the F8 satellite were unstable, the F8 subset is excluded from the NT2, N90 and TUD algorithm time series, leaving the period from January 1991. As mentioned by Cavalieri et al. [1999] the data set has gaps. We reject data from days with excessive amounts of missing data and fill daily maps with only few missing pixels by linear interpolation in time. The land and ocean masks by Stroeve [2000], weather filter by Cavalieri et al. [1995], and a cutoff at 30% ice concentration are applied to all algorithms. Finally, ice concentrations exceeding 100% are truncated and daily sea ice extent and area values are averaged into monthly means. This procedure is slightly simpler than that of Cavalieri et al. [1999], but it leads to comparable results and is thought to be sufficiently accurate for intercomparison of algorithms.

[30] Initially, we estimate trends in area and extent by linear least squares fitting of monthly anomalies with respect to mean monthly values including all seasons. The results are shown in Figures 8a and 8b and in Table 10. For NT they correspond well with the estimate of −34000 ± 8300 and −29300 ± 8300 km2/year by Parkinson et al. [1999] for ice extent and area, respectively. The increasing rate of change found from the comparison of the long and short time series is in line with the findings by, for example, Comiso [2001]. The differences between the algorithms are relatively smaller for the ice extent estimates and all algorithms are generally close to be within 1 standard deviation of each other. Basing the same analysis only on winter months, shown in Figures 8c and 8d and Table 10, tends to accentuate the differences, especially for ice area. This is taken as evidence of the reduced differences between algorithms during the summer months, found in section 4.3. For the 1987–2004 time series, trends in sea ice area of −20.900 and −29.000 km2/year is found for BRI and NT algorithms, respectively. This difference is significantly larger than the standard deviations that are between 5000 and 5600 km2/year. For the CF and NT algorithms, the differences are even larger. However, since the CF algorithm, in its official implementation, is not to be used over consolidated ice, this is not a reasonable comparison.

Figure 8.

Observed trends in (a) ice extent and (b) ice area, for the data set excluding data from the F8 satellite (black) and the full data set (grey), as well as (c, d) the same but excluding summer months (May through September). Error bars indicate ±1 standard deviation.

Table 10. Estimated Trends of Ice Extents and Areas, in Units of 1000 km2/yr, From Different Algorithms and Periods With Corresponding Standard Deviations
Annual ice extent (1991–2004)−47.8 ± 6.1−50.1 ± 6.1−44.8 ± 6.0−43.4 ± 6.1−43.3 ± 6.1−41.9 ± 6.0−44.1 ± 5.6
Annual ice extent (1987−2004)−32.7 ± 4.1−34.6 ± 4.1−29.5 ± 4.0−31.7 ± 4.1   
Winter ice extent (1991–2004)−47.1 ± 6.8−49.3 ± 6.8−43.6 ± 6.8−43.6 ± 6.8−42.8 ± 6.8−42.0 ± 6.9−43.5 ± 6.0
Winter ice extent (1987–2004)−31.8 ± 4.7−33.6 ± 4.8−28.3 ± 4.7−30.7 ± 4.8   
Annual ice area (1991–2004)−42.1 ± 6.1−40.8 ± 6.1−46.0 ± 6.3−46.0 ± 6.3−45.5 ± 6.3−38.1 ± 6.0−43.5 ± 5.8
Annual ice area (1987–2004)−27.4 ± 4.2−26.3 ± 4.2−28.8 ± 4.2−32.0 ± 4.3   
Winter ice area (1991–2004)−36.1 ± 7.0−32.9 ± 6.7−43.8 ± 7.8−45.5 ± 7.9−43.4 ± 7.4−35.2 ± 7.3−40.1 ± 6.6
Winter ice area (1987–2004)−20.9 ± 5.0−17.6 ± 5.0−26.9 ± 5.4−29.0 ± 5.6   

[31] We explore the differences between the algorithms closer by selecting the BRI algorithm as reference and subtract its extent and area anomalies from those of the remaining algorithms. The main reason for selecting the Bristol algorithm is that among the non-85 GHz algorithms it consistently showed the lowest standard deviation in the wintertime comparisons in sections 4.1, 4.3, and 4.4. However, the choice is not important for the analysis as the main objective is to simply explore differences between the trends of different algorithms and investigate the statistical significance of these differences. Analyzing the differences has two distinct advantages: It eliminates variations in the time series that are common to all algorithms, and it allows to conveniently test the statistical significance of the differences by testing the hypothesis of a relative trend different from zero. Figure 9 shows the resulting time series for the wintertime subset. Using the full data set (not shown) gives approximately the same results, except that the trends of ice extent and area are more similar. While the amplitude of the monthly extent and area anomalies is generally larger than 106 km2 [c.f. Parkinson et al., 1999; Comiso, 2001], taking the difference between algorithms reduces this amplitude by at least an order of magnitude and even more for ice extent. This reduction in amplitude mostly accounts for real fluctuations in the ice extent and area, whereas the residual variation is mostly due to differences in the ice concentration algorithms. The independent variations between the algorithms are larger for ice area than ice extent, consistent with the more complicated sensitivities to ice and snow surface parameters.

Figure 9.

Observed differences between ice extent (black) and ice area (grey) winter time series relative to BRI: (a, b) results for the 1991–2004 period for CF, CP, N90, NT, NT2, and TUD and (c) results for the 1987–2004 period for CF, CP, and NT.

[32] Most algorithms display trend relative to BRI both for ice extent and for ice area, however it is clear from Figure 9 that this trend is not identical for all algorithms. This indicates that temporal changes in interfering properties, such as surface and atmospheric effects, affect the algorithms in different ways. In particular, it is interesting to note that only CF has a negative trend in ice extent relative to BRI, while N90 has the most positive trend. Following Andersen et al. [2006], these respective algorithms are the least and most sensitive to atmospheric effects. The difference in sensitivity is largest for surface wind speed. We speculate that increased atmospheric effects in connection with the increasing surface temperatures, documented by Comiso [2001], and/or long-term variations in storm tracks, may counter the downward trend in ice extent by raising retrieved ice concentrations at the ice edge. This effect would be more pronounced in the algorithms that are more sensitive to atmospheric effects.

[33] To quantify whether the relative trends in Figure 9 are statistically significant, two sided significance levels are computed conservatively using the student's t distribution with adjusted standard error and degrees of freedom following Santer et al. [2000]. Results for ice extent are not shown since the trends relative to BRI are in most cases significantly different from zero. The exceptions are (1) NT, that consistently shows little trend relative to BRI on the 1987–2004 time series but significant trend on the 1991–2004 time series, and (2) TUD, which shows insignificant trend relative to BRI, mainly owing to a relatively larger and unexplained variability of the difference between the two algorithms. The results for ice area, given in Table 11, are much more varied. However, nonsignificant trends when summer data is included in the analysis seems to be a uniting feature, which is in line with the more uniform skill found for summer data in section 4. During winter the trends of the CP, NT, NT2, and partly CF against BRI are all significant to a very high level of confidence. The TUD and N90 are more similar to BRI. This is again in line with the results for the wintertime comparison in section 4.

Table 11. Trends of Differences in Ice Area Relative to the BRI Algorithm in 1000 km2/yra
  • a

    S indicates the statistical significance in percent with which the hypothesis of zero slope is rejected.

1991–2004 (Annual)
1987–2004 (Annual)
1991–2004 (Winter)
1987–2004 (Winter)

[34] In conclusion, we state that (1) the trends in ice extent derived through different algorithms typically have differences with statistically significant trend, (2) these relative trends show a pattern consistent with the sensitivities of the algorithms to atmospheric contamination, (3) the differences of the ice area estimates show larger amplitudes consistent with the more complicated sensitivities to ice and snow surface parameters, and (4) the trend of the differences is typically between 5 and 10% of the total observed trend in extent and area.

5. Conclusions

[35] We have presented the results from inter comparison of seven passive microwave sea ice concentration algorithms as well as comparison to a comprehensive collection of reference data over surfaces within the Arctic perennial ice cover. While most previous work has concentrated on a few algorithms, it seems justified to consider a large ensemble of approaches since there are significant differences. Results from the comparison to ship-based observations and SAR-based concentrations, are generally similar to the extent that algorithms based on 85 GHz data have better skill. Recalling that these algorithms are more sensitive to atmospheric effects, it indicates that retrieval of high ice concentrations are most affected by surface effects and limitations of the sensor, that is, smearing and sensor noise. A particularly trusted data subset allowed the separation of these contributions. This data set essentially consisted of sea ice of concentration in excess of 99% and variations due to smearing could be neglected. The error standard deviations were in the range between 2.5 and 5% and 2.1 and 4.7%, before and after subtraction of variance due to sensor noise, respectively. The variability obtained from wintertime high ice concentration areas, using an ensemble mean concentration threshold of 90%, agreed with these numbers with ranges of standard deviation generally between 3 and 5%.

[36] For all algorithms during winter, the error standard deviations exceeded the standard deviations of the SAR derived concentrations. Consequently, very low values of the correlation between the two data sets were found. Similar low correlations were found when comparing SSM/I ice concentration changes to opening in the ice cover derived from RGPS ice motion data. This kinematic evidence allows us to conclude that during winter and in perennial sea ice, the prevailing ice concentrations are very high and the variability in ice concentrations derived from SSM/I measurements is mainly due to variations in ice emissivity. In the high ice concentration domain, a constant ice concentration actually matches the SAR concentration better than the SSM/I derived concentrations. Importantly, at intermediate concentrations, the correlation between SAR and SSM/I concentrations are between 0.8 and 0.9 and somewhat lower between ship observations and SSM/I. As a consequence of the prevailing high-concentration sea ice in our data set, we are not able to accurately determine an exact concentration threshold where the correlation between SSM/I and SAR concentrations becomes insignificant.

[37] During summer, the main results showed decreases in retrieved ice concentrations leading to bias changes of typically −5 to −10%. However for the ship observation a larger spread in error standard deviation and correlation was found, where the comparison against SAR data showed more uniform values. The comparison to ship data led to error standard deviations that were lower during summer than during winter, where for the comparison against SAR the opposite was the case. This is mainly because of the very different routes taken during the winter and summer cruises, where the winter cruise stayed largely in the Marginal Ice, whereas the summer cruise penetrated through the perennial sea ice to the North Pole.

[38] Analysis of the entire SSM/I time series showed that the differences between the algorithms has significant temporal trend affecting both ice extent and area estimates. All algorithms indicate a significant sea ice retreat; however, they differ significantly in their estimates of the rate of this retreat. For ice extent the sensitivity to atmospheric influences appears crucial, and the least sensitive algorithm showed a negative trend against a moderately sensitive algorithm, whereas the most sensitive algorithm showed the largest positive trend. A possible explanation could be due to increased atmospheric effects related to the increasing surface temperatures, such as reported by Comiso [2001]. This, as well as long-term changes in storm tracks, could lead to increased atmospheric effects and increased ice concentration retrievals that would counteract the negative trend in ice extent more for the most sensitive algorithms. While it is beyond the scope of the present paper, it is considered worthwhile to examine this in more detail in the future. For ice area, significant trends between algorithms are established only during winter. When summer data is included, the differences in trends are reduced. This is consistent with the observation of more uniform errors in comparison to summer time SAR data. Interestingly, it is found that the N90 and TUD derived ice area time series consistently yield insignificant trend against BRI, which is in accord with the finding that these three algorithms gave the overall best statistics in comparison with both ship and SAR data.

[39] A consequence of our findings for the winter season is that applications, which rely on ice concentrations within the perennial sea ice during winter, should not rely exclusively on SSM/I based observations. Since it is demonstrated that the errors pertain to uncertainties in the tie point emissivities, moderately increased resolution, such as offered by AMSR and the possible future CMIS instruments, will not significantly improve the situation. In the long-term, improved understanding of the radiative processes in snow and sea ice would be beneficial and might also allow more reliable detection of important snow and sea ice parameters. In the meantime, increased use of high-resolution observations, such as SAR and to some extent VIS/IR observations, is an important supplement.


[40] This work was supported by the European Union 5th framework programme project IOMASA (EVK3-2001-00116) and the German Science Foundation (DFG). Special thanks are due to ice analysts Gert B. Christensen and Ib Westergaard for their indispensable efforts as well as to Thorsten Markus for making the NASA/TEAM2 software code available. We are grateful for the fruitful discussions and comments of Thorsten Markus and Ron Kwok as well as for the helpful comments of two anonymous reviewers.