Seasonal forecasts of the September 2012 Arctic sea ice thickness and extent are conducted starting from 1 June 2012. An ensemble of forecasts is made with a coupled ice-ocean model. For the first time, observations of the ice thickness are used to correct the initial ice thickness distribution to improve the initial conditions. Data from two airborne campaigns are used: NASA Operation IceBridge and SIZONet. The model was advanced through April and May using reanalysis data from 2012 and for June–September it was forced with reanalysis data from the previous seven summers. The ice extent in the corrected runs averaged lower in the Pacific sector and higher in the Atlantic sector compared to control runs with no corrections. The predicted total ice extent is 4.4 +/− 0.5 M km2, 0.2 M km2 less than that made with the control runs but 0.8 M km2 higher than the observed September extent.
 As activities increase in the Arctic in response to reduced summer sea ice extent and increased interest in natural resources, seasonal predictions of ice extent or ice concentration become increasingly requested, both for the region as a whole and for specific locations. To date, most interest is in the total ice extent at the time of the annual minimum, in September. This is a focus of the SEARCH Sea Ice Outlook activity (http://www.arcus.org/search/seaiceoutlook) in which individuals can submit forecasts of the total sea ice extent for the Arctic as measured by the National Snow and Ice Data Center (NSIDC) Sea Ice Index [Fetterer et al., 2002] or for specific regions. A wide variety of methods are utilized in this exercise, from coupled atmosphere–ocean–ice numerical models, to ice–ocean-only models, to a range of statistical methods, and even to heuristic arguments or popular polls. The outlook activity begins with forecasts made in the first week of June. A major limitation of these efforts has been the lack of near-real time estimates of the ice thickness over broad areas of the Arctic that would aid in the forecast procedures.
 Here we use newly available quick look estimates of ice thickness made by two different field campaigns in late March and early April 2012 to improve the estimate of the initial conditions for forecasting ice conditions in September. A coupled ice–ocean model is used to project the ice evolution from the springtime measurements to September. The following four steps are performed to make the forecasts.
 1. The model is first initialized using historical reanalysis forcing data starting in 1948 and continuing through the end of March 2012. Ice concentration and sea surface temperatures are assimilated from January 1979 through March 2012. This provides the first guess ice thickness fields for 1 April 2012.
 2. The observations from the two field campaigns (IceBridge and SIZONet, described in section 3) are clustered in 50-km samples. While the observations span the period 14 March–9 April 2012, we consider all of the observations to have been made on 1 April. The PIOMAS first guess thickness distribution for 1 April is then corrected to match the observations with an optimal interpolation procedure.
 3. As the first forecasts are made in the first week of June, the atmospheric reanalysis data for April and May 2012 are used to advance the model to the end of May, including the assimilation of ice concentration and SST data.
 4. For the months of June–September the model is forced with the summer weather as represented in the reanalysis data from the previous seven summers. This creates an ensemble of seven members. The mean ice extent and the standard deviation provide an estimate of the September ice extent and the uncertainty. In addition the ice edge and its variability in specific regions can be examined.
 The forecast exercise is made both with the first guess ice thickness fields (control) and the fields corrected to match the ice thickness observations (corrected). Perhaps more interesting than the result of this single exercise are the numerous questions that this type of study raises about the role and utility of observations in improving forecasts. Some of these questions are addressed in section 6.
2. Model and Forcing Data
 The numerical ensemble seasonal forecasting system consists of the Pan-Arctic Ice–Ocean Modeling and Assimilation System (PIOMAS) [Zhang and Rothrock, 2003], the NCEP/NCAR atmospheric reanalysis forcing data, and satellite observations of ice concentration and sea surface temperature. PIOMAS is a coupled ice–ocean model that assimilates satellite sea ice concentration [Lindsay and Zhang, 2006] and sea surface temperature [Schweiger et al., 2011].
 The seasonal forecast system is based on the assumption that the current climate is not fundamentally different from the recent past. This means that reanalysis data from the recent past may capture the current climate variability and therefore may be used to drive PIOMAS for ensemble seasonal forecasts. Here, the ensemble consists of seven members, each of which uses a unique set of NCEP/NCAR atmospheric forcing fields from recent years such that ensemble member 1 uses 2005 NCEP/NCAR forcing, member 2 uses 2006 forcing, and member 7 uses 2011 forcing. Each member starts with the same initial ice and ocean conditions on 1 June 2012. One limitation to using the reanalysis data is that there is no interaction between the atmosphere and the ice conditions. The advantage is that it is very simple to use past years and thereby quickly obtain an estimate of the range of possible outcomes given the current initial ice and ocean conditions. Only the previous seven summers are used because the near-surface atmospheric properties depend strongly on the ice conditions so using recent years with low ice extent is appropriate. More details about the ensemble prediction procedure can be found inZhang et al. .
3.1. Operation IceBridge Quick Look Data
 Sea ice thickness data from Operation IceBridge (OIB) are taken from the quick look data product available via the National Snow and Ice Data Center (http://nsidc.org/data/docs/daac/icebridge/evaluation_products/sea-ice-freeboard-snowdepth-thickness-quicklook-index.html). The OIB quick look sea ice thickness data were obtained on 12 flights of the NASA P-3B Orion aircraft between 14 March and 2 April in the western Arctic Ocean basin and the Beaufort/Chukchi sea region. The quick look data were processed in an expedited manner to support the development of seasonal sea ice prediction capabilities such as this. Due to the expedited nature of the data production process, additional uncertainties may be present in the quick look data. The full assessment of the uncertainties in the quick look data is an ongoing research project, which will be undertaken after the release of the final OIB 2012 sea ice data products [Kurtz et al., 2012].
 Sea ice thickness is inferred through measurements of the height of the sea ice and snow layers above sea level and an assumption of hydrostatic balance. The hydrostatic balance equation relating the measured sea ice freeboard (hf, the height of the surface snow-plus-ice layer above the local sea surface elevation), and snow depth (hs) properties to the sea ice thickness (hi) is
Where the densities of snow, sea ice, and sea water are = 320 kg m3, = 915 kg m3, and = 1024 kg m3, respectively. Uncertainties in these densities are included in the estimates of the thickness uncertainties. The OIB sea ice thickness data are provided at a spatial resolution of 40 m along the aircraft track. The uncertainty in the retrieved sea ice thickness is provided at each along-track measurement location through propagation of the uncertainties of each component of the hydrostatic balance equation.
 Laser altimetry data from the Airborne Topographic Mapper (ATM) system [Krabill, 2009] were used to determine hf. The sea surface elevation is determined at discrete locations through the measurement of surface elevation over open water and newly frozen leads. In the quick look data products, open water and newly frozen leads were identified using surface temperature results from a KT19 infrared pyrometer to retrieve the open water fraction within the viewing area. The sea surface height was then constructed along each flight line by subtracting out known sea surface height parameters (including the geoid, tidal, and atmospheric pressure induced fluctuations) and using an ordinary Kriging approach to interpolate between the discrete sea surface height observations and each measurement location. Uncertainties in the sea surface height are determined from the Kriging error in the interpolation scheme. Due to the irregular spacing of lead observations the freeboard uncertainty is highly variable along each flight track.
 Snow depth is determined from the University of Kansas' snow radar system [Leuschen, 2010; Panzer et al., 2010] and is retrieved through the identification of the air–snow and snow–ice interfaces and determination of the distance between them. The air–snow and snow–ice interfaces are identified following the method described in Kurtz and Farrell , with an update described in the OIB data products manual [Kurtz et al., 2012] to account for the lack of radiometric calibration of the radar data. The snow depth is then calculated by differencing the air–snow and snow–ice interfaces in the time domain and multiplying this difference by the speed of light within the snow pack. Following the results described in Farrell et al. , the uncertainty in the snow depth is here estimated to be 5.7 cm. Further refinement of the snow depth uncertainty is expected through the comparison with coincident in situ data collected in 2011 and 2012.
3.2. Airborne Electromagnetic Induction Snow and Ice Thickness From SIZONet
 Measurements of total ice-plus-snow thickness were made with an Airborne ElectroMagnetic (AEM) induction sounding system [Eicken et al., 2007]. The surveys were part of the Seasonal Ice Zone Observation Network (SIZONet). Three helicopter flights were conducted from Barrow, Alaska, between 7 and 9 April. Each flight resulted in approximately 220 km of profile data. Data from similar flights exist from every spring since April 2007 and other flights have been conducted in many regions in the Arctic and Antarctic (see the Unified Sea Ice Thickness Climate Data Record; psc.apl.uw.edu/sea_ice_cdr [Lindsay, 2010]).
 Measurements were made with an EM sensor suspended 20 m below a helicopter and towed at an altitude of 10 to 15 m above the surface. The retrieval method is based on the contrast of electrical conductivity between sea ice and ocean. Electromagnetic fields are used to determine the range from the instrument to the ice–water interface and a laser altimeter is used to range to the snow surface, thus the difference of the distances gives the ice-plus-snow thickness. This approach is based on a 1D representation of sea ice and a full description is given inHaas et al. . The instrument can be operated from helicopters or fixed-wing aircraft for long-range surveys [Haas et al., 2010].
 When comparing AEM sea ice thickness with other products, two properties must be considered: AEM thicknesses always include snow depth and the footprint of the EM ranging is approximately 40–50 m. The consequence of the first point is that either snow depth has to be added to other sea ice thickness products or removed from AEM thickness estimates. The second point results in footprint smoothing of the ice thickness profile. Deformed sea ice features, such as ridges, are underestimated in maximum thickness and overestimated in width. It is assumed that the footprint effect has little impact on larger scale mean AEM thickness estimates, which is backed by inter-comparisons of thickness products from different methods [Schweiger et al., 2011].
 The airborne EM sensor used in the SIZONet 2012 field campaign is designed to overcome the footprint limitation. MAiSIE, the Multi-Sensor Sea Ice Explorer, features an enhanced EM concept to allow a geophysical inversion for sub-footprint-scale sea ice thickness [Pfaffhuber et al., 2012]. To meet the time constraints for a seasonal outlook in 2012, the AEM data for this study were processed with the traditional 1D processing and released shortly after the end of the field campaign. Mostly first-year sea ice was found with intermittent multi-year sea ice floes. The typical thickness of level first-year ice, represented by the maximum of the ice thickness distribution, was calculated to be 2.0 m.
 The AEM measurements were corrected to ice thickness by subtracting an estimate of the snow depth determined from the OIB snow depth measurements. All of the OIB mean ice thickness and snow depth measurements were used to determine a linear relationship between the ice-plus-snow depthhi+s versus the snow depth alone, hs:
This relationship was then used to determine the snow depth for each AEM point measurement before the clustering procedure.
 The point data from each of the campaigns were grouped into 50-km clusters, independent of any grid, to determine the local ice thickness distribution. Each campaign was clustered independently, but all flights from each campaign were clustered together regardless of the date flown. All points within a 50-km circle were used to form the thickness distribution for the cluster using 10-cm bins. The centroid location of the observations was retained. The locations of the circles were chosen to minimize the number of clusters and maximize the number of points in each cluster.
 For the OIB measurements, only points that had an estimated uncertainty of m were included, where h is the mean thickness measurement and is the reported uncertainty. For unbiased errors the uncertainty of the mean ice thickness for the 50-km clusters is quite small because there were about 1000 point measurements in each cluster. However, a significant unknown bias may exist for the cluster means.
3.4. Model Comparisons
 The location of each cluster is plotted on a map of the mean ice thickness from the PIOMAS first guess field for 1 April 2012 in Figure 1a. We see that the model consistently underestimates the thickness of thick ice near the pole and overestimates the thickness of thinner ice in the Beaufort and Chukchi seas. This bias pattern in the PIOMAS model thickness has been observed previously when the model thickness was compared to submarine ice draft measurements [Zhang and Rothrock, 2005; Lindsay and Zhang, 2006; Schweiger et al., 2011]. We have not yet established the ultimate source of this model bias.
4. Correcting the Model Ice Thickness
 A simple optimal interpolation (OI, or Kriging) procedure is used to merge the model estimates of the mean ice thickness and the collocated observed mean thicknesses and thickness distributions. First the difference between the observations and the first guess model estimates are determined for each observation location
The difference at each location is shown in Figure 1b. The difference is then interpolated to the locations of all model grid points using three parameters: the uncertainty in the observations εobs = 0.5 m [Kurtz et al., 2012], the uncertainty on the model estimates εmod = 1.0 m [Schweiger et al., 2011], and a correlation length scale for the model errors, Lerr = 500 km. The length scale for the model errors is not well known. The interpolated correction field was then added to the first guess model estimate of the mean ice thickness for 1 April 2012 to provide a revised estimate of the ice thickness. The interpolated difference field is also shown in Figure 1b.The revised model mean ice thickness has no bias with respect to the observations and a correlation of R = 0.88 (N = 214).
 The model, however, does not use the mean ice thickness as a state parameter, but instead uses a 12-bin thickness distribution to characterize the ice thickness. The model thickness distribution is modified in the same manner as the mean thickness. For each observation cluster the observed thickness distribution is divided into the same 12 bins that the model uses. The model minus observed area fraction difference is obtained for each bin independently at each observation location and the difference is then interpolated to the model grid locations using OI. The errors in the area fractions for the individual bins are not known, but we have again used an error for the model that is twice that of the observations, i.e.,εobs = 0.05, εmod = 0.1. (It is the ratio of the errors that is important for the OI procedure). The interpolated difference in the area fraction of each bin is then added to the model first guess to obtain the new model initialization. Because the weighting of the model and the observations is the same for all bins at each location, the distribution remains normalized. In examining the correction fields for each of the bins it is apparent that area is removed from some bins and added to others in regions where the observations differ from the model distributions. For example, where the model is too thick in the mean, area is removed from the thicker bins and added to thinner bins. The net result is a change in the mean thickness very similar to what we calculated above using the mean thickness field.
5. Ensemble Forecasts
 The forecasts in this exercise are made starting from the first of June to conform to the Sea Ice Outlook project guidelines. The NCEP reanalysis data for the months of April and May 2012 are used to force the model to obtain an estimate of the ice conditions for the first of June. This is a two-month hindcast starting with the revised initial conditions for the ice thickness distribution on 1 April. Control runs are also made with the original first guess ice conditions from 1 April. The forecast ice conditions are thus estimated for the months of June–September with two seven-member ensembles, one ensemble for the control and one for the corrected or initialized ice conditions.
Figure 1c shows the mean thickness difference between the corrected and the control runs forecast for September and the ice extent lines for the seven corrected runs. The ensemble mean ice extent is significantly lower in the corrected runs in the region north of the East Siberian Sea. This reflects the reduced ice thickness in the Chukchi Sea in the corrected runs, which has migrated to the west. We also see a large amount of variability in the ensemble runs in this region so that our confidence in the ice extent forecast here is low. In the Beaufort Sea, where we had abundant ice thickness observations, there is only a very small reduction in the estimated ice extent in the corrected runs and a small reduction in the ice thickness. Near the Barents Sea and in Fram Strait the initialized forecast shows an extent greater than the control run. This reflects the increased ice thickness of the initialized run in the region near the pole; this anomaly migrated closer to Svalbard in September.
 The observed September 2012 mean ice extent is also shown in Figure 1c. The predicted extent from both the control run and the corrected run is generally lower than what actually occurred. The differences are greatest in the Pacific sector, though there is also a significant overestimation of the extent in both ensembles near the Barents Sea and an underestimation in Fram Strait. The largest difference between the initialized and the control runs is near the East Siberian Sea where the thinning for the initialized runs is largest and where the mean of initialized runs nearly matches the observed extent. Here the forcing from 2007 for the initialized runs produced an ice edge even farther north than what was observed. In the Pacific sector the mean ice edge of the initialized runs is closer to the observed edge than that of the control runs because of the thinning imposed by the observations.
 The net effect of the reduced extent on the Pacific side in the initialized run compared to the control run and increased extent on the European side is that the forecast of the total ice extent in the Arctic is similar in the initialized and control runs. Figure 2 shows the time evolution of the computed total ice extent for the entire Arctic and the differences for each pair of runs (control minus corrected). The observed total extent for September from the Sea Ice Index [Fetterer et al., 2002] is also shown for the summer months. The ensemble median of the initialized forecasts is 4.4 +/– 0.5 M km2, i.e., 0.2 M km2 less than the control forecast of 4.6 +/– 0.5 M km2. Although the median of the initialized forecasts was less than that of the control forecast, it was still 0.8 ± 0.5 M km2 higher than the actual observed September mean extent of 3.6 M km2.
 This first exercise in initialized sea ice seasonal prediction naturally raises as many questions as it answers. Many of the procedures are admittedly ad-hoc and point to where further improvements should be made.
 An extraordinary effort was made by the field teams to provide the near-real time quick look data products used in the forecasts. With repeated campaigns the effort will likely be smoother. The timing of the campaigns in late March and early April are not ideal for predicting the end-of-summer ice conditions from the first of June, but the ability of a model to use the information from early in the season makes the observations useful in correcting the model estimates of the initial ice thickness. The observations are also spatially limited, with no information about the model error on the Siberian side of the basin where there is significant variability in the ensemble members.
 The length scale used to interpolate the sparse observations to the model grid is another area of uncertainty. We used a length scale of 500 km because the correlation length scale of the model ice thickness is on the order of 500 km or more. However, there is a great deal of small-scale variability in the difference between the observations and the model. Because the model is used to project forward six months from the observations, we thought it best to smooth the differences and extend the observed differences beyond the immediate vicinity in which they were obtained. This smoothing and extrapolation is accomplished through the optimal interpolation procedure. If a 250-km length scale is used, the corrections to the initial ice thickness are more localized to the positions of the observations. The net effect on the predicted ice edge in this case compared to one using a 500-km length scale is very small in most areas, but in the area north of the Laptev Sea, where the uncertainty is large, the mean differences are large.
 An additional issue for initialized forecast is model bias. The model may have a bias in the ice extent, either regionally or for the Arctic as a whole. We saw above that there is a regional bias in the model ice thickness. An additional issue for forecasts is the time interval for which the model bias is developed. If the bias is developed quickly, the initialization may not be able to correct the forecast, while if the bias develops over a number of years the initialization will potentially improve the forecast. We currently have little information as to how quickly the model bias develops but the map in Figure 1c shows the anomaly introduced by the observation is significantly smoothed and reduced in magnitude in just six months.
 Why were the September forecast ice extents too high? Four-month forecasts depend heavily on the nature of the uncertain forcing fields. One consideration is that the ensemble of forcing years taken from the recent past is inherently conservative, given the upward trend in air temperatures and downward trend in ice extent. A second consideration is that the atmospheric forcing data is not interactive with the forecast ice cover so the thin or reduced ice is not allowed to influence atmospheric near-surface air temperatures. Finally, it may be that some aspects of the weather in the summer of 2012, such as the large storm that passed through the region in August, created conditions particularly conducive to a large melting event. The proximate and ultimate causes of the record-low ice extent observed in 2012 remain active research questions.
 Poor verification of this single forecast tells us little about eventual possible improvements in the accuracy of the forecast using observations because of the large uncertainty in the projected ice extent. In most locations the difference in the mean ice extent between the corrected and control runs is small compared to the variability seen in the ensembles. However, this first attempt to use ice thickness observations to improve forecasts shows that observations closer to the forecast time for forecasts made over shorter time intervals may have more of an impact. This forecast season also demonstrates that much of the uncertainty in long-range forecasts is likely not due to uncertain initial ice conditions but to uncertain summer weather.
 This study was conducted with support from the National Science Foundation Office of Polar Programs, the National Aeronautics and Space Administration Cryosphere Program and Operation IceBridge, the Office of Naval Research, and the Center for Remote Sensing of Ice Sheets.
 The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.