Six Arctic Ocean Model Intercomparison Project model simulations are compared with estimates of sea ice thickness derived from pan-Arctic satellite freeboard measurements (2004–2008); airborne electromagnetic measurements (2001–2009); ice draft data from moored instruments in Fram Strait, the Greenland Sea, and the Beaufort Sea (1992–2008) and from submarines (1975–2000); and drill hole data from the Arctic basin, Laptev, and East Siberian marginal seas (1982–1986) and coastal stations (1998–2009). Despite an assessment of six models that differ in numerical methods, resolution, domain, forcing, and boundary conditions, the models generally overestimate the thickness of measured ice thinner than ∼2 m and underestimate the thickness of ice measured thicker than about ∼2 m. In the regions of flat immobile landfast ice (shallow Siberian Seas with depths less than 25–30 m), the models generally overestimate both the total observed sea ice thickness and rates of September and October ice growth from observations by more than 4 times and more than one standard deviation, respectively. The models do not reproduce conditions of fast ice formation and growth. Instead, the modeled fast ice is replaced with pack ice which drifts, generating ridges of increasing ice thickness, in addition to thermodynamic ice growth. Considering all observational data sets, the better correlations and smaller differences from observations are from the Estimating the Circulation and Climate of the Ocean, Phase II and Pan-Arctic Ice Ocean Modeling and Assimilation System models.
 Dramatic decreases in Arctic sea ice are predicted by some climate models to the degree that multiyear ice may be lost during this century. Critical to the accuracy and reliability of high-latitude climate forecasts is a better understanding of sea ice dynamics and thermodynamics through the proper simulation of sea ice and its responses to atmospheric forcing across a range of temporal and spatial scales. Assessment of model performance regarding sea ice would include, at least, comparisons with observations of the interrelated sea ice characteristics of motion, strain, deformation, concentration, age, and thickness. Evaluation of modeled sea ice behavior, however, is limited by incomplete observational data across the scales that characterize sea ice growth, melt, motion, and deformation.
 With the beginning of the satellite record in the late 1970s, sea ice concentration became widely available as a product derived from passive microwave brightness temperatures [Gloersen et al., 1992]. However, estimating sea ice thickness remotely is not straightforward although procedures for estimating thickness as well as velocity from the satellite record have been developed [Laxon et al., 2003; Kwok et al., 2004]. Thickness is important to probability estimates of sea ice survival over the melt season [Untersteiner, 1961] and its distribution appears to be undergoing rapid changes [Wadhams, 1990; Rothrock et al., 1999; Wadhams and Davis, 2000].
 The focus of this paper is the ability of six coupled Arctic Ocean Model Intercomparison Project (AOMIP) models to simulate sea ice thickness and to identify trends and differences among the AOMIP model ice thickness results by comparing them with the broad range of derived sea ice thickness data that is now available. The observational data include (1) gridded ice thickness derived from the ICESat satellite for 10 campaigns from fall and winter 2004 to 2008; (2) ice thickness transect data from electromagnetic airborne measurements (2001–2009); (3) ice draft from 24 moored instruments equipped with upward looking sonars (ULS) and ice profiling sonars (IPS) from 1992 to 2008 from the Beaufort Sea, Fram Strait, and the Greenland Sea; (4) ice draft from submarines equipped with upward looking sonar (1975–2000); (5) ice thickness at drill holes through sea ice from 187 sites taken in spring from 1982 to 1986 across the Siberian marginal seas; and (6) fast ice thickness from 51 Russian coastal stations (1998–2009).
 While one advantage of using model results can be the availability of ice freeboard, draft, snow depth, and similar variables to be compared directly with observations, our original AOMIP experiment focused on ice thickness. With six models and six data sets, the need to simplify our approach became apparent. All data were converted to thickness as described below. A range of thickness measurements is important to assess model performance. When and where ice is thin and/or in low concentration there is potential for thermodynamic ice growth and/or high speed drift.
 Because of differences among model forcing and parameterizations, our goal is to identify differences between modeled and observed sea ice thickness as a foundation for model improvement. However, the complexity of isolating specific model attributes from among the full suite of parameterizations, forcing, and boundary conditions is beyond the scope of this paper. Indeed, Kim et al.  have shown that model parameters need to be varied simultaneously to tune models to optimal values and identifying the most important parameters is “not trivial.”
2. Summary of Previous Work
Bourke and Garrett  first reported on the mean ice thickness distribution in the Arctic Ocean from data taken between 1960 and 1982. Rothrock et al.  showed that the mean ice draft in most of the central portion of the Arctic Ocean had declined from 3.1 m in 1958–1976 to 1.8 m in the 1990s (a 40% decrease). The submarine ice draft data in the data release area (DRA) were fit with multiple linear regression expressions of location, time and season by Rothrock et al.  for the period 1975–2000. They found the annual mean ice draft declined from a peak of 3.42 m in 1980 to a minimum of 2.29 m in 2000. ICESat ice thickness estimates for 2003–2008 for the same area of the Arctic Ocean as represented by the regression equations show a continued decline to less than 1.0 m in the DRA in the fall of 2007. Wadhams and Davis  found a 43% decline in the ice draft at the pole from 1976 to 1996. Winsor , however, found no trend in six cruises between the pole and the Beaufort Sea from the 1990s. Airborne electromagnetic (EM) surveys by Haas et al.  showed a thinning of 20% in the region of the North Pole between 1991 and 2004, with a sharp drop to only 0.9 m in the summer of 2007 related to the replacement of old ice by first-year ice.
 Direct comparison of model results to observed sea ice thickness has been limited because pan-Arctic sea ice thickness data were not widely available at useful resolutions. The lack of observational data was carefully circumvented by Gerdes and Köberle  who compared results from several IPCC modeled outputs against sea ice thickness from a hindcast model (AWI1) positively evaluated against other AOMIP models. They concluded that differences among the IPCC models were likely due to the different effective wind stress forcing and the coupling methodologies with the ocean, a conclusion consistent with studies showing that atmospheric forcing fields essentially drive the results of sea ice simulations [Walsh and Crane, 1992; Bitz et al., 2002; Hunke and Holland, 2007] more so than the details of the sea ice model itself [Flato et al., 2004].
 Recently however, several studies have compared model results with available sea ice thickness data sets to calibrate, validate and improve sea ice representation in the regional and global models. For example, Koldunov et al.  analyzed Arctic sea ice parameters simulated by the fully coupled climate model ECHAM5/Max Planck Institute for Meteorology Hamburg Primitive Equation Ocean Model (MPI-OM) for the period from 1980 to 1999 and compared them with observations collected during field programs and by satellites. It was found that the major biases in sea ice thickness arise from errors in the ECHAM5/MPI-OM atmosphere and associated errors in surface forcing fields (especially wind stress). In contrast, the identical coupled ocean–ice model, when driven by NCEP–NCAR reanalysis fields showed increased skill in its ice and ocean circulation parameters. However, common to both model runs was too strong ice export through the Fram Strait and a substantially biased heat content in the interior of the Arctic Ocean.
 An individual model evaluation using sea ice thickness observational data is a paper by Schweiger et al.  (this section). Practically all available sources of sea ice thickness were used for comparison with Pan-Arctic Ice-Ocean Modeling and Assimilation System (PIOMAS) simulated results. It was concluded that in general PIOMAS relative to observations, appears to overestimate the thickness of thin ice and underestimate the thickness of thick ice. In this paper we are able to compare PIOMAS output with other models and to evaluate them based on several data sets including coastal station data and drill hole data.
 For sea ice concentration, satellite-derived values were compared with several AOMIP models to show that they reproduced wintertime observations reasonably well when ice concentration was near 100% but underestimated the September ice concentration minimum [Johnson et al., 2007]. The variability among model results exceeded the variability among four satellite-derived observational data sets suggesting the need to further constrain model performance and reduce sensitivity to prescribed forcing.
 Assessment of the ice age-thickness relationship using model results shows that for Northern Hemisphere–wide averages the notion of thicker ice being older is reasonable at decadal scales, but for specific years and at scales less than hundreds of kilometers, ice age is not a good proxy for ice thickness [Hunke and Bitz, 2009]. At interannual timescales, the Northern Hemisphere–averaged ice age is not well correlated with any of the three common ice descriptors: thickness, area or volume.
 This paper is organized as follows. Sections 3 and 4 describe the different AOMIP model characteristics and the data sets against which they are compared. Section 5 on the data biases follows. The methods used to prepare the model and observational data are then discussed in section 6. Next we compare model results and data using Taylor diagrams modified to retain units of ice thickness residuals and model-data correlations (section 7). Linear regressions between the model and observed data are also presented and analyzed (section 8). This is followed by a comparison of modeled ice growth rates to empirically derived thermodynamic ice growth rates and observed sea ice growth rates taken from select coastal stations (section 9). A discussion and summary are last in sections 10 and 11.
 The AOMIP project and its models are described by Holloway et al.  with considerable detail to be found on the AOMIP Web site (http://www.whoi.edu/projects/AOMIP/). The six models used in this paper are from the Jet Propulsion Laboratory (Estimating the Circulation and Climate of the Ocean, Phase II (ECCO2)), Goddard Space Flight Center (GSFC), Institute of Numerical Mathematics Ocean Model (INMOM) Russian Academy of Science, the Naval Postgraduate School Arctic Modeling Effort (NAME), the National Oceanography Centre Southampton (ORCA), and the PIOMAS of the University of Washington (hereinafter UW model). Specific sea ice parameters for these models are shown in Table 1 and discussed below.
Table 1. Model Configuration and Selected Parametersa
See AOMIP Web site for additional details (http://www.whoi.edu/projects/AOMIP/). MY2.5, Mellor-Yamada level 2.5 turbulence closure model; KPP, K profile parameterization; TKE, turbulent kinetic energy; TVD, total variation diminishing; EEN, energy-enstrophy conserving scheme; MPDATA, see Smolarkiewicz and Margolin ; BL, boundary layer.
dynamics: 2 (1 for open ocean, 1 for mean grid cell ice), thermodynamics: 7 categories for ice
2: mean grid cell ice thickness or open water
centered momentum, upwind, A + D
centered second order
Prather, second-order, second moment conserving
teardrop plastic rheology, line successive relaxation solver
cold snow: −0.85, melting snow: 0.78
0.7–0.8 (surface temperature dependent)
0.5–0.65 (clear sky, snow thickness dependent)
0.1–0.65 (ice thickness dependent)
0.1–0.72 (clear sky, ice thickness dependent)
0.1–0.575 (ice thickness and surface temperature dependent)
01.–0.5 (clear sky, ice thickness dependent)
Surface momentum exchange coefficients
1.4 × 10−3
1.14 × 10−3
2.75 × 10−3
1.63 × 10−3
1.1 × 10−3
5.4 × 10−3
5.5 × 10−3
5.0 × 10−3
5.5 × 10−3
Cw = 0.0055
 The Arctic domain of ECCO2 uses a regional configuration of the Massachusetts Institute of Technology general circulation model (MITgcm) [Marshall et al., 1997; Losch et al., 2010; Nguyen et al., 2011]. The domain has southern boundaries at ∼55°N in the Atlantic and Pacific sectors. The grid is locally orthogonal with horizontal grid spacing of approximately 18 km. There are 50 vertical levels ranging in thickness from 10 m near the surface to approximately 450 m at a maximum model depth of 6150 m. The model employs the rescaled vertical coordinate “z*” of Adcroft and Campin  and the partial cell formulation of Adcroft et al.  which permits accurate representation of the bathymetry. Bathymetry is from the S2004 (W. Smith, unpublished manuscript, 2004) blend of the Smith and Sandwell  and the General Bathymetric Charts of the Oceans (GEBCO) 1 arc min bathymetric grid. The nonlinear equation of state of Jackett and McDougall  is used. Vertical mixing follows Large et al. . A seventh-order monotonicity-preserving advection scheme [Daru and Tenaud, 2004] is employed and there is no explicit horizontal diffusivity. Horizontal viscosity follows Leith  but is modified to sense the divergent flow [Fox-Kemper and Menemenlis, 2008].
 The ocean model is coupled to the MITgcm sea ice model described by Losch et al. . Ice mechanics follow a viscous-plastic rheology and the ice momentum equations are solved numerically using the line successive overrelaxation (LSOR) solver of Zhang and Hibler . Ice thermodynamics are represented using a zero heat capacity formulation and seven thickness categories. Salt rejection during sea-ice formation is explicitly treated with a subgrid salt plume parameterization [Nguyen et al., 2009]. The model includes prognostic variables for snow thickness and for sea ice salinity. Boundary conditions are monthly and taken from the global optimized ECCO2 solution [Menemenlis et al., 2008]. Initial conditions are from the World Ocean Atlas 2005 [Antonov et al., 2006; Locarnini et al., 2006]. Atmospheric boundary conditions are from the Japanese 25 year Reanalysis Project (JRA25) [Onogi et al., 2007]. The integration period is from 1992 to 2008. A comprehensive assessment of the solution used in this study is given by Nguyen et al.  where the model parameters are optimized from 1992 to 2004 using ice thickness data from submarine and mooring ULS, sea ice concentration and velocity, and ocean hydrography.
 The GSFC model is based on the generalized Princeton Ocean Model (POM) which can accommodate σ coordinates (the original POM), but also z levels and a mixture of σ and z levels, as the vertical coordinate [Blumberg and Mellor, 1987; Mellor et al., 2002]. The results presented here are from a version which uses only z levels. The model domain covers the Arctic Ocean and the North Atlantic and extending to 15°S, with a horizontal resolution of 0.35–0.45°. Vertical resolution is 26 levels ranging from 6 to 500 m layer depths. Transport at the open boundaries is defined by an inflow of 0.8 Sv through Bering Strait, which equals the amount that exits through the model's southern boundary at approximately 15°S. Monthly T and S are restored at the open boundary buffer zones, but no other restoring is used in the GSFC model.
 The vertical mixing coefficients are determined from 2.5-layer turbulence closure [Mellor and Yamada, 1974] which requires computation of the kinetic energy and kinetic energy times mixing length as additional prognostic quantities. The ocean model is coupled to a two-layer dynamic-thermodynamic snow-ice model where sea ice is described as a generalized viscous medium [Mellor and Kantha, 1989; Häkkinen and Mellor, 1992; Häkkinen and Geiger, 2000]. Ice-ocean momentum, heat and salt exchange is described by a flow over a rough surface based on the theory of Yaglom and Kader . The solar radiation can penetrate below the ocean surface to distribute shortwave solar heating. P-E is from Rasmusson and Mo  and the Sellers formula as used by Parkinson and Washington  for shortwave radiation. The model uses NCEP wind stress with results from a cold start at January 1948 using daily NCEP Reanalysis data.
 The INMOM is a “terrain-following” σ coordinate ocean model [Moshonkin et al., 2011]. The global version of the INMOM with low spatial resolution is used as the oceanic component of the IPCC climate model INMOM [Volodin et al., 2010] presented by the Intergovernmental Panel on Climate Change . The present version of the model covers the North Atlantic (open boundary at approximately 20°S), Arctic Ocean, and Bering Sea regions including the Mediterranean and Black Seas. A rotation of the model grid avoids the problem of converging meridians over the Arctic Ocean. The model North Pole is located at the geographical equator, 120°W. The 1/4° horizontal eddy-permitting resolution is used. There are 27 unevenly spaced vertical σ levels. A Laplacian operator along the geopotential surface is used for the lateral diffusion on the tracers and a bi-Laplacian operator along σ surface is used for the lateral viscosity on momentum. The vertical viscosity and diffusion coefficients are calculated by Monin-Obukhov-Kochergin [Kochergin, 1987] parameterization. The elastic-viscous-plastic dynamic-thermodynamic sea ice model [Hunke, 2001; Yakovlev, 2009] is coupled to the ocean model. Surface forcing is from the CORE forcing data set [Large and Yeager, 2004]. The surface turbulent fluxes are calculated using the bulk formulae. A climatological monthly runoff from CORE is applied along the coasts. Surface salinity is restored toward monthly climatology with a relaxation scale of approximately 12 days both for the open ocean and under sea ice. Temperature and salinity restoring toward monthly climatology is used at the open boundaries.
 The ORCA model is a global z level OGCM based on the NEMO ocean code [Madec, 2006] and uses the global tripolar ORCA grid at 1/4° horizontal resolution. The effective resolution is ∼27.75 km at the equator, increasing to 6–12 km in zonal and ∼3 km in meridional directions in the Arctic Ocean. Thus the model resolves large eddies in the Arctic Ocean and “permits” smaller ones. The configuration was developed by the DRAKKAR project and is described by Bernard et al.  as the ORCA025-G70 configuration. The version of the model used here has a higher vertical resolution (64 vertical levels) than the ORCA025-G70, with thicknesses of the model levels ranging from ∼6 m near the surface to ∼204 m at 6000 m. The “partial step” topography [Adcroft et al., 1997; Pacanowski and Gnanadesikan, 1998] is used, whereby the bottom cell is variable and more able to represent small topographic slopes near the Arctic shelves, resulting in the more realistic along-shelf flow [e.g., Bernard et al., 2006; Penduff et al., 2007]. The ocean model is coupled asynchronously to the sea ice model every five oceanic time steps through a nonlinear quadratic drag law [Timmermann et al., 2005].
 The sea ice model LIM2 [Fichefet and Morales Maqueda, 1997] is based on the viscous-plastic (VP) rheology with an elliptic yield curve [Hibler, 1979] and Semtner's two-layer ice, one-layer snow thermodynamics [Semtner, 1976]. The latter is updated with sea ice thickness distribution [Fichefet and Morales Maqueda, 1997]. Other features of the model are the positive-definite, second-order, second moments conserving advection scheme [Prather, 1986], ice thickness–dependent albedo [Payne, 1972], lateral ice thermodynamics and a simple snow-ice formation mechanism due to hydrostatic imbalance [Fichefet and Morales Maqueda, 1997]. Sea ice salinity is taken equal to 4, the average value of sea ice salinity in the Central Arctic Ocean. Heat exchange between the ocean and sea ice is calculated as a product from the departure of surface temperature from the salinity-dependent freezing point and friction velocity at the ice-ocean interface. Solar radiation penetrates snow-free ice, increasing latent heat storage in brine pockets [Fichefet and Morales Maqueda, 1997].
 Surface forcing is provided by the DRAKKAR Forcing Set 3 [Brodeau et al., 2010]. This data set is a combination of precipitation and downward longwave and shortwave radiation fields from the CORE forcing data set [Large and Yeager, 2004] and 10 m wind, 2 m air temperature, and 2 m specific humidity from the ECMWF ERA40 reanalysis product. The turbulent air/sea and air/ice fluxes are calculated by the model using the bulk formulae [Large and Yeager, 2004]. A climatological monthly runoff [Dai and Trenberth, 2002] is applied along the coasts. Surface salinity is restored toward monthly climatology with a relaxation scale of 180 days for the open ocean and 12 days under sea ice.
 The pan-Arctic coupled ice-ocean model used in this study consists of a Hibler-type sea ice model [Zhang and Hibler, 1997] coupled to a regional adaptation of the Parallel Ocean Program (POP) [Smith et al., 1992; Smith and Gent, 2002]. The sea ice model employs a viscous-plastic rheology, two ice thickness categories (mean ice thickness and open water), the zero-layer approximation of heat conduction through ice and a simplified surface energy budget [Zhang et al., 1999; Maslowski et al., 2000]. The ice strength is parameterized in this model as a function of the mean grid cell ice thickness, which tends to underestimate ice drift and deformation [Maslowski and Lipscomb, 2003; Kwok et al., 2008]. The ocean model is a z coordinate ocean model with an implicit free surface and 45 vertical levels, with layer thickness ranging from 5 m near the surface to 300 m at depth.
 The model domain includes all sea ice covered oceans and marginal seas of the Northern Hemisphere, extending to ∼30°N in the North Pacific and to ∼45°N in the North Atlantic. Both components of the coupled model use identical horizontal grid configured at 1/12° (∼9 km) in a rotated spherical coordinate system to eliminate the North Pole singularity. The model lateral boundaries are solid and no mass flux is allowed through them, however, a virtual annual cycle salt flux is prescribed for most major rivers as a function of river run-off. Surface layer (0–5 m) temperature and salinity are restored toward monthly climatology (PHC) [Steele et al., 2001] on timescales of 365 and 120 days, respectively.
 The model was forced with daily average atmospheric fields (downward longwave and shortwave radiation, surface air temperature, specific humidity, wind velocity and stress) from the European Centre for Medium-Range Weather Forecasts (ECMWF) 1979–1993 reanalysis and 1994–2004 operational products. Additional details of model configuration, initialization, and integrations are given by Maslowski et al. [2004, 2008].
 The UW model is the coupled PIOMAS, a regional version of the global Parallel Ocean and Ice Model (POIM) [Zhang and Rothrock, 2003]. The sea ice model is the multicategory thickness and enthalpy distribution (TED) sea ice model [Zhang and Rothrock, 2001; Hibler, 1980]. It employs a teardrop plastic rheology [Zhang and Rothrock, 2005], a mechanical redistribution function for ice ridging [Thorndike et al., 1975; Hibler, 1980], and a line successive relaxation dynamics model to solve the ice momentum equation [Zhang and Hibler, 1997]. The TED ice model also includes a snow thickness distribution model following Flato and Hibler . TED has 12 categories each for ice thickness, ice enthalpy, and snow depth. The centers of the 12 ice thickness categories are 0, 0.26, 0.71, 1.46, 2.61, 4.23, 6.39, 9.10, 12.39, 16.24, 20.62, and 25.49 m. The ocean model is based on the POP developed at Los Alamos National Laboratory [Smith et al., 1992]. The model domain of PIOMAS covers the Northern Hemisphere north of 48°N. The POP ocean model has been modified to incorporate open boundary conditions [Zhang and Steele, 2007] so that PIOMAS is able to be one-way nested to a global POIM [Zhang, 2005] with open boundary conditions along 49°N. The PIOMAS finite difference grid is based on a generalized orthogonal curvilinear coordinate system with the “North Pole” of the model grid placed in Greenland. The model horizontal resolution ranges from 6 to 75 km with a mean resolution of 22 km for the Arctic, Barents, Greenland-Iceland-Norwegian Sea, and Baffin Bay. The POP ocean model has 30 vertical levels of varying thicknesses to resolve surface layers and bottom topography. The first 13 levels are in the upper 100 m and the upper six levels are each 5 m thick. The model bathymetry is obtained by merging the IBCAO (International Bathymetric Chart of the Arctic Ocean) data set and the ETOPO5 (Earth Topography Five Minute Gridded Elevation) data set [see Holland, 2000]. PIOMAS is forced by daily NCEP/NCAR reanalysis [Kalnay et al., 1996] surface forcing fields, i.e., 10 m surface winds, 2 m surface air temperature (SAT), specific humidity, precipitation, evaporation, downwelling longwave radiation, sea level pressure, and cloud fraction. Cloud fraction and SAT are used to calculate downwelling shortwave radiation following Parkinson and Washington . Model forcing also includes river runoff of freshwater in the Arctic Ocean. Climatological river runoff (i.e., no interannual variability) is provided as in the work of Hibler and Bryan . The calculations of surface momentum and radiation fluxes follow Zhang and Rothrock . No climate restoring is allowed. No data assimilation is performed for this study, although PIOMAS is able to assimilate ice concentration and sea surface temperature data.
4. Observational Data
 Ice thickness from models is compared with thickness derived from freeboard (ICESat), indirect measurements of thickness (airborne electromagnetic), thickness computed from ice draft (moorings and submarines), and thickness measured directly (drill holes). Much of the data used in this study is available from the new Unified Sea Ice Thickness Climate Data Record [Lindsay, 2010]. The CDR has summary statistics for moorings, submarines, aircraft, and satellite measurements of ice draft and ice thickness. The summary statistics include mean, minimum, maximum, and standard deviation of the measurement as well as the full probability density distribution. There are currently over 3000 samples in the archive which can be accessed along with documentation and metadata at http://psc.apl.washington.edu/sea_ice_cdr/.
4.1. ICESat Campaigns
 The Arctic Ocean sea ice thickness grid with 25 × 25 km resolution is shown (Figure 1a) for 2004–2008 and five fall and five winter ICESat campaigns [Kwok et al., 2009]. The duration, start and end dates of the fall and winter campaigns are variable (Table 2) with a typically three to four month separation between the fall and winter campaigns. The five fall campaigns start between 24 September and 25 October and end between 8–27 November. Winter campaigns start between 17 February and 12 March and end between 21 March and 14 April.
Table 2. ICESat Campaign Periods
24 Sep to 18 Nov
17 Feb to 21 Mar
3 Oct to 8 Nov
17 Feb to 24 Mar
21 Oct to 24 Nov
22 Feb to 27 Mar
25 Oct to 27 Nov
12 Mar to 14 Apr
2 Oct to 5 Nov
17 Feb to 21 Mar
 The ICESat thickness data are derived from freeboard (distance above the water line to top of the snow cover) obtained from the Geoscience Laser Altimeter System (GLAS). The methodology for determining freeboard, snow depth, and ice thickness from ICESat's 70 m footprint is given by Kwok et al. [2007, 2009]. The empirical relationship between thickness and freeboard for the first year (FY) ice in late winter is discussed by Alexandrov et al. . Satellite grid point values were computed and a 50 km Gaussian smoothing applied. The satellite hole is filled using an interpolation procedure described by Kwok et al. . The gridded ICESat ice thickness estimates are available at the Jet Propulsion Laboratory at http://rkwok.jpl.nasa.gov/icesat/index.html.
4.2. Electromagnetic Airborne Soundings
 Thickness data were obtained using EM induction sounding that computes the distance to the ice/water interface by evaluating the amplitude and phase of a secondary EM field induced by eddy currents in the seawater. With airborne measurements, the height of the EM instrument above the air-snow surface is measured with a laser altimeter. Ice plus snow thickness is the difference between the EM distance measurement to the ice/water interface and the laser height of the snow [Haas et al., 2009].
 The accuracy of the EM method is ±0.1 m over level ice under typical summer conditions [Haas et al., 1997; Pfaffling et al., 2007] with only small effects from melt ponds [Haas et al., 1997; Eicken et al., 2001]. The horizontal extent of induced eddy currents results in a measurement footprint area of up to 3.7 times the instrument height above the water [Reid et al., 2006]. Surveys (Figure 2) were performed using a sensor (“EM-Bird”) towed from helicopters and fixed wing aircraft from icebreakers and land bases in various regions of the eastern and western Arctic [Haas et al., 2006, 2008, 2009, 2010] generally in April/May and August/September.
4.3. Upward Looking Sonar and Ice Profiling Sensors From Moorings
 Eleven moorings with upward looking sonars (ULS) deployed in Fram Strait and the Greenland Sea (Figure 1b) by the Alfred Wegener Institute for Polar and Marine Research, Bremerhaven, Germany acquired almost 25 station years of data between 2002 and 2004 as a contribution to the World Climate Research Programme's Arctic Climate System Study/Climate and Cryosphere (ACSYS/CliC) Project. The ice draft data are available from the CDR as well as the National Snow and Ice Data Center Web site with data descriptions by Witte and Fahrbach .
 Sea ice draft data are available on the continental shelf of the Eastern Beaufort Sea for the period April 1990 to September 2003 from IPS instruments deployed by H. Melling at the Institute for Ocean Sciences (IOS), Canada. Data are described by Melling and Riedel [2008, and references therein]. Sea ice draft data in the central Beaufort Sea for the period 2003–2008 were acquired through the Beaufort Gyre Exploration Project (BGEP, A. Proshutinsky, PI). The point data are available at the Woods Hole Oceanographic Institute Web site (http://www.whoi.edu/beaufortgyre/).
Melling and Riedel  estimate for their data an accuracy of ±0.05 m draft for level ice. The ACSYS/CliC Workshop [Steffen, 2004] on sea ice thickness requires an accuracy of ±0.05 m for draft for ULS and IPS.
4.4. Upward Looking Sonar Measurements From Submarines
 Submarines have traversed the Arctic regularly since 1958 measuring the draft of the overhead sea ice using upward looking sonar (ULS). The processed and publicly available data (archived at National Snow and Ice Data Center (NSIDC) and available as 50 km averages at the CDR) include 42 cruises from 1975 to 2000 covering 120,000 km. The cruises took place between April and November, although most of the data were collected in late spring (April–May) and in late summer–fall (August–October) [Rothrock and Wensnahan, 2007].
 The draft data are produced for periods when the submarine was traveling in a straight line at constant speed and depth. The basic data product is ice draft along the cruise track (Figure 2). The original data typically have a spacing of 1–8 m with a footprint size of 2–7 m depending on the submarine depth. Processed data segments vary in length from a few to several hundred kilometers.
4.5. Pack Ice and Fast Ice Measurements From Drill Holes
 Historical ice thickness data are available from the “Atlas of ice and snow of the Arctic Basin and Siberian Shelf Seas” [Romanov, 1995]. This data set contains sea ice and snow from spring (mid-March–mid-May) measurements collected during aircraft landings associated with the Soviet Union's Sever airborne and North Pole drifting station programs. The High-Latitude Airborne Annual Expeditions Sever took place in 1937, 1941, 1948–1952, and 1954–1993 [Konstantinov and Grachev, 2000]. The data set is derived from as few as 7 landings (1937) to nearly pan-Arctic coverage in the 1970s. The data set contains measurements of 23 parameters, including (1) ice thickness and snow depth on the runway and surrounding area; (2) ridge, hummock, and sastrugi dimensions and areal coverage; and (3) snow density. The data used in this paper are a subset of those used to create the atlas “Morphometric Characteristics of Ice and Snow in the Arctic Basin” (self published by I. P. Romanov in 1993 and republished by Backbone in 1995). Romanov provided these data to NSIDC in 1994 (see http://nsidc.org/data/g02140.html for full description and data in ASCII format). We use two data sets from Romanov Atlas. The first one referenced as Siberian Seas data set includes observations from drill hole in springs of 1982–1986 covering the Kara, Laptev, East Siberian, and Russian part of the Chukchi Sea. The second part, named “Romanov Atlas” includes all digital data from NSIDC data archive and covers the entire Arctic Basin for 1930s–1980s.
 We also use data from 51 coastal stations where sea ice thickness was measured monthly through drill holes. The data represent thicknesses of fast sea ice in the vicinity of the coastal station, mostly first-year ice, undeformed by ridging or rafting. Monthly data are available for 1998–2009. The data were provided by the Arctic and Antarctic Research Institute, St. Petersburg, Russia. Although these data and the data from the Romanov Atlas are unique, and the accuracy of such direct measurements is likely less than 0.05 m, we cannot make here a formal statement regarding error of drill hole position and accuracy.
5. Data Biases
 ICESat estimates [Kwok et al., 2009] of ice drafts are consistently within 0.5 m (one standard deviation) of profiles from a mid-November 2005 submarine cruise and 4 years of ice draft from moorings (BGEP-WHOI and AIM-IOS) in the Chukchi and Beaufort Seas. The standard deviation of the uncertainty in ICESat thickness estimates is 0.37 m [Kwok and Rothrock, 2009]. The ICESat measurements, when converted to drafts, are smaller on average by 0.1 ± 0.42 m than adjusted ULS submarine drafts and by 0.14 ± 0.51 m than ULS moored drafts [Kwok et al., 2009]. ICESat ice thickness estimates for 2003–2008 for the same area of the Arctic Ocean as represented by their regression equations match well with the earlier submarine records [Kwok and Rothrock, 2009]. Shifts in the individual satellite campaign timing introduce seasonal and interannual variability within the data set, although thicknesses represent near maximum end of winter and minimum end of summer data and this bias may be small.
 Assessment of model performance using sea ice drift and deformation derived from satellite data indicates little agreement between modeled patterns of sea ice deformation fields and the linear features produced from the RADARSAT Geophysical Processor System (RGPS) at days to seasons and from kilometers to near basin scale [Kwok et al., 2008]. Compared to the RGPS products, specific model shortcomings included slow ice drift along coastal Alaska and Siberia, poor temporal rates of regional ice cover divergence, and low deformation-related ice volume production [Kwok and Sulsky, 2010].
 Ice thickness from electromagnetic induction is computed by differencing the EM inferred distance to the ice/water interface with the laser height of the snow plus ice. Thus, the ice thickness from the EM measurements includes snow thickness. Due to the footprint, the measured, unconsolidated ridge thickness can be less than 50% of its “true” thickness from (primarily) drill holes [Haas et al., 2008] and from ULS [Haas and Jochmann, 2003]. This would result in underestimates of mean ridge thicknesses. However, due to the footprint, EM measurements would overestimate ice thickness if the sensor is over the flanks of a ridge, which would compensate for too small estimate over ridge crests [Haas et al., 1997]. The EM thickness distributions are most accurate with respect to modal thickness, while mean thickness can still be used for relative comparisons between regions and years.
 Ice draft from moorings is converted to thickness as the product of draft and the seawater to sea ice density ratio (1.115) [e.g., Bourke and Paquette, 1989]. This assumes that contributions to thickness are negligible from variations in the seawater density and any overlying snow water equivalent. Snow cover, melt ponds and deformed ice provide sources of error. Ice draft from ULS and IPS mounted on moorings is overestimated on average in rough ice. Note that NSIDC has been alerted to an error in the way a bias correction was applied for the AWI data, but pending further clarification these data are used assuming the accuracy noted above.
Rothrock and Wensnahan  identify the following submarine ice draft measurement errors: precision error; error in identifying open water (ice of zero draft); sound speed error; error caused by sonar footprint size variations; error from uncontrolled gain and thresholds; error due to vessel trim. There are also differences between analog versus digitally recorded data with paper charts biased toward thicker ice by over 0.30 m due to their coarser temporal resolution. The drafts are obtained from the “first return” or from the depth of the deepest ice within the footprint. Rothrock and Wensnahan  estimate the overall bias due to this effect of the submarine ULS data from the actual draft as +0.29 m with a standard deviation of ±0.25 m. Rodrigues  finds a bias based on the sonar beam width and ice roughness larger than that found by Rothrock and Wensnahan . For this study we have corrected for the submarine draft bias of +0.29 m described by Rothrock and Wensnahan .
 The Romanov ice thickness data were obtained at sites adjacent to aircraft landing sites such as refrozen leads. Although measurements were taken from nearby ice floes, these data may overrepresent level, undisturbed ice. Hunke  compared model output with the aircraft landing data (1958–1986) and found that model output was too thick while the same model output was comparable to submarine draft data from 1986 to 1988, although the landing and submarine time periods do not overlap. If the Romanov data are indeed biased “thin” it may not reflect the prevailing ice thickness with large degree of deformation away from landing sites.
 The observational data have very different spatial resolutions; moored instruments and drill holes produce point data, while the ICESat data were processed using a 50 km Gaussian smoothing; and the Unified Sea Ice Thickness Climate Data Record provides the statistical mean ice thickness at 50 km intervals for the submarine ULS data.
 Although the data sets have the above mentioned biases, we are not aware of a thorough comparison among all data sets, perhaps in part because of the lack of broad temporal and/or regional overlap among them.
 In the following discussion, model minus observed thickness values are referred to as residuals; a positive residual indicates a model overestimates observation. Where model results overlapped observed data temporally, model ice thicknesses were extracted from the nearest model grid point and monthly averaged. We compare the observed data with the nearest model grid point, an approach perhaps advantageous to models with finer resolution. For models with coarse resolution, a 50 km weighted average which is used by Rothrock and Wensnahan , might be advantageous.
 Record length correlation coefficients and residuals were computed from the monthly time series for each of the moored ULS and the 51 coastal stations data. Annual correlation coefficients and residuals were computed for each of the model-data pairs from ICESat, airborne EM, Romanov Atlas, and submarine data sets. From these, grand mean correlation coefficients and residuals for each observational platform were computed.
 To show the residuals and correlation coefficients, a modified Taylor Diagram [Taylor, 2001] is used where the radial angle is proportional to the residual with ±π corresponding to ±2 m residuals, and the distance from the origin is proportional to the correlation coefficient where r equals 1 on the unit circle (Figure 3). Model performance is quantified by comparing the area swept ∣(1 − r)θ∣ by the radial “tip” (1 − r) rotated from zero to the residual value (θ).
 The linear relationships between observed ULS and coastal station thickness data and model output are computed from the record lengths determined from each instrument or station location. Seasonally averaged thickness is used for ICESat and Romanov's data and monthly averages were used for the multiple locations of the airborne EM and submarine ULS data.
7. Comparisons Between Observations and Model Results
7.1. Siberian Seas
 The residuals from the Romanov drill hole data from the Siberian Seas (Kara, Laptev, East Siberian, and Chukchi) were averaged from 1982 to 1986 and contoured using a color bar defined so that zero is white (Figure 4). Except for the GSFC, the models have positive residuals (consistent with the argument that the Romanov data are “thin”). Spatially, residuals are larger in the east (East Siberian and Chukchi Seas) than in the west (Kara and Laptev Seas), particularly for INMOM. GSFC has near-zero residuals in the eastern Siberian marginal seas, and small, negative residuals in the west.
 The residuals are all less than 0.75 m (Figure 5a). The UW, ECCO2, and NAME are correlated with data above 0.6 and have residuals smaller than ±0.20 m. The GSFC correlation is larger than 0.6 and the residual is negative, about −0.60 m. The INMOM correlation is less than 0.5 and the residual exceeds 0.60 m. (Post-2001 results from the ORCA model were not available at the time of our analysis.)
7.3. Airborne EM
 The correlations are all less than 0.5. UW, NAME, INMOM, GSFC and ORCA underestimate ice thickness by as much as 0.40 while ECCO2 overestimates by ∼0.30 m (Figure 5b). GSFC, UW and NAME have almost identical residuals, underestimating the observations by ∼0.5 m. The negative residual occurs, perhaps, because the EM measurements include snow depth with the ice thickness and EM overestimates ice thickness if the sensor is over the flanks of a ridge.
7.4. Moored ULS
 All six models have residuals less than ±0.25 m and moderate correlations with the data (Figure 5c). ECCO2, UW and GSFC are correlated at ∼0.6. INMOM, NAME, and ORCA correlations are weaker with similarly sized residuals of ∼0.25 m. ECCO2 demonstrates the best agreement with the moored ULS data with its very small residual and correlation of 0.6. GSFC and UW have negative residuals of ∼−0.15 m.
7.5. Submarine ULS
 ECCO2 has the highest correlation (∼0.7) and a residual less than +0.2 m (Figure 5d). UW and GSFC have correlations above ∼0.6. INMOM and NAME have positive residuals less than +0.70 m with correlations less than 0.6. ORCA has the weakest correlation (0.48) and a negative residual slightly larger than −0.30 m.
7.6. Coastal Stations
 All models overestimate thickness at the coastal stations except GSFC (Figure 5e) which has a near zero residual. The correlations are similar, between ∼0.4 and ∼0.6 with NAME having the largest correlation (>0.6). INMOM and ORCA have the largest residuals.
7.7. Romanov Atlas Data at NSIDC
 All models overestimate thickness for these data except GSFC (Figure 5f), consistent with Figure 4. GSFC has a negative residual (∼−0.20 m) while INMOM, NAME, UW, and ORCA have positive residuals, larger than residuals from any other measurements. The largest residuals are consistent with other findings discussed above suggesting the Romanov data probably underrepresent thicker, likely heavily ridged ice.
8. ULS Residuals
 Here we focus on assessment of ULS residuals because of (1) broad spatial and temporal coverage, (2) relatively well understood accuracy and biases, and (3) similarities between measurements in time at a point from a mooring and a model grid point. Figure 6 displays the correlations and residuals from the moored ULS instruments to show that UW and ECCO2 have residuals generally less than 0.70 m (Figures 6a and 6b), with progressively larger residuals for GSFC and NAME (Figures 6c and 6d), and the largest residuals (approaching ±2 m) from INMOM and ORCA (Figures 6e and 6f). Model correlations range between zero and one without any apparent relationship with the residuals.
9. Linear Relationships
 The linear fit computed from modeled and observed thickness pairs are shown in Figure 7 for satellite (Figure 7a), airborne EM (Figure 7b), moored ULS (Figure 7c), submarine ULS (Figure 7d), coastal stations (Figure 7e), and the Romanov Atlas (Figure 7f). The model and observed thicknesses are those used to form the residuals. The gray bar in the lower right depicts the measurement error from section 4.
 The slopes are less than one in all but four cases and the y intercepts are positive in all but three of the 32 cases. The regression lines cross y = x at variable locations with a mean of 2.2 m (range is −3.4 to 7.7 m) which we now use to separate “thin” from “thick” ice in the remaining discussion.
 INMOM, ECCO2, UW, and GSFC overestimate satellite-derived thickness less than 1m (Figure 7a). INMOM, ECCO2 and UW overestimate satellite-derived thickness when less than 2.0 m, and INMOM overestimates ice less than 3 m. The models underestimate the thickest ice by two or more meters. NAME is omitted as it overlapped the satellite record in 2004 only.
 INMOM, UW, and ECCO2 overestimate EM thickness less than ∼1 m, and all models underestimate airborne EM thickness when thicker than ∼3.5 m (Figure 7b). ORCA is omitted.
 GSFC strongly underestimates moored ULS thickness less than 2 m and NAME and ORCA overestimate ice thinner than 1.5 m (Figure 7c). ECCO2 estimates thickness quite well across the range.
 All models overestimate thickness when submarine ULS measurements are less than ∼2 m (Figure 7d) and underestimate ice measured to be thicker than ∼4 m. The slopes of the INMOM, UW, GSFC, ECCO2, and ORCA regression lines are similar. UW, GSFC, ECCO2, and ORCA have y intercepts near one, with larger y intercepts for INMOM and NAME showing the potentially large errors for “open” water.
 All models except NAME and GSFC overestimate fast-ice ice thickness from coastal stations when measured to be less than 1.5 m (Figure 7e). GSFC is unable to reproduce the range of the observed fast-ice measurements. UW and ECCO2 have similar slopes and y intercepts.
 INMOM, NAME, UW, and ORCA overestimate the Romanov Atlas thickness less than 3 m, and all models overestimate thickness where it is measured to be less than 1 m (Figure 7f). (ECCO2 is omitted as mode output does not overlap the Romanov Atlas data.)
 We have compared six AOMIP models having different numerical methods, resolution, domain, forcing, and boundary conditions against ice thickness acquired using six different methodologies. Despite differences among models and data, the model performance is dominated by an overestimate of thickness of measured ice thinner than ∼2 m and an underestimate of thickness of measured ice thicker than ∼2 m.
 For all observational platforms, models generally have regression slopes less than one ( ), positive y intercepts ( ), and cross the y = x line (perfect fit) at a mean observed thicknesses of about 2 m. Overestimating thin ice is larger for the satellite and submarine ULS data (Figures 7a and 7d). For the thickest ice measured by satellite, airborne EM and submarine ULS, the models underestimate thickness by several meters (Figures 7a, 7b, and 7d), a consistent pattern across platforms and across models.
 The AOMIP models compare well, in the mean, with the submarine ULS with a mean residual of 0.12 m. We note that the Louvain-la-Neuve (LIM3) model [Vancoppenolle et al., 2009a] underestimated submarine draft observations converted to thickness by −0.55 ± 1.04. They obtained a positive model bias for thin ice and a negative bias for the thick ice, similar to our results. Vancoppenolle et al. [2009b] performed sensitivity study of ice thermodynamics to the sea ice salinity and demonstrated a 0.30 m reduction in the model bias when the salt evolution model is used instead of constant or prescribed varying salt profiles.
 The model biases indicate a regional dependency with small residuals in the Beaufort Sea and the central Arctic Ocean (moored and submarine ULS data) and an increase in the size of the positive residuals on the Siberian Shelf (station data and Siberian Seas data). Rothrock et al.  and Vancoppenolle et al. [2009a], comparing model results with submarine ULS, obtained a persistent pattern of model biases with positive values in the Beaufort Sea, north of Greenland and toward the Alaskan and East Siberian Shelves, and with negative values in the Arctic Transpolar Drift and toward Fram Strait. Wilchinsky and Feltham  found a similar pattern in their simulations and demonstrated that using sliding friction in sea ice rheology can reduce the biases.
 Adjusting the albedo may compensate for missing model physics, in particular the lack of parameterization of albedo with ice age or melt fraction. This is a common problem with many of the AOMIP sea ice models where albedos were obtained from an 18 km regional Arctic optimized solution where we minimized the misfits between all oceanic and sea ice data available to us, and not just sea ice data alone. Sea ice models typically use only a few sea ice and snow albedos to obtain basin-scale mean sea ice thickness that is on the same order as basin-wide observed thickness. In the case of ECCO2, albedos were optimized by minimizing data-model misfits between all oceanic and sea ice data. However, we note that adjusting the albedo may compensate for missing model physics, in particular the lack of parameterization of albedo with ice age or melt fraction.
 The mechanisms that create the thickest ice in the Arctic include the complex dynamics of ridging. However, in areas away from strong dynamic processes, such as shallow, coastal regions covered by landfast ice, thermodynamics are responsible for ice growth. We look next for cases where thermodynamic ice growth, based on observed air temperature, explains observed ice thickness by evaluating the data from fast-ice stations along the Russian coast. We expect that heat from the ocean here is small or negligible.
 We select from the set of 51 coastal stations specific years and stations where the observed ice thickness is consistent with the thermodynamic thickness computed from empirical formula and observed air temperatures. When the monthly mean observed sea ice thickness agrees with estimates from thermodynamics, then modeled ice thickness at these stations and months different from observed must be due to errors in modeled thermodynamic ice growth, the inclusion of model dynamics that do not apply under the observed conditions, or the application of forcing substantially different from observations.
 The thermodynamic ice thickness (hi) is computed for all coastal stations for September–April. A subset of monthly ice thicknesses is created with the criteria that the observed mean September ice thickness is zero, the observed mean September air temperature is less than 0°C, and the observed thickness agrees with the computed thickness for at least September–December.
 Thickness is computed from Zubov's  formula where hi−1 is the ice thickness in cm and Ti−1 the temperature (°C) for the prior 30 days. A useful discussion of this empirical formula and the thermodynamics of sea ice is given by Makshtas . A different formula, such as Maykut and Untersteiner's  relationship for snowless (or snow covered) ice [Weeks and Hibler, 2010], could be used. We compared these formulas (not shown) to highlight differences in resulting thicknesses. Some formulas produce faster growth. Which formula is “right”? In the case of the Zubov formula above, we identified 22 “cases” of observed ice thickness from the coastal station data that closely match the thermodynamic ice thickness for at least 4 consecutive months and as much as 9 consecutive months. These cases are from 13 unique coastal stations, 8 unique years, and consist of 132 months of ice thickness values.
 The resulting data are divided into a “thermodynamics” group consisting of observed ice thickness that agrees with the Zubov calculation, and “models” consisting of the modeled ice thickness from the locations and months where “thermodynamics” apply. Figure 8 shows “thermodynamics” as the gray area (drawn as a set of gray lines connecting thicknesses through the months) and “models” as red lines connecting monthly thickness for the 22 cases. The thicker red lines identify modeled ice thickness that was zero in September, as observed.
 It is evident from Figure 8 that all models have ice thickness values outside the range of “thermodynamics” suggesting the inclusion of dynamics such as ridging that thicken the ice. While all models except INMOM have cases where the September thicknesses are zero, as observed, all models also have cases where the September ice thickness is nonzero. The September thickness for INMOM is always greater than zero, more than 150 cm in one case. In these cases, modeled ice thickness greater than “thermodynamics” indicates the model has included dynamic factors such as ridging.
 For ECCO2 there are cases where the ice becomes thicker than observed by October and substantially thicker by December–March. For those cases with zero ice thickness in September, the UW model tracks just above “thermodynamics” through November, GSFC tracks “thermodynamics” closely through December and then decreases noticeably below “thermodynamics” through May. ORCA and NAME have only one case with a zero September thickness, and both track “thermodynamics” well.
 Ice growth was computed using forward differences from both “thermodynamics” and “models.” While model sea ice “growth” likely includes nonthermodynamic processes, we refer to both observed and modeled changes as ice “growth,” for ease of this discussion. Dynamic processes may be relevant in modeled sea ice thickness, particularly when the values fall outside “thermodynamics.” Nonthermodynamic processes must be involved when the ice growth exceeds a rate limit based on the coldest (monthly) air temperatures.
Figure 9 (top) shows the ice growth rates for “thermodynamics” and for the “models.” For September all models have mean ice growth greater than one standard deviation from “thermodynamics” except GSFC. ECCO2, UW, and ORCA have October mean ice growth larger than one standard deviation from “thermodynamics.” There are cases where ECCO2 and GSFC ice growth substantially exceeds ∼3.0 cm d−1. In November ECCO2 has the largest model growth rate and exceeds “thermodynamics” by more than one standard deviation. From December to February the mean model growth rates are within approximately one standard deviation of “thermodynamics.”
 The ratios of modeled ice growth to the thermodynamic ice growth (for each “case,” each model mean, and the mean of all models) are shown for each month in Figure 9 (bottom). In September, mean model ice growth is ∼4.5 times faster than “thermodynamics.” INMOM has individual cases where the model to observed rate exceeds ten, and UW and INMOM have mean September growth rates approximately 8 times the observed. In October the ice growth ratio is reduced but still above one. For November–March the ratios are near one.
 If the bias is driven by the applied surface forcing, then errors exist in the CORE surface forcing, DRAKKAR Forcing Set 3, ECMWF ERA40 reanalysis, the JRA25, and NCEP/NCAR Reanalysis radiation fields in September and October but not November–April. This seems an unlikely cause for overestimating thickness. Rather, the inclusion of dynamics processes such as ridging is present.
 To compensate for missing or poorly known physical processes, model parameters such as the ice demarcation thickness Ho between thin and thick ice are adjusted [Hibler, 1979; Smedsrud, 2011]. For example, the optimization described by Nguyen et al.  uses both hydrographic and sea ice data to obtain Ho of 0.6, and albedos of 0.71 for wet ice, 0.87 for dry snow, and 0.81 for wet snow. AOMIP models use a range of 0.25–0.5 m for Ho. Although assigning new ice to an instant thickness often greater than “reality” could account for “fast” growth, eliminating those cases from Figure 9 (cases with the thick red lines) does not substantially change the results that growth is frequently too “fast.”
 All models except INMOM are capable of producing sea ice thickness at rates comparable to “thermodynamic” because they have thickness values that overlay the observations (Figure 8). INMOM consistently overestimates thickness, and in no cases has thickness values at the “thin” end of observed. Sea ice that is thicker than observed is likely due to dynamic factors such as ridging in these fast ice regions.
 Finally, in order to better understand the role of sea ice dynamics in the simulated sea ice thickness in the regions of prevailing immobile fast ice conditions, we conducted two numerical experiments with ORCA model. All parameters and forcing conditions in these experiments were identical except that in our second experiment we introduced an empirical algorithm to parameterize immobile fast ice. In this algorithm, we assumed that in the regions with depths less than 28 m (typical observed fast-ice edge locations) and for times after 1 November and until 1 May, sea ice does not move and therefore cannot be deformed and/or ridged. Differences in sea ice thickness between control and experimental model runs after 10 years of simulation (1983–1992) are shown in Figure 10. As it was expected from our hypothesis that dynamics are increasing sea ice thickness beyond “thermodynamics,” the simulated sea ice thickness in the coastal regions with fast ice was decreased significantly, approaching observed values at coastal stations. Interestingly, the areas of intensive sea ice ridging in the experimental model run are located at the fast ice edge and we can expect that significant changes have to be observed not only in sea ice characteristics but also in oceanic parameters such as currents, temperature and salinity fields due to the influence of fast ice. The analysis of these results will be the subject of a different paper.
 Sea ice thickness from six AOMIP models were compared with thickness across the Arctic basin from (1) satellites; (2) airborne EM; (3) moored ULS in Fram Strait, Greenland Sea, and the Beaufort Gyre (ULS, IPS); (4) submarine ULS across the central basin; and (5) drill holes through fast along coastal Russia and within the ice pack. We find that the models overestimate thickness of ice thinner than ∼2 m and underestimate the thickness of measured ice thicker than ∼2 m. Overestimating thin ice is problematic for forecasting areas of open water and perhaps the timing of the seasonal cycle. Underestimating thick ice hinders long-term forecasts where the proper role of multiyear ice is critical. For locations and years along the arctic coastline where fast-ice growth closely follows a thermodynamic growth law, the AOMIP models produce ice at rates substantially higher than those expected from observations and thermodynamics for September. Averaging over all observational data sets, the correlations and smaller differences from observed thickness are better in the ECCO2 and UW models.
 The processes of fast-ice formation need to be parameterized in the AOMIP models to bring ice thickness values and ice growth rates closer to observations particularly in the early months of the ice growth cycle. The ice thickness threshold of ∼2 m, the thickness from our analysis that discriminates positive from negative model biases, is similar to the commonly accepted thickness distinguishing undeformed, first-year from multiyear Arctic sea ice [World Meteorological Organization, 1985]. Our results show that the overestimation of “thin” sea ice is, in part, due to ice growth that is much too rapid for areas where thermodynamic ice growth is expected such as the Russian coastal fast-ice regions.
 This research is supported by the National Science Foundation Office of Polar Programs covering awards of AOMIP collaborative research projects: ARC-0804180 (M.J.), ARC-0804010 (A.P.), ARC-0805141 (W.M.), ARC080789, and ARC0908769 (J.Z.). Travel support to attend AOMIP meetings and publications fees for Y.A., I.A., B.d.C., S.H., R.K., R.L., and A.N. were provided by OPP project ARC-0804010. C.H. is grateful for support with data acquisitions through the Alfred Wegener Institute in Germany and various EU projects. This research is also supported by the Russian Foundation of Basic Research, projects 09-05-00266 and 09-05-01231. At the National Oceanography Centre Southampton, this study was funded by the UK Natural Environment Research Council as a contribution to the Marine Centres' Strategic Research Programme Oceans 2025. The NOCS-ORCA simulations were undertaken as part of the DRAKKAR collaboration [Bernard et al., 2006]. NOCS also acknowledges the use of UK National High Performance Computing Resource.