Accuracy of analyzed stratospheric temperatures in the winter Arctic vortex from infrared Montgolfier long-duration balloon flights 2. Results



[1] Five long-duration flights with the Mongolfier infrared (MIR) balloon lasting 15 days, on average, have been conducted in the Arctic winter stratospheric vortex in 1997, 1999, and 2000. Temperatures from the European Centre for Medium-Range Weather Forecasts (ECMWF), Met Office (MO), National Centers for Environmental Prediction (NCEP), Data Assimilation Office (DAO), and NCEP/NCAR reanalysis (REA) have been compared to the observations from 4 to 146 hPa. Occasional large errors (>14 K) occur in each analysis, mainly above 30 hPa. In 2000 the standard deviations of ECMWF, MO, and DAO with respect to the measured temperatures range from 1.0 to 1.3 K, whereas NCEP and REA have substantially larger errors. In 1999 the flights took place during a major warming, and all operational models had large standard deviations and substantial biases. Preoperational versions of the new ECMWF model with increased stratospheric resolution and assimilation of the advanced microwave sounding unit, which none of the other models assimilated, show small biases and standard deviations.

1. Introduction

[2] To calculate ozone destruction, accurate temperature fields are important. Several studies have investigated the quality of analyzed temperatures using a variety of methods. Manney et al. [2002, hereinafter MSP] intercompared various analyses. They found large differences in the winters 1995/1996 and 1999/2000, but their method does not allow a determination of which analysis is best. Davies et al. [2002] found Met Office (MO) temperatures to be too warm in December 1999 and January 2000 and too warm in February and March 2000 compared to radiosondes, whereas the European Centre for Medium-Range Weather Forecasts (ECMWF) agreed better. Manney et al. [1996] compared radiosondes to the MO and the National Centers for Environmental Prediction (NCEP) analyses in the winter 1994/1995 and found a warm bias of up to 1.9 K and 0.3 K for the MO and the NCEP temperatures, respectively. For temperatures below 200 K the warm bias is up to 3.7 K and 1.7 K for MO and NCEP temperatures, respectively. Knudsen [1996] compared radiosondes to ECMWF analyses or first-guess fields (the 6-hour forecasts from the previous analyses assumed to be independent of errors on individual radiosondes) and found large biases until 30 January 1996. On that day, ECMWF moved to a three-dimensional (3-D) variational assimilation allowing a better combination of the radiosondes and the low-resolution TIROS Operational Vertical Sounding (TOVS) temperature data from the National Oceanic and Atmospheric Administration (NOAA) satellites. After 30 January the biases reduced, but still the analysis temperatures were found to underestimate the polar stratospheric cloud (PSC) extent by 23% at 30 hPa. Pullen and Jones [1997] used data from 28 ozonesonde stations, of which some are not included in the assimilation, and found a warm bias of the MO analyses in the winter 1994/1995 of 1.7 K below the nitric acid trihydrate melting point (TNAT) in the winter 1994–1995. Pawson et al. [1999] compared TOVS satellite-retrieved temperatures to Freie Universität Berlin (FUB) analyses, which are based primarily on radiosondes, and found that the satellite temperatures were too low at 50 hPa if the temperatures decreased up to 10 hPa. This might have some bearing on analysis temperatures in the cases where they have not been constrained sufficiently by radiosonde observations. They also found that the FUB temperatures were, on average, lower than the satellite temperatures below TNAT.

[3] Another result from the above studies is the relatively large standard deviation, of 2–3 K depending on the analysis, in the difference between analyses and individual sondes and increasing with altitude [e.g., Manney et al., 1996; Pullen and Jones, 1997]. The authors generally attributed this scatter to small-scale vertical structures in the temperature profiles and probably also to mesoscale orographically forced waves not captured by the analyses.

[4] Here we report on an experimental evaluation of the temperature of a variety of analyses in the winter Arctic stratospheric vortex from a series of data obtained from five long-duration Montgolfier infrared (MIR) balloon flights carried out in 1997, 1999, and 2000. Data from other flights of less than 2 days of duration have not been used here. The advantage of these data is that they are completely independent from the analyses. Already, Knudsen et al. [1996] made such a comparison but for much shorter flights from 1992 to 1995. They found large errors on the ECMWF temperatures in agreement with the previously mentioned studies. The MIR data have also been used to assess the accuracy of trajectory calculations [Knudsen et al., 2001].

[5] The companion paper [Pommereau et al., 2002] describes the balloon flights and the accuracy of the measurements. This paper is organized in the following way: first, a short summary of the flights and the quality of the measurements (section 2) and a description of the analyses (section 3). In section 4 the errors of the analyzed temperatures are described, and finally some conclusions are presented.

2. MIR Flights and Meteorological Measurements

[6] In 1997, 1999, and 2000 five long-duration MIR hot-air balloons were launched from ESRANGE (67.9°N, 21.1°E). The launch time and duration is given in Table 1. The balloons stayed well inside the polar vortex during all flights except that the balloon moved to the edge of the vortex at the end of the second MIR flight in 1997 and that the second balloon in 1999 was transported out of the vortex. More details about the flights are given in the companion paper [Pommereau et al., 2002].

Table 1. Launch Time and Duration for Five MIR Flights
BalloonLaunch Time, UTLaunch Day NumberDuration(Days)
MIR 1, 199724 February 180055.7713.04
MIR 2, 199717 March 210076.8721.11
MIR 1, 199918 February 190049.817.10
MIR 2, 199919 February 150050.6117.38
MIR 2, 200018 February 180049.7616.92

[7] As described in the companion paper [Pommereau et al., 2002], the quality of measurements of the meteorological sensors has been checked by comparison between the various sensors flown onboard the same balloon and with independent radiosondes as well. The result is that the GPS position has an accuracy of ±100 m, both vertically and horizontally, except in 1997 where the horizontal accuracy was reduced to ∼1 km due to a clock problem. The temperature was measured with aluminized Veco microthermistors mounted on 1-m-long booms on the opposite side of the gondola. Because of the solar heating of the gondola, only nighttime measurements could be used. The temperature measurements have an accuracy of ±0.5 K and a precision of ±0.4 K below 10 hPa. The pressure measurement has an accuracy of ±2 hPa and a precision of ±1 hPa, except for the second MIR flight in 1999 where a leak in the pressure sensor caused a drift in the pressure. Therefore the GPS altitude was converted to pressure by log-linear interpolation using the ECMWF analyses. Compared to Väisälä radiosondes the ECMWF geopotential height has an accuracy that is much better than 100 m at 30 hPa [World Meteorological Organization (WMO), 1998]. The conversion from geopotential height to geometric height was done as in the work of McPeter et al. [1999]. A height error of 100 m corresponds to 1.5 hPa at 100 hPa and less above. In 2000 the pressure was inferred from the GPS height also. The data were generally transmitted every 9 (1997) and 10 min (1999 and 2000).

3. Description of Analyses

[8] All analyses assimilate temperatures or raw radiances from the TOVS System (HIRS, MSU, and SSU) onboard NOAA TIROS satellites. In April 1998 the NOAA-15 spacecraft was launched carrying the AMSU instrument with high vertical resolution [ECMWF, 1999]. As of March 2000, ECMWF is the only model assimilating AMSU data as described in section 3.1.

3.1. ECMWF

[9] The ECMWF analyses available in 1997 were from a 3-D variational data assimilation, operational since 30 January 1996 [Courtier et al., 1998]. The number of vertical levels are 31 from the ground to 10 hPa (referred to as ECMWF31), and the levels used here are at or near 10, 30, 50, 70, 90, 110, and 130 hPa. The analyses are extracted from T106 truncated spherical harmonical fields in a 1.125°, 1.125°, and 1.5° latitude-longitude grid in 1997, 1999, and 2000, respectively. For 1999, two versions of ECMWF coexisted. First, the same 31 levels version, but augmented since November 1997 to a 4-D assimilation scheme [Rabier et al., 2000], was available. Second, preoperational runs (operational on 5 May 1999) of the new 50 levels model up to 0.1 hPa (ECMWF50) with a level spacing of 1.5 km throughout the stratosphere [Simmons et al., 1999] were available. The 50 levels version includes for the first time the assimilation of the AMSU data. In 2000 the number of levels increased to 60 (ECMWF60), but the level spacing in the stratosphere remained practically constant (levels used here are at or near 10, 12, 15, 19, 23, 29, 36, 44, 55, 67, 80, 96, and 113 hPa). The analyses were available every 6 hours.

3.2. Met Office

[10] During the period studied here the stratospheric data assimilation system of the Met Office in the United Kingdom uses the analysis correction scheme in which observations are gradually inserted [Swinbank and O'Neill, 1994]. The analysis has 42 levels up to 0.28 hPa and a horizontal resolution of 2.5° latitude by 3.75° longitude. There were no major changes in the assimilation system from 1997 to 2000, but erroneous ozone data were used at the top of the model in the winters 1998–1999 and 1999–2000 (MSP). This resulted in decreased stratospheric temperatures but does not seem to have an effect around March 1 (MSP), where the MIR data were obtained. The data used in this study are those produced for the Upper Air Research Satellite (UARS) project. The analyses are interpolated to the 22 UARS standard pressure levels (mainly the levels at 10, 15, 22, 32, 46, 68 100, and 147 hPa are used in this study). The MO analyses are available at 1200 UT only.

3.3. Data Assimilation Office

[11] The DAO at the Goddard Space Flight Center (GSFC) produced stratospheric data assimilation analyses with 2° latitude by 2.5° longitude resolution in 1997 and 1999 (GEOS 1) and in a 1° latitude by 1.25° longitude grid in 2000 (GEOS 3). The observations are gradually inserted using the incremental analyses update scheme [Schubert et al., 1993]. The analysis has 46 levels up to 0.4 hPa of which mainly 10, 30, 50, 70, 100, and 150 hPa are used here. The analyses are produced every 6 hours.

3.4. National Centers for Environmental Prediction

[12] The NCEP analysis of the U.S. Climate Prediction Center (CPC) provides temperature and geopotential heights on pressure surfaces from 70 to 0.4 hPa on a 65 × 65 polar stereographic grid interpolated to 5° longitude by 2° latitude. The levels used here are mainly 10, 30, 50, 70, 100, and 150 hPa. The NCEP CPC analyses are based on an objective analysis scheme that does not include a forecast model. The analyses are calculated by the successive correction method [Cressman, 1959; Finger et al., 1965] and include radiosondes and TOVS satellite data. During the period in question, several changes were introduced in the assimilation of satellite data, but the impact of these changes are usually less than 1 K below 10 hPa (MSP). Winds are derived from the geopotential heights using a balanced wind approximation [Randel, 1987; Newman et al., 1988]. At 100 hPa and below the analyses are those of the NCEP T126 Global Data Acquisition System (GDAS) analysis. The NCEP analyses are available at 1200 UT only.

3.5. Reanalysis

[13] The NCEP/NCAR reanalyses are available back to 1948 [Kalnay et al., 1996; Kistler et al., 1996]. They are produced from the same model as the NCEP GDAS analyses except that derived satellite temperatures are used instead of raw radiances. Since March 1997, an error in the filtering of the TOVS data has existed (MSP). This leads to temperature increases near 100 hPa, and the analyses are currently being reprocessed. Preliminary results of the reprocessing give up to ∼1 K lower temperatures in the January–March 2000 average. The analyses are made in 2.5° latitude-longitude grid and have 17 levels up to 10 hPa of which 10, 20, 30, 50, 70, 100, and 150 hPa are used here. They are available for every 6 hours.

4. Temperature Errors

[14] The analyzed temperatures are interpolated in space and time by a linear procedure (log-linearly in pressure). Figures 13 show the differences between the analysis temperatures and the observed temperatures. In 2000 the average, 〈T〉, of the two temperature sensors, T1 and T2, was used. In 1997 and 2000, only measurements below 10 hPa have been used. In 2000 the differences between T1 and T2 are also shown to give an indication of the uncertainties involved in the measurements.

Figure 1.

1997 differences between various analyses and MIR temperatures as a function of pressure.

Figure 2.

As Figure 1 but for 1999.

Figure 3.

As Figure 1 but for 2000. Also, the difference between the two temperature sensors are shown. The largest differences of the REA analyses do not fit onto the plot.

[15] In 1999 all operational analyses represent the measurements quite poorly, which is probably due to the warming taking place [Naujokat, 2000]. ECMWF31 is 20 K too cold at the 10 hPa top of the model, and often the analyses showed unrealistic temperatures below 173 K. This occurred during extended periods during the 1998–1999 winter and could be caused by the coarse vertical resolution of satellite-based temperatures, which could easily give a wrong temperature at 10 hPa, when the lowest temperatures occur between the two uppermost levels at 10 and 30 hPa. MO, DAO, NCEP, and REA show a positive bias of 3–5 K around 10–15 hPa in 1999 and except NCEP also in 1997. At 5 hPa, MO, DAO, and NCEP exhibit a temperature error of 20 K, and this could also be related to the major warming taking place during the flights. Thus the temperature gradient between NP and 60°N reversed on 20 February (day number 50) at 5 hPa. In contrast, the new EMCWF model (ECMWF50) shows almost no bias due to the assimilation of the higher-resolution AMSU data.

[16] In Table 2 the average temperature differences are shown for each year. In 2000 all analyses except NCEP and REA have improved compared to 1997 and 1999. For NCEP the large standard deviations in 2000 are due to the large discrepancies at the beginning of the flight at low pressures, and removing the first two nights of data reduces the standard deviation to 1.12 K and changes the mean to 0.39 K. It should be noted, however, that this would also reduce the ECMWF standard deviation to 0.92 K and change the mean to −0.50 K.

Table 2. Average Temperature Difference (Model-MIR) for Each Year Data and Standard Deviation in Different Pressure Intervals
10–146 hPa
19970.42 ± 1.400.65 ± 1.610.06 ± 1.871.16 ± 1.670.31 ± 1.42NA1717
19991.76 ± 2.591.96 ± 2.531.79 ± 1.811.33 ± 2.98−1.56 ± 7.30−0.08 ± 1.191756
20000.32 ± 1.150.62 ± 1.280.72 ± 1.880.69 ± 3.38NA−0.64 ± 1.021131
10–30 hPa
19970.32 ± 1.791.38 ± 2.20−0.71 ± 2.172.41 ± 2.130.21 ± 1.69NA406
19992.71 ± 3.302.58 ± 3.181.74 ± 2.372.05 ± 2.43−10.50 ± 8.20−0.55 ± 1.26504
2000−0.29 ± 1.331.08 ± 1.281.67 ± 3.250.77 ± 2.38NA−1.16 ± 1.20255
30–146 hPa
19970.44 ± 1.250.41 ± 1.290.28 ± 1.700.48 ± 1.220.33 ± 1.37NA1311
19991.36 ± 2.111.69 ± 2.151.79 ± 1.531.03 ± 3.091.91 ± 2.080.10 ± 1.101267
20000.41 ± 1.070.48 ± 1.240.44 ± 1.070.67 ± 3.62NA−0.49 ± 0.91876

[17] In Table 2 the results have been split up in the two pressure intervals 10–30 hPa and 30–146 hPa (there are only few data below 100 hPa). Generally, the standard deviations increase with height, except for REA in 1999 and 2000 and ECMWF50 in 1999. This increase could be due to increasing errors in both the analysis and the MIR temperatures. The latter is seen in the top right panel of Figure 3. Often, the magnitude of the biases (i.e., the averages of the differences) also increases with height. In the pressure interval 30–146 hPa, NCEP has improved substantially from 1997 to 2000.

[18] Note that standard deviations give the uncertainties on each data point and not on the averages. Because of the difference between T1 and T2, not much significance could be assigned to the biases in 2000, but ECMWF seems to have a small cold bias in their analyses, whereas the other analyses have a small warm bias. In 1999 the analyzed temperatures are biased warm except from ECMWF50. In 1997, only REA has a significant bias.

[19] The MIR measurements are autocorrelated, which complicates the statistical treatment, so for the temperature differences in 2000, the averages of each of the 19 nights is formed. These averages are independent, and in Table 3 their mean ± the standard deviation ± the uncertainty of the standard deviation is shown. This latter uncertainty, s(σ), is calculated from the standard deviation, σ, as follows:

equation image

where n is the number of nights.

Table 3. Mean of Each Night's Average Temperature Differences in 2000 (Analyzed-Observed), Standard Deviation (σ), and uncertainty in the standard deviation (s(σ))
 Mean ± σ ± s(σ)
T1 − T2−0.35 ± 0.22 ± 0.05
ECMWF − 〈T−0.63 ± 0.46 ± 0.11
MO − 〈T0.31 ± 0.67 ± 0.16
NCEP − 〈T0.49 ± 1.23 ± 0.29
DAO − 〈T0.58 ± 0.92 ± 0.22
REA − 〈T1.10 ± 4.21 ± 0.99
ECMWF low resolution − 〈T−0.51 ± 0.56 ± 0.13
Using observed pressure 
ECMWF − 〈T−0.47 ± 0.39 ± 0.09
MO − 〈T0.49 ± 0.62 ± 0.14

[20] From Table 3 it can be inferred that the standard deviation of the ECMWF means of each night in 2000 are smaller than the MO and DAO standard deviations with 68% confidence, whereas it is smaller than the NCEP and REA standard deviations with more than 95% confidence.

[21] The observed pressure was not used in 2000 due to discrepancies to the pressure calculated from the GPS heights of up to 5 hPa. The difference ± standard deviation of the calculated and observed pressure is 1.6 ± 0.9 hPa. Using observed instead of calculated pressure to calculate the temperature differences results in smaller standard deviations for ECMWF and MO, as shown in the two last rows in Table 3. This could be due to the fact that the uncertainties in the GPS heights could cause substantial errors in the calculated pressure at the lower levels.

[22] Figures 46 show the differences of the analyzed temperatures with respect to the MIR temperatures as a function of time. In 1999 the results for both flights are plotted on top of each other. Some the analyses show a long wavelength signal, especially NCEP and REA. The largest positive discrepancies for REA in 2000 occur over Russia. Some of this long-wavelength signal could be due to stationary anomalies in stratospheric analyses, which might be caused by regional differences in the radiosondes used [Bowman et al., 1998]. However, the large discrepancies seen in REA in 2000 could not be explained by these anomalies. Other explanations include displacements of the temperature field. In Figures 4 and 5 (and to a lesser extent Figure 6) there often is a larger temperature discrepancy at the beginning of each night, when the balloon descends from higher altitude. Especially in 1999 during the major warming the vertical temperature gradient is not determined correctly by any of the models except NCEP and ECMWF50. Only few measurements are taken during the descent, so these large discrepancies only have a minor impact on the biases given in Table 2.

Figure 4.

1997 differences between various analyses and MIR temperatures as a function of time. Also, the temperature is shown.

Figure 5.

As Figure 4 but for 1999. Notice the changes in scale.

Figure 6.

As Figure 4 but for 2000. Also, the difference between the two temperature sensors is shown. Notice the change in scale for the REA analyses.

[23] Figure 7 shows the temperature errors of the various models as a function of the MIR temperature for 2000 (here only the most recent data are shown). The three models with lowest spatial resolution (MO, NCEP, and REA) seem to have a cold bias at the largest temperatures. Of special interest is the accuracy of the analyses at temperatures below TNAT (around 195 K in this case). The same three analyses (MO, NCEP, and REA) apparently have a warm bias below TNAT. In Table 4 the slope of the regression lines are given for each year. In 2000 the intercepts are 17.4, 16.1, 15.4, −3.3, and −1.2 K for MO, DAO, NCEP, REA, and ECMWF, respectively. With slope and intercept it is possible to correct each of the five analyses, but the correction is only valid around 1 March 2000.

Figure 7.

2000 temperature errors of the models with respect to the average MIR temperature as a function of the MIR temperature. The linear regression line is shown as well as the slope of the regression line and the 68% confidence limits of the slope. Some of the data did not fit into the plot.

Table 4. Slope of Correlation Line Between Temperature Difference (Model − MIR) and MIR Temperature
1997−0.01 ± 0.0060.02 ± 0.010.01 ± 0.010.07 ± 0.01−0.01 ± 0.01NA
19990.31 ± 0.0270.00 ± 0.01−0.03 ± 0.010.03 ± 0.01−0.23 ± 0.010.02 ± 0.01
2000NA−0.09 ± 0.010.02 ± 0.01−0.07 ± 0.01−0.08 ± 0.020.00 ± 0.01

[24] The PSC temperatures were observed during two consecutive nights, and the observed and calculated temperatures are shown in Figure 8. The ECMWF temperatures represent the measurements quite well during the first night, whereas they are a bit too cold during the second night. The other high-resolution analysis, DAO, also catches the observed temperatures quite well during the first night, but they are too warm during the second. MO and NCEP do not quite catch the minimum temperatures, and this could partly be explained by their lower resolution. The magenta line thus shows the result of degrading the ECMWF horizontal and temporal resolution to the one used by MO. It was mainly the degradation of the temporal resolution that mattered, so this does not explain the large discrepancies of the REA data, which may be due to an error (see section 3.5). Table 3 shows that using the degraded resolution does increase the standard deviations but not significantly so. The difference between the standard deviations of MO and ECMWF at the same horizontal and temporal resolution is not significant either. Using higher horizontal resolution (up to 0.4° is possible at ECMWF) has hardly any effect, except in cases of stronger lee waves. In this example the MO, NCEP, and REA analyses would sincerely underestimate the PSC activation, but more data are needed to confirm this. However, this is in agreement with Manney et al. [2002] and Davies et al. [2002], who show that the biases may change during the winter.

Figure 8.

Observed and analyzed temperatures during the period where temperatures below TNAT were encountered in 2000. The ECMWF temperatures are shown both for normal resolution (1.5° latitude-longitude grid, 6 hourly analyses) (red) and for low resolution (2.5° by latitude, 3.75° by longitude, 24 hourly analyses) (magenta).

[25] Another contribution to the scatter between analyses and actual temperature are the sub-grid-scale waves caused by the upward propagation of orographic gravity waves. Figures 91011 show the scatter of the difference among ECMWF31 in 1997, ECMWF50 in 1999, and ECMWF60 in 2000 and MIR temperatures in reference to the average calculated for each night. Since the horizontal displacement of the MIR at night is usually larger than the analysis grid spacing (up to 1500 km in 1997 during a single night) and the vertical excursion covers several model layers, it provides a reasonable estimate of the impact of sub-grid-scale features. The average amplitude of the 1 standard deviation scatter compared to the low vertical resolution ECMWF31 varies from ±1.16 K during the first flight of 1997 to ±0.93 K in the second in 1997. It does not significantly improve in 1999 in ECMWF50 (±0.97 K on average) or in 2000 in ECMWF60 (±0.90 K on average) even though the spacing between levels is reduced to 1.5 km. Only a few cases of lee waves of ±2.5 K amplitude have been positively identified above the Ural and the Siberian mountain ridge associated with fast surface westerly winds [Pommereau et al., 1999]. Another case with ±9.5 K amplitude was identified above Scandinavia [Hertzog et al., 2002]. Otherwise, there is no correlation with orography, except that there is generally a noticeable drop in scatter, when the balloon passes over the sea. For example, during the second MIR flight in 1997 the night numbers 2, 8, 14, 19, and 24 are spent over the Arctic Sea at around 180°E. During these nights, Figure 9 shows a standard deviation below 1 K in agreement with Murphy and Gary [1994]. Over land the scatter can occasionally also be low due to, for example, weak surface winds. There is also a drop in scatter in 1999, when the balloon was returning to the northern Atlantic in an easterly flow, where gravity waves usually do not propagate upward. While we thus have not seen a major impact of gravity waves on the MIR temperatures, Dörnbrack and Leutbecher [2001] have shown that gravity waves might strongly enhance the potential for ice formation, at least over Scandinavia.

Figure 9.

1997 analyzed−observed temperature relative to the average of each night (top panel), the standard deviation for each night (middle), and the corresponding orography height (bottom).

Figure 10.

As Figure 9 but for 1999.

Figure 11.

As Figure 9 but for 2000.

5. Conclusions

[26] We have compared temperatures from ECMWF, MO, DAO, NCEP, and REA to observed temperatures in late winter/early spring in 1997, 1999, and 2000. Occasional large errors occur in all the analyses and consist of the following: (1) a cold bias up to more than 20 K above 30 hPa in the operational ECMWF model in the winter 1998–1999, possibly due to the assimilation of coarse resolution satellite data at the top of the model; this error was also seen in the winter 1997–1998; preoperational runs with the new ECMWF model, which became operational on 5 May 1999, do not show this error; (2) a warm bias of about 20 K at 5 hPa in the MO, DAO, and NCEP analyses during one night in February 1999 during a major warming; (3) an alternating cold and warm bias of up to 14 K magnitude from 30 to100 hPa in the NCEP/NCAR reanalyses in 2000.

[27] Because of solar heating of the gondola, only nighttime measurements could be used. In 2000 the ECMWF standard deviations are lower than the MO and DAO standard deviations with more than 68% confidence, whereas they are lower than the NCEP and REA standard deviations with more than 95% confidence. If the first two nights, where NCEP had a particularly large bias of up to 8 K above 30 hPa, were removed, the NCEP standard deviation would be larger than the ECMWF with only 68% confidence. If the ECMWF resolution is degraded to the resolution of MO, the differences in standard deviation between the two is not significant even at the 68% confidence level. These results for 2000 could change in different meteorological conditions, but during a major warming in 1999, the new ECMWF model also had substantially smaller biases and standard deviations than the other models. The situation might change, however, when other models start assimilating AMSU data (as MO did in November 2000 with a 3-D variational assimilation).

[28] During two nights, temperatures below TNAT were encountered. We found that models with 24 hourly analyses (MO and NCEP) did not catch the temperature minima. Regardless of their 6 hourly analyses the REA data did not catch the temperature minima at all, but reprocessing of the data to correct for an error in the filtering of the TOVS data (since March 1997) may improve the situation. However, two nights are not enough to draw any robust conclusions. Although a few cases of lee waves have been positively identified, no major impact of lee waves on the temperatures was found.


[29] ECMWF, MO, NASA, and NCEP are thanked for providing their analyses and the radiosonde stations for their work. In particular, we acknowledge A. P. McNally and Adrian Simmons for making preoperational ECMWF data available to us, Richard Swinbank for valuable comments and the MO UARS correlative data, Ron Nagatani for his help in acquiring the NCEP CPC analyses, and Don Hooper of NOAA-CIRES Climate Diagnostics Center (CDC) for his help in obtaining the NCEP/NCAR reanalyses. Portions of the NCEP/NCAR reanalyses were provided by CDC, from their Web site at The project was supported by the French Programme of Atmospheric Chemistry (PNCA), the Centre National d'Etudes Spatiales (CNES), and the DG XII of the European Commission (projects Lagrangian Experiment and THESEO 2000 - EUROSOLVE under contract ENV4CT970504 and EVK2-CT-1999-00047, respectively).