Comparisons between observed AIRS radiances and radiances computed from coincident in situ profile data are used to validate the accuracy of the AIRS radiative transfer algorithm (RTA) used in version 4 processing at Goddard Space Flight Center. In situ data sources include balloon-borne measurements with RS-90 sensors and frost point hygrometers and Raman lidar measurements of atmospheric water vapor. Estimates of the RTA accuracy vary with wave number region but approach 0.2 K in mid- to lower-tropospheric temperature and water vapor sounding channels. Temperature channel radiance biases using ECMWF forecast/analysis products are shown to be essentially identical to those observed with coincident sonde observations, with somewhat higher biases in water vapor channels. Some empirical adjustments to the RTA channel-averaged absorption coefficients were required to achieve these stated accuracies.
 The Atmospheric Infrared Sounder (AIRS) on NASA's Aqua satellite platform [Aumann et al., 2003] measures 2378 high -spectral resolution infrared radiances between 650 and 2665 cm−1 with a nominal resolving power (λ/Δλ) of 1200. Atmospheric profiles are retrieved with an iterative algorithm [Susskind et al., 2003] that minimizes the differences between the observed (or cloud-cleared) radiances and radiances computed with the AIRS radiative transfer algorithm (RTA). A key component of the retrieval accuracy is therefore the accuracy of the AIRS RTA, which is often referred to as the fast (forward) model.
 This paper presents the continuing validation of the AIRS RTA, specifically the RTA used in the AIRS version 4 processing package that was used to reprocess all AIRS data beginning in mid-2005. We have previously discussed the prelaunch version of the AIRS RTA [Strow et al., 2003a]. That work concentrated on the fast parameterization of the channel-averaged transmittances and some of the unique spectroscopy included in these transmittances. The error characteristics of the AIRS RTA discussed by Strow et al. [2003a, see Figures 3 and 4] were the fast model RMS fitting errors, which may impact individual observations, but should be minimal in the statistical means that are the subject of this paper.
 Differences between AIRS radiances computed from validation profile data and observed radiances presented here can arise from a variety of sources, including errors in the (1) AIRS radiometric calibration; (2) AIRS spectral calibration and instrument line shape; (3) AIRS fast model parameterization; (4) cloud contamination in fields of view selected as clear; (5) validation data, including time/space mismatches and uncertainties in minor gas abundances; and (6) spectroscopy used in the AIRS RTA. Much more research is needed to reliably assign observed bias differences to these various errors sources, however the results presented here suggest that the AIRS-RTA errors are approaching the nominal AIRS noise level of 0.2 K, which is sufficient to meet AIRS weather forecasting requirements. Other studies (see section 2.2) have shown that the AIRS radiometric calibration is good to ∼0.1–0.2 K. The AIRS version 4 RTA assumes that the channel center frequencies are fixed at their nominal values as of September 2002. Since drift of the AIRS frequencies could introduce biases into the RTA validation statistics, we summarize here our analysis of the AIRS frequency calibration (see section 4) which suggests that the effects of frequency drifts during the validation period are at most a ∼0.1 K brightness temperature shift.
 The scope of this paper is generally limited to describing how we determined bias differences between observed radiances and those computed from validation profiles with the AIRS RTA, and modifications to the version 4 AIRS RTA based on early validation studies. The general sources of these bias errors are discussed, but a detailed analysis of their origins, especially given that AIRS has ≈2000 low-noise channels, will not be given here.
 Two categories of validation data are examined here: (1) in situ measurements of the atmospheric state with balloon-borne sondes and Raman lidar recorded coincident with AIRS overpasses from a variety of sources/campaigns and (2) European Center for Medium-Range Weather Forecasts (ECMWF) global analysis and forecast fields.
 Our strategy was to evaluate the RTA performance with ∼10% of the early validation data, modify the RTA transmittances if warranted on the basis of these data, and then evaluate the final version 4 RTA with all the in situ sonde validation data. The ECMWF data provided a secondary test of the RTA for channels dominated by CO2 over a much wider range of atmospheric conditions, with biases almost identical to the sonde biases. ECMWF water vapor fields are not sufficiently accurate for AIRS validation and exhibit much larger standard deviations than the sonde data for channels primarily sensitive to water vapor.
Section 2 provides some background on the development of the RTA, reviews the AIRS radiometric accuracy [Aumann et al., 2006], and introduces our strategy for validating the AIRS RTA. In section 3 we introduce the various validation data sets used for this work. Section 4 utilizes some of these data sets to validate the AIRS spectral calibration and establishes liens against the RTA caused by the use of a fixed frequency scale. Section 5 discusses the core validation of the RTA and is the main result of this work. Section 6 presents limitations to the daytime validation because the current RTA does not model non-LTE emission. Liens against the current RTA because of variability in atmospheric gases that are not available in the validation data sets are reviewed in section 7.
2.1. AIRS RTA
 The AIRS RTA effectively parameterizes atmospheric transmittances in 100 pressure layers [Strow et al., 2003a] using the AIRS spectral response functions (SRFs) measured during prelaunch testing [Strow et al., 2003b]. (The AIRS SRFs are available from the authors, or can be found at http://asl.umbc.edu/pub/airs/srf). The RTA includes surface and atmospheric emission/absorption, as well as surface reflected thermal and solar terms. The vast majority of channels have RMS transmittance fitting errors of less than 0.1 K, with a mean error of about 0.04 K with an independent profile set (see details given by Strow et al. [2003b]). These numerical fitting errors are small compared to estimated spectroscopy errors of ∼0.2 K or more.
 The version 4 AIRS RTA allows the user to vary the H2O, O3, CH4, and CO mixing ratios. In addition, the CO2 profile can be adjusted with a constant scale factor. All other gases are fixed, and their absorption coefficients are lumped together into the fixed gas absorption coefficients (including CO2) assuming reasonable values for their abundances.
 The starting point for the spectroscopy used in the AIRS RTA is largely provided by our pseudo line-by-line algorithm kCARTA [Strow et al., 1998; De Souza-Machado et al., 2002] which is based on the HITRAN 2000 line parameter database [Rothman et al., 2003] and a number of recent improvements to the CO2 line shape [see Strow et al., 2003a, section III] and the MT-CKD water vapor continuum [Clough et al., 2005]. However, as will be discussed later, some of the AIRS version 4 RTA absorption coefficients were modified on the basis of early validation studies.
 The actual radiance computations presented here use a version of the AIRS RTA that can be run without the complete machinery of the AIRS operational retrieval system. The code, called the Stand-Alone AIRS Radiative Transfer Algorithm (SARTA), is available upon request (see http://asl.umbc.edu/pub/rta/sarta for details).
2.2. Radiometric Calibration
 A prerequisite to the AIRS RTA validation is the establishment of accurate Level 1B AIRS radiances. Two independent approaches to AIRS radiometric validation are presented in this issue [Tobin et al., 2006b; Aumann et al., 2006] establish that the AIRS radiometric accuracy is accurate to ∼0.1–0.2 K. Aumann et al.  compared AIRS derived SSTs to NOAA/NCEP's RTGSST sea surface temperature product and found agreement to better than 0.1 K. This approach does weakly depend upon the accuracy of the water continuum in the AIRS 2616 cm−1 channel, which is difficult to measure since it is so small. Aumann et al.  have also shown that the AIRS radiometric stability is better than 0.01 K/year, which is a prerequisite to our analysis of the effect of variable atmospheric CO2 on computed AIRS radiances presented in this paper. Tobin et al. [2006a] compared observed AIRS radiances to the Scanning-HIS radiometer high-spectral resolution radiance recorded while under-flying AIRS and found agreement of better than 0.2 K between the two measurements for channels that are not sensitive to absorption above the aircraft altitude (∼20 km).
 Successful processing of the AIRS data into geophysical variables as soon as possible is a high priority for the NASA EOS program. Although an intense 2-year validation campaign was executed by NASA for validation of AIRS, the requirement to generate retrievals of all data as soon as possible only allowed us to use a small subset of these data, namely the ARM-TWP and ARM-SGP Phase 1 data sets, [Tobin et al., 2006a], for validation and improvement of the RTA before it was delivered to NASA for version 4 processing. Since many sounding channels have a significant component of surface emission, we restricted our initial validation to the ARM-TWP site RS-90 sonde data, since that site's ocean location allowed us to accurately characterize the surface emissivity term in the RTA. We also restricted ourselves to night-only scenes to avoid reflected solar radiation in the shortwave AIRS channels.
 Our early examination of the biases between the observed AIRS radiances and those computed from our initial RTA using the ARM-TWP RS-90 sonde profiles highlighted a number of spectral channels with biases larger than desired. Some of these biases were in spectral regions sensitive to atmospheric variables outside the measurement range of the RS-90 sondes, especially high-altitude temperature, and ozone. However, many spectral regions had biases that were small enough to suggest they may be spectroscopic in origin, but large enough that they were likely not due to sonde inaccuracies. This led us to make adjustments to fast model transmittances in channels where we felt that the sonde data are more accurate than the spectroscopy, based solely on the ARM-TWP nighttime sonde data that we had in hand at the time.
 Note that many of the channels that we adjusted (see section 5.1 for details), are sensitive to CO2 or H2O transmittances under cold conditions and long path lengths that are difficult to accurately replicate in the laboratory. Moreover, some of the largest adjustments were made to the CO2 absorption coefficients, which improve the bias by ∼0.8 K, are equivalent to only a difference of 0.02 in transmittance in a laboratory spectrum. This fact has led us to the conclusion that further improvements to the spectroscopy required for high-spectral resolution sounders will probably come from validation data such as presented here rather than from laboratory spectroscopy.
 The biases and standard deviations between observed and computed radiances presented here for the sonde data include a much larger data set than what was used to modify the RTA transmittances. The ARM-TWP Phase 1 data represent less than 10% of the sonde data used to generate the bias statistics, giving us a rather large independent data set for validating the RTA. The brightness temperature biases change by less than 0.05 K max if the ARM-TWP Phase 1 sondes are removed from the validation data set. In addition, we present here RTA biases using ECMWF analysis/forecast model fields, which are almost identical (for temperature sounding channels) to those derived from the sonde data set. The ECMWF data set covers a much wider range of profile types, but with similar biases and standard deviations.
 ECMWF started assimilating AIRS radiances in early October 2003 (A. Collard, private communication, 2005). This means that, in principle, the ECMWF fields are no longer a truly independent data set for bias evaluation of AIRS. However, by comparing AIRS biases between identical months of different years, we see no significant changes in the biases before and after assimilation of the AIRS data. This is not surprising, since the initial assimilation of AIRS data at ECMWF is using very small amounts of AIRS data that are weighted rather low until the system is better understood. In any case, we use ECMWF here as a secondary validation source for CO2 channels, primarily to give us a wider range of atmospheric states than we have with the sonde validation data sets.
3. Validation Data Sets
3.1. Sonde/LIDAR Data Set
 The AIRS validation effort benefited from a large number of campaigns that launched balloon borne temperature and humidity sensors and recorded Raman lidar water profiles coincident with AIRS overpasses [Fetzer, 2006]. Table 1 lists the in situ validation data sets used in the RTA validation, and Figure 1 shows the locations of the sites used to record these data. Tobin et al. [2006a] describe the most comprehensive set of AIRS validation measurements, taken at the Department of Energy's Atmospheric Radiation Measurement Program's (ARM) Southern Great Plains (SGP) and Tropical Western Pacific (TWP) sites. The phases for the ARM data sets refers to time periods from 2002 to 2004 that ranged from 4 to 8 months in length.
Table 1. Number of Sonde Launches, LIDAR Observations, Coincident With AIRS
Table 1 lists the various data sets used for the AIRS RTA validation and nominal values for the number of sondes launched (or LIDAR measurements) during the AIRS validation campaign that we used for this work. The McMillan/ABOVE data sets (W. McMillan, private communication, 2004) were Vaisala RS-90 sondes released from the Chesapeake Bay Lighthouse, located approximately 27 km east from the mouth of the Chesapeake Bay. This provides an ocean scene, although care must be taken to reject nearby AIRS fields of view that are contaminated by land. The Minnett data sets [Szczodrak et al., 2004] are RS-90s (some RS-80s were also launched, but not used in this work) launched from the Explorer of the Seas cruise ship in the Caribbean. The Vömel data sets [Miloshevich et al., 2006] are frost point hygrometers from the NOAA CMDL laboratory that were launched in a variety of locations. The Whiteman/LIDAR data sets are Scanning Raman LIDAR measurements of water vapor profiles [Whiteman et al., 2006]. The sonde releases examined here were generally limited to those launched within one hour of the AIRS overpass and 30 km of the AIRS FOV center location.
 A surface with a well-characterized emissivity is needed to validate AIRS channels that contain significant emission from the surface, which includes a large number of temperature and water vapor sounding channels. For this reason, we used scenes over ocean for much of this work. Because of the difficulty in characterizing solar reflection in the shortwave channels, we also limited much of our validation efforts to nighttime scenes. The ARM-TWP, McMillan/ABOVE, and Minnett sonde data sets satisfy these requirements. The ARM-SGP, Vömel, and LIDAR data sets were used for validation of water vapor channels peaking in the mid- and upper troposphere (∼1250–1620 cm−1) since surface emission was less important for these channels, and the data could be screened sufficiently for clouds even over land.
 An algorithm developed by Aumann et al.  was used for cloud filtering. His technique tests for clear with a single FOV, and has been tested extensively against more straightforward tests that use scene uniformity as the basic test for clear fields of view. He estimates that scenes with cloud contamination greater than 0.5 K in brightness temperature are eliminated by this filter.
 This estimate appears reasonable, since all the average window scene biases for each of the validation sites studied here were 1 K or less. For the ARM-TWP, ABOVE, and ocean Vömel sites we used the ECMWF model SSTs in the computed radiances, for the Minnett site we used Minnett's MAERI derived SSTs, and for the ARM-SGP sites we used the surface temperature and emissivity values given by Tobin et al. [2006a]. For the LIDAR cases the ECMWF land surface model temperature were used. The sea surface emissivity by Masuda et al.  was used for ocean scenes. The more recent model by Wu and Smith  may be more appropriate, but the differences between these two models for profiles used here is only on the order of 0.05 K. The land scene data were only used to examine mid- to upper tropospheric water biases which are relatively insensitive to surface emission.
 In order to study bias errors of channels with low-altitude weighting functions, we retrieved an effective SST from the nighttime spectra using a set of channels between 2600 and 2633 cm−1 that have very little water absorption. This effective SST is the radiative skin temperature, if there is no cloud contamination, and partially accounts for residual cloud contamination if it does exist. Our computed radiances use this retrieved effective SST, which lowered the window bias errors to close to zero (by definition) in the shortwave windows, and to about 0.2 K max in the longwave, with the largest errors in the 800 cm−1 region. Simulations of this process indicate that the increase in the longwave bias after fitting for an effective SST in the shortwave is probably due to small residual cloud contamination, since this difference does not appear to depend on the water column amount.
Table 2 lists the approximate percentage and the absolute number of nighttime clear sonde/LIDAR observations that made it through our cloud-filtering process. The high percentage of clear observations from Minnett is presumably due to the fact that conditions are generally clear in the Caribbean.
Table 2. Summary of Number of Clear Observations Over Ocean at Night
Number of Sonde/Lidar Profiles
Figure 2 gives a summary of the resulting nighttime ocean sonde biases. Figure 2 includes the mean AIRS brightness temperature spectrum for these cases, which can be referenced later when looking at the many bias spectra shown in this paper that do not include the brightness temperature spectrum itself. Detailed discussion of the biases in Figure 2 are given in section 5.
3.2. ECMWF Data Sets: Clear Field-of-View Selection
 Comparisons between observed radiances and radiances computed from ECMWF analysis/forecast model fields provided a much larger validation data set over a wider geographical region and range of atmospheric conditions. In addition, this global data set allowed us to build up significant statistics quickly before the sonde data sets were available. ECMWF is expected to have very good statistical accuracy since they assimilate a wide number of global data, including sondes. One result of the work presented here is validation of the ECMWF model statistical accuracy, relative to our sonde observations.
 Bias calculations between observed and computed AIRS radiances using ECMWF data were restricted to ocean scenes in order to easily characterize the surface emission. We implemented a clear filter early in the AIRS mission, and have applied it to all AIRS Level 1B radiances for more than 29 months of data. This filter, described below, accepts about 1–2% of all Level 1B data, which we store in an online database, which we will call the “uniform_clear” data set.
 The term uniform_clear is suggestive of the basis of this clear filter, which requires adjacent scenes to have nearly identical radiances in window channels. The observed brightness temperatures of four window channels; 900.22, 960.95, 2610.8, and 2616.1 cm−1, for the field of view (FOV) under consideration, must all be within 0.3 K for all eight adjacent scenes. In addition, a small selection of window channel brightness temperatures (including the 2616 cm−1 at night) must agree within 4 K with brightness temperatures computed from the nearest (in time and space) ECMWF model. This test removes scenes with uniform stratus clouds. Finally the observed SST (computed from these window channels with the atmospheric correction computed using the ECMWF profile) must be greater than 273 K in order to avoid ice. These threshold values are saved in the data set, so they can be made more stringent at a later time. The 0.3 K uniform scene requirement is quite strict, and is the major determiner of a clear scene. During the daytime, our uniform_clear filter derives the effective SST from only the longwave channels. Our uniform_clear filter gives almost identical biases (within 0.02 K) on large data sets as the filter by Aumann et al.  that we used for the sonde data set filtering.
 As with the sonde data, we derived an effective SST from shortwave channels that are very insensitive to water, and used that derived effective SST in our bias calculations. The mean derived effective SST, between ±45° latitude was about 0.5 K larger than the ECMWF value, which is within about 0.1 K of the expected bias (see discussion by Aumann et al. ).
4. Spectral Calibration
 The AIRS RTA cannot be constructed without accurate spectral response functions (SRFs) and thus we consider the spectral calibration of AIRS part of validation of the forward model. We define the AIRS SRFs to include both the SRF shape and the SRF center frequencies. The SRF shapes and center frequencies were measured during prelaunch testing [Strow et al., 2003a]. There is no direct way to characterize the AIRS SRF shapes in orbit, although SRF width errors may have a unique bias signature. What can be measured accurately are the SRF centroids, which were expected to shift slightly during launch. However, these small shifts, which are due to a movement of the AIRS focal plane relative to the spectrometer optical axis, apply to all channels, if one uses units of a fraction of the SRF width. Early postlaunch evaluation by Gaiser et al.  shows a negative shift of ∼13 microns, or equivalently 13% of an SRF width, relative to prelaunch test values.
 Brightness temperature errors in the forward model due to errors in the prelaunch SRF shape measurements are estimated to be generally less than 0.1 K. This assumes that residual fringing in the SRFs, which depend on the temperature of the spectrometer entrance filter, have been characterized in orbit, which was done early in the mission. (The next section discusses the fringe effects in more detail.)
4.1. Channel Center Frequencies
 Here we summarize errors in the AIRS RTA due to shifts of the SRF center frequencies. A more detailed analysis of the AIRS center frequencies will be reported in the future. Monthly means of our uniform_clear data set of clear, ocean, fields of view (including both observed radiances, and radiances computed from the ECMWF model fields) were zonally averaged and placed into bins 10° wide in latitude. Each of these binned observed brightness temperature spectra are cross-correlated with a series of computed spectra with slightly offset frequency scales to find the best match. The computed spectra start with the binned ECMWF spectra, which are offset in frequency using a spline fit versus wave number to determine the frequency derivatives. This technique is very accurate for a wide selection of AIRS channels.
 The peak in the correlation should occur when the computed spectrum is on the correct frequency scale. These correlations were done separately for each AIRS module (see for the definition of a module Gaiser et al. ). Seven modules, compromising ∼40% of the AIRS channels were averaged to determine the final frequency calibration. These seven modules were those modules with the most defined spectral structure due to atmospheric spectral lines. We also purposely avoided using channels in the 2370–2400 cm−1 spectral region where you cannot differentiate frequency calibration errors from radiance errors.
Figure 3 shows these results for nighttime over a 29 month period, on a relative scale calibrated in percent of the SRF full width half max (FWHM). Several features are clear. There is a yearly periodicity in the frequency shifts coupled with a small negative drift. This slow negative drift would be more pronounced in Figure 3 if there was not a sudden increase in the frequencies of +0.12% of the SRF width due to slight changes in the spectrometer operating temperature that occurred after the Aqua spacecraft was shut down for about one week to protect it from a very large solar flare in November 2003. The periodic nature of the frequency calibration also depends on latitude. However, we believe that latitude is actually a proxy for orbital time since the most likely cause of these shifts with latitude is the instrument response to solar illumination. Note that the daytime spectral calibration shown in Figure 4 has a different flavor of variations, since these occur at different orbital times. The seasonal component of these periodic variations are most likely a function of the solar beta angle (the angle between center of earth-Sun line and the Aqua orbit plane), which varies throughout the year. A limitation of this work is the lack of data at the higher latitudes, where there may be slightly larger frequency excursions.
 An examination of the frequency drifts between month i and month i + 12 shows a clear secular drift that was about −0.25% in mid-2003 declining to −0.16% of a FWHM by mid-2004. Extrapolation of this trend suggests that the slow nonperiodic drift of the frequencies will cease sometime in 2005. We therefore expect variations of approximately 0.6% of a FWHM throughout the mission. The main conclusion here is that the AIRS frequencies are extremely stable, and will shift less than the original specification of 1% of the SRF FWHM during the mission [Gaiser et al., 2003]. Figure 5 compares the AIRS mean sonde bias spectrum to the brightness temperature errors that will be induced by a 0.6% change in the frequency calibration, showing a maximum of ±0.2 K. There are regions where the two curves have the same magnitude, these include the 700–760 cm−1 temperature sounding region and the 1250–1350 cm−1 water vapor sounding region. The correlation between the bias and frequency calibration error curve in the 700–760 cm−1 region is about 0.25, suggesting that the two are slightly related. These two curves exhibit little correlation in the water vapor sounding region. The transmittance adjustments discussed in section 5.1 show no correlation with the frequency calibration errors.
 Note that the version 4 AIRS RTA uses fixed frequency channels, so none of these observed variations in the channel frequencies are included in the processing, including the November 2003 shift. However, as discussed above, these shifts are at or below specification and too small to affect weather related products.
 These results do strongly suggest that the AIRS frequency calibration can be determined to well below 0.1% of a FWHM for more demanding applications such as climate monitoring. It should also be noted that many AIRS channels throughout the spectrum have extremely low sensitivity to frequency shifts and may not need adjustments even for climate applications.
 The AIRS spectrometer entrance filters are not wedged, and consequently introduce a small periodic modulation into the shape of the AIRS SRFs with a free-spectral range of ∼1.2 cm−1. This fringing was fully characterized during prelaunch calibration [Strow et al., 2003b] and has been included in the SRF shapes. If the fringing was ignored in the SRF model, which is not the case, it would only impact a handful of channels by a maximum of 0.3–0.4 K. However, the locations of these fringes relative to the SRF can change if the SRF center frequencies shift or if the fringes shift. The more dominant effects are shifts of the fringes caused by changes in the entrance filter temperature.
 The location of the fringes postlaunch were determined by examining the radiometric gain as a function of the spectrometer temperature. It is difficult to validate the fringe positions with bias evaluation, since they impact the radiances by 0.3–0.4 K max for a handful of channels, and by much less for the vast majority of channels. However, the November 2003 shutdown of the Aqua platform shifted the fringes because it was impossible to reconfigure the AIRS spectrometer thermal configuration to have unshifted SRF center frequencies as well as unshifted fringes. The AIRS spectrometer temperature was set after the November 2003 event to minimize shifts to the SRF frequencies, and as noted early this was largely accomplished, with a resultant shift of only −0.12% of a FWHM relative to their pre-November 2003 values. This adjustment of the spectrometer temperature did cause the entrance filter temperature to change, shifting the fringes slightly.
Figure 6 shows the expected nominal change in brightness temperatures after the November 2003 shutdown due to the shift in the fringe positions. This was computed using the before and after values of the fringe filter temperatures. In the bottom left of Figure 6 is a zoom of the computed signal change in the 2180–2230 cm−1 spectral region where the effect is largest. Also plotted is experimental confirmation that this model is correct. We have formed the difference between the mean monthly observed radiances between month i + 12 and month i where November 2003 is located between these two months. This difference was averaged using eleven values for months i. Using identical months greatly lowers the dependence of these differences on the atmospheric state. The agreement is extremely good, and shows that our prelaunch model of the effects of the fringes on the SRFs works well in orbit.
 Unfortunately, the version 4 RTA does not use the final value for the post-November 2003 entrance filter temperature, and instead uses a temperature about 0.4 K too low. That results in bias errors in the post-November 2003 time frame in the AIRS RTA of about 2/3 the magnitude of the values shown in Figure 6 and 1/3 the magnitude in the pre-November 2003 time frame. However, the vast majority of channels have sensitivities to the fringes well below requirements for weather applications, as is clearly evident in this graph. There are only 156 channels with fringe errors >0.02 K, and only 9 channels with fringe errors >0.10 K.
5. Validation Results
 The rather involved integration of a new RTA into the AIRS retrieval algorithms requires that any changes to the RTA take place in time for extensive testing and modification of the retrieval algorithms. Consequently, the version 4 AIRS RTA, which was delivered in January 2004, could only benefit from the ARM-TWP Phase 1 data set and the ECMWF data set.
 Given uncertainty about the absolute accuracy of the ECMWF model fields, especially for water vapor, we performed initial validation, and adjustment of the RTA, with the ARM-TWP Phase 1 data set. The ARM-TWP Phase 1 biases suggested that there were a number of problems with the AIRS RTA at the 0.5–1 K level. Some of these bias errors, especially for CO2 channels, were also present in the ECMWF validation set biases. Fortunately, the ARM-TWP site has a high water burden, which was helpful in discovering water vapor continuum errors in the shortwave part of the spectrum during this initial validation phase.
5.1. Transmittance Adjustments
 The magnitude of some of the ARM-TWP Phase 1 bias errors were small enough that they could potentially be due to either spectroscopy or instrument SRF errors, but large enough that they did not appear to be sonde errors, except for high-peaking CO2 and H2O channels.
Figure 7 illustrates the problem with high-peaking CO2 channels, where the ARM-TWP Phase 1 bias error spectrum is grayscale-coded to indicate what percentage of each channel radiance is dependent on emission/absorption above 60 hPa. Above 60 hPa we use the ECMWF temperature profile.
 We also used the ECMWF water vapor profile rather than the RS-90 profile above 200 hPa, since the RS-90 accuracy can drop significantly above 200 hPa, depending on the air temperature.
 We chose to modify a subset of channels based on the ARM-TWP Phase 1 results by applying static scalar multipliers to the appropriate channel absorption coefficients. We will not describe this process in detail, since the purpose of this paper is to validate our delivered RTA. However, some background on how the version 4 RTA was developed will help one understand its characteristics. This tuning was implemented by applying scaling multipliers to the optical depths of the main atmospheric gases that reduced the bias close to zero. See Figure 8 for a plot of these multipliers. The absorption in most channels is dominated by a single gas, and our tuning is generally applied to the dominate gas for each channel.
 The water vapor absorption in the RTA is modeled as the sum of (1) the portion within ±25 cm−1 of the spectral line centers and (2) the far wing continuum due to the wings of lines beyond ±25 cm−1. For window channels we applied the tuning to the continuum optical depth, while in the main water band we applied tuning to the optical depth due to spectral lines. Attempts to tune the water continuum inside the water band did not succeed, because of somewhat complicated implementation issues related to the fact that the water continuum is expected to have a smoother variation with wave number than was needed for the optical depth multipliers. We have not yet tried to differentiate between errors in the self- versus foreign-broadened continuum, and have only made corrections to the continuum in the RTA by a multiplicative constant. For the vast majority of channels, this is equivalent to tuning either the self or foreign terms, since one usually strongly dominates over the other. For example, most window channels are totally dominated by the self-continuum.
 The largest adjustment applied to the water continuum was in the shortwave, where the continuum was too small by factors of up to 10×. However, since the water continuum is so small in the shortwave, these adjustments only changed the brightness temperatures by ∼0.5 K, and only in scenes with a high water vapor burden. These numerically large changes to the shortwave water continuum were verified with the analysis of a large set of cloud-free uplooking AERI data from the ARM-SGP site kindly provided by the Univ. of Wisconsin AERI program [Turner et al., 2004].
 High-altitude CO2 and O3 channels that peaked above 60 hPa were not tuned. The scaling multipliers for lower peaking CO2 channels that had weighting function tails extending above 60 hPa were reduced by the fraction of the weighting function that was above 60 hPa.
 Note that the untuned RTA already had several enhancements to account for CO2 line-mixing and duration-of-collision effects that are not normally included in line-by-line algorithms, [see Strow et al., 2003b, section III.]. In particular, the inclusion of P/R-branch line mixing in CO2 in the 710–750 cm−1 region was shown to improve bias calculations by ∼1 K and in the 2388–2392 cm−1 region by 1–2 K. The scaling multipliers arrived at using the ARM-TWP Phase 1 data in the 700–760 cm−1 range are quite small with a mean of 0.995 and a 1-sigma standard deviation of ±0.02. The maximum brightness temperature corrections produced by these multipliers are on the order of 0.5 K. Since these adjustments vary somewhat randomly around unity we suspect a cause other than CO2 spectroscopy. In the important 2380–2390 cm−1 sounding region the absorption coefficients were adjusted in the range of 0.98 to 1.05, large enough to modify brightness temperatures by close to 1 K.
Figure 9 shows the effects of tuning on the sonde biases. The curves in Figure 9 include all RS0-90 observations over the ocean, namely all three ARM-TWP phases, Minnett, and ABOVE. The high biases in the 650–700 and 2300–2380 cm−1 range involve channels peaking very high in the atmosphere that were not tuned significantly since there was no RS-90 temperature data at those high altitudes. The computed spectra used for these bias graphs used fitted values for the effective SST, as discussed earlier.
 The large biases in the 1100 cm−1 are due to O3, whose profile comes from the ECMWF analysis/forecast fields. The 0.3 K offset in Figure 9b in the 2410–2590 cm−1 range is due to the water vapor continuum being too small, which was verified with the ARM AERI uplooking data.
Figure 10 illustrates the effect of these scaling multipliers on the strong water band biases. In Figure 10 the bias is plotted versus the channel observed brightness temperature, which is a good proxy for altitude. Since the tuning was determined using ARM-TWP Phase 1, we show here biases for ARM-SGP Phase 1, a totally different set of sondes, and geographic locale. The circles are biases using the version 4 RTA, while the diamonds are for the untuned RTA. A grayscale codes the wave number of the channel. Clearly tuning drastically lowers the spread of bias values for this independent data set. Note, however, that instead of close to a zero bias that you would get for ARM-TWP Phase 1 (by definition) the ARM SGP Phase 1 biases are larger by up to 0.6 K, although they are close to zero in the lower atmosphere (higher observed brightness temperatures).
5.2. Validation of Version 4 AIRS-RTA: Bias Results
 The main results of this validation study are shown in Figure 11, where the mean RTA biases and standard deviations relative to all RS-90 ocean scene profiles are plotted in Figure 11b. and the RTA biases and standard deviations relative to all ocean scene ECMWF profiles are plotted in Figure 11c. The ECMWF bias mean is taken over 24 months of data between ±45° latitude. By definition, the upper altitude channels in the 650–700 cm−1 and the 2300–2480 cm−1 range have almost the same biases since the ECMWF model fields are used to supplement the RS-90 profiles above 60 hPa for temperature. Again, these biases are computed using effective SSTs derived from the observed window channel radiances.
Figure 12 is a zoom of Figure 11 that more clearly shows how the biases in the 650–700 cm−1 region oscillate from close to zero (in-between lines, meaning lower in the atmosphere) to more than 1 K on top of lines peaking high in the atmosphere. Since the lower pressure CO2 transmittance in the high-peaking channels should be more accurate than those peaking lower in the atmosphere, we suspect that these bias errors are not RTA errors but rather ECMWF temperature errors. A comparison between high-altitude limb-sounding temperature retrievals made with the MIPAS spectrometer on ENVISAT and ECMWF model temperatures supports this conjecture [see Dethof et al., 2004, Figure 6 (bottom)]. They found that global mean ECMWF temperatures in the lower stratosphere were up to 3 K too cold, a result that is compatible with our ∼1 K biases for strong lines with a significant stratospheric component. Additional work is needed to more carefully intercompare the MIPAS results with those presented here. In addition, the AIRS radiometric calibration has not been definitively validated at low brightness temperatures [Aumann et al., 2006], so some uncertainties remain with the accuracy of the AIRS cold radiances.
 ECMWF model temperature data are expected to be quite accurate in the troposphere, and indeed we find that the RTA biases for our RS-90 sonde data set are almost identical to those for the ECMWF fields between 700 and 780 cm−1, and between 2386 and 2400 cm−1, with an RMS agreement of 0.05 K. The agreement between observed and computed radiances for both data sets in this range is 0.1–0.2 K. Moreover, the ECMWF standard deviations are very low, from a max of 0.24 K down to 0.14 K in the 2386–2400 cm−1 tropospheric temperature sounding region, indicating that AIRS and ECMWF agree very well on an observation-by-observation basis. This suggests that ECMWF model fields are extremely useful for validation of the radiometric and spectral performance of future sounders, such as the IASI sounder on METOP and the CrIS sounder on NPP/NPOESS. Of course, ECMWF fields are not sufficiently accurate for validation of the RTA in regions sensitive to water vapor, as evidenced by the much higher standard deviations, and biases, when compared to the RS-90 data set results.
Figure 13 zooms in on the 800–1150 cm−1 window region. Excluding the ozone region, the biases are −0.1 to −0.2 K. This is very good agreement with the shortwave region, which was used to determine the effective SST for each scene. Simulation has shown that small negative slope in the bias from 950 to 800 cm−1 could be due to small amounts of residual clouds in these scenes. Note that for the tropical ARM-TWP scenes the brightness temperature depression in the 800 cm−1 region is as high as 7 K, so these low biases indicate that the water continuum is quite accurate.
 Biases in the strong water vapor band are shown in Figure 14. Here we see that the ECMWF bias is larger, with more wavelength variations than the sonde bias. The standard deviation of the ECMWF bias is also about twice as large as for the sonde data. This result is not surprising, and is essentially one reason for the existence of AIRS, to improve numerical weather forecasting of atmospheric water vapor. Still the overall shape of the sonde and ECMWF bias curves are similar, with negative biases below 1350 cm−1 and positive biases at higher wave numbers. The biases generally increase for higher-peaking channels where both the sondes and ECMWF are expected to have higher water vapor errors.
 The shortwave biases shown in Figure 15 are dominated by high positive biases due to suspected errors in the ECMWF upper atmospheric temperature fields. The 0.5 K biases seen below 2200 cm−1 are due to incorrect carbon monoxide (CO) mixing ratios in the profile, and do not reflect RTA errors. McMillan et al.  discuss the retrieval of CO with AIRS and shows initial results that validate the quality of the RTA via validation of the accuracy of the retrieved CO mixing ratios.
 Note that the biases in the shortwave window beyond 2400 cm−1 are very flat in Figures 15b and 15c for both the RS-90 sonde data and for ECMWF. This strongly suggests that our values for the shortwave water continuum and water lines are quite accurate. The 2616 cm−1 channel is of particular interest, since it has the least amount of water vapor absorption of any AIRS channel, and is the basis for validation of the AIRS radiometric calibration [Aumann et al., 2006]. The accuracy of the 2616 cm−1 water vapor continuum absorption was tested by fitting the variation in the 2616 cm−1 channel biases versus the ECMWF total column water to a linear equation. Separate fits were performed for each of 28 months of observations in order to estimate errors in the offset and slope. The mean offset was −0.51 K ± 0.11 K, where the standard deviation is taken over the 28 monthly averages. The slope in the bias versus total column water, if multiplied by the mean total column water amount of BBBB30 mm, is −0.042 K ± 0.077 K. The error bound for the water continuum absorption in this channel is quite low compared to an estimated accuracy of 0.2 K for the AIRS radiometric calibration (see discussion by Aumann et al. ). The total gas absorption for the 2616 cm−1 channel for 30 mm water, is 0.16 K due to water and 0.13 K due to the N2 continuum, CO2, and CH4. Errors in the absorption due to these three gases should be well below the error in the water continuum component for this channel.
 RTA errors for the 2616 cm−1 continuum absorption also depend on the statistical accuracy of the ECMWF total column water, which is the independent variable in this least squares fit. We examine the bias error in the ECMWF total column water using the brightness temperature differences between the 2616 cm−1 channel and the 2607.8 cm−1 channel which contains several unresolved weak water lines. Following Aumann et al. , we denote bt2616 as the 2616 cm−1 brightness temperature and bt2607 as the 2607.8 brightness temperature, and find that TCW ≈ (bt2616 − bt2607) × 8, where TCW is the total column water in mm. For the RS-90 sonde data presented here, where the TCW is known, the mean observed error in bt2616-bt2607 is 0.04 K ± 0.22 K. This indicates that the AIRS RTA will return a value for bt2616-bt2607 of close to 0 K if the TCW in the profile is accurate. For the ECMWF clear observations, we find that the mean bias in bt2616-bt2607 is −0.05 K ± 0.11 K, or equivalently an error in TCW of about 1.3%. An error of 1.3% in TCW only introduces an error in bt2616 of 0.004 K, leading to confidence in our assessment of the error in the computation of the water vapor absorption in the 2616 cm−1 channel of −0.042 K ± 0.077 K given above. This error bound is used by Aumann et al.  to derive an estimate for the absolute radiometric accuracy of AIRS.
Figure 16 is a further zoom of the important 2400 cm−1 temperature sounding region where AIRS noise levels are in the 0.1 K range. The ECMWF and sonde biases are remarkably similar throughout this whole spectral range, giving us confidence that our tuning of the CO2 absorption coefficients is valid, at least statistically. From 2385 to 2392 cm−1 the standard deviations of the sonde biases and the ECMWF biases are very low, around 0.2 K.
 A different view of the biases in the water band is given in Figure 17, similar to Figure 10, in order to visualize how the biases vary with altitude. Nighttime biases in the 1350–1615 cm−1 spectral range, which have weighting functions peaking in the 250–550 hPa range, are plotted. Figures 17a–17c show the mean bias for all of the RS-90 sondes, Vömel's frost point sondes, and Whiteman's LIDAR measurements, respectively. Overall these curves are fairly similar. A 0.5 K bias is roughly equal to a 10% change in water for these channels, although this value can vary by more than 50% depending on the channel. So, for the RS-90 sondes the bias in percent water vapor is roughly 3%, rising to 8% at the higher altitudes. This good agreement is not surprising, since the RTA has been tuned to the ARM-TWP Phase 1 measurements. The larger biases at higher altitudes may more reflect variability in the ECMWF fields above our 200 hPa cutoff, above which we do not use the sonde data in the bias calculations. A 10% error in the ECMWF fields above 200 hPa corresponds roughly to a 0.2 K error for the higher-peaking channels, so care must be taken in determining the source of the bias error in these channels.
 However, it is interesting that frost point biases shown in Figure 17b are very similar to the high-altitude RS-90 biases. We used the frost point sonde data up to 60 hPa, which drastically reduces the sensitivity of the bias calculations to errors in the ECMWF water fields. The frost point biases agree very well with the RS-90 biases until reaching observed temperatures of 265 K, where the frost point biases drop another 0.3 K. Figure 17c shows Whiteman's Raman LIDAR biases, which lie somewhere in between the RS-90 and Vömel's frost point results, so the overall agreement is excellent.
 The AIRS Water Vapor Experiment-Ground (AWEX-G) experiment [Whiteman et al., 2006; Miloshevich et al., 2006] performed in November 2003 attempted to intercompare various atmospheric water vapor sensors. Their work deemed the frost point hygrometer as the reference-quality standard of known absolute accuracy, and by comparing RS-90 measurements taken simultaneously with the frost point they developed empirical corrections to the RS-90 series of sondes. These corrections have so far been applied to the ARM-TWP Phase 1 RS-90 data set. In Figure 17d we show the effect of the AWEX-G derived RS-90 corrections, by plotting the difference between radiances computed using the AWEX-G derived corrections minus those using a standard RS-90 sonde. These corrections, if applied to the RS-90 biases in Figure 17a, would be added to the RS-90 biases, and would thus flatten out this curve, close to zero bias, at all observed brightness temperatures. Since the AWEX-G corrections depend on the sensor temperature, these corrections may vary between data sets, but should not change drastically. However, that would introduce a more significant difference between our frost point biases (Figure 17b) and the supposedly corrected RS-90 results. The same would hold true for Whiteman's LIDAR results. Since the AWEX-G corrections are just being introduced, we will not discuss them further here. As the AWEX-G corrections are applied to all the RS-90 sonde data these results will be reevaluated.
 We observe significant differences in the RS-90 biases between day and night. Figure 18a shows day/night bias differences that vary from ∼0.2 K to ∼0.5 K with increasing altitude. This implies that the RS-90 water profile has a dry bias during the day of roughly 4–10%. This agrees with Miloshevich et al. , who observed a 6–8% RS-90 daytime dry bias by comparing microwave to RS-90 derived total column water vapor at the ARM-SGP site. He attributes these differences to solar heating of the sensor during the day. Figure 18b also shows day/night differences in the frost point hygrometer biases, of roughly 5–15% in the water vapor mixing ratio, but with a wet bias during the day. Note that both sensors have smaller day/night differences at lower altitudes. The frost point data shown here only involved 14 sonde launches, 7 day and 7 night, so this result is based on limited statistics.
 Spectra of the water vapor biases for our largest statistical set of data, the RS-90s, is shown in Figure 19 where both the mean and standard deviations for the biases are plotted for night, day, and the average of night plus day. Figure 19 shows the higher standard deviation for the day biases. We also include in Figure 19 biases below 1350cm−1 which are systematically low for nighttime compared to high wave numbers. During the day the bias curve is flat at all wave numbers. The average of the day and night biases is close to zero except at the lower wave numbers where the nighttime droop in the bias starts to dominate. Although it is interesting that this average bias is so small, the preceding discussion suggests that the error bars on these bias curves are probably on the order of ±0.5 K (or about 10% in water mixing ratio) given the differences between the RS-90, frost point, and LIDAR biases.
 The ARM program has recently determined that the microwave calibration of their RS-90 sensors was in error for the ARM best estimate data utilized here [Liljegren et al., 2005; Tobin et al., 2006a]. To correct this error, all ARM RS-90 water vapor amounts should be lowered by 3%. This would shift the observed-minus-computed biases presented here by ∼−0.2 K inside the main water band and by ∼−0.05 K to 0.1 K in the 12 micron window channels. These 3% corrections were not performed for any of the results presented here.
 The AIRS RTA does not include nonlocal thermodynamic equilibrium (non-LTE) emission that occurs in the upper atmosphere during solar illumination. This decision assumed that non-LTE effects would not be significant in important temperature sounding channels. Figure 20 shows the only region of the AIRS spectrum affected by non-LTE. This plot is the mean difference between day and night radiances in our uniform_clear data set, averaged over 24 months and over a ±45° latitude range. We have done some limited theoretical modeling of non-LTE emission that agrees with these observations to ∼1 K, which is not yet sufficiently accurate for retrievals, but does prove that we are indeed seeing non-LTE emission.
 As you move from the CO2 R-branch band head near 2400 cm−1 into the band, non-LTE emission reaches the AIRS noise level (about 0.15 K at a 250 K observation temperature) around 2387 cm−1, more quickly than anticipated. This has rendered several good high-altitude temperature sounding channels unusable for daytime observations. We present these observations because they represent limitations to the version 4 AIRS RTA that should be recognized by users. However, the non-LTE emission is not difficult to model, and will likely be included in future versions of the AIRS RTA.
7. Variable Gases
 A fundamental limitation in computing AIRS radiances from the RTA is the specification of minor gas mixing ratios that are not part of the standard retrieval. In the present AIRS retrieval CO2, CH4 and N2O are fixed at a single climatological profile. The version 4 AIRS RTA is capable of varying the column CO2 by multiplying the mixing ratio profile by a single scalar multiplier. CH4 profiles can be varied on all of the 100 AIRS RTA layers. However, at present N2O cannot be varied in the AIRS-RTA and more work is needed to determine if this limitation needs to be removed.
 The major purpose of this section is to estimate RTA bias errors introduced by our use of fixed abundances for various minor gases in the RTA validation profiles. One could view this as a limitation of the present RTA, which may be fixed in the future by modifications of the RTA and retrievals of some of the minor gas abundances from the AIRS radiances.
 It is well known that atmospheric CO2 has spatial and temporal variability at the few percent level, which led us to include the capability to scale the CO2 profile in the AIRS RTA. However, the version 4 AIRS temperature retrievals keep the CO2 mixing ratio fixed at 370 ppmv, a reasonable value for 2002. Therefore we also kept the CO2 mixing ratio fixed at 370 ppmv for the RTA validation comparisons. This seemed to be a reasonable first approach, especially for weather applications.
 The absolute accuracy of the CO2 spectroscopy in the AIRS RTA is difficult to estimate, given the importance of the far-wing line shape on the AIRS CO2 radiances. We have previously shown [Strow et al., 2003a] that P/R-branch line-mixing, a ∼1 K effect in midtropospheric sounding channels at 15 microns, must be included in the CO2 line shape to achieve good agreement with high-spectral resolution aircraft observations of upwelling radiances. Those results used the HITRAN values for CO2 line strengths, which gave us excellent agreement with observations (0.2 K or better in the 700–760 cm−1 sounding region), indicating that the HITRAN CO2 line strength error estimate of 4% (equivalent to ∼0.5 K maximum in brightness temperature) may be overestimated. This highlights the fact that atmospheric radiance observations by AIRS are pushing the state-of-the-art in laboratory spectroscopy for even simple molecules like CO2. Note that the fixed gas tuning multipliers in the 700–760 cm−1 region vary about unity quite randomly, which would not be the case if the cause was an incorrect CO2 mixing ratio or incorrect spectral line shape.
 In the 2390–2400 cm−1 sounding region we had to modify the fixed gas absorption coefficients (which include CO2) by 3–5%. The need for these corrections could arise from sources other than spectroscopy, such as slightly inaccurate SRF wings, and initial postlaunch frequency calibration errors. However, this region is very difficult to quantify with laboratory spectroscopy, and to theoretically model [see Strow et al., 2003a], given the extreme sub-Lorentzian behavior of the absorption. Note that a 5% adjustment to this spectral region is quite small, amounting to an error of less than 0.02 in transmittance in a laboratory spectrum taken under ideal conditions. Our use of a constant 370 ppm of CO2 for the ARM-TWP data used to modify the RTA in this spectral region may also introduce some bias error into the RTA, but we will show that this is probably less than 0.2 K Consequently, we suspect that a significant fraction of our modification of the transmittances in this region are of spectroscopic origin.
 CO2 is increasing continuously, and varies seasonally by up to ∼3%, so the use of a constant CO2 amount in the retrievals and in the validation studies presented here could introduce bias errors that vary over time and space, which would be especially problematic for climate studies. For example, simulations reported by Engelen et al.  show that errors on the order of 0.5 K in the temperature profile can arise from the use of a constant CO2 mixing ratio over the whole globe.
 Consequently, it is possible that some of our adjustments to the CO2 channel absorption coefficients were due to an incorrect CO2 mixing ratio that is changing with time. We have tested this by recomputing all the RS-90 bias calculations, using the NOAA-CMDL climatology [GLOBALVIEW-CO2, 2004; Tans and Conway, 2005] of mean monthly global estimates of CO2 mixing ratios. Figure 21b, solid curve, shows that the CMDL climatology has a maximum effect of ∼0.2 K, which is on the order of magnitude of the biases in many CO2 channels. The true variation in CO2 mixing ratios observed by AIRS may be slightly lower since most CO2 channel weighting functions peak in the 600 hPa range, and have little sensitivity in the boundary layer where the CMDL climatology is valid.
 Also shown in Figure 21 is the estimated change in the brightness temperatures due to variable CO2 over the expected 7-year lifetime of the AIRS instrument. (This calculation assumes a 30 ppmv change in CO2 mixing ratio, based on a max seasonal cycle of 16 ppmv at high northern latitudes, and a 14 ppmv overall increase in CO2 over 7 years.) These changes approach 1 K, and, if ignored, will make AIRS temperature retrievals useless for climate studies. Clearly, future improvements to the AIRS retrieval algorithm must take variable CO2 into account.
 Several groups [Chédin et al., 2003; Crevoisier et al., 2004; Engelen et al., 2004; Aumann et al., 2005] have reported on preliminary studies to detect variable CO2 from the AIRS radiances. These authors concentrated on tropical regions and, except for Aumann et al. , looked at limited time spans of data. The study by Aumann et al.  is quite compelling since it is very straightforward, using differences between two AIRS channel radiances to deduce seasonal and secular trends in CO2. Their technique does require some knowledge of global N2O variability in order to achieve quantitative results.
 Our approach here is to study CO2 variability within the context of bias evaluation relative to ECMWF temperature fields over a long enough time period to detect seasonal trends. In addition, our uniform_clear data set has a sufficient number of observations that are free of significant cloud contamination to enable detection of variable CO2 at higher latitudes. This, of course, assumes that the monthly ECMWF temperature fields are unbiased over the latitudes studied here.
 We present here a detailed examination of the nighttime biases for the AIRS channel centered at 791.75 cm−1. This channel was chosen because it has a lower sensitivity to temperature profile errors than other CO2 channels, and has been well characterized by laboratory measurements [Strow, 1993]. In addition, this channel has only minor interferences from other gases, although a reasonably accurate surface temperature is needed. The AIRS radiances for this channel were taken from our uniform_clear data set, night only, that was used to generate the biases shown in Figures 11–16. The ECMWF computed radiances for each AIRS observation used an effective SST derived from several channels near 2616 cm−1, as discussed earlier. We also modified the ECMWF total column water with a single scalar multiplier to agree with the AIRS observed total column water. Radiances for both 370 and 385 ppmv of CO2 were computed, giving us a CO2 sensitivity for every AIRS observation, and allowing us to calibrate the biases relative to ECMWF in units of ppmv of CO2.
 Monthly averages of the bias relative to ECMWF were computed for a 29-month period. Figure 22 shows the time dependence of zonal averages of the 791.75 cm−1 channel biases, separated into 10° latitude bins. The right-hand scale is calibrated in brightness temperature bias, while the left-hand scale is calibrated in CO2 ppmv offset by 370 ppmv. Note that the individual curves in this plot have been offset from each other (in ppmv units) for clarity. In addition, an overall mean bias has been removed from the 29-month data set, forcing the a brightness temperature bias of 0 K to equal 370 ppmv of CO2. (The absolute bias error for this channel is estimated to be 5 ppm, or 0.17 K, on the basis of NOAA/CMDL global mean CO2 for the month of March 2003.) The latitude range is limited by decreasing numbers of clear ocean FOVs at the higher latitudes.
 These results show clear evidence of the CO2 seasonal cycle, larger in the Northern Hemisphere than in the Southern Hemisphere, where the seasonal cycle reverses phase. The 30–50° latitude seasonal cycle has a peak-to-peak variation of about 8 ppmv, which is very close to the NOAA CMDL values for this latitude. A secular increase of ∼2 ppmv per year is evident in the Northern Hemisphere. This result means that AIRS has the stability, and sensitivity (at least for averaged data), to easily detect global variation of CO2, which has also been pointed out by Aumann et al. .
 Another view of these data is given in Figure 23, where the data for 1 year are averaged over 3-month periods, and then plotted versus latitude. Figure 23 more clearly shows the phase reversal of the CO2 cycle between the Northern and Southern hemispheres. A more detailed analysis of these results is forthcoming. As mentioned earlier, these results show that a more refined validation of the AIRS RTA will require taking CO2 variability into account and users should keep this in mind when using the version 4 RTA.
 Atmospheric CH4 variability may also impact validation of the AIRS RTA and retrieval products. There are numerous CH4 features in the 1230–1370 cm−1 region that overlap with the water vapor lines. The version 4 AIRS RTA allows for variable CH4 profiles, but the lack of in situ data did not allow us to vary the CH4 profile in the RTA validation data sets. Instead we used the AIRS RTA reference CH4 profile which has a midtropospheric mixing ratio of ∼1.8 ppmv. It is possible that some of the RTA validation, and subsequent tuning, has been impacted by variable CH4. A preliminary examination of the variable CH4 in the AIRS radiances is summarized in Figure 24, which is identical to Figure 23 except we are now plotting the biases in a strong CH4 channel centered at 1304.35 cm−1.
Figure 24 shows reductions in CH4 of about ∼12% from the Northern to Southern hemispheres. The NOAA/CMDL CH4 climatology [GLOBALVIEW-CH4, 2005] for April 2002, for example, has a 7% drop from 50°N to 50°S. Our MAM (March April May) curve shows a drop of about 7% as well, although it also shows an increase in CH4 in the midlatitudes that is not reflected in the CMDL climatology. The 1304.35 cm−1 channel is sensitive to CH4 in the lower stratosphere as well as the midtroposphere, which could make comparisons with the lower-tropospheric CMDL climatology problematic. Note that the variations in this channel's brightness temperature biases are rather large, since a 10% change in column CH4 is roughly equal to a brightness temperature change of 1 K. This result shows that just as in the case of CO2, reasonable CH4 variability can be extracted from AIRS radiances, and that a more refined validation of the AIRS RTA will need to account for the true CH4 profile. Since CH4 features appear throughout much of the water sounding band, further improvements in the water vapor channels may also be dependent on properly accounting for CH4 as well.
 Detailed validation of the RTA channels sensitive to O3 has not yet been performed. We have, by default, computed biases relative to the ECMWF model O3 fields, which generally exhibit some of the largest biases in the AIRS radiance spectra. The line structure of O3 is very dense and complicated, and absolute O3 line intensities are difficult to measure in the laboratory. The AIRS RTA is based on the HITRAN 2000 database, which was updated recently to HITRAN 2004 [Rothman et al., 2005]. This update had some relatively significant changes in O3 line parameters, which we used to update the AIRS RTA (these updates are not in the version 4 RTA). The changes to the O3 lines in HITRAN 2004 translate into modifications of the RTA channel-averaged absorption coefficients of ∼3%.
Figure 25 shows biases for October 2002 relative to ECMWF between ±45° latitude, for the version 4 RTA ozone (solid curve) and for a modified version of the RTA using the HITRAN 2004 O3 line parameters (shaded curve). The new line parameters give significantly lower biases in the strong portion of the O3 band between 1040 and 1060 cm−1. In the weaker portions of the band the new biases are just slightly worse. This result suggests that the AIRS O3 retrievals may be improved with the addition of the HITRAN 2004 O3 line parameters.
7.4. SO2 and HNO3
 Neither SO2 nor HNO3 profiles can be varied in the version 4 AIRS RTA, their values are set at climatological estimates. Experience with the AIRS observations have shown that AIRS can easily detect SO2 from a number of volcanic eruptions each year. We have developed a prototype RTA with variable SO2 [Carn et al., 2005] that was used to measure SO2 output by the October 2002 Mt. Etna eruption. At this time, we do not believe that the version 4 RTA is generally compromised by uncertainties in background SO2 amounts since they are very small, and volcanic eruptions are isolated. However, care must be taken in using the RTA, or doing retrievals, in regions where volcanic gases are present. The range of channels sensitive to SO2 can be clearly seen in Figure 26, which shows the AIRS RTA biases relative to ECMWF for about 70 scenes over ocean near the Mt. Etna eruption. The spectral region from 1320 to 1380 cm−1 clearly shows up to 10 K depressions in the bias due to absorption by the SO2. Comparisons between retrieved SO2 amounts using our prototype RTA with other techniques [Carn et al., 2005] shows rough agreement (generally in within 50%), the best that can be expected considering the difficulty of comparing measurements of these types.
 We have also observed variable HNO3 in the AIRS spectra (see Figure 27). Figure 27 shows the change in the bias, relative to ECMWF, between a high-latitude (∼40–50° latitude) zonally averaged uniform_clear bias spectrum and one from the midlatitudes (∼10–30°) for the 879 and 896 cm−1 HNO3 bands. The bias differences that results from the use of a constant HNO3 amount are quite large, over 1 K for the 879 cm−1 band. We have also detected changes in HNO3 with latitude in the 763 and 1325 cm−1 bands. The measurement of daily HNO3 with AIRS may prove interesting, so the inclusion of variable HNO3 in the AIRS RTA should prove worthwhile.
 The ability of the AIRS RTA to accurately compute the observed AIRS radiances has been tested with a variety of data sources. The combination of several relatively independent data sources; RS-90 sondes, the NOAA/CMDL frost point hygrometer, the GSFC Scanning Raman LIDAR, and the ECMWF analysis/forecast fields indicates that the AIRS RTA accuracy for mid- to lower-tropospheric CO2 channels is ∼0.2 K, and ∼0.2–0.5 K for mid- to lower-tropospheric water channels. These error bars are only slightly higher than the proven radiometric accuracy of AIRS [Tobin et al., 2006b; Aumann et al., 2006]. Upper tropospheric water vapor channels RTA errors are harder to characterize because in situ measurements are more difficult, with day/night variability of 0.6 K or more. The water vapor profiles from the ARM sites have not been reduced by the 3% figure recommended by Liljegren et al. , which would change the RTA biases by at most approximately −0.2 K, in regions of strong water emission. We also observe the dry bias in the RS-90s during daytime operation reported by Miloshevich et al. .
 The RTA performance for stratospheric CO2 channels is difficult to quantify to better than ∼1 K because of the lack of in situ data, and indications from other data source (MIPAS on ENVISAT) that the ECMWF stratospheric model data are biased by several K.
 Empirical adjustments were made to the RTA absorption coefficients for some channels on the basis of the ARM-TWP Phase 1 RS-90 validation data set, which comprised about 10% of all AIRS in situ validation data studied here. These adjustments were shown to be valid by analysis of remaining 90% of the validation data. The ECMWF model data biases, for CO2 channels, are almost identical to the sonde biases, giving us additional confirmation of the accuracy of the RTA. However, we have shown that these empirical adjustments may be compensating for a variety of effects that are not due to spectroscopy errors, but due to some combination of frequency calibration errors, and/or incorrect mixing ratios for CO2, CH4, or N2O in the RTA validation profiles. More importantly, the source of these nonspectroscopic adjustment are all changing with time, and must be addressed for climate studies.
 The use of fixed mixing ratios for a number of gases limits the accuracy of the RTA. We are currently testing the capability of the RTA to vary the vertical CO2 profile. We have shown that AIRS radiances contain very good climatological information on the global variability of CO2, and that this information must be supplied to RTA calculations if temperature retrieval accuracy is to be maintained over the life of the mission. In addition, further improvements to the absolute accuracy of the CO2 channels in the RTA must somehow take into account the CO2 variability in the validation data sets. These comments also apply to CH4 channels, although the RTA already has the capability to vary the complete CH4 profile. We have also shown that improvements to the RTA O3 channels may be possible with updated O3 line parameters from the HITRAN 2004 database.
 Simple bias calculations using ECMWF model fields have shown that SO2 and HNO3 variability is easily seen by AIRS, and consequently the capability to vary these gases needs to be added to the RTA.
 The reflected thermal component of the AIRS RTA has not been validated to date. The high ocean emissivities make this term rather small and hard to evaluate, but it will also be difficult to evaluate this term over land given uncertainties in emissivity and surface temperature from ground truth.
 We have not discussed here the effects of atmospheric scattering on AIRS radiances. Mineral dust clouds occur quite often, and can be spatially uniform, allowing them to pass through the cloud-clearing process. To mitigate this problem, mineral dust scattering may have to be added to the RTA in the future.
 In summary, the RTA is performing very well relative to our knowledge of the underlying spectroscopy. For mid- to lower-tropospheric channels the RTA is approaching 0.2 K in accuracy. Uncertainties in the validation data at higher altitudes make it more difficult to prove this level of accuracy for high-altitude channels. However, for weather applications the RTA appears sufficiently accurate. The RTA is not accurate enough in an absolute sense for climate applications, but is accurate enough to reliably compute changes in radiances at the climate level. We expect that continued examination of AIRS validation data will lead to improvements in basic atmospheric spectroscopy that can be applied to other instruments, especially the follow-on instruments to AIRS; CrIS on NPP/NPOESS and IASI on METOP.
 This work was funded by NASA HQ under grant NNG04GG03G. ECMWF has kindly provided their operational forecast and analysis data, via NOAA/NESDIS, to the AIRS project for validation. The outstanding performance of the AIRS instrument is the result of the JPL and BAE groups that developed and calibrated AIRS, and we thank them for their efforts.