The validation of version 2.2 (v2.2) H2O measurements from the Earth Observing System (EOS) Microwave Limb Sounder (Aura MLS) on the Aura satellite are presented. Results from comparisons made with Aqua Atmospheric Infrared Sounder (AIRS), Vaisala radiosondes, frost point hygrometer, and WB57 aircraft hygrometers are presented. Comparisons with the Aura MLS v1.5 H2O, Goddard global modeling and assimilation office Earth Observing System analyses (GEOS-5) are also discussed. For H2O mixing ratios less than 500 ppmv, the MLS v2.2 has an accuracy better than 25% between 316 and 147 hPa. The precision is 65% at 316 hPa that reduces to 25% at 147 hPa. This performance is better than expected from MLS measurement systematic error analyses. MLS overestimates H2O for mixing ratios greater than 500 ppmv which is consistent with a scaling error in either the calibrated or calculated MLS radiances. The validation of the accuracy of MLS v2.2 H2O from 121 to 83 hPa which is expected to be better than 15% cannot be confirmed at this time because of large disagreements among the hygrometers used in the AVE campaigns. The precision of the v2.2 H2O from 121 to 83 hPa is 10–20%. The vertical resolution is 1.5–3.5 km depending on height. The horizontal resolution is 210 × 7 km2 along and perpendicular to the Aura orbit track, respectively. Relative humidity is calculated from H2O and temperature. The precision, accuracy, and spatial resolution are worse than for H2O.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Water vapor is a key component in weather and climate as an agent of energy transfer and a greenhouse gas. Accurate water vapor and relative humidity measurements are needed for model testing and improvement, weather forecasting and predicting future climate change. This paper provides an assessment of the accuracy, precision and resolution of the version 2.2 (v2.2) Microwave Limb Sounder(MLS) H2O and relative humidity with respect to ice (RHi) products in the upper troposphere and lower stratosphere (UTLS, 316–83 hPa). A similar assessment for H2O in the stratosphere and mesosphere (pressure less than 83 hPa) is given by Lambert et al. . We also provide quality screening rules for using the data in scientific studies.
 The paper is organized as follows: section 2 presents the measurement method, measured precisions and estimated accuracy, section 3 is a zero-order validation of the data set showing similar behavior with older data sets and meteorological dynamics, section 4 gives results from detailed coincident comparisons between MLS and other sensors, and section 5 gives a summary of the accuracy, precision and spatial resolution of the v2.2 UTLS MLS H2O.
2. H2O and RHi Measurement
2.1. MLS Overview
 MLS observes thermal microwave–far infrared emission from the Earth's atmosphere in five spectral regions. The H2O and RHi measurements described in this paper are retrieved from measurements of the 183 GHz H2O rotational line spectrum. MLS looks forward from the Aura spacecraft and vertically scans the Earth's limb from near the surface to 90 km every 24.7 s. The vertical scan rate varies with altitude with a slower scan in the troposphere and lower stratosphere (0–27 km). The slower vertical scan provides a spectrum every ∼400 m in the troposphere and lower stratosphere.
 This paper describes the use and validation of the v2.2 H2O and RHi data. H2O is retrieved from calibrated MLS radiance observations by the MLS data processing algorithms [Livesey et al., 2006; Jarnot et al., 2006]. The Goff-Gratch [List, 1951] function is used to compute RHi from retrieved H2O and temperature. H2O and RHi are measured or calculated on defined pressure and horizontal grids. The UTLS H2O retrieval pressure grid is 316, 261, 215, 178, 147, 121, 100, 83, …hPa, or 12 levels per decade change in pressure (lpd, ∼1.3 km). The horizontal grid places the profiles every 1.5° along the orbit track. The horizontal grid is phased such that a profile coincides with the equator. There are 240 profiles per orbit at defined but not equally spaced latitudes. MLS retrieves slightly under 3500 H2O and RHi profiles per day.
 The MLS H2O and RHi products are reported in separate Level 2 Geophysical Product (L2GP) files for a 24 h period from midnight to midnight universal time. The L2GP files store the data in an Hierarchical Data Format (HDF)-EOS version 5 “swath” format with the swath name (H2O or RHI) describing the product. The MLS Version 2.2 data quality document [Livesey, 2007] gives more information on the file format.
2.2. Proper Use of MLS UTLS H2O and RHi Data
 Each MLS level 2 data point is reported with a corresponding precision value. These reflect the likely contributions of radiometric noise to the data and, in regions where measurement sensitivity is less, the contribution of a priori information. These issues are discussed in more detail in section 2.5. The precisions are set to negative values in situations when the retrieved precision is larger than 50% of the a priori precision, an indication that the data are biased toward the a priori value.
 Three additional data quality metrics are provided for each vertical profile: “Status,” “Quality,” and “Convergence.” The “Status” field is an integer indicating circumstances where profiles are not to be used, or may be suspect due to instrumental and/or retrieval issues. Odd values of “Status” indicate profiles that should never be used. Some nonzero, even values of “Status” occur when the retrieval algorithm detected cloud signatures in some radiances and chose to either ignore them or deemphasize them by substantially increasing their precision. Another even value is used if the Goddard global modeling and assimilation office Earth Observing System (GEOS-5) analysis [Rienecker et al., 2007] temperature and reference geopotential height data are missing and the retrieval uses a climatological profile for the a priori temperature profile. The impact of this on MLS data varies with species and height. More details on the “Status” field are given by Livesey  and the meaning of their values are presented in Table 1. The “Quality” field is related to the goodness of the residual between the measured and calculated radiances (larger values imply better fits). Finally, “Convergence” compares the fit achieved for a “chunk” of ∼10 profiles to that expected by the retrieval algorithms, larger values imply poorer convergence. Values in the range 1.0–1.1 indicate that acceptable convergence has been achieved for all 10 profiles. For UTLS H2O and RHi, “Convergence” and “Quality” are tightly linked. Poor “Convergence” is always associated with poor quality in some profiles. Since “Quality” is specific for each profile, whereas “Convergence” applies to a chunk of 10 profiles, we chose to ignore “Convergence” and screen by “Quality”.
Status field in L2GP file is total of appropriate entries.
flag, do not use this profile (see bits 8–9 for details)
flag, this profile is “suspect” (see bits 4–6 for details)
information, this profile may have been affected by high altitude clouds
information, this profile may have been affected by low altitude clouds
information, this profile did not use GEOS-5 temperature a priori data
information, retrieval diverged or too few radiances available for retrieval
information, the task retrieving data for this profile crashed (typically a computer failure)
 The data quality metrics that need to be considered when using MLS UTLS H2O and RHi data between 316 and 83 hPa, are as follows; 1. The precision value for that data point is positive. 2. The profile “Status” for that profile is even. 3. The profile “Quality” is greater than 0.9. Refer to Lambert et al.  for pressures lower than 83 hPa.
2.3. MLS v2.2 H2O and RHi Measurements
 MLS measures the 183 GHz H2O rotational line spectrum as a function of height as shown in Figure 1 (top). The spectra also show emissions from neighboring weaker molecules, N2O, O3(ν2), HNO3, ClO, O3, and HCN. Each spectrum measured by MLS is associated with a field-of-view (FOV) pointing called tangent pressure. The tangent pressure of each pointing is determined from line width measurements of O2, O18O, and FOV limb scan tangent height data as described by Schwartz et al. . The temperature profile is retrieved simultaneously with tangent pressure. The temperature and tangent pressure measurements are constrained quantities in the H2O retrieval.
 The 12 lpd vertical gridding of the v2.2 H2O presented new challenges that required significant changes to the retrieval configuration. The most significant being the addition of vertical and horizontal regularization and dynamically determined a priori values and uncertainties in the troposphere. Regularization is a profile smoothing technique, performed both horizontally and vertically, that constrains its second derivative behavior in both dimensions. It is based on the second-order Tikhonov constraint [Livesey et al., 2006; Rodgers, 2000]. Version 2.2 utilizes a series of initial estimate retrievals to establish the vertical and horizontal regularization, a priori values, and uncertainties. A linear multispecies retrieval [Livesey et al., 2006] is performed on selected R2 radiances (Figure 1) having an optical depth less than 0.4. This produces a good H2O measurement for pressures less than 147 hPa. The regularization, a priori value and uncertainty for this retrieval are based on a zonal climatology. An estimate of H2O for pressures greater than 316 hPa is derived from a middle tropospheric RHi retrieval using opaque low-looking radiances. The theory of this measurement is given by Soden and Bretherton  and expanded upon for the MLS limb viewing geometry in Text S1, section S1, in the auxiliary material. An initial retrieval of H2O at 316, 215, and 147 hPa (standard levels used in v1.5) is done next where H2O for pressures less than 147 hPa and greater than 316 hPa is constrained to that from the two previous H2O retrievals. This 316–147 hPa H2O retrieval has no vertical or horizontal regularization and uses an appropriate subset of R2 radiances.
 H2O retrieved for the three altitude ranges are joined together and serve as the initial estimate in a multispecies non linear retrieval. H2O, N2O, HNO3, ClO, O3, SO2, HCN, and CH3CN are all simultaneously retrieved. H2O is retrieved from 316 to 10.0−5 hPa and the other molecules are retrieved from 100 hPa to their maximum altitude which varies by molecule. O3 and HNO3 from 316 to 147 hPa are constrained to that retrieved from the R3 (240 GHz) radiances [Livesey et al., 2007; Santee et al., 2007] which produces the best MLS estimate of these molecules in the troposphere. H2O for pressures greater than 316 hPa is constrained to that computed from the middle tropospheric RHi retrieval. The retrieval uses selected radiances from all the R2 bands whose limb tangent pressure is <350 hPa or in the case of band 2 and band 23, optical depth <0.4. The a priori profile for pressures greater than 10 hPa is the initial estimate profile interpolated to 12 lpd. The regularization for pressures greater than 10 hPa constrains the more highly sampled H2O retrieval to follow the horizontal and vertical profile shapes of the initial H2O retrieval. The a priori uncertainty for pressures greater than 10 hPa is the minimum of six times the retrieved uncertainty for the initial retrieved H2O or the value associated with the zonal climatology. The a priori, a priori uncertainty, and regularization for pressures less than or equal to 10 hPa are based on a zonal climatology.
 Because of deficiencies in our understanding of the instrument and/or forward model, the radiance residual of the fit is greater than instrument noise as shown in Figure 1 (bottom). Increasing the radiance precision by 0.003 times the radiance improves the retrieval convergence rate from 35% to 80% (“Convergence” ≤ 1.01). This amount of radiance precision inflation (RPI) is equivalent to 0.001 K for a space signal to 0.75 K for a 250 K signal. This is much less RPI than was used for v1.5 which was 0.7 K for all radiances. The residual shows that we are fitting the radiances to within ∼2%.
 Other differences from v1.5 include increasing the optical depth from 0.2 to 0.4 for band 2 radiances, using radiances from the digital autocorrelator spectrometer (DACS, band 23), and eliminating the temperature and tangent pressure retrieval in this phase. Spectroscopic changes include an increase in the H2O line strength by 0.7% and increasing H2O line width by 4%. The line strength change corrected an error in the Jet Propulsion Laboratory (JPL) spectral catalog [Pickett et al., 1998] and (B. Drouin, personal communication, 2005) and the line width value is from the cavity measurements by Meshkov  used to determine the H2O, N2, and O2 continuum absorption.
2.4. Differences Between v2.2 and v1.5
Figure 2 (left) compares v1.5 and v2.2 H2O profiles between 10°S and 10°N for 25 January to 7 February 2005. The tropics are chosen because of widespread interest in using these data for tropical UTLS investigations. Version 2.2 H2O has twice as many vertical levels below 22 hPa than v1.5. Figure 2 (middle) shows the percent differences. The zig-zag nature of the mean difference beginning just below the tropopause (∼100 hPa) and propagating to higher altitudes is mostly a smoothing artifact of the relatively coarse vertical gridding of the v1.5 H2O. Down sampling 12 lpd data to 6 lpd using the forward model smoothing function [Livesey et al., 2006; Read et al., 2006] introduces these artifacts. The biases caused by the minor spectroscopic changes to H2O are smaller (shown later).
Figure 2 (right) shows the estimated single profile precision and the standard deviation of retrievals about the mean H2O. The benefit of using more radiances with less RPI in v2.2 is evident as its single profile uncertainty is comparable to or better than v1.5 in the lower stratosphere even though the vertical resolution is better at most heights.
 In the absence of RPI, the measured H2O variability is the root sum square of the H2O estimated precision and the atmospheric variability. RPI leads to an overestimation of the retrieved H2O precision. Version 2.2 shows agreement between the estimated precision and the measured variability in the lower stratosphere as shown in Figure 2 (right). This agreement is a near match between atmospheric H2O variability and the additional increase in the precision caused by the v2.2 RPI. A method of estimating the H2O precision independently of the retrieval algorithm is to measure the variability of closely coincident profiles on the ascending and descending sides of the Aura orbit [Lambert et al., 2007]. This approach requires that diurnal, dynamical, and chemical effects are negligible over a 12 h period. For H2O, these conditions probably apply to pressures less than 100 hPa in the lower stratosphere. That analysis shows that the estimated precision from the v2.2 MLS retrieval algorithm is overestimated by 25%. For pressures less than 100 hPa in the tropics, the close agreement between the estimated precision and measured variability for H2O is consistent with ∼5% atmospheric variability after correcting for the RPI. Also noteworthy is the decrease in H2O variability for most of the upper tropospheric levels seen in v2.2 relative to v1.5.
2.5. Precision and Spatial Resolution
 Each MLS H2O and RHi data point is accompanied with a precision. The precision for H2O is taken from the diagonal elements of the solution covariance matrix [Livesey et al., 2006]. It is the error estimated from combining the radiance precision with the a priori uncertainty. A positive precision means that it is less than 50% of the a priori uncertainty indicating that the retrieved H2O is mostly from radiance information. Typical examples of the single profile precisions are given in Table 2 and shown in Figure 2. Table 3 gives RHi precision which incorporates both H2O and temperature precisions.
Table 2. Typical Single Profile Precisions and Resolutions for H2Oa
A range is presented because the estimated precision depends on H2O concentration. The larger value is for latitudes poleward of 60° and the lower value is for the tropics.
V is the vertical resolution. H∥ is the horizontal resolution along the measurement track. The horizontal resolution perpendicular to the measurement track is 7 km for all heights. A range is given for V and H∥ because the H2O averaging kernel full width at half maximum varies with concentration. Generally, better resolution is achieved under drier conditions.
Same as footnote b except the larger value refers to a tropical average.
V is the vertical resolution. H∥ is the horizontal resolution along the measurement track. The horizontal resolution perpendicular to the measurement track is 7 km for all heights. The resolution of the RHi product will be dominated by the spatial resolution of the temperature product [Schwartz et al., 2007].
A range is presented because the estimated precision depends on H2O concentration. The smaller value is a tropical average, and the larger value is for latitudes poleward of 60°.
 MLS H2O measurements are related to the true atmosphere according to
where xMLS is the logarithm of MLS retrieved H2O, x is the logarithm of the true H2O, xa, is the logarithm of the H2O a priori profile, ηt(ηηt)−1 is the forward model smoothing function [Read et al., 2006; Livesey et al., 2006], which is a least squares solution matrix that fits the MLS grid point representation function to the true profile, and A is the retrieval averaging kernel [Rodgers, 1990]. In general, vectors xMLS, xa and x, consist of points associated with a vertical and horizontal location and matrices η and A operate in both the horizontal and vertical dimensions. Equation (1) is valid for situations where the measurement system responds linearly to the profile fluctuations being smoothed. For H2O this is often not the case for pressures greater than 147 hPa and therefore equation (1) may not accurately represent the smoothing.
Figure 3 shows vertical and horizontal averaging kernels, A, for UTLS H2O, respectively. The vertical and horizontal resolution along the orbit track is the full width at half maximum (FWHM) of the averaging kernel. The width of the averaging kernel varies with H2O concentration. The horizontal resolution perpendicular to the orbit track is 7 km, the FWHM of the azimuth antenna pattern [Cofield and Stek, 2006]. The orbit tracks are separated by 10°–20° of longitude at middle to low latitudes with much finer sampling in the polar regions.
 The resolution of the relative humidity product is most likely represented by the temperature or H2O product having the poorer resolution. The temperature averaging kernels are given by Schwartz et al. . The quantified resolutions for H2O and RHi as a function of height are given in Tables 2 and 3.
 A major component of the validation of MLS data is the quantification of the various sources of systematic uncertainties. Systematic uncertainties arise from instrumental issues (e.g., radiometric calibration, FOV characterization), spectroscopic uncertainty, and approximations in the retrieval formulation and implementation. This section summarizes the results of a comprehensive quantification of these uncertainties that was performed for all MLS products. More information on this assessment is given in Appendix A.
 The impact on MLS measurements of radiance (or pointing where appropriate) of each identified source of systematic uncertainty has been quantified and modeled. These modeled impacts correspond to either 2-σ estimates of uncertainties in the relevant parameters, or an estimate of their maximum reasonable errors based on instrument knowledge and/or design requirements. The effect of these perturbations on retrieved MLS products has been quantified for each source of uncertainty by one of two methods.
 In the first method, sets of modeled errors corresponding to the possible magnitude of each uncertainty have been applied to simulated MLS cloud-free radiances, based on a model atmosphere, for a whole day of MLS observations. These sets of perturbed radiances have then been run through the MLS data processing algorithms, and the differences between these runs and the results of the “unperturbed” run have been used to quantify the systematic uncertainty in each case. The impact of the perturbations varies from product to product and among uncertainty sources. Although the term “systematic uncertainty” is often associated with consistent additive and/or multiplicative biases, many sources of systematic uncertainty in the MLS measurement system give rise to additional scatter in the products. For example radiometric calibration errors cause both over and underestimation of H2O. The extent to which such terms average down is estimated to first order by these “full up studies” through their separate consideration of the bias and scatter each source of uncertainty introduces into the data. The difference between the retrieved product in the unperturbed run and the original “truth” model atmosphere is taken as a measure of uncertainties due to retrieval formulation and numerics. Another retrieval of the unperturbed radiances is performed with the H2O a priori profile increased by 50%. This adjustment was only applied to the initial estimate retrievals.
 In the second method, the potential impact of some (typically small) systematic uncertainties has been quantified through calculations based on a simple model of the MLS measurement system (see section S2 in auxiliary material). Unlike the full up studies, these calculations only provide estimates of % bias error introduced by the source in question; this approach is unable to quantify additional scatter for these sources of uncertainty.
 Finally, MLS observations are affected by thick clouds associated with deep convection. The MLS level 2 data processing algorithms discard or deemphasize radiances by increasing the precision of those identified as being affected by clouds [Livesey et al., 2006]. The contribution of cloud effects to the systematic uncertainty, both from the presence of clouds not thick enough to be screened out by the cloud filtering and from the loss of information through omission of cloud-impacted radiances, has been quantified by adding “cloud-induced radiances” [Wu et al., 2006] from a representative cloud field to the simulated radiances and comparing retrievals based on these radiances to the unperturbed results. The cloud-induced effects are estimated by considering only the cloudy profiles (having a vertically summed ice water content greater than 2 mg m−3 in the “truth” field).
Figure 4 shows a scatterplot of retrieved H2O versus true H2O from the retrieval algorithm test (Appendix A). The retrieval algorithm test shows how accurately the MLS retrieval algorithms retrieve a known H2O in the absence of systematic errors and noise. The truth H2O has not been smoothed by equation (1). Most of the scatter in Figure 4 arises from neglecting the two-dimensional (2-D) smoothing effect. Applying the 2-D averaging kernel (Figure 3 and equation (1)) to the truth (not shown) reduces the RMS difference by a factor of 3. A noteworthy feature in Figure 4, which is clearly evident at 261 and 178 and slightly evident at 215 hPa, is the asymptotic behavior of the retrieved H2O to approach 3–4 parts per million volume (ppmv) as the true H2O approaches ∼1 ppmv. Applying proper smoothing appears to eliminate this problem at 178 and 215 hPa but not at 261 hPa. At these low concentrations at high pressures, the dry continuum emission is the dominant absorber. Retrieved temperature and limb tangent pressure errors will cause an error in the calculation of the dry continuum absorption that may have a relatively stronger “knock-on” effect on H2O. At lower pressures (<178 hPa) there is no clear evidence that such asymptotic behavior exists; however, the concentration is never allowed to fall under 0.1 ppmv because a logarithmic representation is used for H2O. Therefore in addition to a % bias, there is a minimum measurable H2O.
Figure 5 shows the contributions of several categories of systematic errors to H2O. Table 4 summarizes these errors in tabular form. Sources of error contributing more than 20% are (1) pointing, (2) radiometric/spectroscopic, (3) clouds, and (4) retrieval. Pointing error contributions are approximately evenly divided among O2 line width uncertainty, FOV direction offset uncertainty for R1A/B, and FOV direction offset uncertainty for R2. The dominant radiometric error source is from the gain compression error. Errors in cloudy scenes are due to inaccuracies of neglecting their radiation scattering and emissions in the level 2 forward model. The retrieval uncertainty is a sum of algorithm and a priori considerations. Figure 5 and Table 4 are a statistical summary of the scatter shown in Figure 4. Most of the error associated with the retrieval algorithm is caused by neglecting the proper 2-D smoothing. Applying the mid latitude 2-D averaging kernel to the data set (itself an approximation because the averaging kernel depends on the H2O volume mixing ratio (VMR)) reduces the retrieval bias and standard deviation to 3% and 5%, respectively, for pressures ≤215 hPa. These values are similar to the impact of increasing the H2O a priori by 50%. The bias and standard deviation of the differences between smoothed true and retrieved H2O at 316 and 261 hPa are half of that under retrieval in Table 4. We have chosen the unsmoothed values for the error assessment because in most situations including comparisons shown in this paper, we do not have high-resolution 2-D correlative data. Therefore the best we can do is smooth the one dimension that has better resolution than MLS and perform an unsmoothed comparison in the other dimension.
Table 4. Systematic Uncertainty Contributions (Bias/Random) for H2Oa
Forward Model, %/%
Clear Sky Total, %/%
Totals do not include errors due to clouds; unk (unknown) contributes nothing to the total.
 RHi being derived from H2O and temperature will include errors from both sources. The temperature errors in percent add an additional 12–15% K−1 to RHi because of the exponential sensitivity of the saturated mixing ratio on temperature.
3. “Zero-Order” Validation
 This section shows that the MLS H2O or RHi measurements exhibit correct behavior. We show that the MLS v2.2 H2O near the tropopause is consistent with meteorological dynamics. We show comparisons with the Goddard global modeling and assimilation office Earth Observing System (GEOS-5) analyses [Rienecker et al., 2007].
3.1. Consistency With Meteorological Fields
Figure 6 shows mapped v2.2 H2O fields for 28 January 2005 at 261, 215, 178, and 147 hPa. Overlaid is the ±3.5 potential vorticity (PV) unit (1 PVU = 10−6 km2 kg−1 s−1) contour from GEOS-5 analyses which indicates the dynamical tropopause [Highwood and Berrisford, 2000; Schoeberl, 2004]. Typically, poleward of ±3.5 PVU is in the stratosphere (∣PV∣ > 3.5 PVU) and equatorward is in the troposphere (∣PV∣ < 3.5 PVU). Stratospheric H2O concentrations are less than 10 ppmv and tropospheric values are usually greater. The 5–10 ppmv H2O contours closely follow the dynamical tropopause. The poleward movement of the tropopause with higher pressures is reflected by a similar poleward movement of higher mixing ratios seen by MLS. Intrusions of stratospheric air into low latitudes and tropospheric air into high latitudes as indicated by PV show corresponding moist and dry features in the MLS data. The correspondence is also good at 316 hPa. This comparison shows that the MLS v2.2 H2O 5–10 ppmv contours closely follow the ±3.5 PVU contour at all the heights shown.
Figure 7 shows the 1200 UT GEOS-5 H2O interpolated to 261, 215, 178, and 147 hPa with the GEOS-5 ±3.5 PVU contour overlaid. We compare it to v2.2 MLS maps in Figure 6. There is good visual agreement between v2.2 MLS and GEOS-5 for tropospheric H2O, that is the region equatorward of the ±3.5 PVU contour. Poleward of the ±3.5 PVU contour, MLS is drier and more in accordance with stratospheric values being less than 10 ppmv. The horizontal resolution of the GEOS-5 (0.625° × 0.5°, longitude × latitude) is much higher than the MLS horizontal sampling and therefore show detailed features that are missed or smeared out in the MLS maps.
Figure 8 shows a coincident difference density plot of 159 d (17 August 2004 to 27 March 2007) of MLS v2.2 and GEOS-5 H2O. The latitude dependence of the scatter is minor and the global view shown here captures the essence of the comparison. The 316 hPa H2O shows good overall agreement for VMR between 100 and 500 ppmv. For lower pressures, MLS H2O is ∼40% drier than GEOS-5 H2O for concentrations greater than 10 ppmv. The correlation for coincident comparisons disappears when GEOS-5 H2O < 10 ppmv for pressures ≥147 hPa. This is probably due to the assimilation of Aqua Atmospheric Infrared Sounder (AIRS) radiances in the GEOS-5 analyses. As shown in section 4 and in section S1 in the auxiliary material, the AIRS measurement technique is not well suited for stratospheric conditions.
4. Coincident Comparisons
 This section focuses on coincident comparisons of MLS v2.2 with AIRS v4 measurements and several in situ sensors. Here we compare MLS v2.2 humidities with those from another technique that were measured approximately at the same time and location. Results of comparisons with AIRS, Vaisala RS92/90 radiosondes, hygrometers on the Aura Validation Experiment (AVE) campaigns, and balloon borne cryogenic frost point hygrometer (CFH) are discussed. Comparisons with other satellite and the balloon borne sensors that mainly focus on stratospheric H2O are presented elsewhere [Lambert et al., 2007].
4.1. Comparison With Aqua AIRS
 AIRS is a spectrally resolved (ν/Δν ≈ 1200) infrared sounder with 2378 channels covering 650–2675 cm−1 that was launched on the EOS Aqua satellite on 4 May 2002. AIRS retrieves H2O on 28 standard levels of which we consider the 500, 400, 300, 250, 200, and 150 hPa levels. On the basis of radiosonde, aircraft and balloon frost point comparisons, the accuracy of the v4 AIRS humidity is 5%, 10%, 15%, 20%, 25% and 25% for the aforementioned heights, respectively [Divakarla et al., 2006; Tobin et al., 2006; Hagan et al., 2004; Gettelman et al., 2004]. The accuracies for pressures less than 250 hPa are based on a few low latitude intercomparisons. The measurement precision based on extensive comparisons at Atmospheric Radiation Measurement sites [Stokes and Schwartz, 1994; Ackerman and Stokes, 2003] and radiosondes is 30–35% between 316 and 178 hPa [Tobin et al., 2006; Divakarla et al., 2006]. The horizontal resolution at nadir is 45 km and the vertical resolution is specified as a 20% accuracy for humidity in 2 km layers [Fetzer et al., 2003]; however, no averaging kernels have been produced for AIRS v4 H2O. While lower and middle tropospheric water vapor has been extensively examined, no analyses have assessed AIRS water vapor globally at pressures lower than 250 hPa.
 Aqua and Aura fly in a satellite formation known as the “afternoon,” A train [Schoeberl et al., 2006] where Aura follows Aqua by 15 min. MLS which looks forward sees the same atmosphere ∼8 min after AIRS. Currently ∼300,000 coincidences are available.
 Although AIRS and MLS observe atmospheric thermal emission from H2O, their humidity measurements are fundamentally different. The AIRS instrument, which looks down, observes radiances that are proportional to the logarithm of relative humidity in the troposphere [Soden and Bretherton, 1993] as described in section S1 in the auxiliary material. The AIRS RHi measurement requires a thermal gradient along its line of sight where H2O has its strongest observable emission. Under these conditions, the AIRS relative humidity measurement is insensitive to temperature errors. The MLS limb viewing geometry measures radiances that are proportional to H2O VMR (see section S2 of auxiliary material). MLS does not require a thermal gradient along its line of sight but does need to view through a semi transparent atmosphere. In this situation temperature errors do not strongly affect the MLS H2O VMR measurement. These conditions do not ideally exist in the upper troposphere for either instrument. The thermal gradient becomes small eventually disappearing at the tropopause, and the atmosphere becomes nearly opaque when viewed on the limb. Both conditions are challenging for AIRS and MLS. As previously discussed, when MLS scans its FOV to low tangent heights (0–5 km), the atmosphere is usually opaque and MLS becomes a relative humidity sounder like AIRS.
 Comparing H2O VMR also folds in AIRS temperature errors. These are typically ∼1 K root-mean-square (RMS, accuracy and precision) and contribute less than 5% to the H2O VMR uncertainties for both precision and accuracy. Likewise RHi comparisons include MLS temperature uncertainties. For MLS, the temperature accuracy between 316 and 178 hPa is −2–0 K and the precision is 2 K [Schwartz et al., 2007]. These larger values have a bigger impact as discussed later.
4.1.1. Horizontal Smoothing of the AIRS Data
 For each available MLS v2.2 day, we produced an MLS-like geophysical data set on its standard horizontal and vertical grids from AIRS v188.8.131.52 data. First we produce AIRS measurements at its horizontal resolution along the MLS limb tangent measurement track. The MLS measurement footprint is shown in Figure 9 amidst AIRS footprints. The MLS measurements almost evenly split two of the AIRS 30 cross track measurement sweeps. The cross track measurement sweeps are separated by 45 km along the orbit track. Each pair of nearest cross track measurements along the AIRS orbit track are quality screened with the QUAL_TEMP_PROFILE_MID = 0 flag. If both members of the pair have good flags they are averaged, weighted by their normalized orthogonal distance from the MLS measurement track (the weights are almost 0.5/0.5 most of the time). If only one of the pair has a good flag, the good measurement is used, and if both are bad then neither are used. At this stage we have a measurement track of AIRS data following the MLS observation path with points every 45 km (with some missing data gaps). Next we apply the forward model smoothing function, equation (1) along the horizontal direction to produce the data on the MLS horizontal grid. This procedure properly down samples the AIRS higher horizontal resolution into the coarser MLS horizontal resolution. The horizontal averaging kernel contribution is neglected because in the troposphere its resolution is close to the MLS horizontal grid point separations. The horizontal smoothing is done for each of the AIRS standard 28 vertical levels. The AIRS vertical profile is interpreted as a constant H2O VMR between the two adjacent grid points hence the profile has a stair step appearance. We assign the pressure of the H2O VMR to be the geometric mean of the two adjacent pressure levels. After the horizontal smoothing, the AIRS vertical profile is interpolated with respect to the logarithm of H2O to the MLS vertical levels without additional smoothing. MLS vertical smoothing is neglected because the vertical resolution of the AIRS H2O product is expected to be comparable to or worse than MLS. We ignore any consequence of the 8 minute time difference.
4.1.2. Data Issues
 The data require further screening. Figure 10 shows maps made from coincident MLS and AIRS data for 25 April 2006. These data are screened by each instrument's recommended criteria. The region equatorward of ±3.5 PVU is in the troposphere and shows much better agreement than poleward of ±3.5 PVU which is in the stratosphere. H2O less than 10 ppmv are shaded grey on these maps and with very few exceptions occurring at 316 hPa, MLS shows the stratosphere as grey. AIRS shows a much wetter stratosphere often with values over 10 ppmv and more than 2 times higher than MLS. This corroborates previous work by Gettelman et al.  that show poor AIRS performance for H2O < 20 ppmv. Therefore we do not use any AIRS data when MLS H2O is less than 20 ppmv.
 Another feature one sees is a tendency for MLS to measure both higher and lower H2O than AIRS over the moist regions especially at 316 and 261 hPa. This is best illustrated in a time series as shown in Figure 11 (but is also apparent in Figure 10). In the orbit shown, both “Quality” and “Convergence” have good values showing that the radiances are well fit. MLS at 316 hPa in both v2.2 and v1.5 sometimes have H2O oscillating high and low about the AIRS H2O. We also show two non standard MLS levels, 464 and 383 hPa, to illustrate a change made in v2.2 and its possible impact on the 316 hPa H2O. MLS v2.2 uses a single tropospheric layer RHi retrieval as a constraint for all pressures greater than 316 hPa. The peak of the weighting function associated with this retrieval is typically 500–350 hPa (middle troposphere) depending on the H2O concentration. The MLS v2.2 H2O calculated from the middle tropospheric RHi tracks the AIRS measurements at 464 and 383 hPa much better than the zonal climatology used in v1.5. Pay special attention to the feature over South Africa where v2.2 MLS retrieves H2O < 1 ppmv at 316 hPa. AIRS shows a precipitous drop in H2O from 464 to 383 hPa that is not properly captured in the MLS middle tropospheric H2O. The result is that the initial 383 hPa H2O constraint used by MLS is too moist which causes overcompensation with some horizontal oscillations at 316 hPa. In this example, the 261 hPa level appears unaffected.
 Another artifact is the high bias seen in MLS concentrations, particularly noticeable when H2O > 1000 ppmv, will be argued later as evidence of a ∼1% transmission or gain error in the instrument. As the example shows, these artifacts occur in the absence of clouds and are not detectable with any of our quality flags. The time series show for H2O < 20 ppmv at high latitudes, AIRS is consistently wetter than MLS.
 Last we show a mapped comparison of the MLS middle tropospheric RHi and AIRS 383 hPa RHi in Figure 12 for 25 April 2006. The weighting function of the radiance with respect to RHi is typically 350–400 hPa for typical tropical H2O concentrations. The morphologies between MLS and AIRS show good agreement. MLS however, is consistently higher than AIRS in the tropics. A noteworthy feature is a large region over Antarctica having RHi near 100%. This is a surface reflection feature (not properly modeled in the operational MLS forward model) off the 4 km Antarctic plateau which is observed in a limb viewing geometry.
4.1.3. Results From Comparisons
 One hundred fifty-nine days of available v2.2 MLS H2O measurements from 17 August 2004 to 27 March 2007 are scattered against the horizontally smoothed coincident AIRS data and shown in Figure 13. No data are used when MLS measures <20 ppmv at any level. A line showing the medians of the differences of the MLS H2O binned by AIRS H2O show best agreement at all heights when the mixing ratio is greater than 100 ppmv and less than 500 ppmv. When the AIRS H2O measurement is less than 100 ppmv, the coincident MLS H2O is lower until AIRS measures 30 ppmv. The positive MLS median difference for AIRS H2O less than 30 ppmv is probably a sampling bias caused by not using MLS measurements less than 20 ppmv. When AIRS measures H2O greater than 500 ppmv, MLS tends to be higher.
 There is little latitude dependence in the comparison equatorward of 60°. The agreement with AIRS is poorer at 316 and 261 hPa poleward of 60° as shown in Figure 14 for the Southern Hemisphere. The northern high latitudes (not shown) show that the 316 hPa H2O (which is much wetter) is similar to the global comparison but the 261 hPa level looks like that shown in Figure 14. For pressures less than 261 hPa, MLS H2O concentrations are mostly less than 20 ppmv and therefore those levels are not shown.
Figure 15 shows the median and RMS of the MLS and AIRS differences as a function of pressure level. Also shown are the root-sum-square (RSS) of the estimated uncertainties of bias and RMS, from the MLS systematic error analysis and AIRS. The means for the four levels are ∼−5%. Their RMS is much larger, being 35–65%. The RMS of the AIRS and MLS differences are less than the estimated combined RMS of the two instruments. The RMS as a function of height is revealing. It has a minimum at 261 hPa. This is the altitude or more appropriately, the H2O concentrations where these techniques best overlap. The reduction in the RMS of the differences from 215 to 178 hPa seen in the comparisons between MLS and AIRS may be influenced by a much larger reduction in atmospheric variability rather than better correlative agreement.
Figures 16 and 17 show scatter density plots and PDFs, respectively, for coincident AIRS and MLS measurements when MLS detects a cloud in its FOV. The MLS processing handles these cloud detections [Wu et al., 2007] by significantly increasing the radiance uncertainty for the heights where the cloud is detected. MLS coincident cloud detections affect 5% of all profiles considered here. The scatter density plots are similar to those for the whole data set except there are higher densities of H2O measurements in the more humid bins. The PDFs of the differences have biases that are sometimes more positive and have a higher RMS than the global comparison. These differences will include sampling effects in addition to cloud effects. For example, compare the global scatter density (Figure 13) at 316 hPa that shows a high density of points measured between 50 and 300 ppmv to the cloudy scene scatter density that has its highest density between 300 and 700 ppmv. MLS measurements in clouds show a large increase in scatter and a small change in bias compared to clear sky conditions.
4.1.4. Relative Humidity
Figures 18 and 19 show scatter density and PDFs of the differences between AIRS and MLS RHi between 30°S and 30°N at 383, 316, 261, and 215 hPa. We only show the tropics here because it avoids the stratosphere where AIRS does poorly and the peak of the weighting function for the MLS middle tropospheric RHi retrieval is near 383 hPa. MLS middle tropospheric RHi is 5–15% RHi greater than AIRS 383 hPa RHi. The improved agreement between MLS and AIRS for RHi > 80% is caused by setting all RHi > 110% to 110% for the cloud ice processing routines [Wu et al., 2007]. RHi derived from the “nonopaque” limb viewed measurements of H2O (P ≤ 316 hPa) show a progressive overestimation when RHi > 40%. The PDFs in Figure 19 show MLS having worse agreement with AIRS for RHi and a higher RMS difference than the H2O comparisons. Unlike H2O, MLS RHi for pressures less than 383 hPa is very sensitive to temperature errors (12–15% K−1). The increased RMS differences from 316 to 215 hPa are within the MLS analyzed systematic uncertainty which uses a 2 K MLS temperature precision [Schwartz et al., 2007]. The MLS middle troposphere RHi, like AIRS is a direct measurement of RHi and is more immune to temperature errors. This accounts for the significantly improved RMS for the coincident differences. The large jump seen in the bias from 316 hPa to 383 hPa (as well as the tendency for MLS to overestimate H2O and RHi at higher mixing ratios and relative humidities) is consistent with a 1.3% fractional radiance error in the MLS measurement system. The MLS middle tropospheric RHi measurement is very sensitive to a fractional radiance error whereas the limb viewed optically thin measurements of RHi or H2O are not (see auxiliary material for a discussion of measurement differences between opaque versus transparent conditions).
4.1.5. Probability Distributions in Clear and Cloudy Scenes
 A PDF plot of RHi for AIRS v184.108.40.206 and MLS v2.2 is shown in Figure 20. The MLS RHi distribution is separated into clear and cloudy situations based on AIRS cloud data to show that the cloudy scenes have RHi near 100%. Only tropical data (20°S–20°N) are shown as this eliminates most of the stratospheric data, which is poorly measured by AIRS. The MLS data are separated into three curves, all data, clear sky and cloudy sky. The maximum of the PDF for the “all data” curves for both instruments occurs at approximately the same RHi (10–30% RHi). The RHi of the PDF maximum increases with decreasing pressure.
 A notable difference is that the AIRS PDF decreases sharply for RHi > 70–80% whereas MLS shows a slowly decaying tail well into supersaturation. This behavior is expected in part because AIRS RHi is more immune to temperature errors than MLS RHi which is derived from H2O and temperature. A study of the impact of temperature noise on RHi PDFs presented by Buehler and Courcoux  shows that PDFs like those observed by MLS would be consistent with a random temperature uncertainty of 2–10 K neglecting all other errors. It is unlikely that MLS has a 10 K temperature error [Schwartz et al., 2007]. As shown above, the long tail seen for supersaturated values that is particularly evident in the 316 and 261 hPa PDF is further evidence of a fractional radiance error in the MLS measurement system.
 Colocating clouds detected by a down-looking sounder with the MLS limb sounding geometry is not simple [Kahn et al., 2007]. Clouds detected by AIRS may partially fill the MLS measurement footprint. There is uncertainty in defining a cloudy scene where the MLS measurement volume is 100% RHi. There are ∼6 AIRS scenes per MLS measurement. For this analysis, we find the nearest 6 AIRS scenes (2 cross track and 3 along track) and calculate the average logarithm of the cloud top pressures (CTP) and the average cloud fraction (CF). A cloudy scene is defined as one having an average CF in excess of 50%. For those scenes where the CF exceeds 50%, the CTP is identified with the nearest MLS pressure level and PDFs are produced for the MLS upper tropical tropospheric heights. Once a scene is identified as cloudy, all levels under that measurement are excluded from further consideration, because their cloudiness is indeterminable. We assume that the geometric mean of the scenes' CTP better represents the altitude of the average CF than the minimum CTP used for the CTP validation [Kahn et al., 2007]. We also produce a clear sky PDF which is defined as one where the AIRS maximum CTP of the six scenes is more than one half the grid separation under the MLS standard level. Both clear and cloudy curves are plotted. The cloudy curves show distribution peaks between 110 and 120% RHi at 215 and 178 hPa. A clearly defined peak does not show up at 316 and 261 hPa. A deficiency in the approach used here is that the CF is not always associated with the average CTP and the MLS volume averaged H2O is not necessarily 100% RHi. That association is probably best at 215 and 178 hPa where cirrus anvils from deep convection develop.
4.2. Radiosonde Comparisons
 The radiosonde network provides many opportunities for coincident comparisons. The accuracy of the upper tropospheric H2O measurement is strongly sensor-dependent and often poor. The best currently available humidity sensors for the upper troposphere are the Vaisala RS90 (discontinued) and RS92 radiosondes. The accuracy and precision specifications for the Vaisala RS92/90 radiosonde are 0.5/0.2 K for temperature and 5/2% RHi for T > 223 K, respectively [Vaisala Oyj, 2003]. These values lead to an H2O RMS uncertainty of 15–20% for typical relative humidities between 316 and 178 hPa measured coincidentally with MLS. Comparisons between them and the cryogenic frost point hygrometer (CFH) using a chilled mirror technique [Vömel et al., 2007a] show that the nighttime RS92/90 are 5–8% drier for T = 218 K that increases to a 13% moist bias at T = 203 K [Vömel et al., 2007c]. The daytime RS92/90 RHi show 50% errors due to radiation effects [Vömel et al., 2007c]. We focus our comparisons only with the nighttime RS92/90 radiosonde here.
 The spatial resolution of the radiosonde is much better than MLS; therefore coincident comparisons will have differences caused by comparing a volume average to a point measurement. Unlike with AIRS where it is possible to correctly degrade its better horizontal resolution, with radiosonde comparisons the resolution difference is an additional source of random systematic error. This issue was addressed in the UARS MLS upper tropospheric H2O validation. Level flight data from the Measurement of Ozone and water vapor by Airbus In-service Aircraft (MOZAIC) [Marenco et al., 1998] showed 20–30% H2O variability along 1° (110 km) at ∼215 hPa [Read et al., 2001].
Figure 21 shows the locations of available MLS v2.2 and nighttime RS92/90 measurements within 1° of great circle (110 km) and 3 h. H2O scatterplots for 4 MLS pressures are shown at right. As with AIRS, there is no agreement under stratospheric conditions, therefore no RS92/90 data are used when MLS measures less than 10 ppmv. We also do not use any RS92/90 data when T < 203 K. The MLS tendency toward overestimation of H2O at high mixing ratios is slightly evident in the 316 hPa comparison but not at the other pressures. Figure 22 shows the PDF of the differences and their statistics as a function of pressure. MLS is on average 10% wetter than the RS92/90 at 316 hPa which is consistent with the CFH comparisons that show a small dry bias for the Vaisala. The RMS of their differences at 316 hPa is similar to AIRS. MLS agreement with the Vaisala at 261 hPa is slightly worse than with AIRS (−10% versus −5%) and the RMS of the differences is larger. The small dry bias also stands in contrast to the CFH comparison. The 215 and 178 hPa H2O comparisons between RS92/90 and MLS where the majority of the measurements are colder than 223 K show MLS being 25–35% drier. This behavior is qualitatively similar to the CFH and RS92/90 under these conditions but MLS is much drier. As with AIRS, MLS tends to measure lower concentrations in the driest conditions. The RMS of the coincident differences is large also. The increased RMS seen in the RS92/90 comparisons with MLS relative to AIRS for pressures less than 316 hPa exceeds that expected from comparing a point measurement to a volume average but due to the limited scope of the UARS MLS study (i.e., smaller distance with no vertical averaging) it cannot be completely ruled out as an explanation.
4.3. Aura Validation Experiment Campaigns
 We present results from the Aura Validation Experiment (AVE) campaigns. Table 5 gives the campaigns considered in this validation, and the hygrometers with their reported accuracies. Most of the hygrometers are flown on board a WB57 NASA research aircraft that is capable of reaching an altitude of 70 hPa, ideal for making measurements in the tropical tropopause layer (TTL). All the instruments except ALIAS measure H2O vapor. ALIAS measures total water which is H2O in vapor and particles. We focus on the Costa Rica AVE (CRAVE) campaign where all these instruments participated and there are 2 d of MLS v2.2 data. The AVE-1 and AVE-2 campaigns which flew some of these instruments on the WB57 show identical results.
Table 5. Aura Validation Campaigns Considered in Validation of MLS v2.2 H2O
Whenever a range is given, the higher value is for H2O < 10 ppmv.
22 Oct to 18 Nov 2004
Jun to 24 Jun 2005
9 Jan to 9 Feb 2006
Figure 23 shows H2O measurements taken by the WB57 hygrometers on 22 January 2006. The WB57 flew directly along the MLS measurement track. The Harvard water (HW) instrument did not fly on this day but in all other WB57 flights including CRAVE it has always shown excellent agreement with the integrated-cavity output spectroscopy (ICOS) instrument. The comparisons are typical for the flights during all the AVE campaigns, most notably in the lower stratosphere, MLS v2.2 (and v1.5) is 20–30% drier than ICOS and ALIAS.
 The NOAA Aeronomy Laboratory water (AW) instrument shows inconsistent behavior. The 22 January flight and during the AVE-1 campaign the AW instrument often appears to have a hysteresis effect where its H2O measurements agree within ∼10% of MLS during level flight following takeoff (pressure < 110 hPa) on the outbound leg of the flight but were 20–40% wetter than MLS on the level flight portion after the aircraft performed a dive and climb maneuver.
 The persistent 30% dry bias for pressures less than 100 hPa seen in MLS comparisons with well calibrated in situ hygrometers is greater than the 8–12% estimated accuracy for MLS. Understanding these differences is the main focus of this section. We consider MLS vertical and horizontal smoothing effects and radiance measurement consistency.
Figure 24 shows profile comparisons among MLS v2.2 and the WB57 hygrometers on 22 January and 7 February 2006, another day with good spatial coincidences. These profiles consist of the San Jose, Costa Rica takeoff and landing data and level flight data within 5°S of San Jose. Also shown are the 21 January and 7 February 2006 H2O profiles from CFH launched on a balloon from Heredia, Costa Rica. The 22 January 2006 CFH flight did not make it to the tropopause and was according to its file description “contaminated.” The individual measurements are binned into 24 lpd pressure levels and shown as solid circles colored by instrument type. Their % differences are shown at right. MLS is 30% drier than ALIAS, ICOS, and HW but is within 10% of CFH for pressures ≤121 hPa. The 7 February day underscores the excellent agreement between ICOS and HW instruments. Now we apply the MLS averaging kernel using equation (1) to the ICOS and CFH profiles which are shown as stars in Figure 24. In order to smooth the ICOS profile, it needed to be extrapolated upward because the averaging kernels have decaying nonzero responses to altitudes above that observed by the ICOS instrument. This was done by calculating the ratio of ICOS to CFH 24 lpd binned H2O profile between 101 and 67 hPa and applying that ratio to the CFH profile above 68 hPa and joining it to the ICOS H2O profile. Adding the MLS smoothing effect improves the agreement with CFH for pressures ≤121 hPa but does nothing to change the lower stratospheric bias with ICOS.
 Horizontal smoothing by MLS is an unlikely explanation for the large bias seen by MLS and most WB57 hygrometers for pressures ≤100 hPa. Tropical lower stratospheric H2O is well mixed and slowly varying. Level flight data from the WB57 at altitudes >100 hPa show slowly varying H2O that should be captured accurately by the MLS 2-D retrieval. Day-to-day measurements by the aircraft and CFH show little variation in lower stratospheric H2O. Section 2.4 shows that the atmospheric variability of tropical lower stratospheric H2O is ∼5%, well under the 20–30% difference seen with most WB57 hygrometers. It also implies that the large H2O difference seen between CFH and the WB57 hygrometers is an instrument difference because the H2O in the different air masses should agree within 5%.
 An off-line exercise was conducted to study the implications of the in situ H2O measurements on the MLS measurement system. MLS simultaneously retrieves H2O, N2O, HNO3, ClO, HCN, O3, SO2, and CH3CN from a subset of R2 radiances shown in Figure 1. The off-line exercise uses the same processing software and configuration used in production but is run in one-dimensional (1-D) mode (neglect horizontal gradients) using a single radiance scan nearest San Jose, Costa Rica (profile 10 in Figure 23) and retrieves one profile for each of the suite of molecules above. We chose the profile from 22 January instead of 7 February because the CFH agrees much better with the WB57 hygrometers for pressure >121 hPa allowing better isolation of the impact of the large difference seen for pressures <121 hPa. We will show that forcing H2O to match that of ICOS which is representative of HW and ALIAS produces poor radiance residuals and degrades the accuracy of retrieved N2O from the same radiometer.
Figure 25 shows results from these runs. A control run is done which is equivalent to the MLS v2.2 H2O retrieval except here it assumes a 1-D atmosphere. The radiance residual from the retrieved profile is shown. The retrieved H2O profile is also shown along with the CFH and ICOS profiles. The next test was to constrain the H2O to that measured by CFH between 215 and 14 hPa. Altitudes outside this range used the control run retrieved H2O. Its residual shown below “mls residual” is visually identical. Next we constrain H2O to ICOS profiles, “WB57a” and “WB57b” having different extrapolations above 46 hPa. Profile WB57a simply scales the CFH profiles above 68 hPa previously described. Profile WB57b is the same as profile WB57a except above 46 hPa, H2O is made to match CFH at pressures less than 32 hPa. The residuals of their fit is shown for the “wb57a residual” and “wb57b residual.” The residuals are much worse and are inconsistent with the MLS radiance measurements. The last test was a simple attempt to correct the radiances for the instrument gain compression error to see if it could explain a 30% bias. The radiance residual shows some improvement for the unused (grey) measurements; however, its retrieved H2O is only slightly different relative to the control run and does not make H2O 30% wetter.
 MLS composition retrievals are often interrelated. For R2, the accuracy of retrieved 201 GHz N2O critically depends on the accuracy of H2O. H2O is the dominant absorber and close to N2O in the intermediate frequency spectrum (see Figure 1). Figure 26 compares the test case retrievals of N2O to the MLS standard product, R4 N2O retrieved from its 653 GHz line and ALIAS N2O measurement from the WB57. The R4 N2O is considered a superior measurement because its line is stronger and more spectrally isolated from interfering molecules. The accuracy of the R4 N2O is ∼10% [Lambert et al., 2007]. Retrieved N2O from the control, CFH H2O, and the gain compression correction experiment are similar and agree within 20% of the R4 N2O for pressures less than 50 hPa and 15% of ALIAS N2O at 68 hPa. The impact of constraining H2O to that measured by ICOS on retrieved N2O is substantial and unphysical leading to almost 0 part per billion volume (ppbv) at 68 and 46 hPa.
 MLS composition retrievals of O3, HNO3, HCl, N2O, that use the same measurement principles as that for H2O show ∼10% agreement with other correlative sources [Froidevaux et al., 2007; Livesey et al., 2007; Santee et al., 2007; Lambert et al., 2007]. It is hard to identify a mechanism that would preserve the good agreement for these molecules and H2O in the upper troposphere but cause a 30% error in the lowermost stratosphere. This study shows that the higher H2O concentrations measured by most of the WB57 hygrometers are inconsistent with MLS radiance measurements and calculations.
Figure 27 provides an overall summary of the comparisons performed in this paper. Globally MLS v2.2 H2O is 10% and 20% drier than v1.5 at 316 and 215 hPa, respectively, and within 2% for 147 and 100 hPa. The standard deviation of the scatter between the two MLS versions closely matches the retrieval algorithm systematic error standard deviation value in Table 4. GEOS-5 is more humid than MLS v2.2 at all levels considered here. The moist bias for 261–147 exceeds 30%. Specific instrument comparisons are discussed in turn.
 Comparisons between AIRS and MLS equatorward of 60° show agreement within each instrument's expected accuracy provided that in addition to the recommended quality control, no MLS H2O measurements less than 20 ppmv are compared. The RMS of the coincident H2O measurement differences is much less than the RSS of the MLS and AIRS estimates of their individual RMS uncertainties. A more detailed look at their coincident comparisons shows that the agreement is best for H2O concentrations between 100 and 500 ppmv. MLS progressively measures higher concentrations than AIRS when H2O > 500 ppmv. MLS is drier than AIRS when the concentrations are between 30 and 100 ppmv. The good overall statistics (−6% bias, 26% RMS) seen for 178 hPa may be fortuitous because the majority of measurements are between 20 and 40 ppmv where MLS-AIRS average difference is small but the correlation is poor. The percent difference density plot (Figure 13) clearly shows for 178 hPa H2O > 30 ppmv, MLS becomes progressively drier. Poleward of 60°, MLS is typically ∼30% drier than AIRS for H2O less than 100 ppmv. H2O measurements made in the presence of clouds detected by MLS that are successfully cleared by the AIRS cloud clearing algorithm show ∼10% agreement on average with large RMS differences of 50–100%.
 Dynamical arguments and comparisons with CFH and AVE hygrometers support the claim that for mixing ratios less than ∼20 ppmv MLS is more accurate than AIRS. Unknown is which technique is more accurate for H2O between 20 and 100 ppmv particularly at 178 hPa.
 The Vaisala RS90 and RS92 nighttime radiosonde comparison with MLS v2.2 show ∼10% biases at 316 and 261 hPa. The RMS of the differences between MLS and RS92/90 is the same as AIRS at 316 hPa and considerably larger than AIRS at 261 hPa. The RMS between MLS and AIRS or RS92/90 at 316 hPa being nearly the same probably relates to the MLS measurement and not from horizontal averaging issues. At 261 hPa, horizontal averaging probably plays a role in the larger RMS difference we see between MLS and RS92/90 versus AIRS. This is expected because the radiosonde horizontal resolution is much better than either AIRS or MLS and we are unable to make a smoothing correction that was done on the AIRS data. The comparison at 215 and 178 hPa is much poorer showing MLS being 35% drier with an RMS of 70%. The scatterplots show some correlation at 215 hPa and almost none at 178 hPa. Both the bias and the RMS for the 215 and 178 hPa levels exceed the combined uncertainty estimates from both instruments. The accuracy of the RHi measurement from the Vaisala RS92/90 radiosonde is only specified for T > 227 K; however, additional comparisons between them and the CFH at temperatures less than 223 K show that the nighttime RS92/90 is ∼10% wetter than the CFH measurement. The large dry bias between MLS and RS92/90 is therefore only partially explained by the having the majority of the 215 and 178 hPa measurements colder than 223 K. The poor agreement with the RS92/90 radiosonde at 215 hPa is in contrast with the AIRS comparisons which show excellent agreement. As with AIRS there is uncertainty regarding the accuracy of the radiosonde measurements at 178 hPa.
 The results for MLS comparisons with the CFH hygrometer in Table 6 are from the validation study by Vömel et al. [2007b]. Comparisons with CFH at multiple locations over the globe show dry biases, the worst being −27% and −24% at 215 and 178 hPa, respectively. Agreement for P ≤ 147 hPa is better than 10%.
MLS overestimates H2O for vmr > 500 ppmv; occasionally erroneous low value <1 ppmv and high value fliers are retrieved in the tropics
unsuitable for scientific use
 Some of the AVE campaigns extensively focused on validating the accuracy of Aura instruments in the UTLS region. Unfortunately, the H2O measurements do not confidently provide confirmation that MLS is measuring UTLS H2O within its estimated accuracy because of large differences among the in situ instruments themselves. The statistics for HW and ICOS in Figure 27 include results from AVE-2 and CRAVE campaigns for which we have v2.2 data. Three WB57 hygrometers (HW, ICOS, and ALIAS) are 20–30% wetter than MLS between 121 and 68 hPa. CFH shows better than 10% agreement with MLS.
 The large disagreement between HW (Lyman α) and balloon frost point hygrometers (multiple instruments using the chilled mirror principle) is a historical problem [Kley et al., 2000]. The HW and ICOS rely heavily on laboratory calibrations using their instruments to measure a known concentration of H2O. The CRAVE ALIAS total H2O data is from a direct absorption measurement that only relies on HITRAN line parameters [Rothman et al., 1998] coupled to the instrument's well known absorption cell path length. The frost point method is fundamentally a temperature measurement coupled to the H2O over ice saturation function [List, 1951]. Historically and currently, satellites and remote sensors on balloons agree within 10% for lower stratospheric H2O [Kley et al., 2000; Pumphrey, 1999; Lambert et al., 2007].
 We explored several avenues to try to understand the large bias with the WB57 hygrometers. Vertical and horizontal smoothing effects were ruled out. An MLS gain compression effect that introduces some error in the calibrated radiances is shown to be inconsequential. Finally, tests were done using a representative H2O profile measured by CFH and ICOS in the MLS retrieval system to see how well it fits the radiances and its effect on the retrievals of interfering species like N2O. The results show that radiance residuals using CFH H2O are as good as MLS H2O but the ICOS are much worse. Forcing the MLS retrieval to use the CFH H2O had a minor impact on retrieved N2O but the wetter ICOS H2O caused the retrieved N2O to have an unphysical retrieval near 0 ppbv at 68 and 46 hPa. The higher H2O measured in the lower stratosphere by the WB57 hygrometers are inconsistent with MLS radiance measurements and would seriously degrade the accuracy of retrieved N2O as well.
 The tendency for MLS to overestimate H2O for concentrations greater than 500 ppmv or RHi greater than 40% for pressures greater than 178 hPa is consistent with a radiance scaling error in the MLS measurement system. The distinguishing feature is the persistent 40% moist bias in the MLS middle tropospheric RHi retrieval when compared to AIRS. The MLS middle tropospheric RHi retrieval is used to constrain H2O for pressures greater than 316 hPa. Both of these biases are consistent with a 1.3% radiance error in the MLS measurement system. For H2O less than 100 ppmv, a ∼1% radiance scaling error causes only a ∼1–2% H2O error which is insignificant compared to the total estimated accuracy. This scaling error will be fixed in the next MLS version.
Table 6 gives our estimate of the precision, accuracy, and spatial resolution of MLS v2.2 H2O. The precision for H2O between 316 and 215 hPa is from the variability of coincident differences seen between MLS and AIRS which is lower than the scatter derived from the systematic error analysis. The RMS of comparisons with the RS92/90 are larger than our estimate but will include additional uncertainties associated with spatial averaging effects and low-temperature performance problems with the Vaisala. Precisions from 178 to 83 hPa reflect the variability of coincident differences between MLS and the in situ CFH, HW, and ICOS hygrometers. These comparisons also show less variability than that from the systematic error analysis except at 83 hPa.
 Minimum H2O in Table 6 is the concentration where MLS is unreliable. MLS uses a logarithmic basis for H2O whose minimum allowed concentration is 0.1 ppmv. In addition, some pressures ≥178 hPa exhibit an asymptotic tendency toward a few ppmv for H2O ≈ 1 ppmv.
 Accuracy for H2O between 316 and 215 hPa mirrors AIRS for this pressure range. The MLS AIRS comparisons show MLS ∼−5% dry but the respective accuracies for AIRS (15–25%) and MLS (30–60%, Table 4) are larger. The MLS systematic error analysis could be overestimated because it represents total accumulation of several maximum possible errors. Therefore we choose the smaller AIRS figures for accuracy. Vaisala RS92/90 comparisons given its better accuracy (∼10%, T > 227 K) show that the accuracy of the MLS H2O measurement may be closer to 10% at 316 and 261 hPa. We are choosing the more conservative AIRS values because of the limited number and global sampling of the radiosonde comparisons.
 Accuracies for 178–83 hPa are from the systematic error analysis. The comparisons done at 178 hPa show mixed results with differences between AIRS, HW, and ICOS showing MLS having dry biases less than the MLS accuracy estimate but CFH and RS90/92 having a larger ∼25% dry bias. At pressures less than 178 hPa the suite of in situ hygrometers considered for this validation show disagreements of 20–30% among themselves which are well outside the MLS estimated accuracy. Since the agreement between MLS and Vaisala RS92/90 or AIRS at the higher pressure tropospheric levels is better than that from our systematic error analysis, one might expect the actual accuracy for 178–83 hPa H2O to be at least as good as that in Table 6. The results from a validation study between MLS and CFH [Vömel et al., 2007b] indicate better accuracy than that given in Table 6 for pressures ≤ 147 hPa.
 The estimated vertical and along track horizontal resolutions are from the FWHM of the averaging kernel. The horizontal resolution perpendicular to the measurement track is the FWHM of the MLS azimuth antenna pattern. MLS only makes measurements along the orbit track which are separated by 10–20° in longitude at middle to low latitudes with finer sampling at high latitudes. Therefore there are significant regions of the globe that are not observed daily by MLS.
Table 7 gives estimates for precision, accuracies, and spatial resolution for the MLS RHi. The values in Table 7 are derived by combining H2O values from Table 6 with estimates of the temperature accuracy and precision [Schwartz et al., 2007]. The spatial resolution is the maximum for temperature and H2O.
 Other caveats to remember is that MLS RHi between 316 and 215 hPa is not useful for supersaturation studies because of the likelihood of a radiance scaling error that causes the highest mixing ratios to be overestimated. Another consideration that affects RHi at all pressures is the contribution of temperature noise to the MLS RHi PDF. A method for removing temperature noise is to recompute RHi from MLS H2O using a very accurate temperature measurement. This would only be recommended for P ≤ 178 hPa. The 316 hPa level sometimes retrieves excessively dry and wet values in the tropics that have good quality and status flags which may be caused by the poor vertical resolution of the 383 hPa H2O derived from the middle tropospheric RHi retrieval. Retrievals in cloudy conditions add ∼ more variability to H2O comparisons between 316 and 147 hPa. Last, more comparisons are needed with reliable sensors between 178 and 147 hPa and a resolution of the differences among the in situ instruments for H2O mixing ratios < 5 ppmv is critically important.
Appendix A:: Systematic Error Analysis
 Two approaches have been used to quantify the impact of systematic errors on MLS measurements. The first, more accurate, approach quantifies the errors in an “end-to-end” manner. A set of simulated MLS radiances have been generated from a model atmosphere. Various sets of perturbations simulating systematic error sources have been added to these radiances and the perturbed radiances were run through the MLS retrieval system (typically for a whole day, ∼3500 profiles). The differences in the retrieved geophysical products between the perturbed runs and an unperturbed “control” run are measures of the impact of the given systematic error source. The difference between the unperturbed ‘control’ run and the original model atmosphere (truth) are a measure of the errors simply due to “retrieval numerics” (comparing results of each perturbation against truth would double-book keep this error). In order to simplify interpretation, radiometric noise (scatter) was not applied to the simulated radiances, although the retrievals did assume the same levels of noise as seen in the real instrument in its optimization strategy.
 The second quantification approach is used for error sources that are likely to be small and/or hard to quantify in the end-to-end system. This uses a simple analytical expression for the MLS constituent error due to an expected perturbation in radiance, as described in section S2 in the auxiliary material. This approach is most appropriate for measurements in optically thin situations where linearity can be safely assumed.
Table A1 lists these error sources and their magnitudes. It was not practical to perform individual tests of each of the large number of systematic errors on each molecule retrieved in MLS v2.2. Instead, computations are grouped in a manner that provides the essential information with a minimum number of retrieval runs. A complexity of this approach arises from the strong interdependence among various parameters in the MLS system. For example, an error in the O2 line width will lead to inaccuracies in the MLS tangent pressure retrieval. This “pointing error” will in turn impact retrievals of gas mixing ratios, often in a complicated manner with errors introduced in strong species (e.g., O3) having their own knock-on impacts on weaker species (e.g., CO).
Table A1. Sources and Magnitudes of Systematic Uncertainties for the Aura MLS
Comments and Caveats
Varies by band. The percent change for the lower sideband response for band 15/18 is 1.7%, band is 16/19 3.6%, and band 17/20 is 3.1%.
Approximately 2% of difference between the channel and radiometer radiance.
Plus an additional error of 2 × estimated Ruze scattering.
propagation of contaminant species line width errors
simulation data exercise
 The remainder of this appendix describes these systematic errors and the basis for their quantification. An indication is given in the title as to whether the end-to-end or “simplified” approaches are used.
A1. Spectral and Radiometric Calibration
A1.1. Sideband Fraction: End-to-End
 All the radiometers except R1A/B are double sideband, simultaneously observing and combining two radio frequency signals (above and below the local oscillator frequency). The receivers' relative responses to the two signals (“sideband fractions”), were measured prior to launch [Jarnot et al., 2006]. Sensitivity tests have been performed that perturb the sideband fractions by amounts given in Table A1 consistent with the uncertainty on the prelaunch calibration. An error in the sideband fraction for R3 will affect all species through its impact on the tangent pressure retrieval (from the 240 GHz 18OO line). In order to decouple these effects, three separate runs have been performed. In the first run, only radiometers R2, R4 and R5 were perturbed, the second run perturbed only R3 and the third perturbed all of R2–R5 (R1A and R1B, being single sideband are not relevant here). The first case isolates the direct impact of these errors on composition for molecules retrieved in R2, R4 and R5, while the second considers the impact of sideband fraction non-R3 molecules through its impact on the tangent pressure (“pointing”) retrieval. The third test combines the first two.
A1.2. Filter Position: Simplified
 The spectral responses of each MLS channel were measured before launch [Jarnot et al., 2006]. The critical characteristics of the MLS filter shapes are the effective filter shape and width (the effects of more subtle changes in filter shape are expected to be insignificant compared to these). This study bases uncertainties in effective position and width from instrument design specifications. It is anticipated, and the data received thus far confirm, that the instrument performance is exceeding the design requirement.
A1.3. Spectrometer Nonlinearity: Simplified
 The finite digitization resolution of the individual MLS signals introduces small non linearity into the measurement system. The 0.02 K radiance error used in the quantification represents an absolute worst case.
A1.4. Power Supply Interaction: Simplified
 Large variations in the total signal level in one MLS spectrometer can lead to artifacts in the signals measured by other spectrometers sharing the same power supply. MLS data taken postlaunch with spectrometers in a diagnostic mode indicate a 0.05 K upper limit for this effect.
A1.5. Hot Reference Standing Waves: End-to-End
 Standing waves are a consequence of multiple reflections within the MLS optics [Jarnot et al., 2006]. The MLS radiometric calibration, involving views to hot and cold targets through a switching mirror, is performed through a different optical path from the limb views. All views will have different characteristic standing waves. Calibrated radiances accordingly exhibit spectral standing wave artifacts. Standing wave artifacts associated with cold reference views have been characterized from extended observations of cold space from high-altitude limb views through the main MLS antenna. These are removed in the MLS processing as described by Jarnot et al. . Standing waves associated with the hot reference are harder to characterize and are not corrected for in the processing. These lead to spectral artifacts in the MLS gain calibration. MLS views of the Moon, representative of a hot reference viewed through the main antenna, have been used to provide an estimate of these spectral artifacts (peak magnitudes shown in Table A1). The THz radiometer views the limb, and its hot and cold references through the same optical path; therefore the effects of standing waves associated with the THz optics are the same for all views and are removed in radiometric calibration [Pickett, 2006].
A1.6. Gain Compression: End-to-End
 Nonlinearities in the broadband radiometer amplifier responses cause an overestimation of calibrated radiances for narrow spectral features. Laboratory measurements using an MLS receiver and a simple model are used to generate spectrally distorted data used in the error analysis. The magnitude of this effect is approximately proportional to the difference between the radiance on an individual channel of interest and the spectrally averaged radiance seen by the radiometer. This instrumental artifact affects the entire instrument and its effect is studied with a single end-to-end test.
A1.7. Radiometric Calibration: Simplified
 This accounts for errors associated with level 1 data processing such as uncertainties in the temperature of the hot and cold references, calculation of gain and other effects described by Jarnot et al. .
A2. Field-of-View Calibration
A2.1. Antenna Pattern Shape: End-to-End
 The GHz and THz antenna patterns were measured prior to launch [Cofield and Stek, 2006; Pickett, 2006] and this assessment uses a perturbation representing 2-σ of the prelaunch shape uncertainty. The R4 patterns had the estimated Ruze scattering doubled [Ruze, 1966]. Five separate end-to-end tests were run, four testing the impact on pointing and temperature by perturbing R1A, R1B and R3 patterns in different configurations and a fifth perturbing only the R2, R4 and R5 patterns. For some molecules, the impact of the first four tests are categorized as a pointing error.
A2.2. Antenna Pattern Scan Dependence: Simplified
 The GHz antenna patterns change shape slightly depending on the scan angle of the instrument [Cofield and Stek, 2006]. The magnitude of this error is estimated by convolving a representative radiance profile with the scan extrema antenna patterns. The THz antenna has no such dependence [Pickett, 2006].
A2.3. Antenna Transmission: Simplified
 A small portion of the atmospheric radiance is absorbed and scattered by the GHz antenna system. These losses are measured prior to launch and the systematic error assessment is based on 2-σ of their uncertainty [Cofield and Stek, 2006]. The THz radiometer views the limb and its reference targets through its main antenna, therefore its transmission properties are removed in radiometric calibration [Pickett, 2006].
A2.4. Scan Jitter: End-to-End
 Post launch analysis of the GHz scan encoder data revealed that the GHz antenna exhibits a 20 arc sec peak to peak departure (jitter) from the desired uniform linear motion. This jitter integrates down to a ±4 arc sec pointing error for each 1/6 s integration [Read et al., 2006]. These effects mimic a small broadening of the antenna pattern.
 In addition, the effects of the continuous scan motion during the 1/6 s integration is ignored by the forward model which simply assumes a “step and stare” scan. The additional complexity of accounting for the continuous scan motion is futile because the scan jitter which cannot be modeled dominates the scan pointing error [Read et al., 2006]. The end-to-end quantification of this error was based on a random jitter model equivalent to 20 arc sec peak to peak motion.
A3. Spectroscopy and Forward Model
A3.1. Continuum: End-to-End
 The continuum refers to background atmospheric absorption from O2, N2, and H2O. These have been measured in the laboratory from 170 to 260 GHz [Meshkov, 2006; Meshkov and De Lucia, 2007]. Water vapor absorption at 2.5 THz was measured by H. M. Pickett (unpublished result, 2004). The H2O continuum for 620–660 GHz is from Pardo et al. . The combined N2 and O2 continuum for 620–660 GHz is derived from in-orbit data. We assume 10% accuracy for these measurements [Meshkov and De Lucia, 2007; H. M. Pickett, unpublished result, 2004]. Two end-to-end tests were conducted, one for the dry air continuum and another for the H2O continuum.
A3.2. Line Strength: Simplified
 The line strength error is twice the dipole moment measurement error for the molecule. These vary by molecule and are given in Table A2.
Table A2. Strength and Line Width Uncertainties for Aura MLS Molecules
 The line width errors for the molecules and lines targeted by MLS are presented in Table A2. These errors are the root sum square of a systematic component and a random component rounded to the nearest percent and apply over the full temperature range in the atmosphere. The systematic component reflects the typical interlab agreement in measured line widths. While most molecules have not been measured at more than one laboratory, for the few cases where multiple laboratory measurements exist we find that agreement is ∼3%. The random component is 2σ of the measurement noise propagated into line width. These errors are assessed through ten end-to-end tests which perturbed molecule line widths individually or in groups. Strong signal molecules such as OH, O2, O3, and H2O, are perturbed both individually and together while intermediate strength molecules N2O, HCl, ClO, HNO3, and CO, and weak signal molecules BrO, HO2, HOCl, HCN, and SO2, were studied in separate tests.
A3.4. Numerics: Simplified
 This error deals with the accuracy of the radiative transfer quadratures, and handling of instrument calibration data. On the basis of comparisons with other forward models, their calculation agree within 1% [Read et al., 2006].
 Field-of-view pointing knowledge is vital for accurate measurements of constituent profiles. In this assessment pointing errors arise from the propagation of instrument and spectroscopic errors described below into the tangent pressure retrieval.
A4.1. O2 Line Width: End-to-End
 The 3% uncertainty in O2 line width approximately corresponds to a 100 m pointing error.
A4.2. Field-of-View Direction Offsets: End-to-End
 The MLS field-of-view directions for the different radiometers were measured before launch, and their accuracies and uncertainties were verified in flight by scanning the field of view across the Moon [Cofield and Stek, 2006]. The 2-σ uncertainty is 0.002° or 100 m at the limb. This effect is quantified in two end-to-end tests, one perturbing radiometers R1A and R1B FOV directions and the other perturbing the R2, R3 and R4 FOV directions.
A4.3. R3 Sideband Fraction: End-to-End
 The intensity of the 18OO line used to determine tangent pressure in the troposphere and lower stratosphere depends on the R3 sideband fraction and is characterized as described earlier. When considered as a pointing error contribution, this error is only applicable to molecules retrieved in radiometers R2, R4, and R5.
A4.4. R1/R3 Field-of-View Pattern: End-to-End
 An error in the respective fields of views for R1 and band 8 also adds error in the tangent pressure measurement, as described in A2.1 and A2.2. Four end-to-end tests perturbed the FOV shapes for R1 and R3 in different combinations. The results of the R3 test as a pointing error are only applicable to molecules retrieved in radiometers R2, R4, and R5.
A5. Temperature: Simplified
 An error in retrieved temperature will impact retrievals of constituent profiles. As with pointing, these errors will have their fundamental origins in an instrument or forward model error, and thus be captured in the quantification of these error sources. Here we consider the additional possibility of a temperature error arising from an unknown source. We use 1% uncertainty based on the validated accuracy of the MLS temperature measurement.
A6. Retrieval Related Errors
A6.1. Algorithm: End-to-End
 The impacts of the retrieval algorithm have been quantified simply by comparing the results of an unperturbed end-to-end retrieval to the known model atmosphere from which the radiances were generated (truth). As described above, in order to avoid accounting for this effect in all the other error quantifications, all other end-to-end quantifications have been compared to this unperturbed run.
A6.2. A Priori: End-to-End
 The retrieval process uses a priori information for the atmospheric state [Livesey et al., 2006]. To test the sensitivity of the retrieved atmospheric compositions to their a priori information, the a priories were perturbed by amounts given in Table A3. Three end-to-end tests were performed perturbing the temperature, H2O and O3 a priories individually while a fourth perturbed the a priories for the remaining species together.
Table A3. A Priori Perturbations
a priori + 3 K
a priori × 1.5
P ≥ 100 hPa, a priori vmr ≤ 0.5 ppmv: a priori + 0.2 ppmv; a priori vmr > 0.5 ppmv: a priori × 1.5
100 > P ≤ 1 hPa: a priori × 1.5
P < 1 hPa: a priori + 2 ppmv
P > 10 hPa: a priori = 0
P ≤ 10 hPa: larger of a priori × 2 or a priori + 0.5 ppmv
a priori × 1.5
change a priori from 0 to 0.5 ppbv
larger of a priori × 1.5 or a priori + 0.04 ppbv
P ≥ 100 hPa: larger of a priori × 2 or a priori + 0.5 ppmv
P < 100 hPa: a priori × 2
a priori × 1.5
change a priori from 0 to 10 ppbv
a priori × 0.5
change a priori from 0 to 10 pptv
a priori × 1.5
A7. Contaminant Species
 An error in the retrieval of a given species can cause errors in the retrieval of other species. As with pointing and temperature these errors originate from an error in the forward model or instrument characterization, and are captured by their respective quantifications. Here we consider only those errors caused by line width errors in the contaminating molecule.
 Thick cirrus associated with convection introduce additional absorption and radiation scattering that are not included in the operational v2.2 forward model. The MLS processing identifies and ignores radiances strongly affected by clouds, and the retrieval of a spectrally flat extinction parameter helps in the fitting of any remaining radiances that are moderately affected. In order to quantify the possible errors introduced by such a strategy, an end-to-end test was performed on a set of radiances that included the effects of thick cirrus clouds based on an off-line scattering model.
 We are very grateful to the MLS instrument and data/computer operations and development teams (at JPL and from Raytheon, Pasadena) for their support through all phases of the MLS project, in particular D. Flower, G. Lau, J. Holden, R. lay, M. Loo, G. Melgar, D. Miller, B. Mills, M. Echevarri, E. Greene, A. Hanzel, A. Mousessian, S. Neely, and C. Vuu. We greatly appreciate the efforts of Bojan Bojkov and the Aura Validation Data Center (AVDC) team, whose work facilitated the MLS validation activities. Thanks to the Aura Project for their support throughout the years (before and after Aura launch), in particular, M. Schoeberl, A. Douglass (also as cochair of the Aura validation working group), E. Hilsenrath, and J. Joiner. We also acknowledge the support from NASA Headquarters, P. DeCola for MLS and Aura, and M. Kurylo, J. Gleason, B. Doddridge, and H. Maring, especially in relation to the Aura validation activities and campaign planning efforts. The aircraft campaigns themselves involved tireless hours from various coordinators, including D. Fahey, E. Jensen, P. Newman, M. Schoeberl, L. Pfister, R. Friedl, K. Thompson, and others involved with campaign flight management and support. The research described here done at the Jet Propulsion Laboratory, California Institute of Technology, was under contract with the National Aeronautics and Space Administration. We thank three anonymous referees for their thorough review and comments on this manuscript.