[6] The theory of retrieval of atmospheric state parameters from remote measurements is well understood [see, e.g., Rodgers, 2000, and references therein] and leads, after linearization of the radiative transfer problem and after writing the solution into the context of Newtonian iteration in order to take nonlinearity into account, to the following estimators of state parameters:
Here K is the m_{max} × n_{max} Jacobian, containing the partial derivatives of all m_{max} simulated measurements y under consideration with respect to all unknown parameters x, superscript T denotes a transposed matrix, x is the n_{max}dimensional vector of unknown parameters, x_{a} the related a priori information. The term y_{max} is the m_{max}dimensional vector of measurements under consideration, y(x_{i}) is the forward modeled spectrum using parameters x_{i} from the ith step of iteration. R is an n_{max} × n_{max} regularization matrix, and S_{y} is the m_{meas} × m_{max} covariance matrix of the measurement. The optional term λI (tuning scalar times unity) damps the step width x_{i+1} − x_{i}, bends its direction toward the direction of the steepest descent of the cost function in the parameter space and prevents a single iteration from causing a jump of parameters x beyond the linear domain around the current guess x_{i} [Levenberg, 1944; Marquardt, 1963].
2.2. Retrieval Parameters
[10] For the application of the retrieval processor discussed here, the parameters to be retrieved are (1) temperature (T), (2) absolute tangent altitudes, (3) socalled “continuum,” and (4) radiometric zero level calibration correction (socalled “offset”). Pressure, temperature, and altitude are related by the hydrostatic equation, where in our implementation temperature and altitude are the independent variables, while pressure is the dependent variable.
[11] 1. Temperature is represented and retrieved on a fixed altitude grid, which does not depend on the actual tangent altitudes. As global reference frame the World Geodetic System 1984 (WGS84) is used. The use of a fixed altitude grid avoids postdifferentiation and resampling, which is required when the tangent altitudes, which are retrieval parameters themselves and thus change during the retrieval, define the altitude grid. The grid width has been chosen quite fine (4–44 km: 1 km; 44–70 km: 2 km; 70 km–80 km: 5 km; 80 km–100 km: 10 km; 100 km–120 km: 20 km; gridding for nonLTE reference calculations even finer) which allows comfortable diagnostics, such as averaging kernels on a sub–tangent altitude grid, and avoids degradation of the vertical resolution beyond the loss of resolution due to regularization.
[12] 2. MIPAS is a limb scanning instrument where the measurement geometry is varied by angular movement of an elevation scan mirror, contrary to a detector array instrument with fixed angular distances between different lines of sight. This means that in case of MIPAS not only the absolute altitude of the tangent altitude pattern is variable but also the relative distances between adjacent tangent altitudes. The scheme presented here formally retrieves each single tangent altitude as an absolute quantity, while information on the relative tangent altitudes, i.e., vertical distances between adjacent tangent altitudes, are retrieved implicitly. A hydrostatic pressure profile is then calculated (see below). This retrieval scheme is distinct from usual schemes, which retrieve pressure and relative tangent altitudes, and then either assign the altitudes via the hydrostatic approximation or just report retrieved species abundances as a function of pressure rather than altitude [see, e.g., Ridolfi et al., 2000]. First, ray path geometry is calculated correctly in each iteration. Second, the compensation of altitude mispointing by pressure correction neglects the altitude dependence of all other atmospheric state parameters, in particular the abundances of species. Furthermore, since a priori knowledge on the instrument pointing is given in geometric coordinates, retrieval of tangent pressure rather than tangent altitude would require an update of the constraint term in equation (1) in each step of iteration. This delicate complication is avoided by using geometric coordinates. However, the main advantage of the retrieval of tangent altitude instead of pressure is computational efficiency: While the calculation of the partial derivatives of spectral radiances with respect to pressure requires extensive radiative transfer calculations, the evaluation of the partial derivatives with respect to tangent altitudes requires only incremental variation of the boundaries of integration of precalculated pencil beam radiances over the fieldofview function of the instrument. Certainly, many of these arguments depend on the actual instrument specification, the nature of available a priori data, and the data structure of the forward model used. Therefore the choice of geometric altitude as pointing variable may not be appropriate for other instruments.
[13] 3. The KOPRA radiative transfer model includes continua of H_{2}O [Clough et al., 1989, version CKD 2.4; Clough, 1995], N_{2} [Lafferty et al., 1996], O_{2} [Thibault et al., 1997] and CO_{2} [Echle and Höpfner, 2000], based on CO_{2} χ factors by Menoux et al. [1987, 1991] to account for the nonLorentzian shape of the line wings. However, it is common experience that in emission spectroscopy the calculated continuum often does not fit the actual background radiation perfectly. Therefore an additional locally wave numberindependent background continuum component is fitted to the measurement in order to prevent errors in modeled continua from being propagated into the retrieval of target parameters. This empirical additional absorption coefficient, which does not depend on any physical continuum model, compensates for atmospheric contributions of weak wave number dependence not reproduced by the radiative transfer forward model. Such effects may include continua due to superimposed wings of faroff transitions, uncertainties of H_{2}O, O_{2} and N_{2} continua, and, most important, signal from aerosols and clouds. As for high tangent altitudes no such continua have been observed, this quantity is set to zero for altitudes higher than 32 km. For altitudes below, the fit parameter formally is the abundance of a locally (i.e., within a microwindow) grey absorbing/emitting constituent. In order to allow for variations of background continuum with wave number, an individual parameter is retrieved for each microwindow, constrained to smoothness in both the altitude and the wave number domains (see below).
[14] 4. As zerooffset calibration is not perfectly known, this quantity is also retrieved directly from the spectra. While in lower altitudes the offset information in the spectra is largely correlated with the continuum, unambiguous offset information is contained in high altitude spectra with tangent altitudes above 32 km, where the continuum is zero. The zero offset calibration error is constant with tangent altitude for each limb scan. The partial derivative of the spectrum with respect to zero offset is unity, since the offset is additive.
[15] Pressure is not derived from the spectra directly but calculated on the basis of the hydrostatic approximation within each iteration, based on the current guess of temperature profile and a pair of pressure and geometrical altitude values, calculated from the pressure and gepotential height data from the European Center of Medium range Weather Forecast (ECMWF) meteorological analysis. For examples shown here, this reference point was selected at 20 km. The altitudedependent mean molecular mass of air is calculated from vertical profiles of the main atmospheric constituents (N_{2}, O_{2}, H_{2}O, Ar, CO_{2}, O) as provided by the extended Mass Spectrometer Incoherent Scatter (NRLMSISE00) atmospheric model [Picone et al., 2002]. The variation of the gravitational acceleration g(ϕ, z) with latitude ϕ and altitude z is calculated as
with R_{e}(ϕ) the local distance to the Earth's center, R_{c}(ϕ) the distance to the y axis of the ellipse along the ellipse's normal, ω the Earth's angular velocity, and g_{0}(ϕ) the gravitational acceleration on ground in m/s^{2}, which is calculated as
Pressure, as all atmospheric state parameters, is represented on the same fine altitude grid as temperature. Because of the hydrostatic adjustment of pressure on the basis of retrieved temperatures and tangent altitudes between successive iterations, the correct temperature derivatives to be used in the retrieval would be a total differential of the type
However, it was found to be accurate enough to omit this postdifferentiation, and to tolerate one or two additional iterations instead, since the evaluation of pressure derivatives with KOPRA is computationally expensive.
[16] The standard treatment of a limb scanning sequence is to assume horizontal homogeneity of each atmospheric state parameter, i.e., to define state parameters as functions of altitude but constant in latitude and longitude. Since this simplification can trigger systematic errors, the algorithm is coded to handle latitudinal and longitudinal variation of state parameters. Approaches of different level of sophistication are supported: The simplest way is just to include horizontal gradient information on state parameters from external sources in the radiative transfer forward model and to keep the gradients fixed while the parameters themselves are retrieved. Also retrieval of horizontal gradient information from the spectra is supported, which needs sufficient regularization [von Clarmann et al., 2000; Steck, 2000]. The most rigorous approach supported by the code is direct retrieval of 2D fields in an optimal estimation scheme which uses a series of successive limb scans in a tomographic approach rather than single limb scans (T. Steck, personal communication, 2003). However, none of these options has been activated for processing of data presented in this paper.
2.3. Microwindow Selection
[17] Since retrieval of tangent altitudes and temperature precedes the retrieval of species abundances, no reliable information on atmospheric state parameters is available from a previous retrieval step. This implies that, in order to avoid error propagation caused by unknown species abundances, only transitions of species should be used whose abundances change only slightly and are well known. Traditionally, both in absorption and emission spectroscopy, small spectral regions (socalled “microwindows”) which contain only CO_{2} lines are used for this purpose [see, e.g., Abbas et al., 1984; Rinsland et al., 1992; Stiller et al., 1995]. Errors caused by variations of CO_{2} mixing ratios are discussed in sections 3 and 4. In order to obtain independent information on tangent altitudes and temperature, spectral lines of different temperature dependence are needed. The IMKIAA retrieval does not explicitly force the microwindow selection toward CO_{2} transitions, but uses an objective method to find optimal microwindows which minimize the propagation of retrieval noise, uncertainties in species abundances and certain instrumental uncertainties to retrieved temperatures and tangent altitudes [von Clarmann and Echle, 1998; Echle et al., 2000]. This optimization approach leads to microwindows which exclude spectral grid points where the gain of additional information is overcompensated by uncertainties of model parameters, such as the unknown abundances of atmospheric species other than CO_{2}, O_{2}, or N_{2}.
[18] Since, for reasons of efficiency, the standard setting of the IMKIAA retrieval scheme assumes LTE and neglects line coupling, also microwindows where these effects are important are avoided. The MIPAS performance data assumed for the microwindow optimization in terms of nominal noise equivalent spectral radiance (NESR_{0,requirement}), noise equivalent spectral radiance after apodization (NESR_{APO,requirement}), and radiometric errors in terms of gain and offset are listed in Table 1. The selected microwindows used for pointing and temperature retrievals, the most prominent CO_{2} transitions, and the main interfering species are listed in Table 2. These microwindows all fall in the MIPAS A band. However, in case of corrupted data in band A, the microwindow selection scheme will automatically fall back to spectral transitions in other MIPAS bands. Coregistration uncertainties of the MIPAS bands in terms of pointing direction have been estimated by means of actively scanning the field of view across Mercury used as a bright infrared source [Nett, 2003]. Since no evidence of any asymmetries of the fieldofview shapes have been found, these data suggest that tangent altitudes of bands A, AB, B and C coincide by better than 12 m, while effective band D tangent altitudes are lower by about 46 m. This excellent altitude alignment of the MIPAS spectral bands is explained by the fact that the field of view is determined by a field stop within the telescope which is common to all detectors, and detectors are all mounted in the same orientation with respect to the incident beam (M. Birk, personal communication, 2003). Banddependent atmospheric refraction contributes by only 0.3 m in the worst case (between bands A and D at 6 km tangent altitude) and thus is negligible. As a consequence, tangent altitudes retrieved from MIPAS band A radiances are valid also for the bands AB, B and C, while a correction by −46 m could be considered for channel D.
Table 1. MIPAS Specifications and Band CharacterizationMIPAS band  A  AB  B  C  D 


Spectral Coverage, cm^{−1}  685–970  1020–1170  1215–1500  1570–1750  1820–2410 
Spectral Sampling, cm^{−1}  0.025  0.025  0.025  0.025  0.025 
NESR_{0,requirement}^{a}  50  40  20  6  4.2 
NESR_{0,in flight}^{a}  38.62  21.56  14.91  3.92  3.91 
NESR_{APO,requirement}^{a}  30.35  24.28  12.14  3.64  2.549 
NESR_{APO,in flight}^{a}  23.44  13.09  9.05  2.38  2.37 
Gain error,% of true radiance  2  2  2  2  2 
Offset error^{a}  100  80  40  12  8.4 
Table 2. Microwindows for Tangent Altitude and Temperature Retrievals^{a}Spectral Coverage, cm^{−1}  Number of Spectral Points  Nominal Altitudes, km  Prominent Transitions  Main Interfering Species 


687.175–692.125  7–17  33–42; 52–68;  0220101101 R2428  O_{3}, NO_{2}, H_{2}O 
   1110110001 Q1038  
   0110100001 R2630  
693.500–693.650  7  47  0110100001 R32  O_{3}, NO_{2} 
737.525–741.300  7–42  6; 21–30; 36–68  1000101101 R2125  O_{3}, NO_{2} 
746.350–746.500  7  6; 9; 47; 68  1110102201 R5  O_{3} 
780.850–781.000  7  9; 42; 47–60  1110102201 R54  O_{3}, NO_{2} 
   1220103301 R30  
801.450–801.625  7–8  12–18; 33; 39–52; 68  1110101101 R12  O_{3}, NO_{2}, ClONO_{2} 
968.900–969.100  7–8  6–12  0001110001 R10  O_{3}, COF_{2} 
[19] While the IMKIAA processor supports dedicated microwindows for different latitude bands (polar, midlatitudinal, tropic), as well as latitudedependent initial guess profiles of atmospheric constituents, these early results shown here have been generated with microwindows optimized for midlatitudinal conditions and a midlatitudinal profile of CO_{2} mixing ratios. Estimated errors due to assumed one sigma differences of 2 ppmv between the actual and the assumed mixing ratios of CO_{2} are below or equal 0.1 K and 26 m for temperature and tangent altitude, respectively.
[20] In order to achieve the best possible tradeoff between noiseinduced random error and systematic errors due to unknown state parameters, both the microwindow selection and the detailed microwindow definition (boundaries, grid point selection within each microwindow) are altitude dependent.
[21] Both offset and continuum are assumed to vary only slowly with wave number. For practical reasons, this behavior is taken into account in the retrieval processor by assuming that these quantities do not vary within a microwindow but can take different values in different microwindows.
2.4. Constraints
[22] Because of the representation of the retrieved atmospheric state parameters on an altitude grid finer than the vertical distance of tangent altitudes of the measurements, the correlations of partial derivatives of the measurements with respect to temperature and with respect to the lineofsight pointing, and due to the large number of fit variables, the inverse problem is illposed and needs regularization. The regularization matrix R is set up as a block diagonal matrix with one block for temperature (R_{T}), tangent altitudes (R_{LOS}), continuum (R_{c}), and offset (R_{o}), each. No correlation constraints between these blocks are introduced in this matrix, i.e., offdiagonal blocks are all zero.
[23] Temperature is constrained with a Tikhonovtype smoothness constraint:
where L_{1} is a firstorder differences operator weighted by the actual grid width, and γ a scalar which controls the strength of regularization. For this application, γ has been chosen as 1.059 K^{−2} (at 1km altitude sampling), such that the number of degrees of freedom of the retrieved temperature profile (typically about 15) is only slightly smaller than the number of tangent altitudes of the limb sequence (up to 17, depending on cloud contamination) [Steck, 2002]. This corresponds to the expected achievable altitude resolution of a measurement where all altitude information is given by measurement geometry while the information content of pressure broadening is quite limited due to limited spectral resolution.
[24] Allinclusive accuracies of the initial line of sight pointing information in terms of tangent altitude errors are specified as 1800 m (absolute, above ground), 900 m (relative error between first and last measurement of a limb scan), and 300 m (relative error between two successive tangent altitudes (each 95% confidence limit).
[25] This information is used to constrain the tangent altitude retrieval by optimal estimation [Rodgers, 2000]. While the specified tangent altitude accuracy does not follow a Gaussian distribution, a pointing covariance matrix has been approximated as (K. Ressel, personal communication, 1996)
where U is a n × n transformation matrix of the type
n is the number of tangent altitudes within a limb sequence, and S_{LOS} is a diagonal matrix where the first n − 1 diagonal entries are (150 m)^{2}, while the last diagonal element is (900 m)^{2}. This approach approximates onesigma uncertainties as uncertainties at 95% confidence limit scaled down by a factor of 2. This slightly overestimates relative pointing uncertainties, since it assumes random accumulation of relative pointing errors, which ends up in a toptobottom onesigma uncertainty of 600 m instead of 450 m as specified. Since the star tracking system based pointing data is independent from the spectral measurements, its use as a priori information in the sense of optimal estimation seems adequate for routine data analysis and similar applications except validation of the initial pointing information itself. For the latter purpose, i.e., to detect biases of the initial lineofsight pointing data, the variance of the absolute pointing a priori information is set to infinity, while the variances of the relative tangent altitudes remain unchanged. Using the first order equivalence of a pure smoothing constraint and optimal estimation for this particular case, which has been pinpointed by Steck and von Clarmann [2001], R_{LOS} can be expressed using the Tikhonov formalism (see, e.g., equation (4)) for this purpose. No significant difference in the results due to these different regularization approaches has been found for cases analyzed so far.
[26] The continuum is set to zero above 32 km (hard constraint). This is justified, because none of the physics which causes any signal to be parameterized and compensated by the empirical continuum, such as far wing radiation, aerosol, or water vapor continuum, has any significant effect in these high altitudes. Below 32 km the regularization approach of the continuum in the altitude domain is the same as for temperature, i.e., a Tikhonovtype firstorder smoothing constraint with γ = 10^{10} km^{2} (at 1 km altitude sampling). In order to force the continuum character of the empirically fitted continuum, additional firstorder smoothness regularization is applied in the wave number domain (i.e., between microwindows) (γ = 2.5 × 10^{7} km^{2} at 1 cm^{−1} spectral distance).
[27] Since there is no atmospheric continuum above 30 km, all radiance offset in these high altitude spectra carries information on the (altitude independent) zero calibration correction. Therefore this quantity does not need to be constrained, i.e., R_{o} = 0. The regularization of the offset calibration correction in the wave number domain may be worthwhile consideration of but has not been implemented, since there has not been any evidence of necessity. Certainly, the assumption of altitudeconstant offset is an implicit constraint in itself, but not in the formal sense of equation (1). This issue falls rather in the definition of the retrieval parameter space.