Corresponding author: L. Wang, 5825 University Research Court, RM 4001, College Park, MD 20740-3823, USA. (firstname.lastname@example.org)
 A consistent long-term stratospheric temperature observation is a crucial part for global change studies. In this study, we compare newly developed Stratospheric Sounding Unit (SSU) layer-averaged stratospheric temperatures with lidar, GPS Radio Occultation (RO), and Microwave Limb Sounder (MLS) stratospheric temperature profiles, each with unique error characteristics, spatial and temporal coverage, and observational principles. The vertical temperature profiles are converted into SSU-equivalent layer temperatures, and diurnal correction is applied to adjust the observations into an identical observational time. The comparison is carried out on pentad grids with a 2.5° latitude × 2.5° longitude resolution. Grid-by-grid comparison of SSU and MLS gives the mean differences between them from August 2004 to May 2006 of -0.041, 0.169, and -0.447 K with standard deviation of 1.180, 1.485, and 1.715 K for SSU channels 1–3, respectively. The correlation of GPS RO and SSU brightness temperature anomalies are 0.943, 0.877, and 0.699 from channels 1–3, respectively, and the correlation decreases with altitude (channels). SSU channel 3 brightness temperature anomalies are correlated with lidar observations with correlation coefficients of 0.839 at the Hohenpeissenberg Observatory in Germany and 0.725 at the Observatoire de Haute-Provence in France. Overall, the comparison results do not show that the newly developed SSU data set is significantly different from any of the three independent data sets based on known limitations and advantages of these data sets.
 The temperature evolution in the stratosphere is an integral component for global change because it can provide evidence of natural and anthropogenic climate forcing just like surface warming [Ramaswamy et al., 2006]. Moreover, stratospheric temperature changes are also crucial for understanding stratospheric ozone variability and trends [e.g., Thompson and Solomon, 2009] as well as dynamics in the stratosphere [e.g., Lin et al., 2009]. Indeed, stratospheric temperature changes have been used as one of the important ways for evaluating climate model performance [e.g., Forster et al., 2011; Shine et al., 2003].
 The Stratospheric Sounding Unit (SSU), carried on NOAA polar-orbiting satellites, was the only instrument making near-global middle and upper stratospheric temperature observations (with weighting functions peaking near at 15.0, 5.0, and 1.5 hPa for three channels) from 1979 to 2006. Because of its uniqueness, the SSU long-term observations have been extensively used for assessment of temperature changes in the stratosphere and their possible causes [Brindley et al., 1999; Liu and Weng, 2009; Ramaswamy et al., 2001; Randel et al., 2009; Seidel et al., 2011; Shine et al., 2003]. However, it is well known that the SSU instruments suffered from a gas leak in pressure-modulated CO2 cells, resulting in changes of channel weighting functions during operation [Brindley et al., 1999]. Furthermore, when SSU observations from different satellites were linked together, long-term increase in atmospheric CO2 concentration [Shine et al., 2008] and orbital drift [Nash and Forrester, 1986] introduced spurious time-varying signals into 27-year SSU time series, making it hard for climate-related studies.
 Only very limited SSU analyses were available due to the above challenges. The pioneer work by Nash and his colleagues [Nash, 1988; Nash and Brownscombe, 1982; Nash and Forrester, 1986; Nash and Edge, 1989] produced the first widely used SSU analysis. However, the details of data processing were not provided in peer-reviewed literatures; thus, certain aspects of the original processing remain unknown to this day. Because of this, another SSU stratospheric temperature analysis was recently generated and released at the Center for Satellite Applications and Research (STAR) in NOAA [Wang et al., 2012]. This data set comprises global pentad layer temperatures based on SSU pixels from all viewing angles, with grid resolution of 2.5° latitudes × 2.5° longitudes, of the middle stratosphere (TMS), upper stratosphere (TUS), and top stratosphere (TTS), respectively, corresponding to SSU channels 1–3 (note that they are equivalent to channels 25– 27 of Nash SSU data set).
 Comparison of these two SSU data sets has been performed to evaluate structural uncertainty in derived trends associated with methodological choice [Thompson et al., 2012; Wang et al., 2012]. However, in global mean trends as well as latitudinal distribution of trends, striking differences are found between NOAA SSU analysis and Nash data set. In particular, whereas the long-term variability and trends in global-mean temperatures for the uppermost SSU channel (SSU channel 3) are relatively similar in both data sets, the significant trend difference as large as ~0.5 K/decade is revealed in channels 1 and 2. With respect to trend latitudinal structures, the NOAA SSU data set suggests that the largest stratospheric cooling occurred at tropical latitudes, whereas the Nash data set indicates relatively uniform distribution with latitude. These remarkable discrepancies between two SSU data sets raise a serious question: Which SSU data set is correct or do both data sets have problems?
 A common way to explore the root causes of these differences is to go through each data set's processing procedures step by step and then compare outputs side by side. However, the methodology used to generate the Nash SSU data is not published in peer-reviewed literatures; thus, it is hard to reproduce the final outputs of Nash SSU data set. Because of this, comparison of the SSU analyses with other independent stratospheric temperature observations can serve as an alternative way to assess SSU data quality. Specifically, ground-based lidar systems have an ability of measuring temperatures typically from 30 to 80 km with high temporal resolution measurements [Hauchecorne and Chanin, 1980]. Under the framework of the Network for the Detection of Atmospheric Composition Change (NDACC) [Kurylo and Solomon, 1990], several lidar stations have continuous long-term observations that provide temperature time-series since the late 1980s. Second, GPS satellites can obtain all-weather high vertical resolution refractivity profiles and then further retrieve temperature profiles in the stratosphere [Kursinski et al., 2000]. Third, launched on 15 July 2004, the Microwave Limb Sounder (MLS) on EOS Aura satellite can measure microwave thermal emission from the limb (edge) of the Earth's atmosphere to retrieve vertical profiles of stratospheric temperatures. With high vertical resolution, all of these observations provide unique opportunities to assess the data quality of SSU stratospheric temperature analysis. Similar comparisons have been carried out in previous studies, mainly concentrated on Nash's SSU data. For example, Randel et al.  and Keckhut et al.  compared Nash's monthly zonal mean SSU data with in situ lidar observations (clear nights only) from different stations. Both results indicate large discrepancies in temperature anomaly time series between lidar measurements and Nash's SSU data set. Given the nature of data sampling in the two data sets, that is, one-point ground measurements versus satellite-based monthly zonal mean temperatures, the inconsistency is mostly contributed by spatial and temporal sampling differences; thus, it is hard to determine consistency of different data sets. Several work has compared GPS radio occultation (RO) profiles with Advanced Microwave Sounding Unit (AMSU) stratospheric channels [Ho et al., 2007, 2009; Ladstädter et al., 2011], but so far no comparisons between GPS and SSU have been performed.
 In this study, we focus on comparison of newly developed SSU stratospheric temperature data records with the above three independent stratospheric temperature observations. Although the primary objective of this study is to evaluate data quality of the STAR SSU stratospheric temperature data set, differences between the three independent data sets and SSU observations can provide fundamental information for linking the stratospheric temperature observations from SSU—a heritage satellite sensor- to current advanced instruments. To our best knowledge, this study is the first effort that compares SSU stratospheric temperature observations based on infrared measurements at CO2 region with three independent data sources that have different observation principles, that is, ground-based lidar systems, satellite-based MLSs, and space-based GPS RO observations. The article is organized as follows: Section 2 introduces the data sets. Section 3 describes the method. Section 4 presents the comparison results. Section 5 provides the conclusions.
2.1 NOAA SSU Data Set
 Table 1 gives a brief description of these four data sets used in this study. The NOAA SSU data are composed by 2.5° × 2.5° pentad (5-day) brightness temperatures (BTs) from three SSU channels covering from 18 October 1978 to 1 May 2006. Described by Wang et al. , the data processing procedures include (1) correcting the SSU BTs to correspond to identical weighting functions by removing instrument CO2 cell gas leaking and atmospheric CO2 increasing effects; (2) adjusting off-nadir SSU observations to nadir-like observations; (3) removing diurnal sampling biases due to satellite orbital drift as well as orbit difference; (4) gridding adjusted SSU BTs; and (5) statistically merging the observations from different satellites. All these corrections (steps 1–3) were made at a SSU pixel level through model simulations, whereas the merging of observations from different satellites was performed at a grid level. After the above steps, the SSU original measurements were constructed as gridded nadir-like SSU BT measurements corresponding to identical weighting functions with the same local observational time. Shown in Figure 1 are the weighting functions of the three SSU channels, corresponding to layer temperatures of TMS, TUS, and TTS. The SSU data can be obtained from the National Environmental Satellite, Data, and Information Service (NESDIS)/STAR Web site at http://www.star.nesdis.noaa.gov/smcd/emb/mscat/mscatmain.htm.
Table 1. Stratospheric Temperature Data Sets Included in This Comparison Study
Data Type and Source
Layer-average temperatures in three channels developed at NOAA/NESDIS/STAR
 In principle, the Rayleigh temperature lidar system employs the backscattering properties of the air molecules and the hydrostatic and ideal gas properties of the atmosphere. In the absence of particles such as aerosols and clouds (typically >30 km), the number of photons received by the lidar telescope is proportional to the number of backscattering air molecules, which then allows the retrieval of atmospheric temperature between 30 and 80 km with high vertical resolution.
 Two lidar stations were chosen in this study, that is, the Observatoire de Haute-Provence (OHP) in France (43.91°N, 5.71°E) [Keckhut et al., 1993] and the Hohenpeissenberg Observatory (HOH) in Germany (47.81°N,11.01°E) [Werner et al., 1983]. Although long-term stratospheric temperature observations have been accumulated at several sites under the framework of NDACC, an intercomparison of lidar-measured stratospheric temperatures with AMSU satellite measurements indicates that the OHP and HOH sites have correlations with AMSU observations >0.7 [Funatsu et al., 2008, 2011], ensuring the data quality of these two sites. The lidar profiles cover a period from 1987 to 2011 at HOH and from 1991 to 2009 at OHP, respectively, which were downloaded from the NDACC website at http://www.ndsc.ncep.noaa.gov/.
2.3 GPS RO Data
 The Challenging Mini-satellite Payload (CHAMP) GPS RO data are used in this study with considering temporal overlap with the SSU data set. Beginning from 2001, the CHAMP GPS RO provided about 150–250 globally distributed all-weather high-resolution vertical profiles of atmospheric parameters per day within the height interval of 0–50 km [Wickert et al., 2001]. Different from traditional multispectral passive radiometers (like SSU or AMSU), GPS RO limb sounding technique measures phase and amplitude of radiosignals propagated through the atmosphere between GPS satellites and GPS receivers, from which atmospheric refractivity, density, pressure, geopotential height, temperature, and humidity profiles are then derived [Kursinski et al., 2000]. Because the fundamental observable of GPS RO is a precise timing measurement, which can be traced to ultrastable atomic clocks on the Earth and is not affected by weather conditions, the GPS RO observations have shown potentials for use as a benchmark data set to validate other measurement at a range of 5–25 km [e.g., Ho et al., 2007, 2009; Wang et al., 2004].
 The temperature profile retrievals from CHAMP GPS RO were processed using different implementations of initialization at four centers, that is, the GeoForschungsZentrum Potsdam (Germany), the NASA Jet Propulsion Laboratory (JPL; USA), the University Corporation of Atmospheric Research (UCAR; USA), and the Wegener Center of the University of Graz (Austria). Statistical comparisons of CHAMP GPS RO temperature profiles show that retrievals from the four different centers exhibited similar changes over time without significant trend differences from 2001 to 2008 [Steiner et al., 2010]. In this study, we use CHAMP GPS RO dry temperature profiles from May 2001 to May 2006 that were processed at UCAR Constellation Observing System for Meteorology, Ionosphere and Climate (COSMIC) Data Analysis and Archive Center (CDAAC) (http://cosmicio.cosmic.ucar.edu/cdaac/index.html).
2.4 MLS Data
 Launched on 15 July 2004, MLS on EOS Aura measures thermal microwave emission in five spectral regions from 115 GHz to 2.5 THz from the limb (edge) of the atmosphere to remotely sense vertical profiles of the stratospheric temperature [Schwartz et al., 2008]. MLS looks forward from the Aura spacecraft and scans the Earth's limb vertically from the ground to ~90 km every 24.7 s. The vertical scan rate varies with altitude, with a slower scan providing greater integration time in the lower regions (~0–25 km). The MLS vertical scans are synchronized to the Aura orbit so that vertical scans are made at essentially the same latitudes each orbit, with 240 scans performed per orbit (~3500 scans/day).
 The data used in this study are version 3.3 and available from the NASA Goddard Space Flight Center Earth Sciences (GES) Data and Information Services Center (DISC) (http://disc.gsfc.nasa.gov/Aura/MLS/index.shtml). The retrieved temperatures from MLS have a ~1 K bias with respect to model analysis and other observations in the troposphere and stratosphere [Schwartz et al., 2008]. The suggested useful range of MLS temperature profiles is from 261 to 0.001 hPa, although the temperatures below this range are also provided.
3.1 Conversion of Temperature Profiles
 Because the SSU BTs represent the layer-averaged temperature weighted by its weighting function in different channels (Figure 1), the first step is to convert vertical temperature profiles of other observations into the SSU-equivalent BTs. Figure 1 gives typical atmospheric temperature profiles from lidar at HOH and OHP sites, GPS RO, and MLS, overlapped with SSU weighting functions. Extending from upper stratosphere to the middle mesosphere (30–80 km), a typical lidar temperature profile fully covers SSU channel 3 but is lacking of overlaps with channels 1 and 2. Thus, the comparison between lidars and SSUs is limited to SSU channel 3. A simple method, which uses SSU weighting functions to perform weighted average on lidar temperature profile, is used to convert lidar temperature profiles into SSU-equivalent BTs. Because the SSU weighting functions varies, albeit weakly, with atmospheric conditions, a strict method should rely on radiative transfer calculation. However, for the SSU and lidar comparison, we focus on evaluating BT anomaly consistency between two data sets but not the absolute values (see discussion below), so this method should not bias comparison results.
 For the CHAMP GPS RO and MLS temperature profiles, we use the Community Radiative Transfer Model (CRTM) to convert them into the SSU-equivalent BTs. Because the SSU was a pressure modulator radiometer and sensed the radiation at CO2 absorption region, instrument cell pressure values and atmospheric CO2 concentration are needed to accurately simulate SSU BTs. Developed at the Joint Center for Satellite Data Assimilation, the CRTM SSU model can well handle the SSU varying cell pressures and atmospheric CO2 concentration [Chen et al., 2011]. The SSU data have been processed at a zero scan angle (corresponding to the nadir view) with fixed atmospheric CO2 concentration of 330 ppmv and constant SSU instrument cell pressures that are 110, 40, and 15 hPa for channels 1– 3 [Wang et al., 2012]. Thus, to match the SSU data set, the identical parameters are used for simulation. The water vapor and ozone profiles are filled with climatic profiles dependent on seasons and locations. The sensitive test indicated that the SSU radiances are not sensitive to water vapor and ozone content [Chen et al., 2011]. As shown in Figure 1, a typical GPS RO temperature profile extends from the lower troposphere to 0.1–0.2 hPa. The missing profile above its top level is filled with climatic profiles, which probably introduce uncertainties when simulating SSU channel 3.
3.2 Diurnal Correction
 It is well known that stratospheric temperatures have diurnal and semidiurnal tidal effects due to the absorption of solar radiance by ozone and water vapor [Lindzen, 1979]. When comparing the stratospheric temperatures from different data sets, because each type of observations has its own observational time, sampling biases could be introduced due to diurnal variation in temperature fields. In particular, measurements from ground-based lidar were mainly obtained before midnight at local time [Funatsu et al., 2008]. On the other hand, the SSUs onboard on the NOAA polar orbiting environmental satellites (POES) passed over a fixed location two times per day but their crossing-time slowly changed due to orbital drift [see Fig. 2 of Wang et al., 2012]. Similar to the SSUs, the MLS on NASA EOS Aura satellite is also in a sun-synchronous orbit. However, different from historic NOAA POES series, the inclination of the NASA EOS satellites is adjusted every year or two to maintain a sun-synchronous orbit and avoid orbital drift. As a result, the MLS has a relatively fixed equatorial crossing time of 0145 and 1345 LST. Finally, compared with the MLS and SSU, the GPS RO instrument is nearly unique because it observes limb-viewing measurements that change in location and time dependent on the GPS and GPS receiver's orbit characteristics [Foelsche et al., 2009]. In general, these observations from different instruments must be adjusted to identical observational time to avoid possible diurnal sampling biases.
 For SSU observations, the adjustment has been made using a data set from the NASA Modern Era Retrospective Analysis for Research and Applications (MERRA) reanalysis [Rienecker et al., 2011; Wang et al., 2012]. By comparing the measurements from Thermosphere-Ionosphere-Mesosphere-Energetics and Dynamics/Sounding of the Atmosphere using Broadband Emission Radiometry with different reanalysis data sets, Sakazaki et al.  indicate that MERRA with other two recent analysis data sets are good at reproducing realistic diurnal tides. Specifically, 10-year MERRA monthly atmospheric profile with grid resolution of 1.25° latitudes × 1.25° longitudes, 3-hourly profiles from 2000 to 2010 are first averaged at each of its output hour to obtain a multiyear mean 3-hourly profile. These yearly averaged profiles are then used in the CRTM to obtain simulated SSU observations. The diurnal cycle anomalies for the 3 SSU channels as a function of local time, month, latitude, and longitude were then derived by interpolating the simulated 3-hourly SSU BTs into every hour during a day. The derived diurnal variation patterns are used to adjust all the SSU observations to an identical observational time of 1200 LST. Following the similar procedure, the SSU-equivalent BTs derived from ground-based lidar, GPS RO, and MLS profiles are also adjusted using the same data set. We have performed a sensitive test based on SSU-GPS BT differences to examine the effects of diurnal corrections and found that the BT differences were reduced after diurnal correction. Henceforward, except for those noted below, the lidar, GPS RO, and MLS observations are referred to the derived SSU-equivalent BTs with diurnal adjustment.
3.3 Comparison Methods
 Because the NOAA SSU data are composed of 2.5° × 2.5° pentad (5-day) average BTs, the SSU-equivalent BTs from lidar, GPS RO, and MLS profiles are processed at the same format as SSU. Specifically, the in-situ lidar data are averaged as 5-day mean time series. The SSU grids closest to the two lidar stations are extracted for comparison. The GPS RO and MLS data are firstly binned into 2.5° × 2.5° grids to match the same spatial resolution as the SSU data set. They are further averaged into 5-day mean BTs. Only SSU grids collocating with CHAMP or MLS are selected for comparison.
 Figure 2 gives an example for a comparison between SSU and CHAMP for the pentad from 10 June to 14 June 2005. Compared with the SSU data that entirely cover the globe, the CHAMP grids are scatted on the globe due to its unique sampling feature (as discussed above). However, the image from GPS observations can still characterize the spatial pattern of stratospheric temperature distribution to a certain extent. For example, both figures show high temperature at the Arctic region, whereas the South Pole region is dominant by low temperatures. The scatter plots on the right panels show how well the SSU and CHAMP data agree with each other. To examine the climate-zone-dependent features, the matched grids are further divided into three climate regions indicated by different colors. Overall, the SSU observations agree with the GPS RO data with a correlation coefficient of ~0.99; nevertheless, their difference in channel 3 (-1.762 K) is three times larger than those in channels 1 and 2 (-0.587 and -0.566 K). This is most likely due to the fact that the observational error of GPS RO continually increases >25 km due to the residual ionospheric noise and the use of ancillary (climatology) data for the noise reduction process [Ao et al., 2006; Kuo et al., 2004], which yields large GPS observational errors for channel 3 (near 45 km).
 Figure 3 shows a comparison between SSU and MLS for the same 10–14 June 2005 period. In contrast to Figure 2, the 5-day gridded MLS image shows almost identical spatial distribution patterns as SSU because of its large sampling number in the space (699 for GPS RO vs. 7958 for MLS in matched grids). The mean differences between the two observations are -0.121, 0.139, and -0.466 K for SSU channels 1–3, respectively, which are much smaller than the SSU-GPS BT differences in the corresponding channel (see the number above). In terms of the standard deviation of BT differences, the values for SSU-GPS BT differences are 1.683, 1.969, and 2.311 K for channels 1–3 compared with 0.942, 1.281, and 1.675 K for SSU-MLS. These differences can be contributed by various factors. First, the MLS profiles covering from 100 to 0.1 hPa have relatively constant accuracy of 0.6–1.0 K with altitude [Schwartz et al., 2008], whereas GPS RO observational errors increase above 25 km (increases to ~4% from ~1%) [Kuo et al., 2004]. Secondly, the MLS profiles can reach to 0.001 hPa, whereas the GPS RO profiles usually stop at 0.2–0.3 hPa (~60 km; see Figure 1). The missing observations in GPS RO above 60 km, which contribute to the SSU channel 3, were filled with the model-provided atmosphere profiles during simulations. This implementation can cause some uncertainties. Finally, data sampling in grid cells is quite different for MLS and CHAMP GPS RO data, which is clearly illustrated in Figure 4 showing the sampling numbers of profiles in each grid cell (that are averaged to produce the 5-day average temperature). As seen, the BT values from GPS RO were calculated mostly from a single profile, contrasted to 1–5 profiles from MLS data. This sampling difference can partly contribute to the differences in their biases relative to SSU.
 To address these differences in vertical coverage as well as spatial and temporal sampling among the four data sets, two different comparison strategies are employed. On one hand, given the fact that MLS observations have high spatial and temporal sampling resolution and good vertical coverage, it is possible to perform grid-by-grid comparison to assess the absolute values of SSU data set. The MLS observations beginning from August 2004 mainly overlapped with NOAA-14. For the SSU data set, NOAA-14 was selected as a reference satellite and the observations from other six satellites were adjusted to merge with those from NOAA-14 to remove intersatellite biases [Wang et al., 2012]. The data quality of the reference satellite is essential for the entire data set. Therefore, comparison with the observations from the reference satellite NOAA-14 can indirectly assess the data quality of the SSU data set. Second, we are planning to extend the SSU stratospheric temperature observations from 2006 to present by merging SSU data sets with AMSU observations. Because NOAA-14 is the only satellite that overlapped with AMSU, it is also important to understand the data quality of NOAA-14 SSU observation. On the other hand, we compare the SSU data set with lidar and GPS RO observations to check the relative consistency of time evolution characteristics among the three data sets, such as trend and temporal variability instead of their absolute differences. In particular, time series from each data set for matched pentad grids are derived, and then their own seasonal climatology is removed from each time series to reduce the systematic difference caused by vertical coverage as well as spatial and temporal sampling differences. Finally, temperature anomalies from different data sets are compared for overlap period.
4 Comparison Results
4.1 SSU Versus MLS
 During the overlap period of MLS and SSU (from August 2004 to May 2006), a total of 124 pentads are generated (similar to Figure 3). All the paired SSU-MLS grids are extracted and put together to make a scatter plot, as shown in Figure 5. The results indicate that the SSU and MLS data have excellent agreements. In particular, the mean differences between SSU and MLS are −0.041, 0.169, and −0.447 K with the standard deviation of 1.180, 1.485, and 1.715 K for SSU channels 1–3, respectively. The magnitude of the bias and standard deviation increases with channels (altitude). By comparing MLS temperature profiles and eight correlative data sets, Schwartz et al.  found that temperature biases and scatter between MLS and correlative data sets increase with altitude from lower stratosphere to lower mesosphere. Our results are consistent with their findings. In addition, the measurement noise of three SSU channels increases with channels (noise equivalent differential radiance is 0.30, 0.40, and 1.00 mW/(m2 sr cm-1) from channels 1–3, respectively) [Kidwell, 1995]. It can partially explain increasing of standard deviation of SSU-MLS BT difference with channels. To further check how SSU is spatially different from MLS, the mean and standard deviation of SSU-MLS at each grid are derived and shown in Figure 6. With respect to the mean of SSU-MLS BT differences, channels 1 and 2 show warm bias in the polar regions but relatively small bias in the tropics. For channel 3, however, cold biases are dominant in the tropics, whereas the cold and warm biases are mixed poleward from 30°N and 30°S. In terms of the standard deviation, which represents the spread of SSU-MLS BT differences, all three channels show persistent large values in the high latitude zones. It suggests that the SSU-MLS BT differences have a large spread in high latitude regions. In addition, the standard deviation at the same region increases with channels, which is consistent with Figure 5.
 Because NOAA-14, which spanned over from 1995 to 2006 for 11 years, was selected as a reference satellite during SSU data set construction, any potential issue in the NOAA-14 SSU calibration could significantly impact its data quality. Considering NOAA-14 SSU was 9 years old after 2004 but MLS was a newly launched instrument during the comparison period, good agreement between MLS and SSU is encouraging for good confidence on the SSU data set.
4.2 SSU Versus Lidar
 The good agreement between SSU and MLS merits further investigation of the time evolution consistency of SSU with the other two data sets. Figures 7 and 8 give the comparison results between lidar and SSU at the HOH and OHP sites, respectively. First, a comparison between the annual cycles for SSU channel 3 and lidar measurements (Figures 7d and 8d) shows that the SSU BTs are less than lidar measurements about 4–5 K for both sites. Various factors could have contributed to this difference. For example, the fixed SSU weighting functions used to convert the lidar profiles to SSU BTs were calculated based on a standard atmosphere, which may not accurately match midlatitude seasonally changing lidar profiles. The diurnal corrections may be not good enough to adjust nighttime lidar observations to a fixed local time of 1200 LST. However, similar seasonal variation patterns are still observed for the two data sets. For example, both data sets featured sudden temperature increases in winter time in spite of different amplitudes, which could be related to the evolution of stratosphere sudden warming [Charlton and Polvani, 2007].
 The anomaly time series, which are derived by subtracting the annual cycles from BT time series, have reduced the bias of BT values caused by the above factors and are given in Figures 7a and 8a. We thus focus on comparing their BT anomalies. The temporal variability between lidar and SSU was generally consistent to each other for both sites, although those from lidar displayed larger spread than SSU because of data sampling difference. To show the correlation of BT anomalies for two data sets, scatter plots are given in Figures 7c and 8c. The correlation coefficient is 0.839 at HOH in contrast to 0.725 at OHP with similar sampling number, suggesting that the lidar observations at HOH agree more with SSU observations than OHP. This finding is consistent with previous lidar-AMSU comparison study by Funatsu et al. , that is, Lidar and AMSU measurements had correlation typically higher than 0.7 but correlation was decreased to 0.4–0.5 in summer at OHP. One of possible reasons that cause the difference at two sites may be due to the lidar measurements sampling. As we know, the lidar operation requires clear night and routine measurements depend on maintenance resources. As shown in Figures 7a and 8a (indicated by green lines), the numbers of lidar profiles that were used to generate pentads are different in two sites. Specially, whereas the OHP has more profiles to generate pentads, the HOH has more regular rate for measurements (also shown in Figure 1 of Keckhut et al. ), which may result in better agreement with SSU at HOH.
 Because each time series represents temperature fluctuations at the same location, differencing lidar and SSU BT anomaly time series for the same period will remove atmospheric variability that is common for both data sets, which thus facilitate identification of trend differences caused by data sampling, processing methods, and instrument differences in the two data sets. The SSU-lidar BT difference time series are given in Figures 7b and 8b, which are superimposed with linear trend lines. Two-sigma confidence intervals (95% confidence level) are determined by adjusting autocorrelation effects [Santer et al., 2000]. The results show that the BT anomaly difference time series between SSU and lidar have trends of 0.035 ± 0.042 K/year at HOH and 0.080 ± 0.067 K/year at OHP. Statistically, there is no significant trend difference between the two data sets at HOH, but the differences can been seen at OHP. Figure 8c also shows some outliners around 2001 at OHP. After these outliers are removed, the correlation between lidar and SSU at OHO increases from 0.728 to 0.793, but the trend of SSU-lidar BT difference also changes from 0.080 ± 0.067 to 0.084 ± 0.052 K/year. Further understanding of the quality of lidar measurement at OHP is still needed for future studies.
4.3 SSU Versus GPS RO
 Because in situ measurements are often than not affected by local geophysical condition as discussed above, a comparison of 5-year GPS RO and SSU measurements can assess time evolution characteristics of SSU data set (against GPS RO observations) globally and comprehensively (all channels). Following the above method, the BT anomaly time series of SSU and CHAMP GPS RO measurements as well as their differences from May 2001 to May 2006 are given in Figure 9. Note that each individual time series is deseasonalized using its own annual cycle derived from a whole time period, as shown in Figure 10. The latitudinal cosine weighting has been applied to the global time series; thus, the values from other three climate regions may represent different area sizes. The number of GPS-SSU matched grids is shown in the top of each panel in Figure 9.
 The global mean time series of BT anomalies in Figure 9a show large variation before 2003, which are mostly caused by the relatively small number of matched grids. After CHAMP GPS RO profiles increased sharply after mid-2003, the variation was apparently reduced. However, for the whole period, the SSU and GPS RO BT anomalies generally agree with each other for all three channels. In particular, time series of SSU-GPS BT anomalies (indicated by blue lines) are nearly close to a zero line with time, whereas the amplitude of the fluctuation increases from channels 1–3. To quantitatively compare these two time series, correlation coefficients are calculated and listed in Table 2. The results clearly show that the correlation of SSU and GPS RO time series decreases with altitude (from channels 1–3), which is mostly caused by error structures of GPS RO soundings. The study on GPS RO profile accuracy by Kuo et al.  and Ao et al.  has showed that RO soundings have the highest accuracy from ~5 to 25 km, whereas the observational errors increase both toward the surface or above 25 km. With increasing altitude >25 km, the accuracy of GPS degrades, resulting in reduced correlation between GPS RO and SSU.
Table 2. Correlation Coefficients Between GPS RO and SSU BT Anomalies
 To check the spatial consistency of SSU and GPS RO, GPS and SSU BT anomalies are broken into three climate regions, that is, the tropics (20°S to 20°N), the 20°N to 90°N zone (referred as Northern Hemisphere), and the 20°S to 90°S region (referred as Southern Hemisphere), as shown in Figures 9b–d. In all three climate regions, the SSU time series shows larger variability, whereas those from GPS are smoother for all three channels. For example, planetary waves propagate during winter (from December to March) in the Northern Hemisphere with a large horizontal and vertical extension, generating stratospheric sudden warming. As shown in Figure 9d, the large BT anomaly variations from December to March are well captured by SSU data but not by GPS RO (probably because of its sparse spatial samplings as shown in Figure 2). Similar as the global time series, the correlation between GPS RO and SSU reduces with channels for all three different climate zones. On the other hand, the 20°S to 90°S and 20°N to 90°N zones have larger correlation than the tropics. It may be caused by the fact that the 20°S to 90°S and 20°N to 90°N zones are dominated with strong seasonal cycle (Figure 10).
 To check the trend consistency between data sets, linear tendencies of each data set as well as their differences are summarized in Table 3. Two-sigma confidence intervals (95% confidence level) with removing autocorrelation effects are also listed. For the global time series, both SSU and GPS RO BT anomaly time series show cooling tendencies in 2001–2006, but the tendency differences are found as 0.044 ± 0.019, 0.041 ± 0.030, and 0.097 ± 0.040 K/year from channels 1–3, respectively. In other words, the SSU data set shows stronger cooling tendencies than GPS RO observations. The largest difference can be found in channel 3, corresponding to the temperature of top stratosphere and lower mesosphere. For other climate regions, the tendency uncertainties are larger than tendency values so that it is hard to detect the tendency differences for the two data sets. Overall, we found that the temporal evolution between SSU and GPS RO data sets are most consistent for channel 1 and the consistency decreases with channels (altitude) for both global time series and those in different climate regions.
Table 3. Trends and 95% Confidence Level of GPS, SSU BT Anomalies, and Their Differences From May 2001 to May 2006
Tendencies (K/yr) 95%
Tendencies (K/yr) 95%
Tendencies (K/yr) 95%
 Finally, Figure 10 gives the BT seasonal cycles of SSU and CHAMP GPS RO for SSU three channels for the globe and three different climate zones. The phases of seasonal cycles for both data sets are consistent with each other, whereas the magnitude differences can be clearly seen in channel 3.
 It is generally difficult to assess the accuracy and quality of the SSU data set by comparing only one set of observations because each data set has its own advantages and drawbacks. Therefore, the comparison of SSU data set with the three independent data sets should take advantage of their strengths and avoid their shortcomings. Specifically, the microwave limb sounding instrument has global coverage and high vertical resolution, but temporal duration is not long enough to address climate-related study. A grid-by-grid comparison of SSU and MLS indicates that the differences between SSU and MLS are <0.5 K for their overlap period for all the three SSU channels. This accuracy level is generally what can be reached by heritage satellite instruments and thus suggests that the SSU absolute accuracy is within expectation and acceptable. The GPS RO technique beginning from 2001 can provide globally covered, vertically resolved stratospheric temperature profiles using a precise timing measurement, but the accuracy degrades when it reaches upper and top stratospheres. Because the GPS RO observations have relatively high accuracy at the low and middle stratosphere, good agreement between SSU and GPS RO in channel 1 (mid-stratosphere channel) suggests that the SSU observations have variability as reliable as GPS RO. The lidar observations—based on an active remote sensing technique—have long-term temporal coverage and high vertical resolution but are confined at several ground-based stations and only good for the top stratospheric temperatures (channel 3). For this channel, BT anomalies between SSU and lidar are highly correlated with correlation coefficients of 0.839 at HOH and 0.725 at OHP, suggesting that the SSU observations have long-term trends and variability comparable to Lidar observations.
 Overall, the comparison results do not show that the newly developed SSU data set is significantly different from any of the three independent data sets based on known limitations and advantages of these data sets.
 The consistent long-term stratospheric temperature observations area crucial part for global change study. In this study, we examined consistency of SSU time evolution with long-term in-situ lidar and CHAMP GPS RO observations and its absolute values against MLS data. The vertical temperature profiles are converted into SSU-equivalent layer temperatures, and then diurnal correction is applied to adjust the observations into an identical observation time. The comparison is carried out on pentad grids with a 2.5° × 2.5° resolution. The following conclusions are drawn from this study:
 SSU channel 3 BT anomalies are highly correlated with lidar observations with correlation coefficients of 0.839 at HOH and 0.725 at OHP. BT anomaly difference time series of SSU-lidar have trends of 0.035 ± 0.042 K/year at HOH and 0.080 ± 0.067 K/year at OHP.
 For the global time series, the correlations of GPS RO and SSU are 0.943, 0.877, and 0.699 from channels 1–3, respectively. The correlation decreases with altitude (channels), because the accuracy of GPS RO soundings degrades above 25 km with altitude. Similar features are also found for different climate zones. The trend differences between SSU and GPS RO global time series are found as 0.044 ± 0.019, 0.041 ± 0.030, and 0.097 ± 0.040 K/year for channels 1–3, respectively.
 The mean differences between SSU and MLS during August 2004 to May 2006 are -0.041, 0.169, and -0.447 K with standard deviation 1.180, 1.485, and 1.715 K for SSU channels 1–3, respectively. The increasing of standard deviation values can be caused by the fact that the noise of SSU measurements increases with channels. Compared with MLS, SSU shows warm bias in the polar regions but nearly zero bias in the tropics in channels 1 and 2. In contrast, channel 3 is dominant with cold bias in the tropics but with mixed biases in the polar regions. The spatial patterns of the standard deviation show persistent large values in the high latitude regions for all three channels, suggesting large spread of SSU-MLS BT differences in the polar regions.
 Overall, the comparison results do not show that the newly developed SSU data set is significantly different from other three data sets based on known limitations of the three comparing data sets.
 The work is supported by NOAA grant NESDISPO20092001589 (SDS0915). L.W. is also partially supported by NOAA grant NA09NES4400006 (Cooperative Institute for Climate and Satellites) at the Earth System Science Interdisciplinary Center/University of Maryland. The lidar data used in this study were obtained as part of the NDACC and are publicly available at http://www.ndacc.org. The CHAMP GPS RO soundings were downloaded from the CDAAC at NCAR. The MLS data were acquired from the NASA GES DISC. The views, opinions, and findings contained in this report are those of the authors and should not be construed as an official NOAA or U.S. Government position, policy, or decision.