Water-vapor-weighted atmospheric mean temperature, Tm, is a key parameter in the retrieval of atmospheric precipitable water (PW) from ground-based Global Positioning System (GPS) measurements of zenith path delay (ZPD), as the accuracy of the GPS-derived PW is proportional to the accuracy of Tm. We compare and analyze global estimates of Tm from three different data sets from 1997 to 2002: the European Centre for Medium-Range Weather Forecasts (ECMWF) 40-year reanalysis (ERA-40), the National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) reanalysis, and the newly released Integrated Global Radiosonde Archive (IGRA) data set. Temperature and humidity profiles from both the ERA-40 and NCEP/NCAR reanalyses produce reasonable Tm estimates compared with those from the IGRA soundings. The ERA-40, however, is a better option for global Tm estimation because of its better performance and its higher spatial resolution. Tm is found to increase from below 255 K in polar regions to 295–300 K in the tropics, with small longitudinal variations. Tm has an annual range of ∼2–4 K in the tropics and 20–35 K over much of Eurasia and northern North America. The day-to-day Tm variations are 1–3 K over most low latitudes and 4–7 K (2–4 K) in winter (summer) Northern Hemispheric land areas. Diurnal variations of Tm are generally small, with mean-to-peak amplitudes less than 0.5 K over most oceans and 0.5–1.5 K over most land areas and a local time of maximum around 16–20 LST. The commonly used Tm-Ts relationship from Bevis et al. (1992) is evaluated using the ERA-40 data. Tm derived from this relationship (referred to as Tmb) has a cold bias in the tropics and subtropics (−1 ∼ −6 K, largest in marine stratiform cloud regions) and a warm bias in the middle and high latitudes (2–5 K, largest over mountain regions). The random error in Tmb is much smaller than the bias. A serious problem in Tmb is its erroneous large diurnal cycle owing to diurnally invariant Tm-Ts relationship and large Ts diurnal variations, which could result in a spurious diurnal cycle in GPS-derived PW and cause 1–2% day-night biases in GPS-based PW.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 The GPS includes a network of 24 satellites in six 55° orbits with 4 satellites in each orbit, each satellite traveling in a 12-hour, circular orbit 20,200 kilometers above the Earth and transmitting radio signals to ground-based GPS receivers around the globe. The radio signals from the GPS satellites to ground receivers are delayed, in part, by atmospheric water vapor (referred to as wet delay). The total delay along the zenith path is called as the zenith path delay (ZPD). The ZPD can be partitioned into two parts, the zenith hydrostatic delay (ZHD), which depends only on surface air pressure (Ps) [Elgered et al., 1991], and the zenith wet delay (ZWD), which is a function of atmospheric water vapor profile. The ZWD can be derived by subtracting ZHD from ZPD. Since ZWD is a function of atmospheric water vapor and temperature, this allows PW to be calculated if the water-vapor-weighted mean temperature of the atmosphere (Tm) can be estimated [Elgered et al., 1991; Bevis et al., 1992, 1994]. The RMS (root mean square) error of GPS-derived PW ranges from smaller than 2 mm in North America [e.g., Bevis et al., 1992, 1994; Li et al., 2003], Europe [e.g., Emardson et al., 1998; Gendt et al., 2004], Australia [Tregoning et al., 1998; Dietrich et al., 2004] to 2.2 mm in Taiwan [Liou et al., 2001], 2.6 mm at global IGS sites [Deblonde et al., 2005] and 3.7 mm in Japan [Ohtani and Naito, 2000]. The PW accuracy is proportional to the accuracy of Tm. An uncertainty of 5 K in Tm corresponds to 1.6–2.1% uncertainty in PW.
 Although there have been many regional applications of ground-based GPS data [see Dai et al., 2002], there has been few efforts to take advantage of the growing network of the IGS stations around the globe. One of the challenges to derive PW from global GPS path delay data is estimating Tm over the globe. Exact calculations of Tm require profiles of atmospheric temperature and water vapor, which are usually unavailable. Instead, Tm are commonly estimated using either (1) station data of surface air temperature (Ts) and its empirical linear or more complicated relationship with Tm (the so-called Tm-Ts relationship) [e.g., Bevis et al., 1992; Ross and Rosenfeld, 1997; Emardson and Derks, 2000] or (2) numerical weather prediction model output or atmospheric reanalysis products [Bevis et al., 1994; Hagemann et al., 2003]. The Tm-Ts relationship can be site-dependent and may vary seasonally and diurnally [Ross and Rosenfeld, 1997]; it produces Tm estimates with a RMS error of ∼2–5 K [e.g., Davies and Watson, 1998; Ingold et al., 1998; Mendes et al., 2000; Liou et al., 2001; Baltink et al., 2002; Bokoye et al., 2003]. Hence the second method may be a better option for global Tm estimation. However, model and reanalysis data have their own uncertainties, and there have been no studies to evaluate Tm estimates from reanalysis data on a global scale.
 The main goals of this study are (1) to evaluate global estimates of Tm from two currently available reanalysis data sets (ERA-40 and NCEP/NCAR reanalysis) by comparing them with Tm calculated from a new global radiosonde data set, (2) to characterize in detail Tm spatial and temporal variability, (3) to evaluate the most commonly used Tm-Ts relationship from Bevis et al.  on a global scale, and (4) to determine the best approach to estimate Tm for deriving global PW from ground-based GPS measurements. The Tm definition and different methods for estimating Tm are given in section 2. Data sets and analysis methods are described in section 3. Section 4 compares the Tm estimates from the three different data sets. The spatial and temporal variability of Tm are discussed in section 5. In section 6, we evaluate the commonly used Tm-Ts relationship from Bevis et al. . Section 7 presents a summary and a recommendation for global Tm estimates.
2. Tm Definition, Sensitivity and Estimation Methods
 The water-vapor-weighted mean temperature of the atmosphere (represented by N levels), Tm, is defined and approximated as [Davis et al., 1985]:
where Pv is the partial pressure (in hPa) of water vapor; T is the atmospheric temperature (in Kelvin). Tm is a function of atmospheric temperature and humidity vertical profiles. GPS-estimated PW is related to ZWD via a dimensionless parameter Π:
where ρ is the density of liquid water, Rv is the specific gas constant for water vapor, and k3 and k′2 are physical constants and given by Bevis et al. . Using equations (2) and (3), the relative error of PW due to errors in Tm can be derived as
since k′2/k3 is small (∼5.9 × 10−5 K−1). Hence the relative error of PW approximately equals to that of Tm, which is also shown by Bevis et al. . On the basis of equation (4), for Tm from 240 K to 300 K, the 1% and 2% accuracies in PW require errors in Tm less than 2.74 K and 5.48 K on average, respectively.
 Atmospheric temperature and humidity profiles, such as those from soundings, can be used to calculate Tm. However, these profiles are usually unavailable at the high temporal resolution of the ZPD data (e.g., 2-hourly) and are usually not collocated with ground-based GPS stations. Hence Tm is often estimated using alternative methods.
 A commonly used method for estimating Tm is to use the strong relationship between Tm and Ts (surface air temperature) since Ts can be obtained from either surface observations at GPS stations or synoptic reports at nearby weather stations. The RMS error in the Tm estimated by the Tm-Ts relationship ranges from ∼2 K [Davies and Watson, 1998] in UK and Taiwan [Liou et al., 2001] to ∼5 K for Arctic air masses [Bokoye et al., 2003]. On the basis of equation (4), the 2 K and 5 K RMS errors in Tm result in an uncertainty of ∼0.73% and 1.83% in PW for Tm within 240 K to 300 K, respectively. However, the application of the Tm-Ts relationship is hampered because the relationship varies with space, time and weather conditions. The most commonly used Tm-Ts relationship is from Bevis et al. , which was derived from radiosonde data at 13 U.S. sites over a 2-year period and gives a RMS error of ∼4.74 K. Ross and Rosenfeld  estimated Tm using radiosonde data from 53 globally distributed stations. They show that both Tm and Tm-Ts relationships exhibit geographic and seasonal variations, and only weak correlations exist between Tm and Ts at tropical stations. Unfortunately, a coding error was discovered in this global analysis, so has to be corrected by multiplying a factor of 1.03443 for the regression slopes and intercepts [Ross and Rosenfeld, 1999]. Other studies developed site-specific Tm-Ts relationship using local radiosonde data, e.g., Liou et al.  for Taiwan, Baltink et al.  for Netherlands, Bokoye et al.  for Canada and Alaska. It still, however, remains unclear whether the Tm – Ts relationship has diurnal variations. Several studies have shown diurnal variations in differences between GPS-estimated and radiosonde-measured PW [e.g., Van Baelen et al., 2005; S. Gutman, personal communication, 2005]. The common practice of using a diurnally and seasonally independent and geographically invariant Tm – Ts relationship probably has contributed to the discrepancy between GPS and radiosonde PW data. It is impractical to derive site-, time- or even weather-dependent Tm-Ts relationships as high-resolution soundings of atmospheric temperature and humidity are unavailable at most GPS sites, especially on a global scale.
 Another method for estimating Tm is to use numerical weather prediction (NWP) model output or reanalysis products, which provide the three-dimensional distribution of temperature and humidity every six hours [e.g., Bevis et al., 1994; Hagemann et al., 2003]. This method, especially reanalysis products, is a good option for estimating Tm on a global scale because of its global coverage and its availability of every six hours. To our knowledge, Bevis et al.  is the only study that quantifies the error in Tm derived from NWP models, they show that a RMS relative error of 1% in Tm can be achieved using NWP models. Recent reanalyses of multidecadal atmospheric observations using modern data assimilation techniques have produced three-dimensional atmospheric fields that are usually much improved over NWP outputs. These reanalysis data have been widely used in climate and atmospheric research. However, there have been no published studies to use the reanalysis data to derive Tm and the associated errors.
3. Data and Analysis Method
3.1. Radiosonde Data
 The Integrated Global Radiosonde Archive (IGRA) is a newly released radiosonde data set from NOAA's National Climatic Data Center (NCDC) [Durre et al., 2005]. The data set consists of daily (1–4 times per day) radiosonde observations at more than 1,000 globally distributed stations for the period 1938 to the present. Observations include pressure, temperature, geopotential height, dew point depression, wind direction, and wind speed at surface, tropopause, standard pressure and significant levels. The data set was created by merging data from 11 different sources and a suite of quality control procedures was applied during the process.
 We calculated Tm using equation (1) and temperature and humidity profiles for 1997–2002 from the IGRA data set. These Tm values were considered as the truth for evaluating the Tm derived from reanalysis data. The radiosonde humidity data are less frequent and less accurate at cold and high-altitude conditions because of poor performances of humidity sensors at cold temperatures and the applications of temperature cutoff for humidity reports used by many countries [e.g., Wang et al., 2003]. The sensitivity of Tm to the vertical resolution of the radiosonde data and the top of the sounding profile (minimum pressure) was studied using the high-resolution (6s) radiosonde data at Fairbanks and Miami stations and the sounding profiles obtained from NCAR [Shea et al., 1994], respectively. Tm calculated from data at standard pressure levels and at all levels (6s) only differs by less than 0.3 K. The sensitivity analysis showed that Tm is very sensitive to temperature and humidity data at the surface and within the boundary layer. Therefore we require that each IGRA sounding has data (both temperature and humidity) available at surface and at least five (four) standard pressure levels above surface for stations below (above) 1000 hPa. Note that we do not terminate the Tm integrals in (1) at 500 hPa, which is different from Ross and Rosenfeld . We found that missing data above 500 mb introduces a warm bias of ∼1.2 K in Tm, which is consistent with the warm bias of 1.5 K from Ross and Rosenfeld . Global mean top of the soundings after excluding missing data is ∼221 hPa.
3.2. ERA40 and NCEP/NCAR Reanalysis Data
 The European Centre for Medium-Range Weather Forecasts (ECMWF) 40-year reanalysis (ERA-40) from 1957 to 2002 is based on the ECMWF three-dimensional variational assimilation system and makes use of both conventional and satellite observations [Uppala et al., 2005]. In this study we use the 6-year (1997–2002) ERA-40 full resolution analysis data set obtained from NCAR Scientific Computation Division (SCD) data archive (http://dss.ucar.edu/pub/era40/). The data set has a spectral resolution of TL159 (equivalent to 1.125° × 1.125° grid), 60 hybrid vertical levels, and is available at 0000, 0600, 1200 and 1800 UTC each day. The geopotential height, temperature and specific humidity profiles from ERA-40 and equation (1) are used to calculate Tm. The ECMWF topography data set (available on 1.125° × 1.125° grid) is also employed in this study for the vertical interpolation between the model and station surface heights (see section 3.3. for details).
 The National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) global reanalysis products are available from 1948 to present at 6 hour intervals [Kalnay et al., 1996]. The resolution of this reanalysis is T62 (∼1.875° × 1.875°) with 28 hybrid vertical levels. The 6-hourly geopotential height, temperature and relative humidity profiles from 1997 to 2002 are used to calculate Tm.
3.3. Comparison Methods
 Tm comparisons are made by matching two data sets both in space and time through interpolation and extrapolation. ERA-40 and NCEP/NCAR reanalysis (NNR) have the same temporal resolution (6-hourly) but different horizontal resolution. NNR's data on T62 (192 × 96) Gaussian grids are interpolated to ERA-40s TL159 (320 × 160) grids for comparisons between them. The comparisons between the reanalysis and IGRA data are more complicated and described below.
 Three steps are taken to match reanalysis (gridded and 6-hourly) and IGRA data (point and 1–4 times per day). Figure 1 illustrates the procedure for the comparisons. First, the reanalysis data within three hours of radiosonde launch time was chosen to compare with the IGRA soundings. Second, the reanalysis temperature and humidity data were vertically extrapolated from the model surface height (hm) to the IGRA station height (hs) if hs is lower than hm. The ERA-40 surface heights in general agree with that from NNR with differences less than 100 m except in Antarctic and Himalayas; a surface height difference of 100 m induces a Tm difference of less than 0.5 K. Therefore no vertical extrapolation was applied for comparisons between two reanalysis data sets. The surface height differences between ERA-40 and IGRA range from 0 to 1.2 km with a mean of ∼125 m, and 75% stations have surface heights within 200 m of model surface heights. The vertical extrapolation is described in details in next paragraph. Note that if hs ≥ hm, Tm is directly calculated by integrating reanalysis data from hs to the top of the profiles in equation (1). Finally, the Tm calculated at each reanalysis grid box was linearly interpolated using four nearby grid values to the IGRA station location. In summary, the Tm comparisons between reanalysis and IGRA were conducted at radiosonde launch times and locations. This procedure is akin to estimating Tm at a given GPS station site.
 If hs < hm, the reanalysis temperatures at hs over the four grid boxes surrounding a radiosonde station were derived at 50 m resolution by using the mean lapse rate (Γm) for the lowest three model layers and temperatures at hm (see Figure 1). The calculated lapse rate is sometimes smaller than −10 K/km or positive, which could result in erroneous T(hs). Therefore the typical lapse rate for moist adiabatic conditions, −6.5 K/km, is used for these two cases and is a good approximation of tropospheric mean lapse rate, which is found to be approximately −5 to −6 K/km from 70°S to 70°N by our calculation and Yang and Smith . The reanalysis vapor pressure profile between hs and hm was calculated from the temperature profile by assuming constant relative humidity (the mean of the lowest two model layers), following Hagemann et al. . Finally, the Tm for the four surrounding grid boxes was calculated by integrating equation (1) from hs to the top of the model layers.
Figure 2 illustrates the effects of the vertical extrapolation on the Tm difference (ΔTm) between ERA-40 and IGRA. Without any adjustments to the ERA-40 Tm, it is colder than IGRA Tm over most areas and by >2 K over high terrain (Figure 2a), which corresponds well with surface height differences (Figure 2c). After vertically extrapolating ERA-40 profiles to IGRA station heights, ERA-40 Tm was recomputed and agrees with IGRA Tm very well at all stations except several stations with ∣ΔTm∣ > 2 K (Figure 2b). The impact of horizontal interpolation is small (not shown). In the comparisons discussed below, the reanalysis Tm has been adjusted on the basis of the procedures shown in Figure 1.
4. Tm Comparisons
Table 1 summarizes the global statistics of Tm comparisons among the three data sets using 6 years (1997–2002) of data. Note that the Tm difference (ΔTm) was computed at individual sounding or reanalysis times, and the mean and its standard deviation (SD) of the individual ΔTm were computed for each month using the 6-year data, e.g., 6 months of January from 1997–2002 were used to compute the January mean and SD of the ΔTm. The global mean number of samples or data points over the 6-year period for each month and location is shown in Table 1.
Table 1. Global Statistics of Tm Comparisons Derived From ERA-40, NNR and IGRA Using 6 Years (1997–2002) of Data
ERA-40 - NNR
ERA-40 - IGRA
NNR - IGRA
The standard deviation (SD) was calculated using all data points (N) for each month combining all the years and then averaged over the months and locations.
Mean total number of data points for each month (N) at each location
Global, annual average of RMS relative differences of ΔTm, %
Maximum of annual average of RMS relative differences of ΔTm, %
% of data points with ∣ΔTm∣ > 2 K
% of grid boxes/stations with 6-year annual mean ∣ΔTm∣ > 2 K
 On a global and annual basis, the mean Tm differences among the data sets are negligible (<0.23 K) with a mean SD of less than 1.3 K. NNR Tm tends to be warmer by ∼0.2 K than the Tm from both IGRA and ERA-40. The RMS relative difference of Tm between ERA-40/NNR and IGRA is less than 0.5% on average and reaches maxima of 1.37% and 1.84% for ERA-40 and NNR, respectively. Hence the relative errors in PW (c.f. equation (4)) induced by Tm derived from reanalysis temperature and humidity profiles (with the vertical extrapolation and horizontal interpolation shown in Figure 1) are likely to be around 0.5%. An absolute error of ≤2 K for Tm may be considered acceptable in order to achieve 1% accuracy in PW based on equation (4). The percentage of matched data points with annual mean ∣ΔTm∣ > 2 K is less than 16% for NNR and only ∼10% for ERA-40 compared with IGRA. At nearly 99% of the total ∼900 IGRA stations, ERA-40 and IGRA have 6-year annual mean Tm within 2 K; while this number decreases to 96.5% for NNR vs. IGRA comparison. These global statistics suggest that, while both are acceptable, Tm based on ERA-40 is better than that based on the NCEP/NCAR reanalysis, which is consistent with a recent assessment of surface air temperatures in the reanalyses [Simmons et al., 2004].
 Tm differences between ERA-40 and NNR are within 2 K over most of the globe except in high-elevation regions such as Antarctic and Himalayas, which is mainly due to discrepancies in surface model heights used by two data sets. As shown in section 3, the vertical extrapolation discussed in Figure 1 to account for surface model height differences would significantly reduce absolute values of ΔTm, but was not implemented in this comparison.
 Global maps of annual and seasonal mean Tm differences between ERA-40 and IGRA are shown in Figure 3. The differences are insignificant (<2 K) at most of the stations for all seasons, except for Siberia and Himalayas in winter where ERA-40 Tm is warmer than IGRA Tm by more than 2 K. The RMS relative differences are less than 1% except for a few mountain stations where it reaches 1.5%. The comparison between NNR and IGRA reveals similar spatial features but has larger magnitudes, confirming that ERA-40 Tm generally is better than NNR Tm.
5. Spatial and Temporal Variations of Tm
 Geographic variations of annual mean Tm from ERA-40 and IGRA are shown in Figure 4 (NNR Tm has patterns similar to ERA-40 Tm). Annual mean Tm has large latitudinal variations, ranging from ∼225 K over Antarctica to ∼295 K over the equatorial oceans; while longitudinal variations are small. As expected, Tm is colder over high terrain, such as Himalayas, Rocky Mountains, Andes and Antarctica, than the surrounding regions. The reanalysis-based Tm reveals more detailed features than the IGRA radiosonde data set and Ross and Rosenfeld . In general, the Tm distribution follows that of surface air temperature [Peixoto and Oort, 1992].
 The amplitude of the annual cycle of monthly Tm is shown in Figure 5. Tm seasonal variations are relatively small in the tropics (<4 K) and extratropical oceans (<10 K). As expected, the Tm annual cycle is larger over land than over ocean, and larger in the Northern Hemisphere (NH) than in the Southern Hemisphere (SH), where ocean predominates. The annual range of Tm exceeds 15 K over land in the NH and has maximum values over Siberia (30–35 K) and Northern Canada (25–30 K). The geographic distribution of Tm annual amplitude is similar to that of surface air temperature [Peixoto and Oort, 1992, Figure 7.4], but with smaller magnitudes.
 The standard deviation of daily mean Tm for individual months is a good measure of day-to-day variability, was averaged from 1997 to 2002 and is shown for January and July in Figure 6 for ERA-40 Tm. Large variations (SD ≈ 4–7 K) exist over the northern mid- and high-latitude continents in winter associated with vigorous cyclonic activity, while the tropics have small variations (<2 K), consistent with the annual amplitude patterns (c.f. Figure 5) but with smaller magnitudes. Summer Tm has much smaller variations than winter months in the NH while the seasonal difference is relatively small in the SH. The NNR Tm shows SD patterns similar to the ERA-40 Tm, but with stronger seasonal contrasts.
 The mean amplitude and phase (local solar time (LST) of maximum) of the Tm diurnal cycle calculated from 6-hourly reanalysis data are shown in Figure 7 for June-August (JJA). The mean-to-peak amplitude is smaller than 1 K over most of the globe except over high terrain. The amplitude is larger over land than over ocean and generally larger in summer than in winter. The diurnal cycle over land peaks in the afternoon (∼1400–1800 LST); the small amplitude over oceans makes the estimates of phase unreliable although it peaks generally in the afternoon. NNR has a stronger diurnal cycle in Tm than ERA-40. The Tm diurnal cycle has similar phase to but much smaller amplitudes than that of air temperature at the surface (∼1–6 K over land, 0.3–0.8 K over oceans, see Figure 10 and Dai and Trenberth ) but slightly larger than that at 850 hPa [Seidel et al., 2005]. Although the mean Tm diurnal variations are relatively small and may be ignored over the oceans, Tm diurnal variations for individual days can be much larger (e.g., with peak-to-peak amplitude >3 K), especially over high terrain. The small Tm diurnal variation in comparison to Ts is due to (1) the larger contributions to Tm of temperature profiles above surface temperature inversion layers at night and (2) the drastic decrease in the amplitude of the temperature diurnal cycle from the surface (1–4 K) to 850 hPa (<1 K) [Seidel et al., 2005].
 The Tm diurnal variation shown in Figure 7 cannot be validated by the IGRA data set since it only has data available at 00 and 12 UTC at most of stations. Thus we evaluate Tm diurnal variations using 3-hourly radiosonde data collected at the Atmospheric Radiation Program (ARM) Southern Great Plains central facility near Norman, OK in the period 1994–2000. Figure 8 depicts comparisons of 7-year averaged Tm diurnal anomalies (departures from daily mean values) at Norman in JJA calculated from radiosonde data, ERA-40 and NNR. There are very good agreements in both amplitude and phase of the Tm diurnal cycle among radiosonde data, ERA-40 and NNR (Figure 8).
6. Evaluation of Tm-Ts Relationship
 The commonly used method for estimating Tm is using the relationship between Tm and Ts because of better availability of Ts data. As discussed in section 2, however, the site-, time- and even weather-dependent Tm-Ts relationships are necessary for reliable Tm estimates, but such Tm-Ts relationships are unavailable on a global scale. The empirical Tm-Ts relationship from Bevis et al. , Tm = 70.2 + 0.72*Ts, has been widely used, so it deserves validation, especially over regions beyond the contiguous United States. Ross and Rosenfeld  used twice-daily radiosonde data at 53 globally distributed stations to evaluate Bevis Tm-Ts relationship and found relative errors of ∼1–5% in Tm. However, the radiosonde data used by Ross and Rosenfeld  were insufficient for studying the diurnal variation and have poor coverage over the oceans, the Southern Hemisphere, and mountainous and polar regions. Here we will use ERA-40 data, which have been shown above to be quite accurate for global Tm estimates (see section 4), to validate Tm estimated from Bevis Tm-Ts relationship. We focus on global and large-scale patterns and diurnal variations. Tm calculated using ERA-40 surface temperature and Bevis Tm-Ts relationship (referred to as Tmb) is compared with that using ERA-40 temperature and humidity profiles (Tme), which is considered as the reference for this evaluation purpose.
 Global maps of annual, January and July mean Tmb – Tme difference averaged from 1997 to 2002 are shown in Figure 9. The mean bias in Tmb has a range of ±10 K, which corresponds to a relative bias of ±3.5%. Tmb has a cold bias of 1–6 K in the tropics and subtropics; such cold bias tends to be larger and more widespread in July than in January in both hemispheres. The maximum bias occurs in marine stratus cloud regions, which is likely due to low-level temperature inversion commonly associated with stratus clouds [Klein and Hartmann, 1993] that could invalidate the Bevis Tm-Ts relationship. Baltink et al.  and Bokoye et al.  also showed poor Tm-Ts relationship because of temperature inversions. The seasonal contrast over marine stratus regions is most pronounced than other regions because of large amounts of marine stratus clouds during boreal summer [Klein and Hartmann, 1993]. Tmb generally has a warm bias at middle and high latitudes with the largest magnitude over mountainous regions. The RMS relative error in Tmb varies from 1% to ∼4% globally and has the same geographic and seasonal variations as in Figure 9, which suggests that the RMS error is dominated by the mean bias shown in Figure 9 rather than the random error. After removing the mean bias, the relative random error is less than 1% over most of the globe and varies from 1% to 2% in mountainous and polar regions and over Sahara desert.
 The mean diurnal amplitudes of Tmb, Tme and ERA-40 Ts averaged over 1997–2002 are shown in Figure 10. Tmb shows large diurnal variations (mean-to-peak amplitude ≈ 1–5 K) over land and has the same spatial patterns as Ts because of diurnally invariant Tm-Ts relationship. In contrast, the diurnal amplitude of Tme is less than 1.5 K. The artificial diurnal cycle in Tmb introduces diurnally variant biases, which are shown as annual mean diurnal anomalies (departures from daily mean values) of Tmb ∼ Tme at 00, 06, 12, 18 UTC in Figure 11. It is not surprising that a warm bias of 1–4 K (relative to daily mean) in the afternoon (12–18 LST) and a cold bias of ∼1–4 K in the early morning (00–06 LST) occur over land, with the largest biases over the Saharan desert.
Figure 8 shows the mean diurnal curves of Ts, Tmb and Tm calculated from 3-hourly ARM sounding data (referred to as Tmr) at Lamont in JJA. The diurnal cycles of Tmb and Ts are very similar, and both are much stronger than Tmr. Tmb is within 1 K of Tmr during daytime, but has a cold bias of ∼3–6 K at night (20–23 LST) and in early morning (2–6 LST), which would result in a dry bias of ∼1–2% in PW. This cold bias in Tm is likely to be the main contributor to the smaller PW derived from GPS ZPD data than that from radiosonde data at night found by several studies (e.g., S. Gutman, personal communication, 2005). Figure 8 suggests that Bevis Tm-Ts relationship is more closely representative of conditions at daytime at Norman. This is confirmed by the Tm-Ts relationship determined from the ARM data in Figure 12. At Norman, Bevis Tm-Ts regression line is a good fit to the daytime data, but has a systematic offset from the nighttime data and introduces significant cold biases at night (Figure 12), which is similar to what Van Baelen et al.  found in Toulouse, France. The analysis of data in DJF (December–January–February) shows the similar results to that in JJA.
 Water-vapor-weighted mean temperature of the atmosphere, Tm, were estimated using temperature and humidity profile data from ERA-40 and NCEP/NCAR (NNR) reanalyses and a new global radiosonde data set (IGRA), which was regarded as the truth for evaluating the reanalysis-based Tm. Global estimates of Tm from the three data sets are intercompared by matching them in space and time. We found that temperature and humidity profiles from both the ERA-40 and NNR data sets produce reasonable Tm estimates. However, because of the following reasons, we conclude that the ERA-40 is a better choice for global Tm estimation:
 1. Both ERA-40 and NNR data have good agreements with IGRA in geographic and seasonal variability, but the difference between NNR and IGRA is slightly larger than that between ERA-40 and IGRA.
 2. ERA-40 data have higher spatial resolution (1.125° × 1.125°, 60 vertical levels) than NNR (1.875° × 1.875°, 28 vertical levels), which is an advantage for producing local Tm for individual GPS sites.
 3. Simmons et al.  found better quality of surface air temperature from the late 1970s onward from ERA-40 than from NNR. It should be kept in mind that Tm strongly depends on surface air temperature.
 Tm increases from <250 K in the polar regions to 290–300 K in the tropics, with small longitudinal variations except over mountain regions where it has lower values. Tm has an annual range of about 2–4 K in the tropics and 20–35 K over much of Eurasia and northern North America. The day-to-day Tm variations are ∼1–3 K over most low latitudes and ∼4–7 K (3–4 K) in winter (summer) Northern Hemispheric land areas. Diurnal variations of Tm are generally small, with mean-to-peak amplitudes less than 0.5 K over most oceans and 0.5–1.5 K over most land areas and a local time of maximum around 16–20 LST. The Tm diurnal cycle is much smaller than the surface air temperature one [Dai and Trenberth, 2004] but larger than the 850 hPa temperature one [Seidel et al., 2005]. Thus, on average the diurnal variations in Tm can only introduce less than 1% errors in GPS-derived PW and may be ignored in many applications, especially over oceans.
 The Tm-Ts relationship from Bevis et al.  has been widely used to derive Tm from Ts, but it has limitations. We found that monthly and annual mean Tm derived from the Bevis Tm-Ts relationship (Tmb) has a cold bias in the tropics and subtropics (−1 ∼ −6 K, largest over subtropical marine stratus cloud regions), and a warm bias in the middle and high latitudes (2–5 K, largest over mountainous regions). Random errors in Tmb are smaller than the mean biases. Another serious problem in Tmb is its erroneous large diurnal cycle resulting from a large Ts diurnal cycle and diurnally invariant Tm-Ts relationship. This artificial Tm diurnal cycle can induce a warm bias (1–4 K) in the afternoon (12–18 LST) and a cold bias (−1∼ −4 K) in the early morning (00–06 LST) over land, and thus 1–2% biases in PW. This diurnal bias could be important for diurnal analyses [Dai et al., 2002] and for understanding the diurnal differences of PW derived from GPS and radiosonde or other data [e.g., Van Baelen et al., 2005]. We conclude that the most common practice for estimating Tm, i.e., using Tm-Ts relationship, is not suitable for global Tm estimation because of this diurnal bias and the difficulties in obtaining site-, time- and even weather-dependent Tm-Ts relationships.
 Our results show that, in the absence of local profile data of atmospheric temperature and humidity, the best option to estimate Tm is to calculate Tm using 6-hourly ERA-40 temperature and humidity profile data with adjustment to GPS station heights and observation times using the matching method developed in this study (Figure 1). However, ERA-40 data are only available from 1948 and 2002, and it is not clear now when the data after 2002 will be released. Therefore we might have to use NNR to calculate Tm for the whole period for consistency since it is available from 1948 to the present and updated frequently.
 The technique of global Tm estimation presented here will help us derive a global, 2-hourly PW data set from ground-based GPS ZPD data from global IGS stations (∼360). This PW data set will be applied to study the diurnal variations in PW over the globe, to estimate the diurnal sampling errors in twice-daily radiosonde humidity, and to quantify spatial and temporal inhomogeneity and biases in global radiosonde PW data.
 This work is supported by NCAR Director Office's Opportunity Fund and partially by NCAR Water Cycle Across-Scale Initiative. We thank Joey Comeaus (NCAR/SCD) and Dennis Shea (NCAR/CGD) for helping us with ECMWF reanalysis data and Teresa Van Hove (UCAR) for providing the program to convert GPS ellipsoid height to MSL height and other useful discussions. The National Center for Atmospheric Research is sponsored by the National Science Foundation.