A 2-hourly data set of atmospheric precipitable water (PW) has been produced from the zenith path delay (ZPD) derived from ground-based Global Positioning System (GPS) measurements. The PW data are available every 2 hours from 80 to 268 International GNSS Service (IGS, formally International GPS Service) ground stations from 1997 to 2004. The accuracy of the IGS ZPD product is roughly 4 mm. An analysis technique is developed to convert ZPD to PW on a global scale. Special efforts are made on deriving surface pressure (Ps) and water-vapor-weighted atmospheric mean temperature (Tm), which are two key parameters for converting ZPD to PW. Ps is derived from global, 3-hourly surface synoptic observations with temporal, vertical and horizontal adjustments. Tm is calculated from NCEP/NCAR reanalysis with temporal, vertical and horizontal interpolations. The derived Ps and Tm at the GPS location and height have root-mean-square (rms) errors of 1.65 hPa and 1.3 K, respectively. A theoretical error analysis concludes that typical PW error associated with the errors in ZPD, Tm and Ps is on the order of 1.5 mm. The PW data set is compared with radiosonde, microwave radiometer (MWR) and satellite data. The GPS and radiosonde PW comparisons at 98 stations around the globe show a mean difference of 1.08 mm (drier for radiosonde data) with a standard deviation of differences of 2.68 mm, which corresponds to mean percentage difference and standard deviation of 5.5% and 10.6%, respectively. The bias is primarily due to known dry biases in the Vaisala radiosonde data. The RMS difference between GPS and radiosonde/MWR data ranges from 1.2 mm to 2.83 mm. The latitudinal and seasonal variations of PW derived from the GPS data agree well with that from International Satellite Cloud Climatology Project (ISCCP) data if the ISCCP data are sampled only at grid boxes containing GPS stations. The large difference between GPS and ISCCP data in the subtropics is interesting, but is not easily explained. The comparisons did not reveal any systematic bias in GPS PW data and show that a RMS difference of less than 3 mm between GPS-derived PW and other data sets is achieved. The comparison study also illustrates the value of GPS-estimated PW for examining the quality of other data sets, such as those from radiosondes and MWR. Preliminary analysis of this data set shows interesting and significant diurnal variations in PW in four different regions.
 Atmospheric water vapor plays a crucial role in Earth's energy and water cycles through absorbing solar and infrared radiation, releasing latent heat, transporting water, and forming clouds and precipitation. Water vapor is the most abundant and also the most important greenhouse gas in the atmosphere, thus it plays an important role in global climate change. Precipitable water (PW), which is also referred to as total column or integrated water vapor, is the total water vapor contained in an air column from the Earth's surface to the top of the atmosphere. About 45–65% of the PW is included in the surface-850 hPa layer [Ross and Elliott, 1996].
 Global radiosonde data represent an increasingly valuable resource for studies of climate change. Unfortunately, the usefulness of radiosonde data for long-term climate monitoring is limited by errors and biases associated with instrument and data processing procedures and by radiosonde changes among stations and with time. One of the scientific objectives of creating a global PW data set from GPS measurements is to take advantage of the increasing volume and maturity of GPS data and more importantly its long-term stability, and use it to monitor the quality of global radiosonde data and potentially improve the long-term radiosonde climate records.
 There exist substantial diurnal variations in atmospheric water vapor, both column-integrated values (i.e., PW) and vertical profiles [e.g., Dai et al., 2002; Wang et al., 2002a]. The water vapor diurnal variations affect surface and atmospheric longwave radiation and atmospheric absorption of solar radiation. They are closely related to many other processes, such as diurnal variations in moist convection and precipitation [Dai et al., 1999], surface wind convergence [Dai and Deser, 1999] and surface evapotranspiration. The diurnal cycle of water vapor also provides a test bed for many aspects of the physical parameterizations in weather and climate models. Unfortunately, there is a lack of data with high temporal resolution for studying the diurnal cycle of water vapor on the global scale. Therefore one of scientific objectives for creating a global water vapor data set using high-temporal-resolution GPS measurements is to analyze the data to document and understand water vapor diurnal variations, and to validate the representation of the water vapor diurnal cycle in climate and weather models.
 The goal of this study is to (1) develop an analysis technique to derive PW using the ZPD derived from the existing ground-based GPS measurements on a global scale; (2) apply this technique to produce a near-global, 2-hourly PW data set; (3) compare the PW data set with other measurements, such as those from radiosondes, microwave radiometers (MWR), and satellites; and (4) use the PW data for various scientific applications, including documenting PW diurnal variations and quantifying time- and space-dependent biases in global radiosonde humidity records. This paper describes the procedure to create the GPS PW data set, shows comparisons with other measurements, and presents a few preliminary results of the scientific applications of the derived PW data set. The various data sets used in this study are described in section 2. In section 3, we detail the analysis technique and the final GPS PW data set along with an error analysis on PW. In section 4, we compare the GPS-derived PW data set with other data sets and briefly mention the scientific applications of the data set. Preliminary results on PW diurnal variations in four regions are presented in section 5. Conclusions and future work are summarized in section 6.
2.1. Global GPS ZPD Data
 The GPS system is made of a constellation of 30 operational satellites circling at 20,200 kilometers above the Earth. They are evenly distributed in six orbital planes inclined at 55 degrees and perform a full revolution roughly every 12 hours such that up to 12 satellites are visible from anywhere on the globe at any time. The radio signals transmitted to the ground-based GPS receivers by these satellites include information on timing, satellite navigation and system parameters which allow real-time high-accuracy timekeeping, positioning, and navigation. The current global IGS network consists of 382 receivers (or stations) as of 10 February 2006 (Figure 1), and provides continuous GPS orbit tracking, as well as other high-quality navigation products in near real time. It is to be noted that in Figure 1 only IGS sites have been shown over the continental United States, while there exist much denser GPS networks including the SuomiNet, NOAA/FSL, and other sites to provide real-time atmospheric sensing [Ware et al., 2003]. Likewise, many European countries operate denser networks for their own purposes but also have regrouped them within an European-wide project for meteorological applications which was started under the EC COST-716 project auspices (http://www.oso.chalmers.se/geo/cost716.html) [e.g., Haase et al., 2001; Huang et al., 2003; Gendt et al., 2004] and is now continued under the EUMETNET E-GVap project (http://egvap.dmi.dk). The IGS network has been and is steadily growing from ∼100 stations in February 1997 to 382 stations in February 2006.
 When traveling from the GPS satellites to the ground-based GPS receivers, the radio (microwave) signals are delayed by the ionosphere and the neutral atmosphere usually referred to as the total atmospheric delay or tropospheric delay. The ionosphere delay is frequency-dependent and can be removed by the data from dual-frequency GPS receivers. The atmosphere delays the microwave transmissions by slowing the wave propagation and bending the raypath. The delay in signal arrival time can be expressed as an equivalent increase in travel path length. This excess path length, or total atmospheric delay, is given by [Bevis et al., 1992]
where n(s) is the refractive index as a function of position s along the curved raypath L, G is the straight-line geometrical path length through the atmosphere (the path that would occur if the atmosphere was replaced by a vacuum), S is the geometrical path length along L, and N(s) = 106(n(s)-1) is atmospheric refractivity. The total atmospheric delay is computed from the observed travel time between the location of the GPS receiver and the GPS satellite using the GPS software (e.g., the GAMIT, GIPSY, and Bernese GPS softwares) and can be partitioned into two parts – the hydrostatic delay, which is mainly a function of the surface pressure at the GPS receiver, and the wet delay, which depends strongly on total amount of water vapor along the wave trajectory and weakly on the atmospheric temperature [e.g., Davis et al., 1985]. At any given time a GPS receiver can receive signals from 6–12 GPS satellites. The signals follow slant paths depending on the azimuth and elevation of each satellite. Total delay along the zenith, called zenith path delay (ZPD), is estimated by mapping slant delays to zenith-equivalent values with mapping functions [e.g., Niell, 1996]. The ZPD is a sum of zenith hydrostatic delay (ZHD) and zenith wet delay (ZWD). The derivation of PW from ZPD is described in detail in section 3.
 The ZPD is one of the IGS data products derived from a subset of the IGS network of the ground-based GPS receivers [Gendt, 1998]. The final IGS ZPD is obtained by combining the ZPD solutions from seven IGS data analysis centers (AC) and has a temporal resolution of 2 hours (Table 1) [Gendt, 1998]. The quality of the combined ZPD is represented by the internal consistency among ACs, and is at the level of 4–6 mm corresponding to ≤1 mm in PW [Gendt, 1998; Byun et al., 2005]. Each AC estimates their own solution of ZPD using its own subset of IGS stations, its own strategy and its own final orbits. Two important parameters for the calculation of ZPD are the elevation cutoff angle and the mapping function. Historically, four ACs use an elevation cutoff angle of 15°, while the three others use cutoff angles of 7°, 10° or even 20° [Gendt, 1998]. Five ACs implement the Niell mapping function, while two ACs have applied the Lanyi and the Saastamoinen mapping function [Gendt, 1998]. Most of ACs use piecewise continuous model to estimate the 2-hourly ZPD and take into account AC-dependent biases in order not to get jumps from missing data. The combined 2-hourly ZPD is the piecewise continuous weighted mean of the ZPD values from all available ACs [Byun et al., 2005]. More than 70% of the stations report ZPD solutions from three or more ACs, thus allowing good quality combined ZPD values. The standard deviation of ZPD in the final product is the weighted combination of submitted standard deviations of the AC solutions and serves as an indicator of overall agreement. In this study, we rejected ZPD data with standard deviations greater than 15 mm. This simple criterion proved very efficient in removing most of the outliers in the ZPD data. Note that the data that have zero standard deviations pass the test, but might not represent good ZPD values. The 2-hourly ZPD data are available from 1997 at ∼100 stations each month to 2004 at ∼335 stations each month, are centered at odd UTC hours (0100, 0300, …, 2100 UTC) (Figure 2), and can be downloaded online from three IGS data archive centers with about 2 ∼ 4-week delay from real time.
Table 1. Characteristics of Data Sets Used in This Study
 Starting from October 2000, the 5-min ZPD data at all IGS sites are also produced using the precise point positioning approach [Byun et al., 2005; Humphreys et al., 2005]. Note that we still use the 2-hourly legacy product for the whole period and for all analyses conducted in this study. The new GPS product is superior to the 2-hourly legacy product with higher temporal resolution and more stations, and claims higher accuracy [Byun et al., 2005]. The limitation of the legacy ZPD product is summarized by Byun et al. . For long-term climate applications, there is a particular concern in the lack of consistency over time in the legacy product resulting from occasional changes made by individual ACs in their GPS data handling (e.g., different elevation cutoff angles, revised antenna phase maps) and their ZPD estimation algorithms (e.g., new mapping functions, different constraint schemes on the analysis parameters). To minimize the impacts of such changes upon the corresponding time series, we have applied quality controls on the final PW data (see details in section 3).
2.2. Auxiliary Data
 A series of auxiliary data (Table 1) are used for converting ZPD to PW described in section 3 and comparing with the GPS-PW data in section 4.
 Three-hourly surface synoptic observations of surface air pressure (Ps), temperature (Ts) and other meteorological variables are available from over 15,000 stations around the globe from 1997 to present [Dai and Wang, 1999]. The Ps and Ts data are used to calculate the Ps at GPS stations with temporal, horizontal and vertical interpolations (see section 3). The synoptic data are screened using a range check and an outlier check. The range limits are −80°C to 50°C for Ts and 550 hPa to 1100 hPa for Ps. The range check removes ∼0.012% of data points. The outlier check removes data points outside the range of annual mean plus/minus three standard deviations at each weather station, which excludes ∼0.6% of data points.
 The National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) global reanalysis products are available from 1948 to present at 6 hour intervals and T62 (∼1.875° × 1.875°) horizontal resolution with 28 hybrid vertical levels (Table 1) [Kalnay et al., 1996]. The 6-hourly geopotential height, temperature and relative humidity profiles from the NCEP/NCAR reanalysis are used to calculate Tm with some adjustments (see section 3).
 The Integrated Global Radiosonde Archive (IGRA) is a newly released radiosonde data set from NOAA's National Climatic data Center (NCDC) (Table 1) [Durre et al., 2006]. The data set consists of 1–4 radiosonde observations per day at more than 1,000 globally distributed stations for the period 1938 to present [Wang et al., 2005]. The PW data from the GPS and radiosonde are compared at colocated stations in section 4.
 The GPS-derived PW data are also compared with the PW from microwave radiometers (MWR) at Darwin, Australia, Toulouse, France, and Onsala, Sweden (see Table 1 for details). The MWR PW data at Darwin were collected at the Atmospheric Radiation Measurement (ARM) Darwin site, which is located about 53 km northwest of the Darwin GPS station (DARW). A two-channel (23.8 GHz and 31.4 GHz) MWR has been operating at the ARM Darwin site since April of 2002. The 2-month (22 August to 21 October 2002) MWR data in Toulouse, France are from Van Baelen et al. , where a Radiometrics Co. 12-channel microwave radiometer profiler was operated. A MWR collocated with the GPS receiver at Onsala measures the sky emission at 21.0 GHz and 31.4 GHz and has been in operation in a continuous sky-scanning mode since 1993 [Elgered and Jarlemark, 1998]. We obtained the equivalent zenith wet delay data from this MWR from G. Elgered (personal communication, 2005) and converted them to PW using Emardson et al. [1998, equation (3)].
 Eight-year (1997–2004) monthly mean PW data from the International Satellite Cloud Climatology Project (ISCCP) [Rossow and Schiffer, 1999] were downloaded from http://isccp.giss.nasa.gov/products/browsed2.html. The ISCCP water vapor data set is produced from operational TOVS (TIROS Operational Vertical Sounder) products with relayering, regridding and filling with a climatological PW data [Zhang et al., 2004]. Zhang et al.  shows that global mean PW difference between ISCCP and other data sets is less than 3 mm over land and up to 3.5 mm over ocean. The latitudinal and seasonal variations of PW from GPS are compared with those from the ISCCP data set.
3. Analysis Technique and GPS PW Data Set
3.1. Analysis Technique
 The ZPD can be partitioned into two parts, the ZHD and ZWD. The ZHD can be estimated from the surface pressure with good accuracy [Saastamoinen, 1972; Askne and Nordius, 1987; Elgered et al., 1991]. The ZWD can be obtained by subtracting ZHD from ZPD. Since the ZWD is a function of atmospheric water vapor and temperature, this allows PW to be calculated if the water vapor pressure-weighted mean temperature of the atmosphere (Tm) can be estimated [Elgered et al., 1991; Bevis et al., 1992, 1994]. We follow Bevis et al. [1992, 1994] for deriving PW from ZPD. In summary, Ps and Tm are required to convert ZPD to PW. The analysis technique is summarized in Figure 3. Ps and Tm calculations are described in detail below. Gutman et al.  presented a method to derive Ps from surface synoptic observations and Tm from a mesoscale numerical model output, which is similar to a certain extent to our method explained below, but has some fundamental differences.
 Only about 70 IGS stations provide surface meteorological data including Ps. In addition, on the basis of our evaluation, the GPS surface meteorological data are very noisy and cannot be used without careful examination and quality control [Wang et al., 2006]. Therefore 2-hourly Ps at the GPS station was derived from the 3-hourly surface synoptic observations through spatial and temporal interpolation. The steps for such calculation are listed in Figure 4 and summarized below:
 1. For a given GPS station, a 50-km radius is drawn to select nearby synoptic stations. Gutman et al.  concluded that the synoptic stations within 50 km of a GPS station can be used to derive Ps at the GPS site with about 0.5 hPa bias.
 2. For each of these stations, the hydrostatic and ideal gas equations are used to adjust Ps from the synoptic station height (hs), referred to as Ps(hs), to that at the GPS station height (hg), referred to as Ps(hg) in Figure 4. The temperature profile from hs to hg is constructed by assuming a typical lapse rate for moist adiabatic conditions, −6.5 K/km. As discussed by Wang et al. , the −6.5 K/km lapse rate is a good approximation of the tropospheric mean lapse rate.
 3. The Psi(hg) that is Ps(hg) at selected ith station is averaged for all selected stations using the inverse of their distance to the GPS station as the weight to obtain the Ps at the GPS station height and location (annotated as ). (4) is linearly interpolated in time to the 2-hourly resolution at 0100, 0300, 0500, …, 2300 UTC.
 Among these steps, the vertical interpolation (step 2) is the most important one. For example, Figure 5 shows the size of the correction for Ps at the IGS station Graz, Austria (GRAZ) and a mountain station at Laguna Mountains, California (MONP) derived from two and one nearby synoptic stations, respectively, when compared to the corresponding GPS system surface meteorological data. At GRAZ, the scatter of the synoptic Ps from the two stations is significantly reduced after adjustments. For the mountain station, the synoptic station is 1072 m above the GPS station. After adjustments, the mean Ps difference between synoptic and GPS met data is reduced to −0.93 hPa from 106.42 hPa. However, the adjusted values show a larger dispersion primarily due to the fact that there can be temporary departures from the constant temperature lapse rate used over such an altitude difference that can affect the adjustment procedure. The comparisons of Ps at 48 GPS sites derived here with that from the GPS surface meteorological data show averaged RMS difference of 1.65 hPa.
 The accuracy of GPS-estimated PW is directly related to the accuracy of Tm and Ps [Bevis et al., 1994]. Tm can be either calculated from temperature and humidity profiles or estimated from the surface temperature, Ts. Our evaluations of various Tm estimates on a global scale show that, in the absence of local temperature and humidity profile data, the best option to estimate Tm is to calculate Tm using 6-hourly ERA-40 temperature and humidity profiles with adjustment to GPS station heights and observation times [Wang et al., 2005]. However, ERA-40 data are currently available from 1948 to 2002 only, and it is unclear whether and when the data after 2002 will be available. Therefore we chose the next best option, the NCEP/NCAR reanalysis, to calculate Tm for the whole period (1997–2004) for consistency as the NCEP/NCAR reanalysis is available from 1948 to present. Note that both ERA-40 and NCEP/NCAR reanalyses produced reasonable estimates of Tm over the globe [Wang et al., 2005]. Global, annual average of monthly mean Tm difference between ERA-40 and NCEP/NCAR reanalysis is −0.206 K with a standard deviation of 1.297 K [Wang et al., 2005].
3.2. Global GPS PW Data Set
 The analysis technique was applied to the 2-hourly ZPD data to create a global, 2-hourly PW data set. In spite of quality checks of all the input data in Figure 3 (ZPD, Ps and Tm), the derived PW may still contain outliers. We applied an outlier check on the PW data. Mean and standard deviation of PW are calculated for each month at each station. Then we rejected PW values differing from the mean by greater than four standard deviations. The rejected data points are less than 0.1% of total data points. For those rejected data points, 57% of them have ZPD values beyond the range of mean ±3*SD (standard deviation). Only 5% and 3% of them are due to extreme Tm and Ps values (beyond mean ±3*SD range), respectively. The rest of them could be real values or due to other unknown reasons.
 The number of stations for each month in the PW data set (Figure 2) is about 7 to 70 less than that in the original ZPD data set because nearby synoptic Ps data are unavailable at these stations. Our final product is a 2-hourly PW data set at 0100, 0300, 0500, …, 2300 UTC from 1 February 1997 to 31 December 2004 at 80 to 268 stations around the globe (Figure 1). The product also includes 2-hourly surface pressure derived from the synoptic observations, Tm from the NCEP/NCAR reanalysis, original ZPD, and calculated ZHD and ZWD. The data set is available to other investigators and can be obtained from the leading author (J. Wang) of this paper.
3.3. Error Analysis
 On the basis of the analysis procedure to derive PW in Figure 3, GPS-derived PW bears errors due to errors in ZPD, Ps and ∏. The relative error in ∏ is approximately (to a very good degree) equal to the relative error in Tm [Bevis et al., 1994; Wang et al., 2005]. Therefore we only discuss the errors in PW due to that in ZPD, Ps and Tm.
 Sources of ZPD errors include measurement errors (receiver noise, site-dependant multipath and antenna phase delays), GPS satellite orbit error, ionospheric corrections, elevation cutoff angles, mapping functions and the lengths of the baselines in the GPS solutions [e.g., Bevis et al., 1992; Tregoning et al., 1998; Gutman et al., 2004b]. Increasing elevation cutoff angles raises the random error of ZPD and also introduces a negative bias to ZPD [Emardson et al., 1998; Tregoning et al., 1998; Liljegren et al., 1999]. The ZPD error due to elevation cutoff angles is difficult to quantify because of different angles used by different ACs. Five out of all seven ACs use the mapping function from Niell , which is not very accurate in the Southern Hemisphere and does not include mapping function variations on timescales less than 1 year. As a result, the IGS ZPD product is likely less accurate in the Southern Hemisphere. In addition, the diurnally invariant mapping function can introduce diurnal errors in ZPD and thus in PW [Humphreys et al., 2005].
 The absolute and relative errors in PW associated with errors in ZPD, Ps and Tm can be estimated by taking the partial derivative of the equations presented in Figure 3:
where ΔPW(ZPD), ΔPW(Tm) and ΔPW(Ps) are PW errors due to errors in ZPD, Tm and Ps (ΔZPD, ΔTm and ΔPs), respectively. Using an approach similar to Deblonde et al. , we did an error analysis of GPS_PW using equations (2)–(4) given errors in ZPD, Tm and Ps and summarized the results in Table 2. The minimum, maximum and mean values of PW along with corresponding ZPD, ZHD, ZWD, Tm and Ps values are first computed from the 2004 PW data at all stations and then are used to compute the absolute and relative errors in PW for dry (minimum PW), moist (maximum PW) and mean (mean PW) atmosphere conditions. We use the claimed 4 mm precision in the IGS ZPD product, the RMS error of 1.3 K in Tm calculated from the NCEP/NCAR reanalysis based on comparisons with radiosonde data [Wang et al., 2005, Table 1], and averaged RMS error of 1.65 hPa in Ps based on comparisons with GPS surface meteorological data. Individual PW errors are all less than 1 mm for all three conditions (Table 2). Total PW errors are 1.25 mm, 1.44 mm and 1.32 mm for the dry, moist and mean atmosphere conditions, respectively (Table 1).
Table 2. Maximum, Minimum, and Mean of PW Using 2004 Data at All Stations, Corresponding Values of ZPD, ZHD, ZWD, Tm and Ps, GPS PW Absolute and Relative Errors Computed From Equations (2), (3) and (4) as a Result of Given Errors in ZPD, Ps and Tm for Dry (Minimum PW), Moist (Maximum PW) and Mean (Mean PW) Atmosphere Conditions, and Total PW Errors
PW Errors, mm/%
Total PW error, mm/%
4. Comparisons With Radiosonde, MWR, and Satellite Data
 On the basis of the error analysis in section 3.3, total PW error due to errors in ZPD, Ps and Tm varies from 1.2 mm to 1.5 mm. Careful procedures have been developed to calculate Ps and Tm at GPS station height and location (see section 3 and Wang et al. ). Additional quality checks have been applied to ZPD, Ps and Tm. They all help minimize the error in PW. In this section the 2-hourly GPS-derived PW data are evaluated through comparisons with the PW data from a global radiosonde data set (IGRA), MWR measurements at three locations, and the ISCCP satellite observations. It is always challenging to validate one product without a ground truth. The PW differences between GPS and other data presented below will be examined carefully and explained on the basis of the best knowledge of known errors in all data sets. The disagreement between GPS and other data sets can originate from errors in either data set or both of them.
4.1. Comparisons With Radiosonde Data
 In the GPS PW and IGRA data sets for 2003 and 2004, we found 102 stations where GPS and radiosonde stations are located within 50 km and have elevation differences less than 100 m. The GPS PW values within an hour of radiosonde launch times are compared with those from radiosondes at the 102 stations. Radiosonde temperature and humidity profiles are required to reach at least 300 hPa for the calculation, and have data available at the surface and at least five (four) standard pressure levels above the surface for stations below (above) 1000 hPa. The analysis of how sensitive PW is to missing data above the tropopause, 300 hPa, 500 hPa and 700 hPa shows that missing data above 500 hPa and 300 hPa would introduce a dry bias of 2.44% and 0.61% in PW, respectively. After enforcing the requirement on the radiosonde data, four stations have less than 50 pairs of matched data points and thus are removed from the comparisons. Figure 6 shows the mean and standard deviation of the GPS – IGRA PW difference from all the 98 stations, which are grouped and separated by radiosonde types. The mean difference is 1.08 mm (drier in the radiosonde data), which results primarily from the dry biases at 76 stations launching Vaisala sondes [cf. Wang et al., 2002b]. MRZ/Mars (Russian) and IM-MK3 (Indian) radiosondes show systematic moist bias (Figure 6). All three sites launching IM-MK3 radiosondes have the largest standard deviation of the difference (larger than 6.5 mm). The RMS difference computed from all data points at all 98 stations is 2.83 mm, which is consistent with the 2–3 mm RMS error of GPS-derived PW by previous studies [e.g., Gendt et al., 2004; Tregoning et al., 1998; Dietrich et al., 2004; Deblonde et al., 2005]. The RMS error of 2.83 mm based on comparisons with radiosonde data is larger than that of <1.5 mm estimated from the theoretical calculation in section 4. This is not surprising because of uncertainties in radiosonde PW data, spatial and temporal separations between GPS and radiosonde data and the fact that GPS observations by viewing satellites in all directions sample a different volume of air than a radiosonde [Liou et al., 2001; Braun, 2004].
Figure 7 compares the GPS and radiosonde PW at six individual stations. The GPS PW at HERT (U.K.) and NYA1 (Norway) agrees well with the radiosonde data (Figure 7). At the two Brazilian stations (Figure 7, middle) where Vaisala RS80-A radiosondes were launched, radiosonde PW is systematically drier than GPS PW because of known dry bias in Vaisala radiosonde data [Wang et al., 2002b]. Radiosonde PW at a Russian station (KHAJ) is systematically larger than the GPS PW by 2.64 mm on average, which is possibly due to the slow response of the Russian radiosonde's humidity sensor (goldbeater's skin). The comparison at an India station (BAN2) has the largest scatters and shows wetter radiosonde data at PW < ∼30 mm. The dry and moist biases of upper tropospheric water vapor in capacitive and goldbeater's skin humidity sensors, respectively, are also revealed by Soden and Lanzante  in comparison to GOES satellite data. The application of using the GPS PW data to identify the errors/biases in radiosonde humidity data will be explored further in detail in the future.
4.2. Comparisons With MWR Data
 For comparison with the 2-hourly GPS PW, the high-resolution MWR data (Table 1) were interpolated to the median times of the GPS data (0100, 0300, 0500, …, 2300 UTC). A second-order polynomial fit was applied to all available MWR data points within the GPS 2-hour time window (e.g., 0000–0200 UTC for 0100 UTC) to estimate the values at the times of the GPS data. All MWR measurements contaminated by rain were removed from the comparison by using the wet window flag in the data.
 The comparison between the GPS at the Darwin GPS site and MWR PW at the ARM Darwin site for 2002–2004 is shown in Figure 8. It can be seen that (1) MWR PW has a systematic positive bias (>20 mm) in 2002 (from July to September), (2) there are some very large MWR PW values (>70 mm) in 2002 and 2003 (only two in 2004), and (3) the 3-year data show that MWR PW is systematically higher by ∼3.17 mm after removing the outliers described in points 1 and 2. Since the Darwin GPS site sits 95 m higher than the MWR site, the PW from the MWR site is expected to be higher. MWR data are also compared with colocated radiosonde data at the ARM Darwin site (not shown) and have the same behaviors shown in Figure 8 but with smaller scatter. This suggests that the discrepancies between the GPS at the Darwin GPS site and MWR PW at the ARM Darwin site are mainly attributable to problems in the MWR data, which are discussed in the ARM data quality reports (J. Liljegren, personal communications, 2005). The bias in MWR PW data from July to September in 2002 resulted from elevated sky brightness temperatures for an unknown reason and this was corrected later. The very large PW values in 2003 (also some in 2002) are a result of incorrect wet window flag during times of rain and should be removed. Besides the location-related PW difference, the systematic moist bias in the MWR data is also due to limitations in the Rosenkranz-based retrieval coefficients [Liljegren et al., 2005]. Since 30 June 2005, the new MONORTM-based coefficients have been used, and the PW was reduced by about 3%. This bias in the earlier data can be corrected by multiplying a factor of 0.9695. Since the GPS station is 95 m higher than the MWR site, we applied a correction to the MWR data by removing the contribution of the first 95 m humidity to the PW using radiosonde data at the ARM Darwin site. The correction is a second-order polynomial fit to the PW calculated from the surface to 95 m AGL as a function of PW from the surface to the top of the atmosphere using the radiosonde data. After these two corrections, the GPS-MWR mean difference for 2004 decreases from −3.17 mm to −0.23 mm with a standard deviation of the differences of 4.18 mm (Figure 9, top). The comparison at Darwin illustrates the usefulness of the GPS PW data for evaluating the quality of other measurements, which is MWR in this case.
 The comparisons between GPS and MWR PW at Toulouse (IGS station TLSE) and Onsala (IGS station ONSA) are also shown in Figure 9. At the Toulouse site, GPS PW values are consistently drier than those from the MWR by a mean of 1.66 mm, which cannot be explained by any known factors. Van Baelen et al.  compared the Toulouse MWR data with GPS data at CNRM (station code TLMF, not an IGS station) and did not find the systematic bias shown in Figure 9. In that case study, the GPS PW values were slightly wetter than the MWR estimates with a mean bias of 0.02 mm. The MWR site is about 1 km due north from the TLSE GPS station and about 7.5 km to the southeast of the TLMF station, so the displacements in location should not be an issue. The PW data at TLSE and TLMF were derived using different Ps and Tm data between this study and Van Baelen et al. . The differences in Ps and Tm used at two sites are small and have the wrong sign to explain the drier GPS PW. Dry biases in GPS PW at other locations were also reported in previous studies [e.g., Rocken et al., 1993; Tregoning et al., 1998; Liljegren et al., 1999; Ohtani and Naito, 2000; Emardson et al., 1998]. Liljegren et al.  and Emardson et al.  found that lowering the cutoff elevation angle by several degrees can increase the GPS PW by 0–2 mm and thus substantially reduce the difference between MWR and GPS measurements. Note that the 15° cutoff elevation angle was used by four ACs, and 7°, 10° and 20° angles were used by other three ACs, respectively [Gendt, 1998]. Usually only one IGS AC contributed to the TLSE ZPD estimate, but its cutoff elevation angle is unknown. The TLMF site used a 10° cutoff elevation angle in the work by Van Baelen et al. .
 The 158 coincident PW measurements obtained by GPS and MWR in 1997, 1999 and 2001 at Onsala show no systematic differences between them (Figure 9). The MWR and GPS antenna at Onsala were less than 10 m apart horizontally and less than 1 m apart in elevation [Emardson et al., 1998]. The standard deviation of the difference between GPS and MWR is 1.22 mm at both Toulouse and Onsala. The large standard deviation at Darwin (4.18 mm) is likely due to the large (53 km) lateral separation of the two instruments.
4.3. Comparisons of Latitudinal and Seasonal Variations With ISCCP Data
 The latitudinal variations of the GPS PW are compared with those from the ISCCP PW for winter, summer and annual mean in Figure 10. The 8-year (1997–2004) averaged seasonal mean PW is computed from both ISCCP and GPS data. In this calculation, we only used the ISCCP data at those 2.5° × 2.5° grid boxes inside which one or more GPS stations were located, and both GPS and ISCCP data were averaged over each 10° latitude zone. In general, the latitudinal variations in the GPS and ISCCP PW agree with each other, with both showing a maximum of 40–50 mm near the equator corresponding to the intertropical convergence zone (ITCZ) and decreasing polarward in each hemisphere but with a steeper gradient in the Southern Hemisphere (S.H.). The large difference occurs in 10–30°N in both summer months JJA (June–July–August) and winter months DJF (December–January–February), which could be partially due to large number of GPS mountain stations (more than 30%) in these zones because the smaller PW is expected at a point mountain station than over a 2.5° × 2.5° grid box that contains both mountains and plains. Using all ISCCP data did not significantly alter the latitudinal patterns shown in Figure 10, which suggests that the GPS network is sufficient for sampling the zonal mean PW.
Figure 11 shows the seasonal variations of global and hemispheric area-weighted averages of PW from the GPS and ISCCP data sets. Monthly mean PW values (averaged over 1997–2004) were calculated from the whole and a subset (of the boxes containing the GPS stations) of the ISCCP data set. Figure 11 indicates that the seasonal PW variations are generally comparable between the GPS and ISCCP data sets, with peak (minimum) values in July and August for the Northern (Southern) Hemisphere. Seasonal changes of PW (larger values in summer) are more pronounced in the Northern Hemisphere (N.H.) than in the S.H. at all latitudes and on hemispheric means (Figures 10 and 11). However, the full ISCCP data set shows higher PW values (by ∼5.5 mm) for the N.H. mean and slightly lower for the S.H. than the GPS data. These differences result mainly from limited sampling by the GPS stations (e.g., few over N.H. oceans where PW is higher than over land, see Figure 1) as the resampled subset of the ISCCP data matches the GPS PW much better (but note the large positive biases in February and March for the S.H.). Both Figures 10 and 11 suggest that the differences between GPS and ISCCP data are more pronounced in the S.H. than the N.H.
5. PW Diurnal Variations
 The diurnal cycle is one of the most important climate signals, but there is a lack of data with sufficient temporal resolution to study diurnal variations of water vapor. The 2-hourly GPS-derived PW can fill this gap. Seasonal mean diurnal anomalies are presented in four regions: Europe, 30–70°S, N.H. Mountains and Darwin as examples for studying diurnal PW variations using our 2-hourly PW data set (Figure 12). The diurnal anomaly at each station is computed by removing the daily mean value at each day; seasonal mean diurnal anomaly at each station is then attained and is averaged for all stations in each region. Europe has the densest GPS network with 110 stations. The PW diurnal cycle in Europe is strongest in summer with an amplitude of ∼0.6 mm and is slightly weaker in fall and winter, but is negligible in spring. The PW in Europe peaks at noon, late afternoon (1600–1800 LST) and early evening (2000–2200 LST) in winter, fall and summer, respectively. The PW diurnal cycle in 30–70°S, N.H. Mountains and Darwin has similar phase in four seasons but different amplitudes. The 30–70°S region has the smallest diurnal cycle, while Darwin has the largest one with peak-to-peak amplitude of larger than 2 mm. In the mountain region PW peaks from late afternoon to early evening and has the largest diurnal cycle in summer. PW diurnal cycle is controlled by precipitation, large-scale vertical air motion, surface evapotranspiration, wind direction (for coast area) and other factors [Dai et al., 2002].
 We have developed an analysis method to convert the ZPD derived from ground-based GPS measurements to PW on a global scale. The method focuses on deriving reliable surface pressure (Ps) and water-vapor-weighted atmospheric mean temperature (Tm), two key parameters to estimate PW from ZPD. Ps was derived from global, 3-hourly surface weather observations with horizontal and vertical adjustments and temporal interpolation. Tm was calculated from 6-hourly NCEP/NCAR reanalysis profiles with temporal, vertical and horizontal interpolations.
 This method was applied to the global, 2-hourly ZPD data product from the IGS data analysis centers to produce a 2-hourly PW data set at the 80 to 268 IGS stations available from February 1997 to December 2004. Total PW error associated with errors in ZPD (4 mm), Tm (1.3 K) and Ps (1.65 hPa) is less than 1.5 mm. The GPS-estimated PW was compared to radiosonde, MWR and satellite PW data. The comparisons of PW at 98 colocated GPS and radiosonde stations around the globe show a mean difference of 1.08 mm (drier in the radiosonde data) and a standard deviation of the differences of 2.68 mm. The mean difference results primarily from known dry biases in the Vaisala radiosonde humidity data. The GPS-estimated PW compares well with the MWR data at three different stations with a mean bias of less than 2 mm, a standard deviation of 1.22 mm at two stations where the GPS receiver and MWR are less than 2 km apart, and a large standard deviation at the Darwin site where the GPS receiver and MWR are 53 km apart. The GPS PW shows latitudinal and seasonal variations comparable to those from a subset of the ISCCP data set sampled only at grid boxes containing one or more GPS stations. The poor sampling over the N.H. oceans and other regions by the GPS stations may explain the underestimation of hemispheric averages of PW in the N.H., but reveals an interesting and significant discrepancy in the S.H.
 The 8-year, 2-hourly GPS PW data set provides a new source of data for atmospheric water vapor. It complements existing data sets derived from radiosonde and satellite measurements with high-temporal-resolution, long-term stability and low costs. Preliminary results are presented on the scientific applications of this data set, identifying systematic biases in global radiosonde humidity data, identifying an interesting discrepancy with ISCCP satellite observations in the S.H., and quantifying the regional variation in diurnal moisture anomaly. We also encourage other researchers to use this data set for their research. The data set can be acquired from the authors. We plan to update the data set continuously, and release it to the public through the Web.
 This work was supported by NCAR Director Office's Opportunity Fund and NOAA fund (NA06OAR4310117). J. Wang would like to acknowledge the support from NCAR TIIMES Water Cycle Program. We thank Todd Humphreys (Cornell University) and John Braun (UCAR) for constructive discussions, Imke Durre (NOAA) for providing the IGRA data, Dennis Shea (CGD/NCAR) for helping us with NCL programming, M.-E. Gimonet from ONERA for operating and providing the data of the Toulouse microwave radiometer, and Gunnar Elgered (Chalmers University of Technology, Sweden) for providing the MWR data at Onsala. We are grateful for the useful comments from three reviewers. The National Center for Atmospheric Research is sponsored by the U.S. National Science Foundation.