The lack of global soil moisture data has spurred research in the field of microwave remote sensing. Both passive (radiometers) and active (scatterometer) microwave data are very sensitive to the moisture content of the surface soil layer. To retrieve soil moisture, the effects of vegetation, surface roughness, and heterogeneous land cover must be taken into account. Field experiments have shown that passive microwave data at long wavelengths (L-band) are best suited for soil moisture retrieval. Nevertheless, the first global, multiannual soil moisture data set (1992–2000) has been derived from active microwave data acquired by the European Remote Sensing Satellites (ERS) ERS-1 and ERS-2 scatterometer (C-band). The retrieval algorithm is based on a change detection approach that naturally accounts for surface roughness and heterogeneous land cover. In this paper the scatterometer-derived soil moisture data are compared to gridded precipitation data and soil moisture modeled by a global vegetation and water balance model. The correlation between soil moisture and rainfall anomalies is observed to be best over areas with a dense rainfall gauge network. Also, the scatterometer-derived and modeled soil moisture agree reasonably well over tropical and temperate climates. The fact that the algorithm performs equally well for regions with summer rain and Mediterranean areas indicates that dynamic vegetation effects are correctly represented in the retrieval. More research is needed to better understand the backscattering behavior over dry (steppe, deserts) and cold (boreal zone, tundra) climatic regions. The scatterometer-derived soil moisture data are available to other research groups at http://www.ipf.tuwien.ac.at/radar/ers-scat/home.htm.
 The term soil moisture refers to the water stored in the pores of the soils. Soil moisture controls the partitioning of rainfall into infiltration and surface runoff, limits evaporation and transpiration, and is the water reservoir crucial for plant growth. According to Henderson-Sellers  “the importance of soil moisture to many, diverse communities has resulted in large number of numerical models all of which simulate soil moisture.” Climate change modelers, meteorologists, hydrologists, agronomists, and many others have developed different land-surface schemes and soil moisture models that suit their special requirements. Also, within a certain group of models, a range of land-surface schemes may be employed. For example, Cramer et al.  noted that global models of terrestrial net primary productivity use, and are very sensitive to, different modeling strategies for the water balance.
 Quantitative comparisons with ground observations have shown that large differences in simulated soil moisture may occur even in situations where high-quality forcing data are available [Shao and Henderson-Sellers, 1996]. Robock et al.  showed that neither the amplitude nor the interannual variation of soil moisture modeled with 30 different atmospheric general circulation models was correctly simulated, while the phase of the seasonal cycle was generally well captured. They noted that the more complicated models (multiple soil levels, immediate and subsurface runoff, interception, and reevaporation by vegetation) did not perform better than simple bucket models. Similarly, land-surface models investigated within the framework of the Global Soil Wetness Project (GSWP) did a fairly good job of reproducing the seasonal cycle of soil moisture but exhibited a variable bias with respect to the field measurements [Entin et al., 1999]. According to Western et al.  the performance of water budget models appears to be quite poor particularly in high-latitude areas and in the description of interannual variability. These problems stem not only from algorithm deficiencies but are also related to the accuracy of the atmospheric forcing. Similarly, runoff estimates from large-scale hydrological models tend to be too low especially in high latitudes, which is probably also attributable to uncertainties in the climatic input data [Döll et al., 2003; D. Gerten et al., Terrestrial vegetation and water balance—Hydrological evaluation of a dynamic global vegetation model, submitted to Journal of Hydrology, 2003, hereinafter referred to as Gerten et al., submitted manuscript, 2003].
 Land-surface schemes often use a large number of parameters. The ultimate goal of achieving a “correct” physical description of land-surface processes often adds to the complexity of the models. On the other hand, the number of observed data, which can be used for model validation, calibration, or input is often limited. This problem is particularly felt for global-scale applications. As a result, there are often too many model parameters and not enough data. In this situation, both the model structure and model parameters cannot be assessed properly [Grayson et al., 2002]. According to Western et al.  there remain three major challenges for large-scale modeling efforts: (1) to develop a process description appropriate for the considered scale; (2) to develop appropriate parameter sets describing the hydrologic and thermodynamic properties of the land surface; and (3) to develop appropriate observational data sets for model testing and parameter identification, and to reconcile the fundamental differences between the model state variable and the measurements. Any measurement of hydrological state variables is useful in this context but particularly the lack of global soil moisture observations has long been felt [Dirnmeyer, 1995].
 The need for soil moisture observations and the limited availability of field measurements has prompted much research in the field of remote sensing. Among the various remote sensing techniques, approaches based on microwave measurements offer the most direct means of retrieving soil moisture [Engman, 2000]. Therefore in the next section a short review of microwave techniques for global soil moisture retrieval is presented. This includes a discussion of our method to retrieve soil moisture from C-band scatterometer data. The sensor has been flown on the European Remote Sensing Satellites (ERS) ERS-1 and ERS-2. Then, the motivation and the methods for comparing the remotely sensed soil moisture data with modeled soil moisture data and precipitation data are discussed. The simulated data stem from the Lund-Potsdam-Jena (LPJ) dynamic global vegetation model forced with observed meteorological data. The gridded precipitation data are from the Global Precipitation Climatology Centre (GPCC). This is followed by a presentation of the results and discussion, in which the results are analyzed from a remote sensing perspective.
2. State of the Art
 For global monitoring of soil moisture, microwave radiometers and scatterometers can be used. Like the more widely known Synthetic Aperture Radars (SARs) scatterometers are side-looking radars that transmit an electromagnetic pulse and measure the energy scattered back from the Earth's surface. For a microwave radiometer, the energy source is the target itself, and it is merely a passive receiver [Ulaby et al., 1981]. Radiometers measure the intensity of the emission of the Earth's surface that is related to the physical temperature of the emitting layer and the emissivity of the surface. Despite the different measurement processes, active and passive methods are closely linked through Kirchhoff's law which, applied to the problem of remote sensing of the Earth's surface, states that the emissivity is one minus the reflectivity [Schanda, 1986]. Therefore in both scatterometry and radiometry one deals in principle with the same physical phenomena, though in practice, the models required for the retrieval of geophysical parameters may differ significantly.
2.1. Passive Microwaves
 Since 1978 a series of microwave radiometers has been launched by the United States: the Scanning Multichannel Microwave Radiometer (SMMR) was operated between 1978 and 1987 followed by the Special Sensor Microwave Imager (SSM/I) in 1987. With respect to the issue of soil moisture retrieval it is important to note that while SMMR had a number of low-frequency channels (6.6, 10.7, and 18.0 GHz), the lowest frequency of SSM/I is 19.4 GHz. The latest generation of radiometers is the Advanced Microwave Scanning Radiometers (AMSRs), AMSR and AMSR-E, which were launched in 2002 on the AQUA and ADEOS II satellites. These instruments receive at six frequencies ranging from 6.9 to 89.0 GHz, with a spatial resolution ranging between 56 (6.9 GHz) and 5.4 km (89 GHz).
 Over land surfaces the emitted radiation depends on the soil moisture content because the emissivity of a bare soil surface changes from about 0.95 when dry to 0.6 when wet [Schmugge et al., 1986]. According to Jackson and Schmugge  soil moisture retrieval from passive microwave data involves consideration of land cover heterogeneity, correction of atmospheric effects, determination of the emissivity by dividing the observed brightness temperature by the physical temperature of the emitting soil layer, separation of soil dielectric, vegetation, and surface roughness effects on the emissivity, and finally the use of dielectric mixing models to estimate the surface soil moisture content from the retrieved dielectric constant.
 Numerous studies that used the 6.6-GHz channel of the SMMR for soil moisture retrieval have reported very encouraging results [e.g., van de Griend and Owe, 1994]. In a recent study, Njoku and Li  were capable of simultaneously retrieving soil moisture, vegetation water content, and surface temperature over western Africa. Their retrieval method is based on a radiative transfer model for land-surface and atmospheric emission, with model coefficients that are tuned over specific calibration regions and applied globally. Also, for the SSM/I, whose lowest frequency, 19.4 GHz, is generally not deemed well suited for soil moisture retrieval, successful demonstration studies have been reported. De Ridder  employed a physical model combined with a time series processing scheme to retrieve surface soil moisture from the SSM/I 19.4-GHz data. This approach does not require any ground-based data, employing solely information in the satellite signal itself. The utility of both the 6.6- and 18-GHz channels for soil moisture retrieval was demonstrated by Vinnikov et al. . In a study of SMMR data over Illinois they observed a satisfactory correlation of the microwave emissivity and the polarization difference at frequencies ≤18 GHz to in situ soil moisture measurements of the top 10-cm layer. Recently, de Jeu  retrieved soil moisture from the SMMR 6.6- and 18-GHz channels over five test areas worldwide (northern and southern Illinois, Iowa, Turkmenistan, and Mongolia), which compared well with field observations.
 In the 1990s a large number of large-scale experiments (Monsoon '90, Washita '92 and '94, Hapex-Sahel, Southern Great Plain '97 and '99, etc.) were carried out [Schmugge, 1998; Jackson and Hsu, 2001]. These experiments clearly showed that longer wavelengths are beneficial for soil moisture retrieval due to their greater penetration depth into vegetation. This led to the requirement that a satellite mission dedicated for measuring soil moisture should be operated at L-band. Also, influential researchers in the field started to arrive at the conclusion that retrieval from active measurements is more confounded by roughness, topographic features, and vegetation than the retrieval from passive data [Engman and Chauhan, 1995]. Therefore recent proposals for dedicated satellite missions to monitor soil moisture from space have relied on passive microwave concepts in L-band. In 1999 the European Space Agency (ESA) selected the Soil Moisture and Ocean Salinity (SMOS) mission as the second Earth Explorer Opportunity mission [Mégie and Readings, 2000]. The launch is currently foreseen for 2006. SMOS is a microwave radiometer operating at L-band (1.4 GHz, 21 cm), which will employ a complex two-dimensional interferometer technique to achieve a ground resolution on the order of 50 km [Kerr et al., 2001]. The goal is to obtain surface soil moisture data with an accuracy of 0.04 m3 m−3 or better.
 Despite the advances made in passive microwave remote sensing of soil moisture, no global products are as of yet available. Even for the SMMR, whose 6.6-GHz channel is not too far away from the optimum frequency band (L-band), no global soil moisture data sets have been produced. The first global soil moisture data sets derived from passive data will probably come from AMSR and AMSR-E, for which algorithm development efforts are under way [Koike et al., 2000]. For validation purposes extensive field campaigns (SMEX02, SMEX03, SMEX05) were and will be carried out. The key objective of the validation will be to assess whether and under what conditions the AMSR derived soil moisture products meet the accuracy goal of 0.06 g·cm−3 (about 0.08 m3 m−3 for a bulk density of 1300 kg·m−3) at a 60-km spatial scale [NSIDC, 2003].
2.2. Active Microwaves
 Research on the use of spaceborne radars for soil moisture estimation has focused on the use of SAR systems, which offer a spatial resolution suited to map highly variable soil moisture patterns [Dobson and Ulaby, 1998]. However, comparatively little research has been carried out using spaceborne scatterometers, which have a spatial resolution similar to passive radiometers. Consequently, much of what we know today about radars stems from the experiences made with SARs. At the spatial scale of SARs (tens of meters), a major problem appears to be the statistical description of the roughness of natural surfaces and the limited range of validity of currently available bare soil backscatter models [Davidson et al., 2000]. Because roughness effects make it difficult to extract soil moisture information, Schmugge et al.  do not consider radars as a promising technology in their recent review of hydrologic applications of remote sensing.
 Owing to the strong sensitivity of the backscattering coefficient σ0 to surface roughness, spatial roughness patterns leave a strong imprint on radar images. Also, temporal roughness changes due to agricultural farming practices may cause significant changes in σ0 [Moran et al., 2002]. Nevertheless, several studies have demonstrated the utility of multitemporal spaceborne SAR acquisitions for monitoring soil moisture conditions. On the basis of 32 ERS SAR images acquired over the Orgeval watershed in France, Quesney et al.  developed a methodology for retrieving soil moisture. The algorithm is based on a selection of “sensitive targets” for which vegetation and surface roughness effects can be easily estimated and removed if needed. Their results suggested that at the watershed scale the mean effect induced by different mixed roughness states is almost constant during the year. In a study of eight ERS-2 SAR images (C-band) at three test areas in the Upper San Pedro River Basin in southeast Arizona, Moran et al.  observed a weak correlation between the backscattering coefficient σ0 and the surface soil moisture. Yet for the same data set, the correlation was strong (R2 = 0.93) after σdry0 taken from a dry reference image was subtracted from the other σ0 images. These and similar studies [Tansey et al., 1999; Moeremans and Dautrebande, 2000; Le Hégarat-Mascle et al., 2002] suggest that practical change detection approaches for retrieving soil moisture from SAR series can successfully account for surface roughness effects and to some extent for low vegetation cover.
 The scatterometer on board the two ERS satellites was the first radar, which has acquired a global multiyear data set. The instrument is operated in C-band (5.3 GHz, 5.7 cm) and has a spatial resolution of 50 km. Similar to SARs, semiempirical backscatter models have been used to retrieve vegetation and soil parameters from ERS scatterometer data [Pulliainen et al., 1998; Magagi and Kerr, 2001; Jarlan et al., 2002]. Typically, these models use simple bare soil backscattering models like the one from the work of Oh et al.  and use vegetation models similar in form to the Cloud Model [Attema and Ulaby, 1978]. Most retrieval studies were confined to one climatic region. Recently, Grippa and Woodhouse [2002a] applied their semiempirical model to three study sites situated in different climatic regions (boreal forest, wet-dry tropical, and wet equatorial). The method simultaneously retrieves a roughness parameter, the soil dielectric constant, and the single scattering albedo and optical depth of vegetation. Grippa and Woodhouse [2002a] note the difficulty of modeling the measurement process and point out that scaling processes need to be further investigated. As one step toward a better physical understanding of large-scale radar measurements, Grippa and Woodhouse [2002b] investigate the effect of variable surface roughness conditions with one ERS scatterometer footprint.
 Our method for retrieving soil moisture from ERS scatterometer data is from its conception a change detection method [Wagner et al., 1999a]. As in the work of Moran et al.  a reference backscatter value σdry0 representing backscatter from the vegetated land surface under dry soil conditions is subtracted from the actual σ0 measurements to account for roughness and heterogeneous land cover. In the work of Wagner et al. [1999b] the method has been refined to account for the effects of plant growth and decay by exploiting the multiincidence capabilities of the ERS scatterometer. As a result, time series of the topsoil moisture content ms (<5 cm) are obtained. It is a relative quantity ranging between 0 and 1 (respectively, 0–100%), scaled between zero soil moisture and saturation.
 Despite our model using different parameters, it is similar in functionality to simple radiative transfer models like the Cloud Model [Ulaby et al., 1982]. In these latter models, the effect of vegetation is to a large extent controlled by the optical depth which weights the relative contributions of volume and surface scattering to total backscatter. When vegetation grows, the optical depth increases, and the volume scattering term becomes more important. However, this does not necessarily mean that backscatter increases. In situations where the reduced contribution from the underlying ground is more important than the enhanced volume scattering, σ0 decreases. An increase in σ0 due to vegetation growth is typically encountered at high incidence angles and dry soil conditions and a decrease at low incidence angles and wet soil moisture conditions. This implies that depending on the soil moisture content there is an incidence angle where the effect of vegetation growth is minimal [Wagner, 1998]. In our model, this is taken into account by assuming that the effect of vegetation is negligible at the so-called “cross over” angles, θdry and θwet, which differ for dry and wet soil conditions [Wagner et al., 1999b]. As a result, when the ERS scatterometer observes a flattening of the backscatter curve due to vegetation growth then σ0 decreases at incidence angles lower than the cross over angles and increases at higher incidence angles.
 In order to allow a comparison with soil moisture measurements over a greater depth (up to about 1 m) a two-layer water balance model, which only considers the exchange of soil water between the topmost remotely sensed layer and the “reservoir” below, was used to establish a relationship between the ms series and the profile soil moisture content [Wagner et al., 1999c]. Solving the differential water balance equation showed that the water content in the reservoir at time t is related to the measurements of ms at times ti < t, whereby the influence of ms(ti) decreases with increasing time lag t − ti. An indicator of the water content in the reservoir layer is obtained by convoluting the ms time series with an exponential function:
 The resulting quantity was called the Soil Water Index (SWI), and again it ranges between 0 and 1. The parameter T is a characteristic time length, which controls the degree of smoothing of the ms series and the response time to changes in the surface wetness conditions. In the Ukrainian study SWI was found to be correlated with gravimetric field measurement of soil moisture in the 0–20 and 0–100 cm layers. Typically, the correlation coefficient R was around 0.3–0.5, with maximum values up to 0.78. These are moderate values, but it must be considered that particularly the small-scale variability of the soil moisture field, and to some extent also field measurement errors, introduces large uncertainties. This could be demonstrated by computing the correlation of soil moisture measurements taken over nearby fields, which was typically in the range between about 0.5 and 0.7. Further, it was shown that SWI can be directly related to the plant available water content PAW through the following:
where WL is the wilting level, FC is the field capacity, and TWC is the total water capacity (porosity) of the considered layer expressed in volumetric soil moisture units. The comparison with gravimetric field data showed that the RMS error of PAW calculated with equation (2) was about 0.06 m3 m−3 for the 0–20 cm layer and about 0.05 m3 m−3 for the 0–100 cm layer [Wagner et al., 1999c].
3. Study Approach
 Any retrieval method that aims to derive geophysical parameters from spaceborne scatterometers or radiometers is faced with numerous questions related to the physical appropriateness of the model at the considered scale. One crucial aspect related to this issue is the question of the transferability of the method. In the case that a model cannot be transferred to other areas outside the region where the method was originally developed, one may be content with restricting the validity range of the model. On the other hand, this is an indication that the model structure and the parameters are not fully appropriate. Like the semiempirical model used by Grippa and Woodhouse [2002a] our method of retrieving soil moisture from ERS scatterometer also has been applied with some success over different climatic regions, namely the Canadian Prairies [Wagner et al., 1999a], the Iberian Peninsula [Wagner et al., 1999b], Ukraine [Wagner et al., 1999c], and western Africa [Wagner and Scipal, 2000]. This has motivated us to apply the method on a global basis to produce the first global remotely sensed soil moisture data set [Scipal et al., 2002].
 For the validation of the remotely sensed soil moisture data, independent data sets are required. These may either be derived from ground observations, models, and/or other remote sensing techniques. None of these data sources can be regarded as absolute ground truth: Field observations of soil moisture are point measurements which may or may not be representative of the soil moisture conditions at a scale of 50 km; the difficulties of modeling soil moisture at large scales were discussed in the introduction; equally, the accuracy of other remotely sensed soil moisture data sets is often not known quantitatively. Therefore it is hardly possible to determine the accuracy of the scatterometer soil moisture data in absolute terms based on the comparison with one independent data set. However, strengths and weaknesses of the various data sets are often well understood and it should be possible to draw “intelligent conclusions” in each case. By comparing the scatterometer data with several, independent data sets one should eventually arrive at a solid assessment of the quality and accuracy of the retrieval method.
 In this paper we compare the soil moisture data derived from the ERS scatterometer with monthly soil moisture computed by the LPJ dynamic global vegetation model, and with observed monthly precipitation data from GPCC. The results of an extensive comparison with field measurements from Illinois, Ukraine, Russia, China, and India will be reported in another paper.
4. Description of Data Sets
4.1. Remotely Sensed Soil Moisture
 The ERS-1 scatterometer regularly acquired data between August 1991 and May 1996 and the ERS-2 scatterometer between March 1996 and January 2001, when due to a failure of a gyroscope all ERS-2 instruments were temporarily switched off. ERS-2 has continued to acquire scatterometer data in gyroless mode after January 2001, but ESA has not yet released these data due to calibration problems. The scatterometer achieves a daily global coverage of up to about 41%, but often it is much less due to operational conflicts with the ERS SAR. Therefore the temporal sampling rate is highly irregular depending on latitude and the frequency of SAR acquisitions. The instrument has three antennas, which look at the surface at incidence angles ranging between 18° and 59° from three different azimuthal directions (sideward and 45° forward and backward). For each point on the Earth's surface, three values of the backscattering coefficient are provided together with other relevant orbit and instrument information such as incidence and azimuth angles.
 Data processing is completed systematically and is described in detail in the work of Wagner et al. [1999a, 1999b, 1999c] and Scipal . Like the SSM/I retrieval method of De Ridder , it is a time series processing approach, which essentially employs information from the satellite data itself. In a preprocessing step, the scatterometer data are rearranged from an image format to a time series format without altering the data. In this way, multiyear time series of scatterometer measurements are built up over a predefined global grid. The further processing steps use as input all available scatterometer data (August 1991 to January 2001). The processing steps are as follows:
 1. Estimating the standard deviation (SD) of σ0 due to instrument noise, speckle, and azimuthal effects based on measurements of the forward- and backward-looking antennas.
 2. Determining the incidence angle behavior of σ0 by making use of the fact that the scatterometer provides instantaneous measurements at two different incidence angles.
 3. Extrapolating all σ0 taken over the entire incidence angle range (18°–59°) to a reference angle of 40° and calculate the average σ0(40) based on the backscatter triplet.
 4. Selecting values for the cross over angles θdry and θwet where the effects of vegetation growth on σ0 are assumed to be negligible.
 5. Modeling the effect of vegetation phenology and determining the exact positions of the dry and wet soil backscatter reference curves, σdry0(t) and σwet0(t) by fitting the curves to the σ0(40) time series.
 6. Calculating the surface soil moisture series ms by comparing σ0(40) to the dry and wet reference curves.
 7. Masking measurements affected by snow and/or frozen soil. The information contained in the backscatter time series is principally useful for monitoring freeze/thawing processes [Wismann, 2000; Scipal and Wagner, 1998]. However, currently, only a simple masking procedure, which is mainly based on mean monthly temperature data extracted from a climate database prepared by Leemans and Cramer , is employed.
 8. Calculating SWI for a selected value of the characteristic time length T. All measurements taken within a period 3T are considered if at least 4 measurements have been recorded within the most recent time period T and if the ground was snow free and not frozen. In cold climates, the calculation of SWI is in principle not possible at the beginning of the thawing period due to the lack of physically meaningful ms data in the previous few weeks. However, to enable the calculation of SWI right from the onset of thawing, the preceding ms values are set equal to 1. The underlying assumption is that the soil moisture status is high at the end of the winter season due to snowmelt. This assumption is not valid for dry cold climates and will be revised in the future.
 9. If soil hydrologic properties are known (wilting level, field capacity, and porosity) the plant available water content can be calculated.
 The data processing can in principle be completed fully automatically, but the operator has the opportunity to manually select a number of parameters. In step 4 one may select θdry and θwet based on a visual comparison of the σ0(40) series with the corresponding dry and wet reference curves. Changing θdry and θwet has the effect of enhancing or lessening the vegetation correction. Given that the values which were used in the previous studies over the Iberian Peninsula, Ukraine, and western Africa (θdry = 20°, θwet = 40°) also produced reasonable results in other areas, it was decided to use these values globally.
 The operator may also influence the determination of the dry and wet reference curves (step 5). In dry climates there may never be enough rainfall to thoroughly wet the soil surface layer [Wagner and Scipal, 2000]. For this situation, an empirical correction approach was developed which estimates σwet0(t) corresponding to a wet, saturated soil surface [Scipal, 2002]. Figure 1 shows those areas where this correction was applied. Correspondingly, in wet climates the soil surface layer may never dry out completely. In high-latitude climates ms values representing frozen soil are used instead because the dielectric properties of frozen and dry soil are similar [Hallikainen et al., 1984].
 In step 8 the characteristic time length T can be selected to calculate SWI. Typically, the temporal variability of soil moisture decreases with increasing layer thickness. Therefore, given that the parameter T controls the degree of smoothing of the ms series, higher T values are representative of deeper layers. In the Ukrainian study, the best comparison of SWI with field measurements of the 0–20 cm layer was observed for T set equal to 15 days, respectively, T = 20 days for the 0–100 cm layer [Wagner et al., 1999c]. For the global processing T was kept constant at 20 days. Given that the temporal variability of the soil moisture field is dependent on climate and soils [Entin et al., 2000], this implies that the SWI data may represent layers of variable thickness in different parts of the world.
 The backscattering coefficient measured by C-band radars saturates over forests with about 20–30 t/ha aboveground biomass [Le Toan et al., 2001]. For the case of the ERS scatterometer this means that the measured signal is relatively stable if a significant portion of the radar footprint is covered by forests. In fact, tropical rain forest in South America has been used as a reference-distributed target for the calibration of the ERS Scatterometer [Lecomte and Wagner, 1998]. Since in such a situation there are not enough soil moisture sensitive areas (e.g., grassland or agriculture) within one ERS footprint, soil moisture retrieval is not possible. To identify these regions, pixels where the difference σwet0 − σdry0 is less than 2 dB were masked. The resulting forest mask covers an area of roughly 9.6 × 106 km2. It only occurs in the equatorial rain forest belt (Figure 1). Also, areas where σ0 is strongly dependent on the azimuth angle have been masked. The masking criterion is based on the estimated standard deviation of σ0 calculated in step 1 of the process. If this parameter was above 1 dB and if the reasons for this high noise could be attributed to azimuthal effects, e.g., caused by sand dunes, then the corresponding pixel was masked. The masked areas cover an area of about 4.7 106 km2 and can mainly be found in the Sahara, the Rub'al Khal, and the Takla Makan (Figure 1).
4.2. Modeled Soil Moisture
 Soil moisture is intrinsically coupled to the carbon assimilation dynamics of vegetation growing on the soil. It is therefore important that a model used to predict soil moisture computes the fully coupled water-carbon biogeochemistry of each pixel. Simulated monthly soil moisture fields are here taken from LPJ, a nonequilibrium biogeography-biogeochemistry model that combines process-based representations of terrestrial vegetation dynamics and land-atmosphere carbon and water exchanges in a single framework [Sitch et al., 2003].
 The LPJ dynamic global vegetation model explicitly simulates major ecosystem processes, including vegetation growth, mortality, and resource competition of 10 defined plant functional types, and carbon allocation to several pools for producing new tissue. Representation of these processes is of intermediate complexity to allow global applications (here, on a 0.5° × 0.5° grid). The presence and fractional cover of plant functional types within a grid cell is determined yearly according to bioclimatic, physiological, morphological, and fire-resistance features. The model captures the biogeographical distribution of the Earth's major vegetation types, and simulates well the global magnitude of terrestrial carbon pools, including their change in response to climatic variations [Cramer et al., 2001; Lucht et al., 2002; Sitch et al., 2003].
 Water is a fundamental factor determining vegetation composition and distribution around the world. In turn, vegetation exerts significant influences on runoff generation, evapotranspiration, and soil moisture. Because the terrestrial water and carbon cycles are intrinsically coupled, LPJ considers the interrelatedness of vegetation growth, carbon cycling, and soil water content by mechanistically linking the associated processes. CO2 uptake from the atmosphere and transpiration by plants, both of which occur simultaneously through the same physical pathway, the stomata, are simulated concurrently using a coupled photosynthesis-water balance scheme. Both transpiration and carbon uptake/photosynthesis are limited by soil water content and are also controlled by atmospheric CO2 concentration [Sitch et al., 2003; Gerten et al., submitted manuscript, 2003]. Moreover, leaf phenology is determined depending not only on temperature but also on water stress thresholds. Decomposition of litter and soil organic matter is also driven by soil water content. In turn, simulated runoff, evapotranspiration, and soil water content are decisively influenced by vegetation attributes such as interception storage capacity, seasonal phenology, and rooting depth.
 LPJ computes the most important fluxes and pools of water at daily time steps for each grid cell, based on disaggregated monthly climate input data (air temperature, precipitation amount, number of wet days, and cloudiness; New et al., 2000). The water balance computations are described in detail elsewhere (Gerten et al., submitted manuscript, 2003), thus only a short overview is provided here. At temperatures below zero, all precipitation is assumed to accumulate as snow; snowpack melts and infiltrates into the soil at >0°C, using a simple degree-day method. Part of precipitation is lost as interception from the canopies; the remainder enters the soil column, which is divided into two layers of, respectively, 50 and 100 cm thickness. The amount of water in excess of each layer's field capacity, which is defined based on nine soil texture types derived from a global database [FAO, 1991], does not infiltrate and is accounted for as surface and subsurface runoff, respectively. Water evaporates from bare soil (i.e., from the part of the grid cell not covered by vegetation) at a rate that relates potential evapotranspiration to the water content of the upper 20 cm. Transpiration from plant-covered area is calculated as the lesser of atmospheric demand, representing potential transpiration in the absence of water limitation, and water supply from the soil-plant system. The latter reaches a vegetation-specific maximum when the soil is saturated and declines linearly with decreasing soil moisture [Sitch et al., 2003; Gerten et al., submitted manuscript, 2003].
 Reliable simulation and validation of the water budget, and of soil moisture in particular, is of paramount importance for the overall performance of biosphere models. A recent study of the annual and monthly fields of runoff and evapotranspiration computed by LPJ (Gerten et al., submitted manuscript, 2003) found these components of the water balance to be captured well in many parts of the world, although performance in regions where climate input data are inaccurate remains weak, as is also the case where human impacts alter the natural water cycle and land cover (LPJ currently considers potential natural vegetation only). Soil moisture simulated by LPJ in turn has not been previously validated, except for some sites in Eurasia [Sitch et al., 2003]. For the present comparison with satellite-derived data, we use simulated monthly soil moisture fields from the upper 50 cm for the period 1992–1998. The model was driven by monthly climate data from the CRU05 database [New et al., 2000].
4.3. Precipitation Data
 Gridded precipitation data sets from the GPCC have been used for comparison with the scatterometer and model results. The GPCC collects in situ observed precipitation data worldwide and maintains gridded data sets of monthly total precipitation, covering the Earth's land surface (Full Data Product available from GPCC at http://gpcc.dwd.de).
 Conventionally, measured data from rain gauge networks are considered to be the most reliable information to obtain area-averaged precipitation of the land surface. Nevertheless, the accuracy and usefulness of the gridded products strongly depend on the availability and quality of the observed gauge data. Hence, data should be treated with care especially in data-poor regions [Rudolf et al., 1994, 2003]. The entire GPCC database includes monthly precipitation totals of approximately 50,000 stations so far. The number of stations gradually decreases from 42,000 in 1986–1987 down to 7000 real-time reporting stations in 2002 due to the time needed for data acquisition and reprocessing. All gauge data selected for gridding at GPCC are quality-controlled using an expensive high-level quality control system with several automatic and visual/manual components.
 For gridding (here to a 1° × 1° grid) the gauge data have been interpolated with a special version of the cartographic method SPHEREMAP of Willmott et al. . The interpolation is based on the empirical weighting scheme of Sheppard , which uses different distance-weighting rules in order to avoid an overweight of station ensembles in an unbalanced station site distribution. The results do not represent grid point estimates but grid-cell-area-related precipitation totals. The gridded CRU data set used for the LPJ model differs from the GPCC data by the interpolation method, the quality-control level, and, in particular, by the density of the observed database: for 1992, 3759 CRU stations versus 33,503 GPCC stations and for 1994, 2777 CRU stations versus 29,644 GPCC stations.
 We compared time series of scatterometer-derived SWI with monthly rainfall and modeled soil moisture of the 0–50 cm layer. The selected study periods correspond to the maximum overlap periods (full years) between the different data sets, i.e., 1992–2000 for the comparison with the rainfall data and 1992–1998 for the model data. While the SWI data are available on an irregular grid with 28-km pixel spacing (corresponding to about half the spatial resolution), both the rainfall and model data are given on a regular grid in the geographic coordinate system (1° × 1° and 0.5° × 0.5°, respectively). As our interest was in evaluating the scatterometer results, we searched for each scatterometer pixel the nearest grid points of the other two data sets and used the unaltered time series for comparison (nearest neighborhood method). This means that we used the same rainfall and model time series for evaluating several scatterometer pixels (16, respectively, four around the equator and less with increasing latitude).
 Rainfall and soil moisture are not directly comparable, but as the former is a driver of the latter, rainfall anomalies should be observable in the remotely sensed soil moisture data. For this reason, anomaly data were calculated for both data sets by determining the mean monthly value based on the years 1992–2000 and by subtracting it from the monthly values. The rainfall data represent monthly rainfall totals (in millimeters) and the SWI data end-of-month conditions (SWI was calculated for 31 January, 28 and 29 February, etc.). The results of the correlation analysis are presented in section 6.
 For the comparison of the scatterometer and the LPJ soil moisture data, mean monthly values were used. In the case of LPJ, these were obtained by averaging the daily values of the 0–50 cm soil layer over 1 month. In the case of the scatterometer, the mean monthly value was determined by calculating the SWI for the tenth, twentieth, and the last day of each month and by averaging these three values. The soil moisture values from LPJ are given in millimeters and vary between a minimum value representing wilting point and a maximum value representing field capacity. In order to allow a direct comparison with the scatterometer data in the same units, the LPJ soil moisture data, denoted by WLPJ, were converted to relative units using:
where the denominator (FC − WL)LPJ is the plant available field capacity used for the LPJ simulations. The obtained quantity is denoted by SWILPJ to express the fact that it is an SWI-equivalent model parameter. The following statistical parameters were calculated based on the monthly time series for the years 1992–1998: The linear Pearson correlation coefficient R, the bias, and the standard deviation. The bias is calculated as follows:
where N is the number of months for which data from both the scatterometer and the LJP model were available (N ≤ 84 depending on the number of available scatterometer data and duration of winter frost). The standard deviation is given by
The same statistical parameters were calculated for the respective anomalies (against the 1992–1998 mean). The results are presented in section 7.
 To assess the transferability of the soil moisture retrieval method over different hydrologic regimes and vegetation conditions, summary statistics will be provided for major climate groups and subgroups of the Köppen Climate System [Köppen, 1923]. The Köppen climate classification used within this study is based on the 1° × 1° climate database prepared by Leemans and Cramer . Two kinds of statistics are presented: (1) a “pixel-based” statistics per climate zone which first calculates R, the bias, and the SD for each pixel and then determines the mean values for the climate zone; (2) a “global” statistic for each climate zone which pools all data pairs from all pixels within one climate zone together and then calculates one value for each of the statistical parameters.
6. Comparison With Precipitation Anomalies
 In Figure 2a the correlation coefficient R between the SWI anomalies and the GPCC rainfall anomalies can be seen for the Earth's land surfaces (with the exception of tropical forest and sand deserts). In general, R is positive over the major part of the land surface with maximum values around 0.7–0.8. Over 67.1% of the land surface R is greater than 0.2, over 29.2% the correlation is not significant (−0.2 < R < 0.2), and over 3.7% the correlation is negative (R < −0.2). The spatial distribution of R does not appear to be related to any known climatic zoning or biogeophysical parameter. Also, it exhibits some unexpected patterns. For example, over Europe there are large acquisition gaps due to frequent SAR operations, which was expected to lower the correlation. In practice, however, R is high over Europe compared to most other regions of the world. Also, high correlations were expected over western Africa due to the distinct rainfall patterns in this region, but R is comparatively low there.
 These findings prompted us to analyze the influence of the number of meteorological stations used by GPCC to derive the global gridded rainfall data. GPCC provides for each month a map showing the number of meteorological stations located within the grid elements. Because the number of stations is variable over time, we calculated the average number of stations ñ over the 108-month-long data record (Figure 2b). The comparison of the two maps in Figure 2 demonstrates that over data-rich areas R is high compared to data poor areas. Even the lack of rainfall stations over individual countries like Angola or Somalia can be clearly recognized.
 The histograms shown in Figure 3a allow a more quantitative discussion of the impact of the station density on R. The four histogram curves correspond to different groups of rainfall data stratified according to the average number of stations ñ: GPCC grids without a station (ñ = 0) partial data record (0 < ñ < 1) at least one station on average (0 < ñ < 2); more than two stations on average (ñ ≥ 2). While the maximum values of R are always in the range from 0.7 to 0.8, the center value of R increases, and the width of the distributions decreases with increasing data support ñ. One can also observe that the histograms are slightly skewed to the left. The median values and standard deviations of R with increasing rainfall data support are 0.25 ± 0.24, 0.36 ± 0.21, 0.43 ± 0.19, and 0.52 ± 0.12. As there is no reason to believe that the quality of the scatterometer data depends on the density of meteorological stations, this shows the limits of gridded precipitation products over data-sparse areas.
 Although soil moisture and rainfall are two different variables which, strictly speaking, cannot be directly compared to each other, these correlations appear to be on the low side on what can be expected, probably caused by retrieval errors in the scatterometer data and remaining errors in the monthly gridded rainfall products. Therefore, for comparison, the correlation between the soil moisture anomalies simulated by the LPJ model and the GPCC rainfall anomalies was also calculated. Given that the LPJ results are based on gauge measurements, it was expected that R is higher and more spatially homogeneous. However, the results turned out to be similar. As can be seen in Figure 3b the R histograms show a distinct behavior with changing data support. The median values and the standard deviations of the histograms are 0.23 ± 0.19, 0.31 ± 0.18, 0.38 ± 0.17, and 0.52 ± 0.14, respectively. The fact that the maximum R values of the histograms in Figures 2a and 2b are comparable suggests that the low correlations are mainly attributed to the different nature of soil moisture and rainfall and consequently that soil moisture anomalies are well captured by the ERS Scatterometer. The observation that even in the case of the modeled soil moisture data a dependency on rainfall station density is observed is most likely due to differences of the GPCC and CRU data sets.
 Because our interest is in evaluating the quality of the remotely sensed soil moisture products, we identify potential problem areas by locating the position of pixels with comparatively low R values on the map. The red points in Figure 2c represent grid points with an R value belonging to the lowest 5% of the histograms with at least one (1 ≤ ñ < 2) or two (ñ ≥ 2) meteorological stations on average. One can see that potential problem areas are deserts, mountain areas, and high-latitude regions.
7. Comparison With Modeled Soil Moisture
7.1. Absolute Soil Moisture
 The results of the comparison between the remotely sensed SWI and the modeled soil moisture content of the 0–50 cm layer, given by equation (3) is shown in Figure 4. Table 1 presents the summary statistics for different climate zones according to the pixel-based and global calculation methods. One can see that the correlation coefficient R and the standard deviation SD are smaller for the pixel-based method. The difference is normally relatively small for R (<0.2), but it may be significant for SD (up to 10%). The bias is comparable for both approaches.
Table 1. Summary Statistics of the Comparison of Remotely Sensed and Modeled Soil Moisture (0–50 cm) for Köppen Climate Zones (up to Second Letter)a
Köppen Climate Classes
Number of Points
The standard deviation and the bias are given in percentage of relative soil moisture. The parameters in columns three to five were calculated by first calculating R, SD, and bias for each pixel and then averaging the results over the entire climate zone (pixel based). The results in columns six to eight were obtained by pooling all data from the climate zones into one diagram and calculating one set of parameters (global).
Af, No dry season
Am, Monsoon type
Aw, Distinct dry season
Cf, No dry season
Cs, Dry summer
Cw, Dry winter
Df, No dry season
Dw, Dry winter
 With the exception of deserts, high-latitude, and mountain areas, the correlation R is generally relatively high with maximum values around 0.9 (Figure 4a). For some unknown reason R is low over the eastern United States compared to similar climatic regions. On average, R is highest for the tropical climate zone (0.69/0.71) followed by the temperate zone (0.61/0.66) and the steppe climate (0.52). For the other climate types (BW, D, and E) R is generally below 0.5.
 The standard deviation is lowest for dry climates with values in the range 11–19%, depending on the calculating method. For the A, C, and D climates it is in the range between 15 and 24%. The upper limit is about 25% (Figure 4). Because model deficiencies also introduce errors of unknown quantity, it is not possible to determine the RMS error between the remotely sensed soil moisture data and the “true” areal value based on this comparison. Only if the RMS error of the model is known can the RMS error of the retrieved soil moisture data be calculated. As an example, let us assume that the random error is equally high for the scatterometer and model data. In this case, the RMS error is obtained by dividing the SD by . For an SD equal to 25%, the RMS error of the SWI data would then be equal to 18%. This corresponds to 0.036 m3 m−3 for a soil with a wilting point of 0.01 m3 m−3 and a field capacity of 0.03 m3m−3.
 The bias is generally in the range ±10% over the majority of areas in the A, B, and C climate zones. The tendency is that over tropical and dry climates the scatterometer-derived data are slightly larger (0.5–2.4%) than the modeled data, while over temperate zones they tend to be somewhat lower (−6%). While these results are within reasonable limits, the bias observed over cold and polar climates is significant. On average, SWI exceeds the model data in these areas by about 15%, but differences up to about 50% are observed.
7.2. Soil Moisture Anomalies
 The results of the anomaly analysis showed that the global patterns of the correlation R between the remotely sensed and modeled soil moisture anomalies are very similar to the correlation map shown in Figure 2a (and is therefore not shown here). This is an indication for the strong influence of the station density not only on the interpolated precipitation data but also on modeled soil moisture anomalies. This is confirmed by Figure 5, which shows histograms of R between SWI and SWILPJ anomalies stratified according to station density. As in Figure 3, the center of the histograms increases, and the width decreases. The median values and the standard deviations of the histograms are 0.22 ± 0.29, 0.32 ± 0.25, 0.37 ± 0.26, and 0.43 ± 0.21, respectively.
 Other than the standard deviation of SWI − SWILPJ shown in Figure 4b, the SD of the respective anomalies does not exhibit distinct spatial patterns. A great majority of the values fall in the range between 10 and 12%, which corresponds to 0.02–0.024 m3 m−3 for a soil with a wilting level of 0.1 m3 m−3 and a field capacity of 0.3 m3m−3.
7.3. Discussion for A and C Climates
 As has to be expected, the patterns of seasonal rainfall influence the results of the correlation analysis between the remotely sensed and modeled soil moisture data. The correlation is best over areas with distinct seasonal patterns of rainfall, which cause large variations in soil moisture (Am, Aw, Cs, and Cw). In these regions about 35–55% of the variability of one data set is explained by the other. In climate zones with more equally distributed rainfall (Af, Cf) R2 is about 25%. The latter result is comparable to the correlation of the anomaly data. This is deemed a reasonable result given that both errors in the retrieval and the model simulations (due to model deficiencies and inaccuracies of the input data) contribute to differences between the two data sets.
 The results obtained for the Cs climate are of particular importance for judging the functioning of the soil moisture retrieval method. This is because the Cs climate has a wet winter and a dry summer. Consequently, soil moisture and vegetation behave quasi-anticyclic, while in the other A or C climate types vegetation and soil moisture are often highly correlated. Under the latter conditions, imperfections in the backscatter model may be hidden because vegetation may act as proxy indicator for soil moisture or vice versa. On the other hand, over the Cs climate such imperfections of the retrieval method would yield inconsistent results. For example, let us assume that our algorithm depends on vegetation as proxy indicator for soil moisture. For the Cs climate the consequence would be that retrieved soil moisture values are too low in winter and too high in summer. The fact that the results for Cs are consistent with the other A and C climate types adds confidence with respect to the physical appropriateness of our retrieval method.
7.4. Discussion for B Climates
 Over many dry climatic regions, particularly over sandy deserts like the Takla Makan, strong azimuthal effects are observed. As it is presently assumed that the Earth's surface is an isotropic scatterer, these areas are masked out. While over steppe climates (BS) an average correlation with modeled soil moisture on the order of 0.5 is observed, it is much lower over desert areas (BW). In some areas even negative correlations occur. The reasons for this behavior will be investigated in future studies, but part of it may be attributable to unexpected seasonal variations in the backscattering coefficient on the order of up to 1 or 1.5 dB. Fortunately, while these effects are certainly spurious, their magnitude is relatively small. The standard deviation between scatterometer and model soil moisture data is below 20%, and the bias is less than about ±4%. For sandy soils with a low water retention capacity (0.1 m3 m−3) these values correspond to about 0.02 and ±0.004 m3 m−3 in volumetric units.
7.5. Discussion for D and E Climates
 The biggest discrepancies between the different data sets occur in high-latitude regions (D and E climates). Correlations are low, and over many boreal and tundra regions the scatterometer-derived soil moisture values suggest much wetter conditions than the LPJ model. Given that the current version of the LPJ water balance model is too simplistic to characterize the complex hydrology in these areas (e.g., permafrost or wetlands are not modeled), given the sparse input data, and the short snow and frost-free period in the summer months, the results do not allow to draw conclusions about the quality of the scatterometer-derived soil moisture data. However, numerous phenomena will need to be addressed in more detail in follow-up studies. These include the effects of small lakes, temporary water surfaces, wetlands, and of freezing of soil and vegetation during winter on the scatterometer-derived soil moisture data.
 In this paper the first global, multiannual soil moisture data set (1992–2000) derived from spaceborne microwave measurements is presented. Backscatter measurements acquired with the C-band scatterometer on board ERS-1/2 were used to calculate surface soil moisture time series. On the basis of these series, a trend indicator for the soil moisture content in deeper soil layers (<1 m) is calculated. Monthly SWI data were compared to globally gridded precipitation products and modeled soil moisture of the 0–50 cm layer. Given that accuracy of the rainfall and model data is not known, it has not been possible to determine the quality of the scatterometer data globally. Nevertheless, in situations where scatterometer data compare well to the other data sets it is very likely that all data sets are of good quality (it is extremely unlikely that the same errors are present in the data sets given the very different acquisition and interpretation processes). In such situations it is also possible to state upper limits of the accuracy of the scatterometer data.
 The comparison of SWI anomalies (compared to the 1992–2000 monthly means) with rainfall anomalies has shown that a reasonable agreement (R2 = 25%) can only be reliably achieved over areas with a relatively dense station network. The same observation applies to modeled soil moisture anomalies. The comparison with the modeled anomalies further suggests that the standard deviation (respectively, the relative RMS error) is better than about 10–12%. Considering that the employed soil moisture retrieval method is essentially a change detection method and that the station density plays no role in the retrieval, this is an evidence to believe that the remotely sensed soil moisture anomalies are suited to improve our capabilities to monitor extreme soil moisture conditions (droughts, floods), particularly in data spare areas.
 Since it is argued that the transferability of a retrieval algorithm is an important indication of its physical appropriateness, statistics were prepared for different climatic regions based on the Köppen Climate Classification. Over tropical (A) and temperate (C) regions the scatterometer and the modeled soil moisture data compare well. Over some areas correlations of up to about 0.9 are observed. Average correlations in the subregions (Af, Am, Aw, Cf, Cs, and Cw) are in the range from about 0.45 to 0.7, with the lower range corresponding to climates with evenly distributed rainfall and higher values to climates with distinct seasonal rainfall patterns. The fact that the retrieval method could be transferred over different climatic zones is indication that the soil moisture retrieval algorithm successfully accounts for vegetation effects. The upper limit of the average RMS error of the remotely sensed soil moisture data is about 25%, which corresponds to about 0.03–0.07 m3 m−3 depending on soil type. This is in agreement with the results of an accuracy assessment using field measurements in Ukraine [Wagner et al., 1999c].
 Over steppe and desert climates spurious effects are observed in the scatterometer data, which will be subject to further studies. Also, the present study did not allow us to draw conclusions with respect to the capability of the scatterometer to monitor seasonal soil moisture patterns over high-latitude climates (D and E) and mountainous regions.
 The present study represents a first step toward a better understanding of the quality of the scatterometer-derived global soil moisture data. The results are encouraging, and modelers may consider using these data for model validation, calibration, or input (e.g., within the framework of an assimilation scheme). The present comparison with soil moisture simulated by a global vegetation model is an exemplary and important contribution to global biogeochemical and hydrological modeling. The attractiveness of the data lies in the fact that they represent a completely new source of soil moisture information, which under careful consideration of its limitations, may allow to investigate some shortcomings of state-of-the-art land-surface schemes and macroscale hydrological models used by climate change modelers, meteorologists, hydrologists, agronomists, and many others.
 The retrieved soil moisture data have been made available on a World Wide Web site (http://www.ipf.tuwien.ac.at/radar/ers-scat/home.htm) and can be obtained in digital format on request. The Web site also allows a visual comparison with meteorological data (rainfall, temperature, and snow) and in situ soil moisture measurements.
 The study has been made possible thanks to funding from the following organizations: Austrian Science Fund (SHARCKS project, P14002-TEC), European Space Agency (Data User Programme 2001, CLIMSCAT project, ESRIN Co. 155566/02/I-LG), European Commission (Framework Programme 5, SIBERIA II project), and the German Ministry for Education and Research (Climate Research Programme DEKLIM, project Climate, Vegetation and Carbon). IFREMER CERSAT kindly provided the ERS scatterometer data on CD-ROM. We also would like to thank our colleagues Marco Trommler, Richard Kidd, and Kathleen Naumann for help with the data processing.