4.1. Remotely Sensed Soil Moisture
 The ERS-1 scatterometer regularly acquired data between August 1991 and May 1996 and the ERS-2 scatterometer between March 1996 and January 2001, when due to a failure of a gyroscope all ERS-2 instruments were temporarily switched off. ERS-2 has continued to acquire scatterometer data in gyroless mode after January 2001, but ESA has not yet released these data due to calibration problems. The scatterometer achieves a daily global coverage of up to about 41%, but often it is much less due to operational conflicts with the ERS SAR. Therefore the temporal sampling rate is highly irregular depending on latitude and the frequency of SAR acquisitions. The instrument has three antennas, which look at the surface at incidence angles ranging between 18° and 59° from three different azimuthal directions (sideward and 45° forward and backward). For each point on the Earth's surface, three values of the backscattering coefficient are provided together with other relevant orbit and instrument information such as incidence and azimuth angles.
 Data processing is completed systematically and is described in detail in the work of Wagner et al. [1999a, 1999b, 1999c] and Scipal . Like the SSM/I retrieval method of De Ridder , it is a time series processing approach, which essentially employs information from the satellite data itself. In a preprocessing step, the scatterometer data are rearranged from an image format to a time series format without altering the data. In this way, multiyear time series of scatterometer measurements are built up over a predefined global grid. The further processing steps use as input all available scatterometer data (August 1991 to January 2001). The processing steps are as follows:
 1. Estimating the standard deviation (SD) of σ0 due to instrument noise, speckle, and azimuthal effects based on measurements of the forward- and backward-looking antennas.
 2. Determining the incidence angle behavior of σ0 by making use of the fact that the scatterometer provides instantaneous measurements at two different incidence angles.
 3. Extrapolating all σ0 taken over the entire incidence angle range (18°–59°) to a reference angle of 40° and calculate the average σ0(40) based on the backscatter triplet.
 4. Selecting values for the cross over angles θdry and θwet where the effects of vegetation growth on σ0 are assumed to be negligible.
 5. Modeling the effect of vegetation phenology and determining the exact positions of the dry and wet soil backscatter reference curves, σdry0(t) and σwet0(t) by fitting the curves to the σ0(40) time series.
 6. Calculating the surface soil moisture series ms by comparing σ0(40) to the dry and wet reference curves.
 7. Masking measurements affected by snow and/or frozen soil. The information contained in the backscatter time series is principally useful for monitoring freeze/thawing processes [Wismann, 2000; Scipal and Wagner, 1998]. However, currently, only a simple masking procedure, which is mainly based on mean monthly temperature data extracted from a climate database prepared by Leemans and Cramer , is employed.
 8. Calculating SWI for a selected value of the characteristic time length T. All measurements taken within a period 3T are considered if at least 4 measurements have been recorded within the most recent time period T and if the ground was snow free and not frozen. In cold climates, the calculation of SWI is in principle not possible at the beginning of the thawing period due to the lack of physically meaningful ms data in the previous few weeks. However, to enable the calculation of SWI right from the onset of thawing, the preceding ms values are set equal to 1. The underlying assumption is that the soil moisture status is high at the end of the winter season due to snowmelt. This assumption is not valid for dry cold climates and will be revised in the future.
 9. If soil hydrologic properties are known (wilting level, field capacity, and porosity) the plant available water content can be calculated.
 The data processing can in principle be completed fully automatically, but the operator has the opportunity to manually select a number of parameters. In step 4 one may select θdry and θwet based on a visual comparison of the σ0(40) series with the corresponding dry and wet reference curves. Changing θdry and θwet has the effect of enhancing or lessening the vegetation correction. Given that the values which were used in the previous studies over the Iberian Peninsula, Ukraine, and western Africa (θdry = 20°, θwet = 40°) also produced reasonable results in other areas, it was decided to use these values globally.
 The operator may also influence the determination of the dry and wet reference curves (step 5). In dry climates there may never be enough rainfall to thoroughly wet the soil surface layer [Wagner and Scipal, 2000]. For this situation, an empirical correction approach was developed which estimates σwet0(t) corresponding to a wet, saturated soil surface [Scipal, 2002]. Figure 1 shows those areas where this correction was applied. Correspondingly, in wet climates the soil surface layer may never dry out completely. In high-latitude climates ms values representing frozen soil are used instead because the dielectric properties of frozen and dry soil are similar [Hallikainen et al., 1984].
Figure 1. Areas where no soil moisture information is provided due to dense forest cover (dark shaded areas) and strong azimuthal effects (black areas). Also indicated are those areas where the wet backscatter reference value was empirically corrected. The different shading tones represent the magnitude of the correction in decibels.
Download figure to PowerPoint
 In step 8 the characteristic time length T can be selected to calculate SWI. Typically, the temporal variability of soil moisture decreases with increasing layer thickness. Therefore, given that the parameter T controls the degree of smoothing of the ms series, higher T values are representative of deeper layers. In the Ukrainian study, the best comparison of SWI with field measurements of the 0–20 cm layer was observed for T set equal to 15 days, respectively, T = 20 days for the 0–100 cm layer [Wagner et al., 1999c]. For the global processing T was kept constant at 20 days. Given that the temporal variability of the soil moisture field is dependent on climate and soils [Entin et al., 2000], this implies that the SWI data may represent layers of variable thickness in different parts of the world.
 The backscattering coefficient measured by C-band radars saturates over forests with about 20–30 t/ha aboveground biomass [Le Toan et al., 2001]. For the case of the ERS scatterometer this means that the measured signal is relatively stable if a significant portion of the radar footprint is covered by forests. In fact, tropical rain forest in South America has been used as a reference-distributed target for the calibration of the ERS Scatterometer [Lecomte and Wagner, 1998]. Since in such a situation there are not enough soil moisture sensitive areas (e.g., grassland or agriculture) within one ERS footprint, soil moisture retrieval is not possible. To identify these regions, pixels where the difference σwet0 − σdry0 is less than 2 dB were masked. The resulting forest mask covers an area of roughly 9.6 × 106 km2. It only occurs in the equatorial rain forest belt (Figure 1). Also, areas where σ0 is strongly dependent on the azimuth angle have been masked. The masking criterion is based on the estimated standard deviation of σ0 calculated in step 1 of the process. If this parameter was above 1 dB and if the reasons for this high noise could be attributed to azimuthal effects, e.g., caused by sand dunes, then the corresponding pixel was masked. The masked areas cover an area of about 4.7 106 km2 and can mainly be found in the Sahara, the Rub'al Khal, and the Takla Makan (Figure 1).
4.2. Modeled Soil Moisture
 Soil moisture is intrinsically coupled to the carbon assimilation dynamics of vegetation growing on the soil. It is therefore important that a model used to predict soil moisture computes the fully coupled water-carbon biogeochemistry of each pixel. Simulated monthly soil moisture fields are here taken from LPJ, a nonequilibrium biogeography-biogeochemistry model that combines process-based representations of terrestrial vegetation dynamics and land-atmosphere carbon and water exchanges in a single framework [Sitch et al., 2003].
 The LPJ dynamic global vegetation model explicitly simulates major ecosystem processes, including vegetation growth, mortality, and resource competition of 10 defined plant functional types, and carbon allocation to several pools for producing new tissue. Representation of these processes is of intermediate complexity to allow global applications (here, on a 0.5° × 0.5° grid). The presence and fractional cover of plant functional types within a grid cell is determined yearly according to bioclimatic, physiological, morphological, and fire-resistance features. The model captures the biogeographical distribution of the Earth's major vegetation types, and simulates well the global magnitude of terrestrial carbon pools, including their change in response to climatic variations [Cramer et al., 2001; Lucht et al., 2002; Sitch et al., 2003].
 Water is a fundamental factor determining vegetation composition and distribution around the world. In turn, vegetation exerts significant influences on runoff generation, evapotranspiration, and soil moisture. Because the terrestrial water and carbon cycles are intrinsically coupled, LPJ considers the interrelatedness of vegetation growth, carbon cycling, and soil water content by mechanistically linking the associated processes. CO2 uptake from the atmosphere and transpiration by plants, both of which occur simultaneously through the same physical pathway, the stomata, are simulated concurrently using a coupled photosynthesis-water balance scheme. Both transpiration and carbon uptake/photosynthesis are limited by soil water content and are also controlled by atmospheric CO2 concentration [Sitch et al., 2003; Gerten et al., submitted manuscript, 2003]. Moreover, leaf phenology is determined depending not only on temperature but also on water stress thresholds. Decomposition of litter and soil organic matter is also driven by soil water content. In turn, simulated runoff, evapotranspiration, and soil water content are decisively influenced by vegetation attributes such as interception storage capacity, seasonal phenology, and rooting depth.
 LPJ computes the most important fluxes and pools of water at daily time steps for each grid cell, based on disaggregated monthly climate input data (air temperature, precipitation amount, number of wet days, and cloudiness; New et al., 2000). The water balance computations are described in detail elsewhere (Gerten et al., submitted manuscript, 2003), thus only a short overview is provided here. At temperatures below zero, all precipitation is assumed to accumulate as snow; snowpack melts and infiltrates into the soil at >0°C, using a simple degree-day method. Part of precipitation is lost as interception from the canopies; the remainder enters the soil column, which is divided into two layers of, respectively, 50 and 100 cm thickness. The amount of water in excess of each layer's field capacity, which is defined based on nine soil texture types derived from a global database [FAO, 1991], does not infiltrate and is accounted for as surface and subsurface runoff, respectively. Water evaporates from bare soil (i.e., from the part of the grid cell not covered by vegetation) at a rate that relates potential evapotranspiration to the water content of the upper 20 cm. Transpiration from plant-covered area is calculated as the lesser of atmospheric demand, representing potential transpiration in the absence of water limitation, and water supply from the soil-plant system. The latter reaches a vegetation-specific maximum when the soil is saturated and declines linearly with decreasing soil moisture [Sitch et al., 2003; Gerten et al., submitted manuscript, 2003].
 Reliable simulation and validation of the water budget, and of soil moisture in particular, is of paramount importance for the overall performance of biosphere models. A recent study of the annual and monthly fields of runoff and evapotranspiration computed by LPJ (Gerten et al., submitted manuscript, 2003) found these components of the water balance to be captured well in many parts of the world, although performance in regions where climate input data are inaccurate remains weak, as is also the case where human impacts alter the natural water cycle and land cover (LPJ currently considers potential natural vegetation only). Soil moisture simulated by LPJ in turn has not been previously validated, except for some sites in Eurasia [Sitch et al., 2003]. For the present comparison with satellite-derived data, we use simulated monthly soil moisture fields from the upper 50 cm for the period 1992–1998. The model was driven by monthly climate data from the CRU05 database [New et al., 2000].
4.3. Precipitation Data
 Gridded precipitation data sets from the GPCC have been used for comparison with the scatterometer and model results. The GPCC collects in situ observed precipitation data worldwide and maintains gridded data sets of monthly total precipitation, covering the Earth's land surface (Full Data Product available from GPCC at http://gpcc.dwd.de).
 Conventionally, measured data from rain gauge networks are considered to be the most reliable information to obtain area-averaged precipitation of the land surface. Nevertheless, the accuracy and usefulness of the gridded products strongly depend on the availability and quality of the observed gauge data. Hence, data should be treated with care especially in data-poor regions [Rudolf et al., 1994, 2003]. The entire GPCC database includes monthly precipitation totals of approximately 50,000 stations so far. The number of stations gradually decreases from 42,000 in 1986–1987 down to 7000 real-time reporting stations in 2002 due to the time needed for data acquisition and reprocessing. All gauge data selected for gridding at GPCC are quality-controlled using an expensive high-level quality control system with several automatic and visual/manual components.
 For gridding (here to a 1° × 1° grid) the gauge data have been interpolated with a special version of the cartographic method SPHEREMAP of Willmott et al. . The interpolation is based on the empirical weighting scheme of Sheppard , which uses different distance-weighting rules in order to avoid an overweight of station ensembles in an unbalanced station site distribution. The results do not represent grid point estimates but grid-cell-area-related precipitation totals. The gridded CRU data set used for the LPJ model differs from the GPCC data by the interpolation method, the quality-control level, and, in particular, by the density of the observed database: for 1992, 3759 CRU stations versus 33,503 GPCC stations and for 1994, 2777 CRU stations versus 29,644 GPCC stations.