Evaluation of satellite-based and model re-analysis rainfall estimates for Uganda


  • We dedicate this paper to the expertise, commitment and passion for African rainfall and its importance to society by our co-author and friend Dr David Grimes who sadly passed away on 22 December 2011 and who will be sorely missed by a great many people.

R. I. Maidment, Department of Meteorology, University of Reading, Berkshire RG6 6BB, UK. E-mail: r.i.maidment@pgr.reading.ac.uk


The dependence of much of Africa on rainfed agriculture leads to a high vulnerability to fluctuations in rainfall amount. Hence, accurate monitoring of near-real time rainfall is particularly useful, for example in forewarning of possible crop shortfalls in drought-prone areas. Unfortunately, ground based observations are often inadequate. Rainfall estimates from satellite-based algorithms and numerical model outputs can fill this data gap, however rigorous assessment of such estimates is required. In this case, three satellite based products (NOAA-RFE 2.0, GPCP-1DD and TAMSAT) and two numerical model outputs (ERA-40 and ERA-Interim) have been evaluated for Uganda in East Africa using a network of 27 rain gauges. The study focuses on the years 2001–2005 and considers the main rainy season (February to June). All data sets were converted to the same temporal and spatial scales. Kriging was used for the spatial interpolation of the gauge data. All three satellite products showed similar characteristics and had a high level of skill that exceeded both model outputs. ERA-Interim had a tendency to overestimate whilst ERA-40 consistently underestimated the Ugandan rainfall.

1. Introduction

The economy of most African countries is dependent on rainfed agriculture and hence the African continent is highly sensitive to variations in rainfall amount. It follows that accurate measurement of rainfall should be a high priority in Africa. In spite of this, Africa has the lowest density of raingauges for any continent apart from Antarctica and operational radar installations are almost non-existent (Washington et al., 2006). Given the lack of ground based observations, indirectly calculated rainfall estimates from satellite imagery and Numerical Weather Prediction (NWP) model outputs assume greater importance. There are many satellite-based rainfall algorithms and many model products. Rigorous evaluation of these methods is necessary with a view both to operational uses of rainfall data and also to improving knowledge of the African rainfall climate. Perversely, the sparseness of ground based observations makes evaluation more difficult than elsewhere and comparisons must be carried out using methods which take due account of this problem.

Unfortunately, only a few such evaluations have been performed for Africa. Studies of satellite algorithms include Thorne et al. (2001) for southern Africa, Ali et al. (2005) and Jobard et al. (2007) for the west African Sahel, and Dinku et al. (2007, 2008) for Ethiopia. Results from such studies show large differences in algorithm performance depending on season and local climate. For example the NOAA-RFE 2.0 algorithm performs well in west Africa (Jobard et al., 2007) but poorly in Ethiopia (Dinku et al., 2007). Conversely, the CMORPH algorithm shows good agreement with gauge data in Ethiopia but strongly underestimates rainfall amount in the Sahel. The variable performance may be explained by the strong spatial and temporal variations in climate related to the passage of the Inter-Tropical Convergence Zone (ITCZ), the influence of other large-scale climate features and the effects of oceans and topography. Algorithms which show consistent performance over a number of regions tend to be those which rely on local calibration, such as the TAMSAT algorithm (described in Section '3. Data and data preparation').

Evaluation of NWP model rainfall over Africa includes Poccard et al. (2000), Funk and Verdin (2003) and Diro et al. (2009). In general, these studies show that model products tend to be better when averaged over large areas (>106 km2) and time steps (monthly and longer), and perform worse at finer scales. The model rainfall estimates also tend to be less reliable in the tropics than in mid-latitudes. Where model products have been compared to satellite estimates, the satellite usually gives a more accurate representation of the rainfall relative to raingauge measurements.

The study described in this paper has been carried out as part of the FOODSEC action (http://mars.jrc.ec.europa.eu/mars/About-us/FOODSEC) of the European Commission Joint Research Centre (JRC) which produces agrometeorological bulletins in various countries of Sub-Saharan Africa with a focus on the Horn of Africa. The aim of this paper is to compare satellite and model-based methods for providing quasi-real-time rainfall estimates for Uganda during the main crop-growing season which runs from February to June. Three satellite algorithms (TAMSAT, NOAA-RFE 2.0 and GPCP-1DD) and two re-analysis model products (ERA-40 and ERA-Interim) from the European Centre for Medium Range Weather Forecasts (ECMWF) have been evaluated by comparison against raingauge observations provided by the Ugandan Meteorological Service. The time and space scales of the comparison are dekad and 0.5° respectively. (There are three dekads in each calendar month. The first two dekads have 10 days each and the third dekad has between 8 and 11 days depending on the length of the month.) These scales are appropriate to agricultural applications such as agricultural monitoring and crop-yield prediction (Challinor et al., 2003).

2. Rainfall climate of Uganda

Uganda is a landlocked country in Eastern Africa located on the Equator (Figure 1). The central part of the country is about 1000 m above sea level with lower-lying land to the northwest and mountainous areas in the east and southwest. Africa's largest lake, Lake Victoria, occupies the south-eastern corner of the country. Other significant bodies of water include Lake Albert in the west and Lake Kyogu in the centre. The rainfall in Uganda is largely determined by the passage of the ITCZ modulated by the complex topography and the presence of the lakes. Recent research has shown that the Indian Ocean, the Red Sea and the coastal waters off South Africa are also moisture sources for Uganda (Gimeno et al., 2010). Southern Uganda experiences two distinct rainfall seasons (February to June and August to November) coinciding with the northward and southward passage of the ITCZ. In the north of the country, these two seasons merge into one lasting from April to October. Highest totals are generally observed in the mountainous regions in the south, west and east and in the vicinity of Lake Victoria (see Figure 1) which is sufficiently large to create significant local variations in climate. The lowest rainfall totals are found in the north east of the country on the border with Kenya and Sudan where drought is a common occurrence (NARO, 2001). There is high uncertainty in future predictions of rainfall in this region of Africa, although current model projections tend to indicate an increase in annual mean precipitation in East Africa in general (IPCC, 2007).

Figure 1.

Topographic map of Uganda and location within Africa (inset)

3. Data and data preparation

3.1. Satellite data

Three satellite algorithms commonly used in African studies were included in this comparison. These were NOAA-RFE 2.0, GPCP-1DD and TAMSAT.

3.1.1. NOAA-RFE 2.0 (National Oceanic and Atmospheric Administration—African Rainfall Estimates Version 2.0)

The NOAA-RFE 2.0 algorithm (hereinafter referred to as RFE 2.0, for full description see http://www.cpc.ncep.noaa.gov/products/fews/RFE2.0_tech.pdf) combines satellite thermal infra-red (TIR) data from Meteosat with passive microwave (PMW) data from the AMSU and SSM/I satellite instruments and Global Telecommunication System (GTS) raingauge data. Initial rainfall estimates are calculated from the TIR data using the GOES Precipitation Index or GPI (Arkin and Meissner, 1987). The GPI algorithm uses the TIR imagery to identify clouds with tops colder than a threshold temperature of 235 K. Such clouds are designated as raining and a rain rate of 3 mm per hour is assumed. Estimates are also generated from the PMW data using the method of Ferraro and Marks (1994). The GPI and PMW estimates are then merged using weighting coefficients inversely related to the mean square difference between the satellite estimates and gauge data. In a final step, the estimates are adjusted to agree with the GTS raingauge data.

3.1.2. GPCP-1DD (Global Precipitation Climatology Centre—One Degree Daily)

The Global Precipitation Climatology Project (GPCP) produces a monthly product (GPCP-V2.1) and a 1° daily product (GPCP-1DD) (Huffman et al., 2001). The GPCP-V2.1 monthly estimates are an amalgam of geostationary TIR, PMW imagery from polar orbiting satellites and raingauge data available from the GTS (Adler et al., 2003) and are widely used in evaluating climate model simulations (e.g. Allan et al., 2010). The GPCP-1DD product, as the name suggests, is generated at a daily time step with a spatial resolution of 1°. GPCP-1DD is similar in concept to the GPI but the temperature threshold is determined for each 1° square by comparison with rainfall images generated by the GPROF algorithm (Kummerow et al., 2001) from PMW data. A rain rate is then assigned to each raining TIR pixel so that the total monthly rainfall matches the GPCP-V2.1 monthly total.

3.1.3. TAMSAT (Tropical Applications of Meteorology using SATellite data and ground based observations)

The TAMSAT methodology (Grimes et al., 1999) is also similar to the GPI in that it attempts to define a linear relationship between the number of hours for which pixel temperature is colder than a specified threshold (the cold cloud duration or CCD) and rainfall amount. It differs from GPI in that both the temperature threshold and the linear parameters are determined by comparison with raingauge data. Calibrations are carried out separately for each calendar month within empirically determined climate zones. The TAMSAT algorithm differs from the other satellite methods described above in that there is no merging with contemporaneous raingauge data. Calibrations are based on historical data and are assumed to be invariant over long time periods.

3.2. Re-analysis data

ERA-40 is a re-analysis data set produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) (Uppala et al., 2005). This uses a fixed NWP model system combined with data-assimilation of observational data to generate a consistent set of model outputs covering the period from 1957 to 2002. However, since 2002, ERA-40 has been continually updated. The model uses 60 levels in the vertical and T159 representation of horizontal fields approximating to a horizontal resolution of 1.125° in the tropics.

ERA-Interim (Dee et al., 2011) is the updated version of ERA-40 with improved model formulation and data assimilation. It has finer horizontal resolution than ERA-40 (approximately 0.8° in the tropics) but the same vertical resolution. Dekadal rainfall totals were computed from the re-analysis data by summing the appropriate number of 6 hour forecasts.

3.3. Raingauge data

Dekadal raingauge totals were provided by the Ugandan Met Service for 27 locations covering the months February to June between 2001 and 2005 (Figure 2). The data were subject to three levels of quality control. In the first step, manual checks for suspicious records were carried out; in the second step, CLICOM software (http://www.wmo.int/pages/prog/wcp/wcdmp/clicom/index_en.html) was used to automatically flag and cross check abnormal records. In the final step, further checks were performed including verification of station location, identification of repeated data, identification of outliers from expected values, comparative tests using neighbouring stations and investigation of suspicious zero values (i.e. missing data or zero rainfall).

Figure 2.

Dots denote gauge locations and shaded blocks denote the 0.5° × 0.5° grid-squares used in this study. If a gauge is located on a boundary between blocks, both blocks are included

3.4. Data processing

All data sets were converted to a regular 0.5° by 0.5° grid. For TAMSAT and RFE 2.0, this is simply a matter of averaging over the appropriate number of pixels. GPCP-1DD data were bi-linearly interpolated to this spatial resolution. For the ERA model outputs, simple inverse distance interpolation was used to re-grid the rainfall totals to the correct spacing. The raingauge data were converted to the same spatial support using block kriging.

Many studies (e.g. Creutin and Obled, 1982; Tabios and Salas, 1985; Goovaerts, 2000) have shown that, when applied appropriately, kriging is a more accurate interpolator of rainfall than other methods. A crucial element of the kriging process is the calculation of a variogram which contains information on the variation with distance of the correlation between two points. Ideally a variogram should be calculated for each dekad based on the data included in that dekad. Given the shortage of data in this case, climatological variograms (Lebel et al., 1987; Grimes and Pardo-Iguzquiza, 2010) were computed by pooling all data for all years for each calendar month. The inherent assumption that rainfall events in a given month have similar spatial properties is not unreasonable given that almost all Ugandan rainfall results from local convection modulated by the annual cycle of the ITCZ.

One problem with kriging rainfall is that rainfall occurrence and rainfall amount have different spatial correlation properties and therefore should be treated separately (Barancourt et al., 1992; Grimes and Pardo-Iguzquiza, 2010). Fortunately, for dekadal totals in Uganda from March to June, zero rainfall was sufficiently rare that there was no need for separate treatment. However, for February the rainfall was more intermittent (Figure 3) and in this case the double kriging approach of Barancourt et al. (1992) was applied in which indicator kriging is used to define a rainy area and ordinary kriging is used to calculate rainfall amount within the rainy area. Unfortunately, there were insufficient rainfall occurrences in February to generate a reliable amount variogram, hence the March variogram was used instead. This seems reasonable as the spatial structure of the rainfall fields is unlikely to differ significantly over the 2 months. Rainfall amount variograms used in the analysis are shown in Figure 4.

Figure 3.

Kriged rainfall (mm) given by blocks and gauge totals (mm) denoted by dots, February Dekad 1 2003

Figure 4.

Climatological rainfall amount variograms computed from the gauge data for (a) March (also used for February), (b) April, (c) May and (d) June

The importance of kriging the gauge data to the same spatial scale as the model and satellite rainfall estimates is demonstrated by Figure 5 which shows kriged grid square averages plotted against the individual gauge observations for all dekads. It can be seen that the effect of kriging is to reduce high values and increase low values. While it is impossible to verify that this is accurate without a dense array of gauges in each grid square, it is physically reasonable as high gauge observations will most likely correspond to a direct hit on the gauge by the most intense part of a storm, implying that the average over the grid square should be lower. Conversely, low gauges values are expected to be less than the grid square average. Results are consistent with those described by Flitcroft et al. (1989) and Grimes et al. (2003).

Figure 5.

Scatterplot of collocated kriged block estimates against observed dekadal gauge precipitation. Dashed line indicates one-to-one correspondence and solid line gives the linear regression best fit

4. Results

4.1. Grid square average rainfall estimates

To avoid problems connected with inaccuracy of kriged estimates far from gauges, only grid squares containing at least one gauge were used in the analysis. Those grid squares are shown shaded in Figure 2. Quantitative comparison of the five estimation approaches was based on calculation of bias, root mean square difference (RMSD) and coefficient of determination (r2) relative to the kriged gauge data and according to the formulae given below:

equation image(1)
equation image(2)
equation image(3)

where Rij, Gij represent respectively rainfall estimate and kriged gauge value for grid-square i, dekad j; N is the total number of dekads and M is the number of grid squares.

Scatterplots for each product and month for all grid squares are shown in Figure 6. Spatially averaged values for each dekad are shown in Figure 7. A summary of statistics according to Equations ((1))–(3) are given in Tables and for Figures 6 and 7, respectively. In general, the satellite methods (TAMSAT, RFE 2.0 and GPCP-1DD) outperform the model outputs according to all statistical parameters. RFE 2.0 and GPCP-1DD show slightly higher correlation with the gauge data but TAMSAT scores slightly better on bias and RMSD. As for the re-analysis data, ERA-Interim has a strong tendency to overestimate in all months except June, whereas ERA-40 underestimates in all months.

Figure 6.

Comparison of kriged gauge rainfall measurements with model and satellite-based rainfall estimates using all grid squares for each estimation method and month considered in this study. Each cross denotes the rainfall for an individual 0.5° by 0.5° block

Figure 7.

Comparison of spatially averaged kriged gauge and rainfall estimates for (a) ERA-40, (b) ERA-Interim, (c) TAMSAT, (d) RFE 2.0 and (e) GPCP-1DD. Each cross denotes the average rainfall over all blocks for an individual dekad. The solid line is the least squares best fit; the dotted line represents one-to-one correspondence

Table 1. Comparison of gridded precipitation datasets with collocated, dekadal kriged gauge observations: shown are the number of dekads in each comparison (N), the mean bias, root mean squared difference (RMSD), coefficient of determination (R2) and the fractions of dekads in the comparison that are within one or two standard errors (s.e.) of the observations
MonthProductNBias (mm)RMSD (mm)R2< 1 s.e.< 2 s.e.
  1. Values in bold denote the most favourable comparison.

 TAMSAT177− 2.5211.820.170.600.92
 RFE 2.01773.3317.260.200.620.85
MarchERA-40180− 7.7016.060.430.440.77
 RFE 2.01808.8322.710.540.330.67
AprilERA-40120− 14.9729.340.290.530.81
 RFE 2.01205.6822.290.600.560.87
MayERA-40300− 19.6228.220.210.330.68
 TAMSAT300− 2.5218.430.380.620.86
 RFE 2.0300− 7.6320.240.460.540.87
JuneERA-40297− 4.4222.990.110.430.74
 TAMSAT297− 2.7813.840.270.550.85
 RFE 2.0297− 8.1816.880.340.320.75
 GPCP297− 2.8613.730.390.530.85
All monthsERA-401074− 10.0223.250.270.430.75
 TAMSAT1074− 0.9715.990.550.570.85
 RFE 2.01074− 1.7319.600.520.460.80
Table 2. As Figure 1 but for an average over all collocated grid-boxes over the domain considered for each dekad
ProductNBias (mm)RMSD (mm)R2< 1 s.e.< 2 s.e.
ERA-4054− 9.8618.090.410.520.80
TAMSAT54− 1.0110.410.720.760.94
RFE 2.054− 1.7311.000.740.670.96

The kriging process allows calculation of a standard kriging error for each interpolated value, reflecting the spread of possible rainfall amounts consistent with the gauge observations. Assuming a Gaussian distribution, for a perfect estimator one would expect to find 68% of estimates within 1 standard error (s.e.) of the kriged value and 95% within 2 s.e. The respective proportions of estimates within these limits are shown for each method in Tables 1 and 2. Again, the satellite methods are consistently better than the model outputs with TAMSAT generally closer to the predicted values than RFE 2.0 or GPCP-1DD.

4.2. Spatial pattern

Rainfall maps for all products, including the kriged estimates are plotted (Figure 8) to assess spatial patterns of the five estimation methods. The mean rainfall for each month, expressed in mm per dekad, was calculated using data from 2003 and 2004 for each dekad. These years were chosen as they were the only years in this study that had data from all months. There is more similarity between the satellite products than between the model products. The satellite estimation methods also show greater resemblance to the kriged output than the model products. They tend to exhibit greatest spatial variability which is more consistent with high spatial variability of convective rainfall. ERA-40 typically has smooth fields due to the coarse resolution of the original model.

Figure 8.

(a) Maps for each rainfall product and kriged estimates of grid squares included in the analysis calculated from the mean rainfall for 2003 and 2004 (mm per dekad). (b) Rainfall anomaly for each rainfall product with respect to the kriged estimates. Grid squares with dashed outline indicates positive anomalies, whilst solid outline gives anomalies equal to or less than zero

The tendency for ERA-Interim to overestimate is clearly visible (Figure 8(a) and (b)), particularly during the first 3 months of the rainy season. ERA-40, despite underestimating throughout most of the season (Figure 8(b)), performs better than ERA-Interim in giving quantities closer to the kriged block estimates.

4.3. Seasonal cycle

In order to assess the success of the estimation methods in replicating the mean temporal pattern of the rainfall season, the mean rainfall over all grid squares was averaged for the years 2003 and 2004 for each dekad. The average time series is plotted in Figure 9. It can be seen that all products manage to capture the shape of the seasonal cycle. Standard errors on the kriged values were calculated allowing for the correlation between grid squares (Diro et al., 2009). The shading shows ± 1 and 2 s.e. on the kriged values. TAMSAT, RFE 2.0 and GPCP-1DD have approximately two-thirds of the data values within ± 1 s.e., consistent with expected statistics for accurate estimates. ERA-40 underestimates the seasonal pattern, especially during the peak dekads while ERA-Interim clearly has a large positive bias, particularly for the first half of the season, although it is better than ERA-40 at capturing the shape of the seasonal cycle.

Figure 9.

Time series comparison of mean dekadal rainfall estimates and kriged gauge for 2003 and 2004; ERA-40 (thick dashed line), ERA-Interim (thin dashed line), TAMSAT (thick dotted line), RFE 2.0 (thin dotted line), GPCP-1DD (thin solid line) and kriged gauge (thick solid line). The dark grey shading represents the 68% confidence interval and light grey shading the 95% confidence interval on the kriged gauge values

5. Conclusion

Rainfall plays a crucial role in the livelihoods of most people in Africa, particularly in respect of the heavy reliance on rainfed agriculture and its influence in driving the economies of most nations. Hence, there is a pressing need for a timely and reliable supply of rainfall data with complete spatial coverage. Unfortunately, the lack of raingauges across the continent means alternative methods need to be considered. In this study, rainfall estimates derived from re-analysis model outputs and satellite observations have been assessed by comparing them to spatially interpolated gauge data from Uganda. The use of kriging ensured that the gauge data were interpolated to the correct spatial scale and provided an assessment of the uncertainty associated with the area-average values.

In general the satellite-based methods outperformed the re-analysis products in estimating the dekadal rainfall amounts. TAMSAT, RFE 2.0 and GPCP-1DD gave very similar results with biases less than 2 mm per dekad, RMSD less than 20 mm per dekad and r2 greater than 0.5 for comparisons covering all months and grid squares. RFE 2.0 and GPCP-1DD showed a slightly higher correlation with the gauge data, whereas TAMSAT scored slightly better on bias and RMSD for most months. Similar statistics were found when the spatial average over all grid boxes for each dekad was considered, with biases less than 2 mm per dekad, RMSD less than 12 mm per dekad and r2 greater than 0.7 for the satellite estimates. ERA-Interim showed a persistent tendency to overestimate (14 mm per dekad on average with higher bias at the start of the season). ERA-40 tended to underestimate throughout (10 mm per dekad on average).

The satellite products produced very similar spatial patterns and with fine detail consistent with the high spatial variability expected from convective rainfall. They also showed more resemblance to the kriged block estimates than the model outputs. The overestimation by ERA-Interim was particularly evident.

The comparison of the satellite algorithms is complicated by the fact that the GPCP-1DD and RFE2.0 algorithms both make use of contemporaneous GTS raingauge data which may also be included in the validating data set. The fact that the TAMSAT method performs well over a number of years on a single calibration demonstrates that for convective rainfall a locally calibrated algorithm using only TIR imagery can perform at least as well as more sophisticated algorithms making use of multiple data sources. This is an encouraging result for African national meteorological services that have relatively easy access to Meteosat TIR data and may have many secondary raingauges which are not available internationally but can be used to improve the local calibration of the TAMSAT algorithm.