Twentieth-century summer precipitation in South Eastern South America: comparison of gridded and station data


  • P. L. M. Gonzalez,

    Corresponding author
    1. International Research Institute for Climate and Society, Earth Institute, Columbia University, Palisades, NY, USA
    • Correspondence to: P. L. M. Gonzalez, International Research Institute for Climate and Society, Earth Institute, Columbia University, P.O. Box 1000, Palisades, NY, USA. E-mail:

    Search for more papers by this author
  • Lisa Goddard,

    1. International Research Institute for Climate and Society, Earth Institute, Columbia University, Palisades, NY, USA
    Search for more papers by this author
  • Arthur M. Greene

    1. International Research Institute for Climate and Society, Earth Institute, Columbia University, Palisades, NY, USA
    Search for more papers by this author


South Eastern South America (SESA) has experienced one of the largest regional wetting trends in the world according to gridded datasets that cover the 20th century, such as Global Precipitation Climatology Centre monthly precipitation dataset in its version 4 (GPCCv4). The trend is strongest in the warm season and covers central and northern Argentina, Uruguay, Paraguay and Southern Brazil. This article examines the consistency of the trend and variability in the gridded product with that of local experience as recorded in station data. This is relevant to the use of both types of datasets for long-term climate studies, since the region has a limited amount of available station data with records extending back to the first part of the century. Both the station and gridded datasets show good agreement on the temporal variability of summer precipitation and its spatial patterns for different timescales. The magnitude of the wetting trend in SESA observed for December, January and February in GPCCv4 during the 1901–2000 period is more than 10 mm in the season per decade and is highly consistent with the trend observed in the station-based SESA average. The percentages of variance explained by the trend, decadal and interannual timescales are also assessed for both GPCCv4 and station data. The variance of the SESA precipitation contained at the decadal timescale matches well to the station-based and the GPCCv4-based indexes, but the latter suggests a larger fraction of variance attributable to the trend and a lower fraction explained by interannual scales, as compared to the station-based index. For all the timescales analysed, there is a good agreement in the spatial patterns observed for GPCCv4 and those suggested by the available station data.

1. Introduction

South Eastern South America (SESA) has experienced a strong positive trend in precipitation within a large region including central and northern Argentina, Uruguay, Paraguay and Southern Brazil throughout the 20th century (Castañeda and Barros, 1994, 2001; Barros et al., 2000, 2008; Liebmann et al., 2004). This increase in rainfall in recent decades has had an important economic impact in the region due to the subsequent expansion of its agricultural frontiers (Barros et al., 2008).

Gridded datasets that cover the 20th century evidence a century-long wetting which is strongest in summer, superimposed with higher frequency variability. The station data suggests this also; however, station data are sparse throughout South America, in particular for the earliest part of the century (Castañeda and Barros, 1994; Seager et al., 2010). Given the fact that gridded and station data are useful for complementary tasks, such as the evaluation of large-scale patterns of variability on the one hand and downscaling to the local level on the other, this work focuses on the consistency of a gridded dataset Global Precipitation Climatology Centre monthly precipitation dataset in its version 4 (GPCCv4) and a set of local stations with long records, most of which are likely to have contributed to GPCC, in the SESA region. Particular attention is given to the comparison of the magnitudes of the 20th century trends, since such results would be of utmost importance for attribution studies.

2. Data

The WMO/DWD GPCCv4 (Schneider et al., 2010) was considered as a gridded 20th century dataset. It covers the period 1901–2007 and is available at spatial resolutions of 0.5, 1.0 and 2.5 degrees. The 0.5-degree version was used in this study.

Precipitation data was also gathered from stations corresponding to the following institutions: Servicio Meteorologico Nacional (SMN, Argentina), Instituto Nacional de Tecnologia Agropecuaria (INTA, Argentina), Direccion Nacional de Meteorologia (Uruguay), Instituto Nacional de Investigacion Agropecuaria (INIA, Uruguay), Instituto Nacional de Meteorologia (INMET, Brazil), and from farmers organizations, especially within the Santa Fe province in Argentina. The latter records, which were collected by INTA, are most likely to be independent from the GPCCv4 dataset, since they are not part of national networks, and are therefore very relevant for this comparative study. Quality control of the data was provided by the collecting institutions. The stations were selected for having relatively complete records for most of the 20th century, which is particularly important for producing a reasonable estimation of the regional trend. Nevertheless, the best-recorded period reduces to 1930–2000, and it reaches a maximum of 66 stations within the region of interest. No precise information is provided in the metadata about which particular stations were used to construct GPCCv4. Nonetheless, given the existing documentation and the record of number of stations per grid box we can be certain that a large number of the stations incorporated had short (as short as 10 years) or interrupted records, and this might have affected the estimation of the regional trend and variability. GPCC provides the VASClimO product, specially adjusted to support long-term precipitation variability and trend analysis, but it covers the period 1951–2000.

The regional domain selected to represent SESA was defined as 40 S–25 S, 65 W–50 W. Averages of monthly data for December, January and February (DJF) were calculated as representative of the region's warm season. Even when SESA is not within the South American Monsoon season, DJF is part of the rainy season and can be considered representative of such. Complete descriptions of the climatology of this region in the context of the South American Monsoon system can be found, for instance, in Nogues-Paegle et al., 2002, Vera et al., 2006 and Marengo et al., 2012.

3. Analysis and results

The percentage of temporal coverage of the 20th century of monthly station data is presented in Figure 1, along with a box delimiting the SESA region. Only a few stations have data available for at least 90% of the period between 1901 and 2000 (circles in Figure 1), and most stations have data for less than 70% of the period. Though there are some stations that fall outside of the SESA region, they were kept in order to compare the local and regional patterns of variability. The year-to-year variability at each station is largely coherent with that observed for the GPCCv4 SESA average (Figure 1). Around half of the stations, located in the region between 36 S–26 S and 63 W–53 W, show correlations above 0.6. This subregion, where the station-based time series are highly coherent with the SESA average, will be considered the SESA core region.

Figure 1.

Each symbol indicates the location of a precipitation station, and the colour scale determines the correlation coefficient between DJF precipitation averages for each station and the South Eastern South America SESA spatial precipitation average from the GPCCv4 dataset for the period 1901–2000. The four types of symbols represent the percentage of monthly data available for that period. In addition, the box in dashed lines delimits the region defined as SESA.

Data from stations within the SESA box were averaged to create a regional index that is used to compare the station-based SESA precipitation with the GPCCv4 average (Figure 2(a)). Two different averages were created from station data: a standard mean and an area-weighted average. Although the period starting around 1930 is the most reliable one, since over 60 stations are available for the region (Figure 2(a)), no significant differences are seen between the gridded data and the two station-based SESA averages. Both are quite consistent with the GPCCv4 regional precipitation, even during the pre-1930 period. The temporal means in the three time series are not statistically different at the 99% confidence level. Station-based averages do show a somewhat larger variance, which is consistent with the facts that the GPCCv4 average is spatially aggregating a larger number of time series, and that its variance might have been affected by the interpolation methodology involved in its design. The differences between the temporal variances are statistically different at the 95% confidence level, though not at the 99% level. Only in the period before 1940, where fewer stations are available, there is some disagreement in the year-to-year maxima and minima, which may also be due to the more limited spatial coverage of the station data for that period (Figure 1). The methodology for estimating the trend is based on Greene et al. (2011). In this method, the ‘climate change signal’ is first represented by the WCRP/CMIP3 ensemble multi-model mean of global temperature. The regression of the observed data onto the ‘climate change signal’ then constitutes the trend in that observed data. A comparison of the trends for the three SESA time series (Figure 2(a), inset) shows that the trends for the station-based and GPCCv4 SESA averages are very similar, though the latter is around 6% larger.

Figure 2.

(a) SESA regional DJF precipitation index from GPCCv4 (red line) compared with two different station-based indexes: a standard average (black line) and an area weighted average (green line). The dashed line corresponds to the right axis and indicates the number of available precipitation stations for each year. The inset in the bottom right corner presents the non-linear trends obtained for the same time series. (b) The bottom panel presents a comparison of Welch spectral density estimates for the three time series in the above plot, after removing the non-linear trends.

The features of the variability in these time series, with the non-linear trends removed, were further explored using a Welch power spectral estimate (Welch, 1967) that provides a reasonable estimate of the continuous spectral density when considering finite data (Figure 2(b); Mudelsee, 2010; Hartmann, 2012). There is a good agreement between the spectral peaks of variability for the gridded and station-based SESA precipitation index, though variance in the GPCCv4 time series is smaller, as was also shown in Figure 2(a).

Over the period 1901–2000 the rate of change, and the spatial patterns of the observed non-linear trend estimated as described above, are consistent with a linear estimate of the trend. In Figure 3(a), stations trends are compared against trends in the gridded data based on the linear estimates as a simplified approach since it provides a constant rate of change based on the entire period. All the stations but three (Figure 3(a), red circled dots) show positive trends, with magnitudes in some cases larger and in some cases smaller than the ones for GPCCv4. The available stations suggest that some small-scale features of the regional trend in GPCCv4 may not be real, as seen for the region south of 25 S and east of 48 S, which shows negative trends in the gridded dataset, but the opposite sign for every station in that area. This inconsistency might be due to the inclusion of station data in GPCC with shorter or interrupted records that led to the representation of a regional negative trend, which would not have been present if only long-record stations were considered. A similar mismatch in small-scale features is suspected for the negative centre near 28 S and 60 W, though no station data can support the disagreement. Regarding the magnitudes of the trend (Figure 3(a), inset), the histogram shows that the average magnitude of the SESA trend from GPCCv4 falls within the modal bin of the trends observed in the stations located within SESA. This indicates that even while the regional distribution of magnitudes for GPCCv4 might not be accurate as compared to the station data within SESA, the regional average is representative. In addition, the magnitude of the trend observed for the station-based SESA precipitation is included and is very close to that for GPCCv4.

Figure 3.

The maps overlap information from the GPCCv4 dataset (shading) and from station data (circles), both with the same colour scale. The variables presented are: (a) the magnitude of the linear trend for the 20th century in mm/month/year. The thick black line represents the level of null trend. The circles surrounded with a red line correspond to stations with negative trends. (b) Percentage of variance explained by the trend. The histograms in (a) and (b) present the magnitude of the corresponding variable for the stations (empty bars) and the grid points of GPCCv4 (grey bars) within SESA. The values obtained for the station-based and GPCCv4-based SESA indexes are included as the black and grey bars, respectively.

As a next step, the variability in both station and gridded data was decomposed into:

  • Trend, following the regression methodology explained above;
  • Decadal-scale variability, obtained using a Lanczos low-pass filter (Duchon, 1979) with a 7-year cut-off period, which is found to be a relative minimum in the spectral density.
  • Interannual-scale variability, obtained with a high-pass Lanczos filter with a 7-year cut-off period.

For each of these portions of the variability, the percentage of explained variance is assessed. Throughout most of Southern South America, the trend explains less than 10% of the total variance in precipitation (Figure 3(b)). Nonetheless, the region South of 25 S and East of 68 W represents a regional maximum of variance explained by the trend, reaching values of up to 30%. Only a couple of other small regions in the world yield such a large percentage of 20th century precipitation variance explained by trends (Time Scales Maproom, Greene et al., 2011, Most of the station data presents lower or equal explained variances compared to the gridded data, with some exceptions, mainly closer to the coast. This last point is consistent with the histogram on the bottom right corner, which shows that the trend of the GPCCv4 SESA average explains a greater percentage of variance than most stations and than the station-based average. This magnitude does not necessarily have to coincide (and it does not) with the mean value of magnitudes in the histogram, since the station-based trend is obtained by first averaging the station data and then calculating the trend of that time series. The differences between the indexes constructed from gridded and station data are larger in the case of the percentage of variance explained by the trend (Figure 3(b)) than in the magnitude of the trend (Figure 3(a)) because there is also a difference between the total variance in the datasets (Figure 2).

For the decadal-scale variability, the regional pattern in GPCCv4 reveals that SESA is a regional minimum, with explained variances between 10 and 20% (Figure 4(a)). The spatial distribution of the decadal variance fraction in the gridded data seems to be quite consistent with station data, with peripheral stations showing percentages above 30 and even 40%, and smaller explained variances South of 29 S and East of 65 W, within the SESA core region. In this case, the histogram reveals that although most stations show larger percentages than the average from the gridded data, the latter is closer to the mode of the individual station values than in the case of the trend (Figure 3(b)) and still seems representative of station data. In fact, decadal variability in GPCCv4 average explains almost the exact same variance fraction as seen for the station-based SESA average (Figure 3(b) inset, black vertical line).

Figure 4.

The maps overlap information from the GPCCv4 dataset (shading) and from station data (circles), both with the same colour scale. The variables presented are the percentage of variance explained by: (a) decadal variability (periods of more than 7 years) and (b) interannual variability (periods of less than 7 years). The histograms in (a) and (b) present the magnitude of the corresponding variable for the stations (empty bars) and the grid points of GPCCv4 (grey bars) within SESA. The values obtained for the station-based and GPCCv4-based SESA indexes are included as the black and grey bars, respectively.

Finally, GPCCv4 shows that SESA is a regional maximum of variance explained by interannual variability (Figure 4(b)), most of which is due to a strong ENSO teleconnection (e.g. Ropelewski and Halpert, 1996). In this case, most stations show explained variances in agreement with the gridded data, including the fact that the stations within the SESA core region show larger explained variances than the surrounding ones. The histogram reveals that more than half of the stations have larger explained variances at interannual timescales than the GPCCv4 SESA average (Figure 4(b) inset, grey vertical line). This value is smaller and outside of the modal bin, which actually contains the explained value for the station-based SESA average (Fig 4(b) inset, black vertical line). This once again confirms that the GPCCv4 SESA average constitutes a reasonable representation of the station data within the region.

The histograms corresponding to station data and the grid point values in GPCCv4 (insets in Figures 3 and 4) do not exhibit significant differences, and the observed discrepancies are likely due to the sparseness and the spatial inhomogeneity of the available stations.

4. Summary and discussion

This work introduced the comparison between a 20th century gridded precipitation dataset (GPCCv4) and station data from SESA (Argentina, Uruguay, Southern Paraguay and Southern Brazil) on different timescales. Given the limited amount of stations with reasonable century-long coverage within SESA, the existence of this type of gridded datasets is very relevant, e.g. for the imminent need to contrast the new WCRP/CMIP5 historical runs. Even when it is likely that most of the station data collected for this study is not independent from the GPCCv4, two important points stress the relevance of the analysis:

  1. Only stations with long records were considered, which can have a strong impact in the assessment of long term trends and variability;
  2. The station data had independent components as well as added information (as judged by the station count provided by GPCCv4), especially for certain periods as the pre-1930 decades, and those independent time series also have long records.

These facts allow presenting a comparison between the datasets with the aim of exploring the coherence across timescales and in the regional patterns of the variability.

Particular interest was given to the comparison of the century-long wetting trends observed in SESA, which is one of the few regions in the world with more than 20% of the 20th century precipitation variance explained by the trend. According to these results, the trends observed in station data and their regional distributions are well represented by the gridded dataset, with the exception of some smaller scale details in the 0.5-degree dataset. All but the outermost stations within the SESA region show wetting trends, with a mean magnitude highly consistent with the GPCCv4 SESA average, which is 0.34 mm/month/year, or an increase of over 10 mm in the season per decade. Since the average seasonal total for SESA is typically around 300 mm, this trend implies an increase of approximately 3% of the seasonal total per decade. Some could argue that it is relatively small; nevertheless, it is one of the largest precipitation trends observed in the world (e.g. IPCC AR4 (2007), Section http://www.ipcc. ch/publications_and_data/ar4/wg1/en/ch3s3-3-2-2.html).

The percentages of variance explained by the trend, decadal and interannual timescales were also compared. Agreement between stations and GPCCv4 is best with respect to the variance explained by decadal timescales, while it slightly overestimates that for the trend and underestimates the portion of the variance explained by interannual scales. Overall, GPCCv4 and the available station data are in reasonable agreement with respect to the temporal variability of summer precipitation and its spatial patterns within SESA. In addition, the GPCCv4 SESA average is very consistent with the station-based SESA precipitation index, and the discrepancies observed are likely due to the fact that the temporal and spatial coverage of station data is not homogeneous.


The authors wish to acknowledge Brazilian institutions EMBRAPA and UFRGS for providing precipitation data from INMET stations. In addition, the authors thank Maria de los Milagros Skansi from Argentina's Servicio Meteorologico Nacional (SMN) for consisting and sharing station data from SMN's network and Laura Gastaldi from Argentina's Instituto Nacional de Tecnologia Agropecuaria (INTA) for work on INTA's network data and other regional records over the Santa Fe province. In addition, the authors wish to thank two anonymous reviewers for their helpful comments. This research was funded by NSF grant # AGS 10-49066.