Development of probability density functions for future South American rainfall

Authors


Author for correspondence:
Tim E. Jupp
Tel: +44 01392 263642
Email: t.e.jupp@exeter.ac.uk

Summary

  • We estimate probability density functions (PDFs) for future rainfall in five regions of South America, by weighting the predictions of the 24 Coupled Model Intercomparison Archive Project 3 (CMIP3) General Circulation Models (GCMs). The models are rated according to their relative abilities to reproduce the inter-annual variability in seasonal rainfall.
  • The relative weighting of the climate models is updated sequentially according to Bayes’ theorem, based on the biases in the mean of the predicted time-series and the distributional fit of the bias-corrected time-series.
  • Depending on the season and the region, we find very different rankings of the GCMs, with no single model doing well in all cases. However, in some regions and seasons, differential weighting of the models leads to significant shifts in the derived rainfall PDFs.
  • Using a combination of the relative model weightings for each season we have also derived a set of overall model weightings for each region that can be used to produce PDFs of forest biomass from the simulations of the Lund–Potsdam–Jena Dynamic Global Vegetation Model for managed land (LPJmL).

Introduction

The Amazonian rainforest plays a crucial role in the climate system. It helps to drive atmospheric circulations in the tropics by absorbing energy and by recycling about half of the rainfall that falls upon it. Furthermore, the region is estimated to contain c. 10% of the carbon stored in land ecosystems and to account for 10% of global net primary productivity (Melillo et al., 1993). Despite large-scale anthropogenic deforestation, it seems likely that the region is currently acting as a net sink for anthropogenic CO2 emissions (Tian et al., 2000; Phillips et al., 2009). The resilience of the forest to the combined pressures of deforestation and climate change is therefore of great concern, especially because at least one major climate model predicts a severe drying of Amazonia in the 21st century (Cox et al., 2000, 2004).

Rainfall in Amazonia is sensitive to seasonal, interannual and decadal variations in sea-surface temperatures (SSTs) (Fu et al., 2001; Liebmann & Marengo, 2001; Marengo, 2004). The warming of the tropical East Pacific during El Niño events suppresses wet-season rainfall through modification of the (East–West) Walker Circulation and via the Northern hemisphere extra-tropics (Nobre & Shukla, 1996). El Niño-like climate change (Meehl & Washington, 1996) has similarly been shown to influence annual mean rainfall over South America in General Circulation Model (GCM) climate change projections (Cox et al., 2004; Li et al., 2006). Variations in Amazonian precipitation are also known to be linked to SSTs in the tropical Atlantic (Liebmann & Marengo, 2001). A warming of the tropical north Atlantic relative to the south leads to a north-westward shift in the Intertropical Convergence Zone (ITCZ) and compensating atmospheric descent over Amazonia (Fu et al., 2001). For northeast Brazil the relationship between the north–south Atlantic SST gradient and rainfall is sufficiently strong to form the basis for a seasonal forecasting system (Folland et al., 2001). The variations in SSTs in the tropical Atlantic and Pacific contribute in different ways to rainfall variability in the regions of Amazonia.

Despite this developing understanding of the dynamics of tropical climate variability and change, the current generation of GCMs give very different projections of future Amazon rainfall (Li et al., 2006), varying from significant increases in rainfall to potentially damaging drying (Cox et al., 2004). Fig. 1 compares the simulated 20th century rainfall with the trend predicted for the 21st century for each of the 24 climate models available in the archive of the Coupled Model Intercomparison Archive Project (CIMP3) and for the five regions of South America defined in Table 1. There is no clear consensus on rainfall change in any of the regions, with predicted trends in 21st century rainfall ranging from an increase of c. +1 mm d−1 century−1 (e.g. model o in Eastern Amazonia and Northeast Brazil) to a drying of −2 mm d−1 century−1 (e.g. model w in Eastern Amazonia). More importantly, there is no obvious relationship between the ability of a given model to simulate the annual mean 20th century rainfall and the sign of its predicted trend in the future. For example, models with a relatively realistic simulation of annual mean rainfall in Southern Amazonia (Fig. 1d) include the models with the largest increases and decreases in the 21st century (models r and w, respectively).

Figure 1.

 Annual means (20th century) and linear trends (21st century) in each of the climate models listed in Table 2. The vertical line shows observed annual mean rainfall in the 20th century. The horizontal line separates models with a positive trend from models with a negative trend.

Table 1.   Definitions of the regions referred to in this study
RegionIdentifierLongitudeLatitude
Eastern AmazoniaEA55°W to 45°W5°S to 2.5°N
Northwest AmazoniaNWA72.5°W to 60°W5°S to 5°N
Northeast BrazilNEB45°W to 35°W15°S to 2.5°S
Southern AmazoniaSAz65°W to 50°W17.5°S to 10°S
Southern BrazilSB60°W to 45°W35°S to 22.5°S

How can we help to inform decision-making given this uncertainty? One way is to weight the various model projections based on the ability of each model to produce key aspects of the observed climate. In this way we might hope to find more robust predictions by emphasizing the results from more realistic models and de-emphasizing the results produced by less realistic models. The method that we describe here is to construct a probabilistic prediction based on a weighted sum of the predictions of individual GCMs, using a Bayesian approach (Min et al., 2007; Tebaldi & Knutti, 2007; Tebaldi & Sansó, 2009). The weight assigned to each GCM will be referred to as the probability of the model and will generate a probability density function (PDF) over the set of models. Bayes’ theorem allows the model probabilities to be modified each time we consider the ability of the models to simulate some relevant aspect of current climate (such as seasonal rainfall) by comparing time series of past observations with time series of model simulations. In this study we weight models based on their ability to simulate both the mean state and the inter annual variability (i.e. the statistical distribution) of the current climate. In other words, the aim is to downweight those models whose mean value is far from the observed mean, or whose interannual variability is a poor fit to the observed distribution, even when any bias in the mean value has been corrected.

The procedure can be summarized as follows.

  • Assign equal probability to all models – a uniform prior PDF.
  • Choose a climatic variable of interest (in this case, precipitation).
  • Update the model PDF based on the fit between model simulations and observations for this variable.
  • Use this posterior PDF to weight the predictions from individual models.

We make use of this procedure to estimate PDFs for future rainfall in each of the five regions of South America (Table 1; Fig. 2), using rainfall simulations produced by the 24 CMIP3 GCMs (Table 2). In the ‘Description’ section we outline the theory and data on which our approach is based, and in the Section ‘Results’ we discuss the PDFs for future rainfall that this procedure yields.

Figure 2.

 The regions defined in Table 1. EA, Eastern Amazonia; NEB, Northeast Brazil; NWA, Northwest Amazonia; SAz, Southern Amazonia; SB, Southern Brazil.

Table 2.   Labelling of the climate models referred to in this study
Model identifierModel nameModel identifierModel name
  1. The models are those in the World Climate Research Programme's Coupled Model Intercomparison Project phase 3 (CMIP3) multimodel data set in the ‘Climate of the 20th Century’ experiment (https://esg.llnl.gov:8443).

abccr_bcm2_0mingv_echam4
bcccma_cgcm3_1ninmcm3_0
ccccma_cgcm3_1_t63oipsl_cm4
dcnrm_cm3pmiroc3_2_hires
ecsiro_mk3_0qmiroc3_2_medres
fcsiro_mk3_5rmiub_echo_g
ggfdl_cm2_0smpi_echam5
hgfdl_cm2_1tmri_cgcm2_3_2a
igiss_aomuncar_ccsm3_0
jgiss_model_e_hvncar_pcm1
kgiss_model_e_rwukmo_hadcm3
liap_fgoals1_0_gxukmo_hadgem1

Description

Assigning Bayesian probabilities to climate model projections

In this section, we describe formally the procedure adopted. We consider the case in which there are N = 24 climate models (Table 2), denoted alphabetically by the labels m1 = a to mN = x. For each of the five regions listed in Table 1, the aim is to assign a probability to the ith model based on its ability to simulate the seasonal precipitation observed in the 20th century. In the absence of any other information about the performance of the models it is natural to assign equal weight to each of them. In the language of Bayesian statistics, we therefore assign a uniform prior distribution to the models:

image( Eqn 1)

In other words, the prior probability of the ith model is set to be 1/N. A naïve multimodel prediction would simply combine the predictions from individual models according to this uniform prior. The salient feature of our method is that predictions for the 21st century will be created by assigning different weights to different model predictions according to the models’ performance in the 20th century.

Having assigned a prior PDF, the next step is to assess the performance of each model over the historical period. This is accomplished by comparing time series from observations with model simulations over the historical period. For example, Fig. 3(a) compares observations of 20th century annual mean rainfall in Eastern Amazonia (solid line) with annual mean rainfall simulated by the 24 climate models (grey) listed in Table 2. Data are presented in the form of time-averages taken over the calendar year January to December (‘ann’). It follows that our measure of statistical variability is the interannual variability in annual mean rainfall.

Figure 3.

 Time series and associated cumulative distribution functions (CDFs) for the Eastern Amazonia region (Fig. 2). Solid lines represent observations {ot} from the data set of the Climatic Research Unit (CRU) at the University of East Anglia; grey lines represent (raw) climate model simulations {ri,t} from each of the climate models listed in Table 2; and dashed lines represent bias-corrected climate model simulations {bi,t} (Eqn 3) from each of the climate models. (a) Time series data. (b) Empirical CDFs corresponding to the time series shown in (a).

Several important points are illustrated in Fig. 3(a). No climate model is able to simulate exactly the observed year-to-year variability in rainfall. In other words, the peaks and troughs of the solid line (observations) do not coincide with the peaks and troughs of any of the grey lines (the raw simulations from the 24 models). This is a function of the chaotic nature of the climate system and is both unavoidable and entirely expected. The best we can demand from a climate model is that it should simulate well the observed statistical distribution of any climate variable over a period of a few decades (or, in this example, the 20th century).

It is clear from Fig. 3(a) that none of the models captures the observed distribution well. Consider first of all the century mean of all of the time series. It is apparent that most of the century means of the simulations 〈ri,t〉 are lower than the century mean of the observations 〈ot〉 = 6.05 mm d−1. (We use angle brackets to denote a temporal average over the 20th century.) This is an illustration of bias in the models. To remove this bias, it is standard practice to perform some sort of bias correction to the model simulations so that the long-term mean value of the simulated climate variable agrees with observations. The precise way in which model simulations are corrected for bias will be discussed further in the Section ‘Example: annual mean rainfall in Eastern Amazonia’. In the Section ‘A measure for bias: the climate prediction index C, we will discuss the way in which models with greater bias will be assigned a lower weighting in the model PDF.

Fig. 3(a) illustrates that the century means of the bias-corrected simulations {〈bi,t〉} are all – as expected – closer than the raw simulations to the century mean of the observations 〈ot〉 = 6.05 mm d−1. It is still possible, however, to discriminate amongst the (bias-corrected) models by assessing how well the distributions of the bias-corrected simulations fit the distribution of the observations. This point is illustrated in Fig. 3(b). Here, the empirical cumulative distribution functions (CDFs) of the raw simulations {ri,t} (grey), bias-corrected simulations {bi,t} (dashed) and observations {ot} (solid) are compared. It is clear that the bias-corrected simulations (dashed) have CDFs that are ‘closer’ to the observations (black) than to the raw simulations (grey). We will show in the Section ‘A measure for distributional fit: the Kolmogorov–Smirnov statistic D, how models whose CDFs are ‘closest’ to the observations will receive the highest weighting in our model PDF.

In particular we wish to assign greater weight to those models that simulate well the observed interannual variability in seasonal rainfall. For each spatial region and for each season, the model PDF is updated in a two-stage process. First, the climate prediction index C, described in the Section ‘A measure for bias: the climate prediction index C, is used to assess the degree to which the mean of the raw simulations of the ith model 〈{ri,t}〉 fits the mean of the observations 〈ot〉. Second, the Kolmogorov–Smirnov statistic D, described in the Section ‘A measure for distributional fit: the Kolmogorov–Smirnov statistic D, is used to assess the similarity of the distribution of bias-corrected simulations of the ith model, {bi,t} to the distribution of the observations {ot}.

In general terms, the sequential modification of the model PDF proceeds by considering the likelihood f(d|mi) of observed data d under the assumption that model mi is correct. The posterior PDF is calculated from the prior PDF using Bayes’ formula:

image( Eqn 2)

with an appropriate normalization being applied so that inline image.

In the next two sections we outline plausible forms for the likelihood function f (d |mi) to assess the bias of the raw simulations {ri,t} and the distributional fit to the data of the bias-corrected simulations {bi,t}.

As rainfall must be non-negative we apply a logarithmic transformation to obtain the bias-corrected rainfall simulations. Specifically, the bias-corrected rainfall simulations are constructed according to the following formula:

image( Eqn 3)

where the angle brackets denote a temporal mean over the 20th century.

A measure for bias: the climate prediction index C

In this section we consider how to weight the climate models according to the mean bias in the raw simulations {ri,t}. For this we compare the century mean of the observations 〈ot〉 with the century mean of the ith (raw) model simulation 〈ri,t〉. We construct the sample variance, σ2, of the century mean amongst the different models:

image( Eqn 4)

Following Murphy et al. (2004), we then construct a climate-prediction index:

image( Eqn 5)

as a measure of the bias of the ith model. The corresponding likelihood of the data, d (which in this case is the climate prediction index Ci), is then assumed (Murphy et al., 2004) to take the functional form:

image( Eqn 6)

A measure for distributional fit: the Kolmogorov–Smirnov statistic D

Here we consider how to rate the climate models according to the shape of the distribution of the bias-corrected simulations {bi,t}. In order to compare the distributions of the bias-corrected simulations {bi,t} and the observations {ot} we consider empirical CDFs, as shown in Fig. 4(a). The CDF F(x) of a variable x is simply the proportion of the data whose value is less than or equal to x. Suppose that the observations consist of a time series of length n0, while the bias-corrected simulation from the ith model consists of a time series of length ni. (In the example that we present in the Section ‘Example: annual mean rainfall in Eastern Amazonia’, the data cover the years 1901 to 1999 and so n0 = ni = 99.) We construct empirical CDFs F0(x) and Fi(x) for the two time series and compare them. Clearly, a good model is one whose CDF Fi(x) is reasonably ‘close’ to the CDF of the observations F0(x). A standard measure of the closeness of two distributions, whose distribution is easily calculated, is the Kolmogorov–Smirnov statistic, D, defined by:

image( Eqn 7)
Figure 4.

 The Kolmogorov–Smirnov statistic as a measure of the difference between two cumulative distribution functions (CDFs). (a) The Kolmogorov–Smirnov statistic D is defined as the maximum difference between two CDFs, where the CDFs are derived from samples of size n0 and ni. The two CDFs shown here are for illustrative purposes only and do not correspond to the data discussed in the text. (b) The probability density function (PDF) of D in the case when n0 = ni = 99 and the two samples are drawn from identical distributions. K-S, Kolmogorov–Smirnov.

Thus, for each model mi we can regard Di as a measure of the difference between the CDF of the (bias-corrected) simulations of the ith model and the CDF of the observations. The distribution fKS(Di;n0,ni) of the Kolmogorov–Smirnov statistic D can be calculated under the null hypothesis that the observations and the simulation are drawn from the same distribution. This distribution is known as the Kolmogorov distribution and it is easily calculated using standard statistical software packages (given knowledge of the two sample sizes n0 and ni) as a function of Di. The PDF of the Kolmogorov distribution with n0 = ni = 99 is shown in Fig. 4(b).

It follows that the likelihood of the data d (which in this case is the Kolmogorov–Smirnov statistic Di), under model mi, is:

image( Eqn 8)

Data

Two types of data are used in this study – observational data for the 20th century alone and model-based data for the 20th and 21st centuries. The data taken to represent the ‘true’ state of the climate are taken from the Climatic Research Unit (CRU) TS 3.0 archive (New et al., 1999, 2000). These data are available at http://www.cru.uea.ac.uk/cru/data/. General Circulation Model data are taken from the CMIP3 multimodel archive (Covey et al., 2003), in the Climate of the 20th Century experiment. There are 24 models, which are listed in Table 2. These data are available at https://esg.llnl.gov:8443/.

For assessment of the 20th century climate, raw data consist of monthly averages for the period January 1901 to December 1999 (this is the longest period for which data are available from all sources). Similarly, model predictions for the 21st century are considered at a monthly resolution for the period January 2001 to December 2098. For the analysis, five types of seasonal average were created by averaging over the periods January–December (denoted by ‘ann’), December–February (‘DJF’), March–May (‘MAM’), June–August (‘JJA’) and September–November (‘SON’). The final time series used in the analysis was then obtained by taking spatial averages of these seasonal data in a total of five spatial windows (Table 1).

Validation

It is important to assess whether or not the posterior weighting of the GCMs can be said to produce ‘better’ predictions of a climate variable, x, than simple uniform prior weighting via Eqn 1. To test this, we split the data from the 20th century into a training period covering the years 1901–1959 and a validation period covering the years 1960–1999. For a climate variable, x, model weights (both the uniform prior weights and the posterior weights obtained by considering the observations in the training period) can be used to produce a predicted CDF F(x) for the validation period. The difference between the predicted CDF F(x) and the observed CDF F0(x) in the validation period is then quantified using the root mean square error E:

image( Eqn 9)

Clearly, it is desirable for the values of E for posterior-weighted predictions to be less than those for prior-weighted predictions.

Results

In this section we present detailed results for one illustrative region and season (see the section entitled ‘Example: annual mean rainfall in Eastern Amazonia’) before summarizing our results for the remaining cases (see the section entitled ‘Results for other regions and seasons’). All calculations reported here were performed using the statistical package R. We chose this software because it is both powerful and freely available for download (available at http://cran.r-project.org/). The results reported here were produced using an R-code that we wrote specifically for the purpose of Bayesian reweighting of climate model predictions.

Example: annual mean rainfall in Eastern Amazonia

We report the steps below sequentially but stress that the same final result would be obtained if the data constraints were considered in a different order (The insensitivity to ordering comes from the fact that at each stage the PDF is modified by a multiplication, and, of course, a multistage multiplication can be performed in any order.)

By definition, our initial model PDF for the N = 24 models is the uniform prior (Eqn 1). We will now modify this PDF according to the models’ ability to simulate annual mean rainfall in Eastern Amazonia over the 20th century.

Bias in the raw simulations  Raw simulations of annual mean rainfall {ri,t} in Eastern Amazonia are shown in grey in Fig. 3(a). The century mean rainfall simulated by the ith model 〈ri,t〉 is compared with the century mean of the observations 〈ot〉 = 6.05 mm d−1 via Eqns 4 and 5. Eqn 6 then yields the likelihood of the data under each of the models. This likelihood is shown for each model in Fig. 5(a). The likelihood is then combined with the (uniform) prior via Eqn 2 to yield the model PDF shown in Fig. 5(b). The dashed horizontal line in this and in subsequent figures for model PDFs denotes the uniform PDF for reference.

Figure 5.

 Steps in the calculation of a probability density function (PDF) across the N = 24 models, shown here for the illustrative case of annual mean rainfall in Eastern Amazonia (Fig. 3). Initially, a uniform prior (Eqn 1) is assigned across the models. (a) The likelihood of each model, calculated from Eqn 6, is a measure of each model's ability to reproduce the mean of the observed time series. (b) Updated model PDF, incorporating the likelihood information in (a). Dashed horizontal line indicates prior probability 1/N initially assigned to each model. (c) The model PDF shown in (b), with models sorted into ascending order of probability. (d) The likelihood of each model, calculated from Eqn 8, is a measure of each model's ability (after bias-correction) to reproduce the distributional shape of the observed time series. (e) Updated model PDF, incorporating the likelihood information in (d). (f) The model PDF shown in (e), with models sorted into ascending order of probability.

Distributional shape of the bias-corrected simulations  The next stage of the process is to modify the current model PDF (in Fig. 5b) according to the (bias-corrected) models’ ability to simulate the distribution of annual mean rainfall when bias corrected by Eqn 3. The distribution of the bias-corrected simulations (Fig. 3b, dashed lines) is then compared with the distribution of the observations (Fig. 3b, black lines) using the Kolmogorov–Smirnov statistic (Eqn 7). Finally, Eqn 8 yields the likelihood of the data under each of the models, as shown in Fig. 5(d). This likelihood is combined with a prior weight taken from the previous calculation (i.e. the model PDF in Fig. 5b) via Bayes’ theorem (Eqn 2) to yield the updated model PDF shown in Fig. 5(e,f). It is clear that the simulated interannual variability discriminates much more clearly between different models than the simulated mean rainfall, such that the final model PDF is dominated by this stage of the procedure.

PDF for future rainfall  We are now in a position to calculate a probability distribution for future rainfall by weighting the predictions of individual models. It is important that some models predict a downward trend in rainfall while others predict little trend, or indeed an upward trend, (Li et al., 2006). Our final estimate of the trend in 21st century rainfall will, of course, depend on how the model PDF (Fig. 5e,f) distributes probability weight between models with upward and downward trends.

Fig. 6(a) shows model predictions of annual mean rainfall in the early part of the 21st century (2001–2031). The predictions from individual climate models are shown as grey lines. These curves are CDFs of the (bias-corrected) rainfall predicted by each of the N = 24 CMIP3 climate models. These individual predictions have then been combined using the model PDF of Fig. 5(e,f) to give an overall distribution for rainfall that is a weighted average across models. This distribution is shown in black and represents our final probabilistic prediction based on the criteria that we outlined earlier.

Figure 6.

 Predictions of 21st century annual mean rainfall in Eastern Amazonia (Li et al., 2006). (a) Cumulative distribution functions (CDFs) of predicted rainfall in the period 2001–2031. The grey lines represent the predictions of each of the N = 24 models. The solid black line represents combined prediction, obtained by weighting each model with the model probabilities in Fig. 5(e). (b) CDFs of predicted rainfall in the period 2068–2098. The gray lines represent the predictions of each of the N = 24 models. The dashed black line represents combined prediction, obtained by weighting each model with the model probabilities in Fig. 5(e). (c) Comparison of the weighted predictions for the early and late 21st century (probability density functions (PDFs) corresponding to these CDFs are shown in Fig. 7a). The solid line represents predicted distribution the early 21st century. The dashed line represents predicted distribution in the late 21st century.

We now consider how the predicted rainfall changes from the beginning to the end of the 21st century. Fig. 6(b) shows model predictions of climate in the last 30 yr of the 21st century. Again, the predictions of the individual climate models are shown as grey lines and a weighted average, using the model PDF of Fig. 5(d,e), is shown in black.

Fig. 6(c) illustrates the change in model-weighted rainfall predictions between the period 2001–2031 and the period 2068–2098. It is clear that the spread of the probabilistic prediction increases over the 21st century. This is a consequence of the prediction being an average across all models. Over the 21st century, some of the models predict increased rainfall, whereas others predict decreased rainfall (Li et al., 2006). Thus, unless all models of one ‘sign’ are very significantly downweighted in the model PDF, the weighted average rainfall prediction must assign some probability to increased rainfall and some probability to decreased rainfall. We can essentially discount the possibility of ‘very high’ or ‘very low’ rainfall in the early 21st century (Fig. 6a) because there are no models that predict these extreme values. For the late 21st century, however, we cannot rule out ‘very high’ or ‘very low’ rainfall (Fig. 6b) because: some models predict high rainfall and some models predict low rainfall; and the evidence does not lead to significant downweighting of all ‘low’ models or all ‘high’ models.

Fig. 6(c) contains CDFs. For ease of interpretation these functions may be differentiated to obtain PDFs for future rainfall. The PDFs for all regions are shown in Fig. 7 and show the change in predicted rainfall PDF between the period 2001–2031 and the period 2068–2098. In the case of Eastern Amazonia (Fig. 7a), our results suggest that a ‘low’ annual mean rainfall of c. 3 mm d−1 is much more likely to occur at the end, than at the beginning, of the 21st century.

Figure 7.

 Changes in modelled rainfall probability density functions (PDFs) between the early 21st century (2001–2031) and the late 21st (2068–2098) for the regions listed in Table 1. The grey line represents observed distribution in the 20th century. The solid line represents predicted distribution in the early 21st century. The dashed line represents predicted distribution in the late 21st century. General Circulation Model (GCM) predictions were weighted according to the appropriate posterior distribution in Table 3. EA, Eastern Amazonia; NEB, Northeast Brazil; NWA, Northwest Amazonia; SAz, Southern Amazonia; SB, Southern Brazil.

Results for other regions and seasons

We compared observed and simulated annual mean rainfall in the 20th century for the five study regions. In general, there are systematic errors in rainfall, with climate models tending to overestimate rainfall in Northeast Brazil but to underestimate rainfall in the other four regions.

Fig. 8 contains the results of the validation procedure described in the Section entitled ‘Validation’. The posterior-weighted predictions perform better than the prior-weighted predictions in most cases and perform only slightly worse in the remainder.

Figure 8.

 Comparison of root mean square error (RMSE) E in rainfall cumulative distribution function (CDF) for prior- and posterior-weighted predictions. Training period 1901–1959; validation period 1960–1999 (Eqn 9 with x1 = 0 and x2 = 25 mm d−1). One data point is given for each season and each region. The dashed line has a slope of 1.

We repeat the Bayesian weighting procedure for each of the five regions in Table 1. In each case we make use of both the bias in the mean rainfall (via the index C), and the Kolmogorov–Smirnov statistic of the bias-corrected rainfall (via the index D), to downweight the models. Table 3 shows the relative model weightings derived for each region that result from considering annual mean rainfall. These overall weightings were subsequently used to produce PDFs of biomass change from the forest projections produced using the Lund–Potsdam–Jena (LPJ) model (Rammig et al., 2010).

Table 3.   Posterior probabilities (expressed as percentages) assigned to models of Table 2 in the regions listed in Table 1
ModelEANEBNWASAzSB
  1. EA, Eastern Amazonia; NEB, Northeast Brazil; NWA, Northwest Amazonia; SAz, Southern Amazonia; SB, Southern Brazil.

a0.870.071.772.490.04
b9.910.395.607.680.09
c9.940.715.617.530.00
d4.150.075.147.580.12
e1.440.950.173.870.20
f0.001.000.000.001.20
g0.193.330.000.071.28
h0.1621.560.000.0021.17
i1.350.003.772.110.17
j2.670.055.552.040.73
k1.920.001.880.656.13
l2.712.363.807.508.06
m3.660.344.237.601.34
n4.032.163.850.882.33
o0.907.420.960.1116.82
p13.632.7814.083.420.08
q13.614.451.751.271.07
r9.930.4710.242.020.52
s0.1214.0714.1010.6421.10
t5.4417.083.337.5413.90
u9.900.002.7310.250.47
v0.030.065.0510.441.61
w1.778.112.682.851.47
x1.6712.583.721.450.09

Discussion

It is clear that the relative ranking of GCMs varies significantly with region and season. In any one region it is also unusual for a given model to simulate rainfall accurately in all four seasons. As a result, models that simulate each season well tend to dominate the overall weighting (e.g. models p and q in Eastern Amazonia, model h in Northeast Brazil, models p and s in Northwest Amazonia, and models h and s in Southern Brazil).

As an indication of the risk of drought, the probability of annual rainfall being < 3 mm d−1 was also calculated for each of the five regions. The results are summarized in Table 4 and indicate an estimated six-fold increase by the end of the 21st century in the likelihood of drought-like conditions for Southern Brazil, and smaller increases for Eastern and Southern Amazonia.

Table 4.   Probability of annual rainfall being < 3 mm d−1 for each of the five study regions of Amazonia
RegionPrior 2001–2031Prior 2068–2098Posterior 2001–2031Posterior 2068–2098
  1. In each case probabilities are shown for the two periods 2001–2031 and 2068–2098, and for the uniform prior distribution as well as for the Bayesian posterior distribution. The posterior distributions represent ‘best’ estimates based on information currently available.

Eastern Amazonia (%)0.62.700.7
Northwest Amazonia (%)0000
Northeast Brazil (%)86808076
Southern Amazonia (%)0.140.700.1
Southern Brazil (%)0.41.71.16.8

To summarize, we have estimated PDFs for future rainfall in five regions of South America, by weighting the predictions of the 24 CMIP3 GCMs according to their relative abilities to reproduce the mean and variability of the observed rainfall in each season. The relative weighting of the climate models was updated sequentially according to Bayes’ theorem, based on the biases in the mean rainfall and the distributional fit of the bias-corrected time series as measured using the Kolmogorov–Smirnov statistic, D. Using a combination of the relative model weightings for each season, we also derived a set of overall model weightings for use by the LPJ group (Rammig et al., 2010).

Depending on the season and region, we find very different rankings of the GCMs, with no single model doing well in all cases. However, in some regions, posterior weighting of the models leads to significant shifts in the derived rainfall PDFs between the beginning and the end of the 21st century, including a significant increase in the risk of annual mean rainfall below 3 mm d−1 in Southern Brazil. Compared with a method in which models are simply weighted equally, the Bayesian approach adopted here provides an estimate of future rainfall in Amazonia that makes greater use of the information available in the historical record. There are still, however, very significant uncertainties associated with deficiencies in GCM rainfall simulation in this region. In the future, the Bayesian methodology described here could be adapted to incorporate statistical descriptions of the uncertainty present in the historical record and a multivariate assessment of model performance. It could also be used to assess GCMs based on their ability to reproduce other variables known to be climatically significant, such as regional SSTs.

Acknowledgements

This work was funded by the World Bank Grant ‘Assessment of the prospects and identification of the implications of Amazon dieback induced by climate change’ (contract no. 7146402). We thank Jose Marengo, Walter Vergara, Sebastian Scholz and Alejandro Deeb for fruitful discussions, and three anonymous referees for improving the quality of the manuscript.

Ancillary

Advertisement