Improved spatial prediction: A combinatorial approach

Authors


Abstract

[1] This paper presents a combinatorial approach for improving spatial predictions. First, copulas are used to interpolate a spatially distributed point rainfall field to a uniform spatial grid. It is observed that results vary substantially depending on the parameters chosen for interpolation leading to the hypothesis that it may be advantageous to estimate copula parameters locally or to combine local and global copula predictions. It is found that by modifying the method of forecast combinations, prediction errors in the spatial interpolation of rainfall can be reduced. Although this method of combining predictions is applied in the context of rainfall interpolation using local and global copula predictions, it can be used on other spatial variables and interpolation methods.

1. Introduction

[2] Various interpolation methods have been applied to rainfall with many papers comparing interpolation methods [Hwang et al., 2011]. Recently, the use of copulas for interpolation of spatial data has come to prominence [Bárdossy and Li, 2008; Kazianka and Pilz, 2009, 2011]. Copulas are multivariate distributions with uniform marginal distributions which are used to describe dependence between a set of variables. In the context of spatial interpolation, this dependence is modeled as a function of the distance separating points and expressed in the form of rank correlations which are independent of the marginal distributions. The interpolated value is then calculated as an expected value of the copula density at the unknown location.

[3] The advantage of using copulas over traditional interpolation methods such as kriging is that copulas can model different dependence structures for different quantiles of a variable [Bárdossy, 2006]. As a result, this makes copulas an attractive prospect for interpolating rainfall as high precipitation events tend to be associated with convective rainfall which is more localized in nature than lower and more widespread events associated with stratiform rainfall. Interpolation using copulas has been shown to outperform kriging when applied to groundwater quality data [Bárdossy and Li, 2008], radiation data [Kazianka, 2012], and rainfall [Bárdossy and Pegram, 2012].

[4] The use of copulas for interpolation, however, requires assumptions about the parameters describing the marginal distribution of the data set, the shape of the correlation function as it varies with distance, and the copula itself. This prompts the following question: as these assumptions are made for an entire spatial domain, could the estimation of the parameters that describe these relationships be improved by using a local neighborhood? Initial experiments suggested that although local estimation of parameters can improve estimation at unknown locations, this improvement is not consistent. At some locations an improvement is observed, while at others prediction errors increase.

[5] Model forecast combinations have been previously used to improve temporal prediction [see, for example, Chowdhury and Sharma, 2009, 2011]. In this paper, we present a novel approach for combining predictions in a spatial context by modifying the forecast combination methodology of Bates and Granger [1969]. In particular, this methodology is demonstrated by combining global and local copula interpolation of rainfall at unknown sites.

[6] The remainder of this paper is structured as follows. Section 'Data' introduces the test data, while section 'Methodology' provides an outline of the proposed methodology. Section 'Copula Interpolation' presents the results of the application of the methodology to several data sets. Finally, section 'Forecast Combinations' presents a discussion of the success of combining forecasts.

2. Data

[7] In order to investigate whether the proposed method is robust, 1 year of daily rainfall data from part of New South Wales was used. Rainfall data were sourced from the Australian Bureau of Meteorology for the year 2008. Two spatial extents were used. The small data set includes all the stations bounded longitudinally by the coordinates 150°–151.4° and laterally −33.1° to −34.5°. This region represents the greater Sydney region (Australia) and its surrounds. The large data set includes all the stations bounded longitudinally by the coordinates 145°–154° and laterally −31° to −35.8°. This represents Sydney, its surrounds, and a large portion of New South Wales. The area is largely temperate with fairly uniform monthly rainfall with most rainfall occurring as the result of easterly troughs and frontal convection [Linacre and Geerts, 1997]. The data bounds with the topography underlain are shown in Figure 1, with wetter areas generally being closer to the coast and in higher altitudes. Days on which no rainfall was recorded at any gauge within the study site were omitted from the analysis. Approximately 200 rainfall gauges were used in the small data sets and 1000 in the large.

Figure 1.

Location of rainfall gauges for both large and small data sets with topography underlain.

3. Methodology

3.1. Copula Interpolation

[8] The general procedure for copula interpolation is presented by Bárdossy and Li [2008] and Kazianka [2012], and the reader is referred to these texts for further details. The following is a summary of the interpolation procedure using the notation presented by Kazianka [2012].

[9] Assume that we have a set of spatial data math formula at locations math formula. An n-dimensional copula math formula can be defined as

display math(1)

[10] The parameters θ and λ control the dependence structure of the copula. The parameter set λ are copula-specific parameters which control the shape of the multivariate distribution used for the copula (for example, the degrees of freedom for a Student t or χ2 copula) and the parameter set θ are the parameters that describe the rank correlation between the marginal distributions. In the present case, θ parameterizes the spatial correlation function describing the correlation of data points with distance. The function Fη is a univariate distribution with parameters η. Note that the parameters are constant and do not vary with location. If a Gaussian copula is assumed, there are no copula-specific parameters that require estimation. The parameters are estimated by maximizing the following likelihood function [Kazianka, 2012]:

display math(2)

where cθ is the copula density and fη is the marginal density function. The prediction at an unknown location x0 is then the expected value of the predictive distribution:

display math(3)

[11] This algorithm is implemented in the spatialCopula toolbox [Kazianka, 2012] in Matlab which in this paper has been modified to allow for estimation of the above parameters both using the entire data set math formula and subsets of this data set based on local neighborhoods. In words, the algorithm is as follows:

[12] (1) Assume or fit a univariate distribution to the spatial data set.

[13] (2) Assume or fit a correlation model to the rank correlations which describes how the dependence varies with distance and parameterizes the copula.

[14] (3) Estimate or optimize the parameters of the univariate distribution and copula using maximum likelihood.

[15] (4) Perform prediction at the unknown location through integration of the conditional copula to find its expected value.

[16] The above is a very brief introduction to the algorithm for copula interpolation. However, it is important in the context of this paper to emphasize that there are two primary sets of parameters that are assumed to be stationary in space, that is, those that describe the data correlation structure with distance and those that describe the univariate distribution of the data. For computational ease, a Gaussian copula using a Gaussian univariate distribution is assumed. An exponential correlation structure is used throughout.

3.2. Forecast Combinations

[17] Given two unbiased predictions y1 and y2 with error variance math formula and math formula and error correlation ρ, the following linear combination of the two forecasts will minimize the variance of the prediction error [Bates and Granger, 1969]:

display math(4)
display math(5)

[18] Bates and Granger [1969] recommend that the weight w be estimated using the errors from a window of previous time steps in order to combine the predictions for a future time step. In the spatial context, it is proposed that the weight be calculated using the errors from an appropriate spatial neighborhood, consisting of the k nearest stations. In this paper, we combine predictions based on two different copula models. One assumes the same parameter sets for the entire data set, while the second estimates these parameters using a local neighborhood. The methodology used in this paper is as follows:

[19] (1) Fit global copula parameters using all available data.

[20] (2) Using the global model as a starting point for the optimization, at each station location, fit local copula parameters using the k nearest neighboring stations (excluding the target station). If no model can be fitted through optimization (for example, if all the data values in the neighborhood are zero), assume the global model.

[21] (3) For each station location, undertake copula prediction using both the globally and locally estimated parameters. In order to save computation time, prediction is also undertaken using the k nearest neighbors.

[22] (4) Calculate the error at each location by subtracting the predicted value from the measured value.

[23] (5) At each unknown location, calculate the weight to be used for combining the global and local predictions.

[24] (6) At each unknown location, fit global and local copula parameters.

[25] (7) Using k nearest neighbors, for each unknown location, undertake prediction using both the globally and locally estimated parameters.

[26] (8) Combine the global and local predictions using the weight calculated in step 5.

[27] As the weights are estimated in a spatial context, there are several ways in which the weights can be determined. Three different methods are employed here:

[28] (1) Global σ2 and ρ: both the variance and correlation of the errors are estimated using all the errors available from the data. This gives a single combination weight at all unknown locations.

[29] (2) Local σ2 and ρ: both the variance and correlation of the errors are estimated using the errors in a local neighborhood of k stations. This gives a different combination weight at each unknown location.

[30] (3) Local σ2 and global ρ: the error variance is estimated from a local neighborhood of k stations, but the correlation is estimated from the entire data set. The justification for such a method is that the correlation is highly variable. This method allows for different combination weights at each location while maintaining greater stability in the combination weights.

[31] The combination weights are restricted to between zero and one. Note that there is no need to have a constant neighborhood k for the copula interpolation, the parameter estimation, and the calculation of the combination weights. The use of identical neighborhood sizes in this paper is for the purpose of simplifying the analysis and interpretation of results.

3.3. Analysis of Results

[32] Results have been compared using a variety of metrics listed below:

[33] (1) MSE: mean square error.

[34] (2) MAE: mean absolute error.

[35] (3) LEPS: linear error in probability space [Ward and Folland, 1991], which can be thought of as the difference between the cumulative distribution functions of the measured and predicted data.

[36] (4) PERC: percentage error, calculated as prediction error divided by the measured value. To avoid dividing by zero, in the instance where a measured zero is not predicted, the error is given a value of 100%.

[37] (5) KS: the two-sample Kolmogorov-Smirnov goodness-of-fit test statistic.

[38] Larger rainfall values will generally have larger errors and therefore have a large influence on the mean square error and mean absolute error criteria. As rainfall is often a highly skewed distribution with most values of a low magnitude, criteria such as LEPS, PERC, and KS are included as they are less likely to be influenced by the magnitude of the variable.

4. Results

4.1. Detailed Examples

[39] As an introduction, we present detailed results for 1 day of rainfall, 19 January 2008. This event was chosen for no other reason than it was the first substantial rainfall event for this year. To simplify the analysis, any gauges with zero rainfall were removed from this data set for the purposes of demonstrating the methodology. The recorded rainfall is presented in Figure 2a. In particular, there appears to be a cluster of high rainfall around the region of Katoomba, while Sydney and the immediate region southwest of Sydney experienced low rainfall on this particular day. Figure 2b presents the interpolation error when each point in the data set is removed in turn, and the parameters are estimated from the entire data set. The error varies spatially. In general, larger errors occur where there is higher rainfall; however, apart from this correlation with rainfall magnitude there is little trend in the errors. Next, copula interpolation is performed using parameters estimated from a local neighborhood (k) of the 15 nearest stations. The choice of the number of stations to use is based on Lall and Sharma [1996] who recommended using math formula neighbors with a data set of size n. Although this choice of k gives reasonable results, the optimal neighborhood will vary depending on the data set and sensitivity testing or optimization of k is prudent as with any application that involves local neighborhoods.

Figure 2.

For 19 January 2008. (a) Recorded rainfall. (b) Error from interpolation using a copula with globally estimated parameters. (c) The change in error when using a copula with locally estimated parameters, compared to copula interpolation using globally estimated parameters. The blue dots represent an improvement, and the green dots represent a deterioration. (d) Combination weights for the global and local copula interpolation predictions. The weights were calculated using 15 nearest neighbors for the variance and all the available predictions for the correlation.

[40] Figure 2c presents whether or not there has been an improvement (reduction in error). The blue dots represent an improvement, whereas the green dots represent a deterioration. It could be hypothesized that local prediction methods will improve prediction error in areas where there is a large density of gauges and high heterogeneity in the data, and if so, a relationship could be established between prediction error and variables such as the density of the gauge network, the distance from the nearest neighbor, the magnitude of the observation, the change in local mean, and the change in local variance. However, when Figure 2c is compared to Figures 2a and 2b, no consistent relationship between such variables and the improvement in prediction is visible. Although this example involves only a single day, this appears to be a general conclusion. Because of this fact, the method of forecast combinations is proposed and Figure 2d presents the local σ2 and global ρ combination weights. The weights have been restricted to range between zero and one, where a weight of zero represents choosing the local prediction, and a weight of one represents choosing the global prediction. What is most promising is that weights in Figure 2d correlate well with the improvements in prediction in Figure 2c. General regions where prediction is improved due to local copula estimation (shown in blue) appear to have weights assigned that are zero or close to zero (also shown in blue). Table 1 presents the prediction error for several criteria. In terms of performance the local copula estimation outperforms the global copula prediction; however, the combination of the two prediction methods gives even further improvement in terms of MSE and MAE, though not necessarily the other metrics. The code used to generate this table is available online (http://hydrology.unsw.edu.au/downloads/software).

Table 1. Prediction Error for 19 January 2008
Prediction MethodMSEMAELEPSPERCKS
Global53.944.760.0500.3820.151
Local52.954.740.0400.3680.112
Combined (global σ2 and ρ)52.324.690.0450.3710.132
Combined (local σ2 and ρ)48.954.510.0450.3610.137
Combined (local σ2 and global ρ)48.884.510.0450.3600.137

[41] Another example, for 17 January 2008, is presented in Figure 3. The analysis is identical, but in this instance the recorded rainfall is of a lower magnitude with several gauges recording no rainfall. Note that in prediction, any gauge with rainfall predicted to be less than 0.1 mm (half the magnitude of the lowest recorded value) is assigned zero rainfall.

Figure 3.

For 17 January 2008. (a) Recorded rainfall. (b) Error from interpolation using a copula with globally estimated parameters. (c) The change in error when using a copula with locally estimated parameters, compared to copula interpolation using globally estimated parameters. The blue dots represent an improvement, and the green dots represent a deterioration. (d) Combination weights for the global and local copula interpolation predictions. The weights were calculated using 15 nearest neighbors for the variance and all the available predictions for the correlation.

[42] Figure 3a presents the data, while Figure 3b presents the error of copula interpolation using parameters estimated globally. Again, higher rainfall values appear to have larger errors. In Figure 3c the improvement achieved by using a local copula is presented. It appears that stations with improvements (blue circles) tend to be clustered together. However, as with the example presented in Figure 2, it is not clear with which parameters to associate this improvement. Figure 3d presents the local σ2 and global ρ combination weights. In this instance, the correlation between Figure 3c, the improvements due to local copula estimation, and Figure 3d, the weights, appears to be poor. Prediction error is presented in Table 2. The local copula prediction does not improve the prediction error when compared to the global prediction for any of the error metrics presented. However, for each of the combined predictions the MSE reduces, albeit only slightly. This suggests that even when the local predictions are overall worse than the global predictions, they may still be of some merit when individual results are combined.

Table 2. Prediction Error for 17 January 2008
Prediction MethodMSEMAELEPSPERCKS
Global31.692.980.0260.8550.137
Local33.193.150.0310.9590.155
Combined (global σ2 and ρ)31.632.980.0270.8610.142
Combined (local σ2 and ρ)31.612.980.0280.9240.146
Combined (local σ2 and global ρ)31.662.980.0280.9220.146

[43] In the above examples, the true values were known, and the error at each data point could therefore be calculated and used to determine the combination weights. The question that needs to be answered is: though it appears that the combination of the two predictions has merit, are these improvements consistent when we have no knowledge of the errors at the location of prediction?

4.2. Analysis on Small Data Sets

[44] A total of 363 days of data were used, with 3 days removed as no rainfall was recorded. Each day was analyzed independently with a new set of parameters estimated for each data set. For each day of data a proportion of data points were randomly removed and predictions combined at these locations. In the first instance, 20% of each daily data set was removed and prediction undertaken at these points using the remaining data. Predictions and combination weights were calculated for the removed points using only the data points that were not removed. Global and local copula parameters were estimated only using the remaining 80% of data. As different data points are removed for each day (due to randomization), this process was repeated 10 times to ensure the results were valid. A constant local neighborhood of 15 was used throughout. To investigate the merit of combining predictions, results are compared relative to global copula interpolation. Table 3 presents the proportion of days for which local copula interpolation and the three combination types produce results that are better or equal to global copula prediction.

Table 3. Proportion of Days for Which the Results Are Equal to or Better Than Those Predicted Using Global Copula Interpolation When 20% of the Data Are Blindly Interpolateda
Prediction MethodMSE (%)MAE (%)LEPS (%)PERC (%)KS (%)
  1. a

    The results are the averages for 10 trials using the small data sets for an entire year.

Local5356706178
Combined (global σ2 and ρ)8183848492
Combined (local σ2 and ρ)7174797688
Combined (local σ2 and global ρ)7073787688

[45] In terms of MSE, use of local parameters in prediction is equal or better than the use of global parameters only 50% of the time. However, this percentage improves to over 70% when the global and local predictions are combined. In terms of LEPS, the results indicate that on 80% of the days the combined results are equal to or outperform global prediction. Similar proportions are observed for the other error metrics, and combined predictions appear to improve prediction error more consistently than local prediction alone. However, these results need to be interpreted with caution. For many of the data sets the predictions of all methods will be identical as all the values predicted are zero. Moreover, it is possible that the gains for when the prediction is improved do not outweigh the losses for when there is no improvement in prediction. Table 4 presents the average percentage change in each of the error criteria only for days when more than 5 mm average rainfall has been recorded (54 days in total). On average, using a local copula only outperforms the global copula in terms of LEPS, PERC, and KS and is poorer in terms of MSE and MAE. However, combining the global and location predictions improves the results in terms of all the error criteria. Combining the results using the local σ2 and global ρ combination weights on average improves the LEPS score by 2.5%, PERC by 1.6%, and KS by 2.2%. This improvement is consistently observed over each of the 10 repetitions.

Table 4. Percentage Change in Error Criteria Compared to Global Copula Interpolation for Days of Average Rainfall Greater Than 5 mm When 20% of the Data Are Blindly Interpolateda
Prediction MethodMSE (%)MAE (%)LEPS (%)PERC (%)KS (%)
  1. a

    The results are the averages for 10 trials using the small data sets for an entire year.

Local−2.0−0.61.90.61.0
Combined (global σ2 and ρ)0.30.20.30.50.0
Combined (local σ2 and ρ)0.20.42.41.61.8
Combined (local σ2 and global ρ)0.30.42.51.62.2

[46] A similar set of results was also developed for the process of randomly removing 40% of data points from each of the daily data sets. Table 5 presents the proportion of days for which local copula interpolation and the combination methods produce results that are better or equal to the prediction using a global copula. The proportions presented in Table 5 are similar to those in Table 3. On average, when the local and global predictions are combined, 70%–80% of the days have the same or an improved MSE, MAE, LEPS, PERC, and KS. Error metrics that are similar or better than the global prediction are only observed on 50%–60% of the days for the local copula prediction. Table 6 presents the average percentage change in each of the error criteria for days with more than 5 mm average rainfall. On average using the local σ2 and global ρ combination weights improves the LEPS score by 1.6%, PERC by 1.6%, and KS by 1.4%. These improvements are slightly lower than for the case when 20% of the data were removed; however, the improvements are still consistent for the 10 trials performed. The MSE and MAE on average also display an improvement for each combination method even though the local copula prediction alone does not consistently improve prediction.

Table 5. Proportion of Days for Which the Results Are Equal to or Better Than Those Predicted Using Global Copula Interpolation When 40% of the Data Are Blindly Interpolateda
Prediction MethodMSE (%)MAE (%)LEPS (%)PERC (%)KS (%)
  1. a

    The results are the averages for 10 trials using the small data sets for an entire year.

Local5454676075
Combined (global σ and ρ)8485848590
Combined (local σ and ρ)7174767484
Combined (local σ and global ρ)7073767485
Table 6. Percentage Change in Error Compared to Global Copula Interpolation for Days of Average Rainfall Greater Than 5 mm When 40% of the Data Are Blindly Interpolateda
Prediction MethodMSE (%)MAE (%)LEPS (%)PERC (%)KS (%)
  1. a

    The results are the averages for 10 trials using the small data sets for an entire year.

Local−1.5−0.50.10.40.0
Combined (global σ and ρ)0.40.2−0.20.6−0.6
Combined (local σ and ρ)0.60.61.51.61.4
Combined (local σ and global ρ)0.90.71.61.61.4

4.3. Analysis on Large Data Sets

[47] The results for the small data sets appear to show a consistent improvement in prediction errors when predictions are combined, even when local predictions alone offer no consistent gain. The improvements however are small in magnitude, especially in terms of MSE and MAE. To further test the spatial method of forecast combinations, the same analysis was performed on the large data sets for the first 10 days of February 2008. These days were chosen because they represent the first rainfall event of the year with consistently high rainfall in the region. A constant local neighborhood of 30 stations was used. Of the data points, 20% were randomly removed from each of the data sets, and the analysis was repeated 10 times. The proportion of cases where results are improved or similar to global copula interpolation is presented in Table 7. When the local copula interpolation is applied, the MSE is improved 47% of the time, and the MAE is improved 57% of the time. However, when forecasts are combined, using local σ2 and global ρ, the MSE is improved 64% of the time and the MAE 76% of the time. In fact, apart from the LEPS score all metrics on average improve due to forecast combination. Table 8 presents the average percentage change in each of the error criteria. For all error metrics, apart from the LEPS score in this case, the combination of forecasts improves results proportionally more than using the local copula alone. For example, the MSE on average is improved by 0.9% when only local copula interpolation is used, but 2.4% when the local σ2 and global combination is used.

Table 7. Proportion of Days for Which the Results Are Equal to or Better Than Those Predicted Using Global Copula Interpolation When 20% of the Data Are Blindly Interpolateda
Prediction MethodMSE (%)MAE (%)LEPS (%)PERC (%)KS (%)
  1. a

    The results are the averages for 10 trials using the large data sets for a stormy period of 10 days.

Local4757835155
Combined (global σ2 and ρ)8589807661
Combined (local σ2 and ρ)6876877265
Combined (local σ2 and global ρ)6476857370
Table 8. Percentage Change in Error Criteria Compared to Global Copula Interpolation When 20% of the Data Are Blindly Interpolated for the Large Dataa
Prediction MethodMSE (%)MAE (%)LEPS (%)PERC (%)KS (%)
  1. The results are the averages for 10 trials using the large data sets for a stormy period of 10 days.

Local0.92.29.81.71.1
Combined (global σ2 and ρ)2.72.64.82.0−0.7
Combined (local σ2 and ρ)2.22.68.13.52.9
Combined (local σ2 and global ρ)2.42.78.22.4−1.0

[48] The above analysis has been repeated for the same data sets with 40% of the points randomly removed. Table 9 presents the proportion of days for which results are similar or better than global copula interpolation. For all error metrics more days show an improvement than local copula interpolation alone. Specifically, the proportion of days showing improvement increased from 39% to 62% for MSE and 55% to 79% for MAE. Table 10 presents the percentage change for each of the error criteria. On average the MSE improves by 2% for the combined prediction, when by itself local prediction appears to offer no improvement over global copula interpolation.

Table 9. Proportion of Days for Which the Results Are Equal to or Better Than Those Predicted Using Global Copula Interpolation When 40% of the Data Are Blindly Interpolateda
Prediction MethodMSE (%)MAE (%)LEPS (%)PERC (%)KS (%)
  1. a

    The results are the averages for 10 trials using the large data sets for a stormy period of 10 days.

Local3955706969
Combined (global σ2 and ρ)7386738372
Combined (local σ2 and ρ)6079797667
Combined (local σ2 and global ρ)6279817774
Table 10. Percentage Change in Error Criteria Compared to Global Copula Interpolation When 40% of the Data Are Blindly Interpolated for the Large Dataa
Prediction MethodMSE (%)MAE (%)LEPS (%)PERC (%)KS (%)
  1. a

    The results are the averages for 10 trials using the large data sets for a stormy period of 10 days.

Local0.01.46.13.15.2
Combined (global σ2 and ρ)1.90.12.91.90.4
Combined (local σ2 and ρ)1.82.35.44.04.0
Combined (local σ2 and global ρ)1.92.35.54.04.4

5. Discussion and Conclusions

[49] There are three main factors which affect the performance of the method of combining forecasts: The merit of the two models used to interpolate the data, the accuracy of estimating the error variance and error correlation at the unknown gauges, and the choice of the copula used.

[50] Using the smaller and larger data sets in the analysis above demonstrates how the relative merit of the model affects the performance of the forecast combination methodology. For a small data set it is unlikely that using a local model will have much utility compared to using a global model, and hence, as the local model on average may be considered worse than the global model, the combination of the two gives only small improvements over the global model. On the other hand, for the larger data sets, the local model is likely to have greater utility. In fact, the results show that on average the local and global models give similar results in terms of MSE, but the combination of the two model predictions consistently yields reduced prediction error. It can be concluded that if one model consistently outperforms another, then the combination of forecasts is unlikely to yield great improvement. However, if both models have merit, then the combination of forecasts will generally outdo the individual models. Note that it has been demonstrated that even when one of the models performs worse on average, a combination of predictions can still improve the overall prediction error.

[51] Estimation of the error variance and correlation at the unknown locations is also a limiting factor. Three methods of estimating these quantities were explored, and while the results differed, all three methods led to improved predictions. A sensitivity analysis was performed. Combination was performed for each of the small data sets; however, the method was modified to include knowledge of the error at each unknown location. On days when the rainfall on average was greater than 5 mm, the average change in MSE for local copula interpolation was −3%. Using forecast combinations changed this to an improvement of about 5%. On average, over 90% of days analyzed had an improvement in MSE. These results are not presented in detail because in practice errors are not known. However, they demonstrate that the ability to accurately predict the error variance at the unknown location is critical for the success of the combination of predictions.

[52] It may be possible to further improve the interpolation through the use of different assumptions on the form of the copula [Bárdossy, 2006; Bárdossy and Li, 2008; Kazianka, 2012; Bárdossy and Pegram, 2012]. For example, Bárdossy and Li [2008] introduced a v-transformed multivariate normal copula and showed this copula to be more appropriate for highly skewed distributions than the standard Gaussian copula. Moreover, Bárdossy and Pegram [2012] show that truncated Gaussian copulas may be more appropriate than the v-transformed copulas for rainfall where a large number of zero values can skew the distribution. However, in this paper, the Gaussian copula was chosen as it is computationally easier to implement, and the primary focus is not to demonstrate a comparison between interpolation methods but to demonstrate a method for combining forecasts in a spatial setting. Visual inspection of the interpolation results showed that the Gaussian assumption performed satisfactorily.

[53] The method of combining forecasts is not limited to use in combining results from global and local Gaussian copula interpolation. Spatial predictions could be combined for any range of methods as long as an estimate of the error variance and correlation can be made. In this paper, the error variance and correlation from local and global copula interpolation of rainfall data were used to derive weights for combining spatial predictions. The result was a consistent improvement in prediction. It may be possible that the combination of spatial predictions for other data sets with less variability, or with different interpolation methods, may prove to be even more beneficial, and this work is planned.

Acknowledgments

[54] The authors are grateful for funding support from the Australian Research Council for this project. The authors would also like to thank the two anonymous reviewers, the Associate Editor, and Editor for their time in reviewing this paper and their constructive feedback which improved the quality of this manuscript.