Downscaling of surface temperature for lake catchment in an arid region in India using linear multiple regression and neural networks

Authors

  • Manish Kumar Goyal,

    Corresponding author
    1. Department of Civil and Environmental Engineering,University of Waterloo, Waterloo, Canada
    2. Department of Civil Engineering, Indian Institute of Technology, Roorkee, India
    • Dept. of Civil and Environmental Engineering, University of Waterloo, Waterloo, N2L 3G1, Canada.
    Search for more papers by this author
  • C. S. P. Ojha

    1. Department of Civil Engineering, Indian Institute of Technology, Roorkee, India
    Search for more papers by this author

Abstract

In this paper, downscaling models are developed using a Linear Multiple Regression (LMR) and Artificial Neural Networks (ANNs) for obtaining projections of mean monthly maximum and minimum temperatures (Tmax and Tmin) to lake-basin scale in an arid region in India. The effectiveness of these techniques is demonstrated through application to downscale the predictands for the Pichola lake region in Rajasthan State in India, which is considered to be a climatically sensitive region. The predictor variables are extracted from: (i) the National Centers for Environmental Prediction (NCEP) reanalysis dataset for the period 1948–2000; and (ii) the simulations from the third-generation Canadian Coupled Global Climate Model (CGCM3) for emission scenarios A1B, A2, B1, and COMMIT for the period 2001–2100. The scatter-plots and cross-correlations are used for verifying the reliability of the simulation of the predictor variables by the CGCM3 and to study the predictor–predictand relationships. The performance of the linear multiple regression and ANN models was evaluated based on several statistical performance indicators. The ANN-based models are found to be superior to LMR-based models and subsequently, the ANN-based model is applied to obtain future climate projections of the predictands. An increasing trend is observed for Tmax and Tmin for A1B, A2, and B1 scenarios, whereas no trend is discerned with the COMMIT scenario by using predictors. Copyright © 2011 Royal Meteorological Society

1. Introduction

Global climate models (GCMs) are numerical models that represent the large-scale physical processes of the earth-atmosphere-ocean system. They have been designed to simulate the past, present, and future climate. These climate models have been evolving steadily over the past several decades. Recently, fully coupled Atmosphere–Ocean GCMs (AOGCMs), along with transient methods of forcing the concentration of greenhouse gases, have brought considerable improvement in the climate model results. A complete review of GCMs used in climate variability and change can be found in Meehl et al. (2007). Water resources are inextricably linked with climate, so the prospect of global climate change has serious implications for water resources and regional development (Intergovernmental Panel on Climate Change (IPCC), 2001).

More recently, downscaling has found wide application in hydroclimatology for scenario construction and simulation/prediction of (i) low-frequency rainfall events (Wilby, 1998); (ii) mean temperature (Benestad, 2001); (iii) potential evaporation rates (Weisse and Oestreicher, 2001); (iv) daily Tmax and Tmin (Schoof and Pryor, 2001); (v) daily Tmax and Tmin (Wilby et al., 2002); and transpiration (Misson et al., 2002); (vi) streamflows (Cannon and Whitfield, 2002); (vii) runoff (Arnell et al., 2003); (viii) soil erosion and crop yield (Zhang et al., 2004); (ix) mean, minimum and maximum air temperatures (Kettle and Thompson, 2004); (x) precipitation (Tripathi et al., 2006); (xi) streamflow (Ghosh and Mujumdar, 2008); and (xii) Tmax and Tmin (Anandhi et al., 2009).

Temperature is an important parameter for climate change impact studies. A proper assessment of probable future temperature and its variability is to be made for various hydro-climatology scenarios. In a transient simulation, anthropogenic forcings, which are mostly decided based on IPCC climate scenarios, are changed gradually in a realistic pattern. GCMs are able to simulate reliably the most important mean features of the global climate at planetary scales. The GCMs are usually run at coarse-grid resolution, and as a result, they are inherently unable to represent sub-grid-scale features like orography, land use, and dynamics of mesoscale processes (Huth, 1999; Mearns et al., 2003; Dibike and Coulibaly 2007). Consequently, outputs from these models cannot be used directly for climate impact assessment on a local scale. This makes them unsuitable to many impact modelers, particularly hydrologists interested in regional-scale hydrological variability. Hence, in the past decade, several downscaling methodologies have been developed to transfer the GCM-simulated information to local scale (e.g. Carter et al., 1994; Anandhi et al., 2009).

Artificial Neural Networks (ANNs) are used in this application to derive relationships between the circulation and the local climatic variables response. This provides a powerful base learner, with advantages such as nonlinear mapping and noise tolerance, increasingly used in the Data Mining (DM) and Machine Learning (ML) fields due to its good behaviour in terms of predictive knowledge (e.g. Rumelhart et al., 1995). ANNs are analogous in application to multiple regression, with the added advantage that they are inherently non-linear, and particularly robust in finding and representing relationships in the presence of noisy data. The application of ANNs and utility for downscaling applications may be found in Hewitson and Crane (1994), Sailor et al., (2000) and Schoof and Pryor (2001). ANNs have proved particularly effective in downscaling temperature and precipitation, where there is a significant non-linear relationship that more traditional techniques such as regression do not capture well.

The objective of this study is to assess the effectiveness of neural networks to downscale mean monthly maximum temperature (Tmax) and minimum temperature (Tmin) by comparing with linear multiple regression (LMR) on a lake catchment in an arid region from simulations of CGCM3 for latest IPCC scenarios. The scenarios which are studied in this paper are relevant to IPCC's fourth assessment report (AR4) which was released in 2007 (IPCC, 2007).

The remainder of this paper is structured as follows: Section 2 provides a description of the study region and reasons for its selection. Section 3 provides details of various data used in the study. Section 4 describes how the various predictor variables behave for the different scenarios, and the reasons for selection of the probable predictor variables for downscaling. Section 5 explains the proposed methodology for development of the regression-based and ANN-based models for downscaling Tmax and Tmin to the lake basin. Section 6 presents the results and discussion. Finally, Section 7 provides the conclusions drawn from the study.

2. Study region

The study area of the research is the Pichola lake catchment in Rajasthan State in India that is situated from 72.5°E to 77.5°E and 22.5°N to 27.5°N. The mean monthly Tmax in the catchment varies from 19 to 39.5 and mean annual Tmax is 30.6. The mean monthly Tmin ranges from 3.4 to 29.8 based on decadal (1990–2000) observed values. The observed mean monthly Tmax and Tmin are shown in Figure 1 for various months of year 2000.

Figure 1.

Observed maximum and minimum temperature in the study region for the year 2000. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

The Pichola lake basin is one of the major sources of water supply for this arid region. During the past several decades, the streamflow regime in the catchment has changed considerably, which resulted in water scarcity, low agriculture yield, and degradation of the ecosystem in the study area. Regions with arid and semi-arid climates could be sensitive even to insignificant changes in climatic characteristics (Linz et al., 1990). Temperature affects the evapotranspiration (Jessie et al., 1996), evaporation and desertification processes and is also considered as an indicator of environmental degradation and climate change. Understanding the relationships among the hydrologic regime, climate factors, and anthropogenic effects is important for the sustainable management of water resources in the entire catchment, hence this study area was chosen because of aforementioned reasons. The location map of the study region is shown in Figure 2.

Figure 2.

Location map of the study region in Rajasthan State of India with NCEP grid. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

3. Data extraction

3.1. Reanalysis data

The monthly mean atmospheric variables were derived from the National Center for Environmental Prediction (NCEP/NCAR) (hereafter called NCEP) reanalysis dataset (Kalnay et al.,1996) for the period of January 1948 to December 2000. The data have a horizontal resolution of 2.5° latitude × 2.5° longitude and seventeen constant pressure levels in the vertical. The atmospheric variables are extracted for nine grid points whose latitude ranges from 22.5 to 27.5°N, and longitude ranges from 72.5 to 77.5°E at a spatial resolution of 2.5°.

3.2. Meteorological data

The Tmax and Tmin are used at monthly time scales from records available for Pichola Lake which is located in Udaipur at 24°34′N latitude and 73°40′E longitude. The data is available for the period January 1990 to December 2000 (Khobragade, 2009).

3.3. GCM data

The Canadian Center for Climate Modeling and Analysis (CCCma) (http://www.cccma.bc.ec.gc.ca/) provides GCM data for a number of surface and atmospheric variables for the CGCM3 T47 version which has a horizontal resolution of roughly 3.75° latitude by 3.75° longitude, and a vertical resolution of 31 levels. CGCM3 is the third version of the CCCma Coupled Global Climate Model which makes use of a significantly updated atmospheric component AGCM3 and uses the same ocean component as in CGCM2. The data comprise of present-day (20C3M) and future simulations forced by four emission scenarios, namely A1B, A2, B1 and COMMIT. Data was obtained for CGCM3 climate of the 20th Century (20 CM3) experiments used in this study.

The nine grid points surrounding the study region are selected as the spatial domain of the predictors to adequately cover the various circulation domains of the predictors considered in this study. The GCM data is re-gridded to a common 2.5° using inverse square interpolation technique (Willmott et al., 1985). The utility of this interpolation algorithm was examined in previous downscaling studies (Hewitson and Crane, 1994; Shannon and Hewitson, 1996; Crane and Hewitson, 1998; Tripathi et al., 2006; Ghosh and Mujumdar, 2008;Goyal and Ojha, 2010a; Goyal and Ojha, 2010b).

The development of downscaling models for each of the predictand variables Tmax and Tmin, begins with selection of potential predictors, followed by training and validation of the LMR and ANN downscaling models. The developed model is then used to obtain projections of Tmax and Tmin from simulations of CGCM3.

4. Selection of predictors

For downscaling predictands, the selection of appropriate predictors is one of the most important steps in a downscaling exercise. The predictors are chosen by the following criteria: (i) they should be skillful in representing large-scale variability that is simulated by the GCMs; (ii) they should be statistically significant contributors to the variability in precipitation, or they should represent important physical processes in the context of the enhanced greenhouse effect; (iii) they should not be strongly correlated to each other (Hewitson and Crane, 1994, 1996; Cecilia et al.,2001; Cavazos and Hewitson, 2005; Goyal and Ojha, 2010c). Several studies by various authors such as Dibike and Coulibaly, (2006) and Anandhi et al., (2009) have used large-scale atmospheric variables, namely air temperature, zonal and meridional wind velocities at various pressure levels as the predictors for downscaling GCM output to mean monthly maximum and minimum temperatures.

As suggested by Wilby et al., (2004), predictors have to be selected based both on their relevance to the downscaled predictands and their ability to be accurately represented by the GCMs. The most favourable predictors must be strongly correlated with the predictand, be physically sensible, and have the ability to capture the climate change signal. Scatter plots and cross-correlations are in use to select predictors to understand the presence of nonlinearity/linearity trends in dependence structure (Dibike and Coulibaly, 2006; Anandhi et al., 2009; Goyal and Ojha ,2010d). Scatter plots and cross-correlations (Table I) between each of the predictor variables in NCEP and GCM datasets are useful to verify if the predictor variables are realistically simulated by the GCM. Scatter plots are prepared and cross-correlations are computed between the predictor variables in NCEP and GCM datasets (Figures 3 and 4). The cross-correlations are estimated using three measures of dependence namely, product moment correlation (Pearson, 1896), Spearman's rank correlation (Spearman, 1904a and b) and Kendall's tau (Kendall, 1951). Scatter plots and cross-correlations between each of the predictor variables in NCEP and GCM datasets are useful to verify if the predictor variables are realistically simulated by the GCM. The scatter plots and cross-correlations between the predictor variables in NCEP dataset and each of the predictands (Figure 5) are useful to verify if the predictor and predictand are well correlated.

Figure 3.

Scatter plots prepared to investigate dependence structure between probable predictor variables in NCEP and GCM datasets. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Figure 4.

Bar plots for cross-correlation computed between probable predictors in NCEP and GCM datasets. P, S and K represent product moment correlation, Spearman's rank correlation and Kendall's tau respectively

Figure 5.

Scatter plots prepared to investigate dependence structure between probable predictor variables in NCEP data and the observed Tmax and Tmin. (a) denotes plots based for the predictand Tmax, while (b) denotes based for the predictand Tmin. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Table I. Cross-correlation computed between probable predictors in NCEP and GCM datasets. P, S, and K represent product moment correlation, Spearman's rank correlation and Kendall's tau, respectively
 Ta925Ua925Va925Va200Ta200Ua200Ta500
P0.830.790.67− 0.180.660.230.81
S0.680.560.43− 0.140.460.570.64
K0.870.760.61− 0.200.680.730.85

5. Development of downscaling model

In order to relate the large-scale weather patterns to the local scale, downscaling is necessary. The relationships between these scales can be determined by a number of methods including regression (Kilsby et al., 1998), canonical correlation analysis (Heyen et al., 1996; Xoplaki et al., 2000), artificial neural networks (Hewitson and Crane, 1994; Gardner and Dorling, 1998; Cannon and Lord, 2000; Schoof and Pryor, 2001).

In this study, linear multiple regression and artificial neural networks (ANNs) are used to downscale mean monthly maximum (Tmax) and minimum (Tmin) temperature. The data of potential predictors is first standardized. Standardization is widely used prior to statistical downscaling to reduce bias (if any) in the mean and the variance of GCM predictors with respect to that of NCEP-reanalysis data (Wilby et al., 2004). Figure 2 shows the grid points superposed on the map of Rajasthan State of India. In this study, standardization is done for a baseline period of 1948–2000 because it is of sufficient duration to establish a reliable climatology, yet not too long, nor too contemporary to include a strong global change signal (Wilby et al., 2004). The dimension of the GCM output dataset extracted is 9 × 3 = 27 (air temperature (925 hPa), zonal wind (925 hPa) and meridional wind (925 hPa) at each of the nine grid points).

Multi-dimensionality of the predictors may lead to a computationally complicated and large-sized model with high multi-collinearity (high correlation between the explanatory variables/regressors). Multiple linear regressions are performed on this dimensionality set. To reduce the dimensionality of the explanatory dataset, Principal Component Analysis (PCA) is performed. The use of principal component (PCs) as input to a downscaling model helps in making the model more stable and at the same time reduces its computational burden. The data of standardized NCEP predictor variables is then processed using PCA to extract principal components (PCs) which are orthogonal and which preserve more than 98% of the variance originally present in it. A feature vector is formed for each month of the record using the PCs. The feature vector is the input to the linear multiple regression and ANN models, and the contemporaneous value of predictand is the output.

To develop the linear multiple regression and ANN downscaling models, the feature vectors which are prepared from NCEP record are partitioned into a training set and a test set. Feature vectors in the training set are used for calibrating the model, and those in the test set are used for validation. The 11-year observed temperature-data series was broken up into a calibration period and a validation period. The models were calibrated on the calibration period 1990–1995 and validation involved the period 1996–2000. The various error criteria are used as an index to assess the performance of the model. Basing on the latest IPCC scenario, a total of 10 models were constructed for predictands using both approaches. These models for mean monthly Tmax and Tmin were evaluated based on the accuracy of the predictions for training and testing the data-set. Table II shows the values of regression coefficients of regression models, while Table III shows certain details of different ANN downscaling models. For linear multiple regression, there will be two models, one for each predictand, while there will be eight models, one for each scenario and one for each predictand for NN.

Table II. Description of regression models, input values and model forms. The predictors in the regression equations (PC#) indicate principal component
ModelPredictandEquation
LMRM1TmaxTmax = − 0.1801 − 0.2171PC1 + 0.2159PC2 − 0.1201PC3
LMRM2TminTmin = − 0.2191 + 0.0398PC1 + 0.1527PC2 + 0.004PC3
Table III. Different ANN downscaling model variants used in the study for obtaining projections of predictands Tmax and Tmin
PredictandPeriod of downscalingLength of the recordScenarioModel
Tmax1990–21001990–2000SRESA1BANNM1
Tmin1990–21001990–2000SRESA1BANNM2
Tmax1990–21001990–2000SRESA2ANNM3
Tmin1990–21001990–2000SRESA2ANNM4
Tmax1990–21001990–2000SRESB1ANNM5
Tmin1990–21001990–2000SRESB1ANNM6
Tmax1990–21001990–2000COMMITANNM7
Tmin1990–21001990–2000COMMITANNM8

6. Results and discussions

Downscaling models are developed following the methodology described in Sections 5 and 6. The results and discussion are presented in this section.

6.1. Potential predictor selection

The most relevant probable predictor variables necessary for developing the ANN downscaling model are identified by using scatter plots and the three measures of dependence following the procedure described in Section 5. The scatter plots and cross-correlations enable verifying the reliability of the simulations of the predictor variables by the GCM and to study the predictor–predictand relationships. The computed cross-correlations have been shown in Table I. It is clear from Table I that all the predictors at pressure level of 925 hPa have performed better than any other pressure levels investigated in this study. Furthermore, the scatter plots between the probable predictor variables in NCEP and GCM datasets are shown in Figure 3, while the cross-correlations computed between the same are shown in Figure 4. In general, the predictor variables are realistically simulated by the GCM. It is noted that air temperature at 925 hPa (Ta 925) is the most realistically simulated variable with a CC greater than 0.8, while meridional wind at 925 hPa (Va 925) is the least correlated variable between NCEP and GCM datasets (CC = 0.67; Figures 3 and 4). It is to be noted that these figures represent how well the predictors simulated by NCEP and GCM are correlated. Generally, the correlations are not very high due to the differences in the simulations of GCM (e.g. for different runs) and possible errors in NCEP-reanalysis. In addition, the inherent errors due to re-gridding from GCM scale to NCEP scale also contribute to low correlation.

To investigate the relationship between the probable predictors and predictands, scatter plots and cross-correlation bar plots between the probable predictor variables in NCEP data and each of the predictands (Tmax and Tmin) are presented in Figures 5 and 6, respectively. From a perusal of the scatter plots, it appears that the linear dependence structure between predictor variables and predictands is weaker for Tmax when compared to Tmin. From the two figures, it can be observed that Ta 925 and Ua 925 have high correlation with both the predictands, while Va 925, has less correlation with the Tmax. Among the two predictands, the Tmin is more correlated with the predictors. These results give an overall picture of relationships between predictors and predictands over all the nine grid points considered.

Figure 6.

Bar plots for cross-correlation computed between probable predictors in NCEP data and observed Tmax and Tmin. (a) denotes plots for the predictand Tmax, while (b) denotes plots for the predictand Tmin. P, S and K represent product moment correlation, Spearman's rank correlation and Kendall's tau, respectively

6.2. Downscaling and performance of GCM models

Three predictor variables namely air temperature (925 hPa), zonal wind (925 hPa), and meridional wind (925 hPa) at 9 NCEP grid points with a dimensionality of 27, are used which are highly correlated with each other. Multiple linear regressions were performed on these datasets. As expected, results of performance indicators were very poor (Mean Square Error was in the range of 2715–4768 and MAE was in the range of − 5.9 to − 15.2). PCA (Hughes et al., 1993; Ghosh and Mujumdar, 2006) is performed to transform the set of correlated N-dimensional predictors (N = 27) into another set of N-dimensional uncorrelated vectors (called principal components) by linear combination, such that most of the information content of the original dataset is stored in the first few dimensions of the new set. It is observed that the four leading principal components (PCs) of the PCA method explain about 98% of the information content (or variability) of the original predictors. Hence, PCs are extracted to form feature vectors from the standardized data of potential predictors. These feature vectors are provided as input to the linear multiple regressions and ANN downscaling models.

The different statistical parameters of each model are adjusted during calibration to get the best statistical agreement between observed and simulated meteorological variables. For this purpose, various statistical performance measures, such as Coefficient of Correlation (CR), Standard Error of Estimate (SSE), Mean Square Error (MSE), Root Mean Square Error (RMSE), Normalized Mean square Error (NMSE), Nash-Sutcliffe Efficiency Index and Mean Absolute Error (MAE) were used to measure the performance of various models. These measures are defined below.

  • A.Coefficient of correlation: The CR can be defined as
    equation image(1)
  • B.Sum of squared errors: The SSE can be defined as
    equation image(2)
  • C.Mean Square Error: The MSE between observed and computed outputs can be defined as
    equation image(3)
  • D.Root Mean Square Error: The RMSE between observed and computed outputs can be defined as
    equation image(4)
  • E.Normalized Mean Square Error: The NMSE between observed and computed outputs can be defined as (Zhang and Govindaraju, 2000).
    equation image(5)
  • F.Nash–Sutcliffe efficiency index: The Nash-Sutcliffe efficiency index (η1) can be defined as (Nash and Sutcliffe, 1970)
    equation image(6)
  • G.Mean absolute error: The MAE can be defined as follows (Johnson et al., 2003)
    equation image(7)

where N represents the number of feature vectors prepared from the NCEP record, Yo and Yc denote the observed and the simulated values of predictand respectively, Ȳo and σobs are the mean and the standard deviation of the observed predictand.

Results of various statistics of linear multiple regression models are presented in Table IV. It can inferred from Table IV that both linear multiple regression models performed poor in terms of all performance indicators. The architecture of ANN is decided by trial and error procedure. A comprehensive search of ANN architecture is done by varying the number of nodes in hidden layers. The network is trained using a back-propagation algorithm. Tan sigmoid activation function has been used in hidden layer(s), whereas linear activation function has been used in the output layer. The network error is computed by comparing the network output with the target or the desired output. Mean square error is used as an error function. Results of the different models (ANNM1 to ANNM8) as discussed in Table III are tabulated in Tables V–VIII. It can be observed from Table V to Table VIII that the performance of ANNs for mean monthly Tmax and Tmin is clearly superior to that of MLR based models (Table IV). All statistical performance indicators have performed better between predicted and observed values for ANN models.

Table IV. Model evaluation statistics for regression models
ModelCRSSEMSERMSE
 TrainingValidationTrainingValidationTrainingValidationTrainingValidation
LMRM10.300.382207.091680.5230.6528.015.545.29
LMRM2− 0.31− 0.335626.345175.5378.1486.268.849.29
 NMSEN-S IndexMAE 
 TrainingValidationTrainingValidationTrainingValidation 
 1.421.35− 0.44− 0.37− 0.25− 0.25 
 1.421.40− 0.44− 0.42− 0.16− 0.19 
Table V. Various performance statistics for SRES A1B scenario
ModelHidden nodesCRSSEMSE
  TrainingValidationTrainingValidationTrainingValidation
ANNM150.980.9459.31157.580.822.63
ANNM250.990.9666.21319.470.925.32
RMSENMSEN-S IndexMAE
TrainingValidationTrainingValidationTrainingValidationTrainingValidation
0.911.620.040.130.960.870.830.64
0.962.310.020.090.980.910.890.79
Table VI. Various performance statistics for SRES A2 scenario
ModelHidden nodesCRSSEMSE
  TrainingValidationTrainingValidationTrainingValidation
ANNM340.980.9552.35153.960.732.57
ANNM450.990.9776.54231.001.063.85
RMSENMSEN-S IndexMAE
TrainingValidationTrainingValidationTrainingValidationTrainingValidation
0.851.600.030.120.970.870.840.65
1.031.960.020.060.980.940.880.81
Table VII. Various performance statistics for SRES B1scenario
ModelHidden nodesCRSSEMSE
  TrainingValidationTrainingValidationTrainingValidation
ANNM540.980.9470.10175.660.972.93
ANNM650.990.9795.59210.411.333.51
RMSENMSEN-S IndexMAE
TrainingValidationTrainingValidationTrainingValidationTrainingValidation
0.991.710.050.140.950.860.820.63
1.151.870.020.060.980.940.870.81
Table VIII. Various performance statistics for COMMIT scenario
ModelHidden nodesCRSSEMSE
  TrainingValidationTrainingValidationTrainingValidation
ANNM760.980.9556.53155.630.792.59
ANNM850.990.9668.09318.670.955.31
RMSENMSEN-S IndexMAE
TrainingValidationTrainingValidationTrainingValidationTrainingValidation
0.891.610.040.130.960.870.840.66
0.972.300.020.090.980.910.880.78

Once the downscaling models have been calibrated and validated, the next step is to use these models to downscale the control scenario simulated by the GCM. The GCM simulations are run through the calibrated and validated ANN downscaling models to obtain future simulations of predictand. The predictands (viz. Tmax and Tmin) patterns are analysed with box plots for 20-year time slices. The middle line of the box gives the median, whereas the upper and lower edges give the 75 percentile and 25 percentile of the dataset, respectively. The difference between the 75 percentile and 25 percentile is known as Inter Quartile Range (IQR). The two bounds of a box plot outside the box denote the value at 1.5 × IQR lower than the third quartile or minimum value, whichever is high and 1.5 × higher than the third quartile or the maximum value whichever is less. Typical results of downscaled predictands (Tmax and Tmin) obtained from the predictors are presented in Figures 7 and 8. In part (i) of these figures, the Tmax and Tmin downscaled using NCEP and GCM datasets are compared with the observed Tmax and Tmin for the study region using box plots. The projected precipitation for 2001–2020, 2021–2040, 2041–2060, 2061–2080, and 2081–2100, for the four scenarios A1B, A2, B1, and COMMIT are shown in (ii), (iii), (iv), and (v), respectively.

Figure 7.

Box plots results from the ANN-based downscaling model for the predictand Tmax. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Figure 8.

Box plots results from the ANN-based downscaling model for the predictand Tmin. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

From the box plots of downscaled predictands (Figures 7 and 8), it can be observed that Tmax and Tmin are projected to increase in future for A1B, A2 and B1 scenarios, whereas no trend is discerned with the COMMIT scenario by using predictors. A comparison of mean monthly observed Tmax and Tmin with Tmax and Tmin simulated using several ANN downscaling models are shown in Figures 9 to 16 for calibration and validation period. Calibration period is from 1990 to 1995, and the rest is validation period.

Figure 9.

Typical results for comparison of the monthly observed Tmax with Tmax simulated using ANN downscaling model 1 for NCEP data. In the figure, calibration period is from 1990 to 1995, and the rest is validation period. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Figure 10.

Typical results for comparison of the monthly observed Tmin with Tmin simulated using ANN downscaling model 2 for NCEP data. In the figure, calibration period is from 1990 to 1995, and the rest is validation period. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Figure 11.

Typical results for comparison of the monthly observed Tmax with Tmax simulated using ANN downscaling model 3 for NCEP data. In the figure, calibration period is from 1990 to 1995, and the rest is validation period. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Figure 12.

Typical results for comparison of the monthly observed Tmin with Tmin simulated using ANN downscaling model 4 for NCEP data. In the figure, calibration period is from 1990 to 1995, and the rest is validation period. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Figure 13.

Typical results for comparison of the monthly observed Tmax with Tmax simulated using ANN downscaling model 5 for NCEP data. In the figure, calibration period is from 1990 to 1995, and the rest is validation period. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Figure 14.

Typical results for comparison of the monthly observed Tmin with Tmin simulated using ANN downscaling model 6 for NCEP data. In the figure, calibration period is from 1990 to 1995, and the rest is validation period. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Figure 15.

Typical results for comparison of the monthly observed Tmax with Tmax simulated using ANN downscaling model 7 for NCEP data. In the figure, calibration period is from 1990 to 1995, and the rest is validation period. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

Figure 16.

Typical results for comparison of the monthly observed Tmin with Tmin simulated using ANN downscaling model 8 for NCEP data. In the figure, calibration period is from 1990 to 1995, and the rest is validation period. This figure is available in colour online at wileyonlinelibrary.com/journal/joc

7. Conclusion

This paper investigates the applicability of the linear multiple regression and neural network for downscaling mean monthly maximum temperature (Tmax) and minimum temperature (Tmin) from GCM output to local scale. The proposed neural network model is shown to be statistically superior compared to linear multiple regression based downscaling model. The effectiveness of this model is demonstrated through the application of lake catchment in an arid region in India. The predictands are downscaled from simulations of CGCM3 for four IPCC scenarios namely SRES A1B, A2, B1, and COMMIT. Scatter plots and cross-correlations used for studying the reliability of the predictor variables GCM.

The results of downscaling models show that Tmax and Tmin are projected to increase in future for A1B, A2, and B1 scenarios, whereas no trend is discerned with the COMMIT using predictors. These results are in agreement with those obtained for temperature projections by Anandhi et al., (2009) for another river basin in India.

Table APPENDIX. Weights and biases for NN model ANNM1 using back-propagation algorithm
Weightsh11h12h13h14h15
i1− 0.24030.63440.0888− 0.27490.2639
i20.1038− 0.263− 0.886− 0.09620.5937
i30.4344− 0.32510.32510.1820.2802
i4− 0.79840.58540.4896− 1.01930.3919
Biasesb11b12b13b14b15
 − 2.3658− 0.9981− 0.48672.39972.519
WeightsO1
H210.8937
H22− 0.4407
H230.585
H240.887
H25− 0.5144
Input layer4nodes
Hidden layer5 nodes
Output layer1 node
Biasesbo1
 0.3671

Ancillary