If you can't find a tool you're looking for, please click the link at the top of the page to go "Back to old version". We'll be adding more features regularly and your feedback is important to us, so please let us know if you have comments or ideas for improvement.
Prediction of air temperature is of primary importance for land evaluating and characterizing systems as well as hydrological and ecological models (Benavides et al., 2007). In such models, air temperature is applied as input parameter to derive other processes such as evapotranspiration, soil decomposition and plant productivity (Dodson and Marks, 1997). Accurate forecasting of this parameter is also needed for determining the site suitability for agricultural and forest crops, predicting of the soil surface temperature and avoiding the hazardous influences of temperature variations (Hudson and Wackernagel, 1994; George, 2001; Ustaoglu et al., 2008).
In recent years, global warming has considerably attracted attentions of scientists. Global warming is related with an average increase in the Earth surface temperature and lower atmosphere, which in turn causes climate changes. Increasing Earth surface temperature may lead to changes in rainfall patterns, a rise in sea level, and a wide range of impacts on plants, wildlife and humans. For this reason, the importance of temperature predictions has been increased all over the World (Bilgili and Sahin, 2010).
So far, a number of attempts have been carried out to model air temperature variations (Kiraly and Janosi, 2002; Bartos and Janosi, 2006; Gyure et al., 2007; Guan et al., 2009), which have emphasized the need to accurate estimation of air temperature in various aspects of meteorology, hydrology and agro-hydrology. Therefore, there are essential needs to better models with high accuracies to address the nonlinearity in air temperature variation process.
In the recent past, Artificial Intelligence (AI) approaches [e.g. Artificial Neural Networks (ANN), Adaptive Neuro-Fuzzy Inference System (ANFIS), etc] have been successfully used in a wide range of scientific applications including water resources engineering, agro-hydrology and agro-meteorology. The complete review of such applications is beyond the scope of this paper and only some relevant literature will be discussed here.
Tatli and Sen (1999) introduced a fuzzy modelling approach for predicting air temperature. Abdel-Aal (2004) applied abductive neural network approach to forecast hourly air temperature. Smith et al. (2005) developed an enhanced ANN for air temperature prediction by including information on seasonality and modifying parameters of an existing ANN model. Shank et al., (2008) applied neural networks for predicting dew-point temperature. Partal and Kisi (2007) introduced a new wavelet-neuro-fuzzy conjunction model for precipitation forecasting. Bilgili and Sahin (2010) used ANN for predicting long-term monthly temperature and rainfall in Turkey. Kisi and Shiri (2011) introduced new hybrid wavelet-AI models for precipitation forecasting. Shiri et al. (2011) applied ANFIS for estimating daily pan evaporation values from weather data in at station as well as cross station scales. Kisi et al. (2012) introduced a generalized ANFIS model of daily pan evaporation estimation using weather data.
In this study, the applicability of ANFIS and ANN models were investigated for predicting long-term monthly air temperature using the geographical input data. The data from 30 weather stations in Iran were used for training and testing of the ANFIS and ANN models.
2. Materials and methods
2.1. Study area and data analysis
In this study data from 30 weather stations in Iran were used. Figure 1 represents an illustrative map of the studied weather stations. Also the corresponding geographical positions of the stations are given in Table 1. Figure 2 displays the long-term average air temperature values. These values have been obtained through the averaging of air temperature in the whole studied weather stations. A linear variation of temperature between the minimum and maximum temperature values was assumed for calculating average air temperature. Data cover the long-term averaged temperatures between the periods of 1986–2000. It is clear from Figure 2 that a dramatic variation in the long-term monthly temperature in Iran occurs during a year. The long-term monthly temperature ranges from as low as −2.91 °C in January (Zanjan station) to as high as 38.21 °C in July (Ahwaz station).
Table 1. Summary of the geographical information of the studied weather stations
aT(°C), average air temperature during the study period.
2.2. Artificial neural networks
An ANN has one or more hidden layers, whose computation nodes are correspondingly called hidden neurons of hidden units. The hidden neurons intervene between the external input and the output in some useful manner. The network is enabled to extract higher order statistics by adding one or more hidden layers. In a rather loose sense, despite its local connectivity due to the extra set of synaptic connections and the extra dimension of network interconnections, the ANN acquires a global perspective.
The ANN was trained using Levenberg–Marquardt (LM) technique here due to that this technique is more powerful and faster than the conventional gradient descent technique (Hagan and Menhaj, 1994; Kisi, 2007). The back propagation with gradient descent technique is a steepest descent algorithm, while the LM algorithm is an approximation to Newton's method (Marquardt, 1963). If we want to minimize a function V(x) with respect to the parameter vector x, then Newton's method would be
where , is the Hessian matrix and , the gradient. Let us assume that V(x) is a sum of square functions
then it can be shown that
where J(x) is the Jacobean matrix and
For the Gauss–Newton method, it is assumed that S(x) ≈ 0, and the update of Equation (1) becomes
The LM modification to the Gauss–Newton method is
The parameter μ is multiplied by some factor (β) when a step increases V(x). When a step would result in a reduced V(x), μ is divided by β. When μ is large the algorithm becomes steepest descent (with step 1/μ), while the algorithm becomes Gauss–Newton for small μ. The LM algorithm can be considered a trust-region modification to Gauss–Newton. The computation of the Jacobean matrix is the key step in this algorithm. The terms in the Jacobean matrix can be computed by a simple modification to the back propagation algorithm for the neural network-mapping problem (Hagan and Menhaj, 1994).
2.3. Adaptive neuro-fuzzy inference system
An ANFIS is a combination of an adaptive ANN and a fuzzy inference system (FIS). The parameters of the FIS are determined by the neural network learning algorithms. Since this system is based on the FIS, reflecting amazing knowledge, an important aspect is that the system should be always interpretable in terms of fuzzy IF-THEN rules. ANFIS is capable of approximating any real continuous function on a compact set of parameters to any degree of accuracy (Jang et al., 1997). ANFIS identifies a set of parameters through a hybrid learning rule combining back propagation gradient descent error digestion and a least-squared error method. There are mainly two approaches for fuzzy inference systems, namely the approaches of Mamdani (Mamdani and Assilian, 1975) and Sugeno (Takagi and Sugeno, 1985). The differences between the two approaches arise from the consequent part where Mamdani's approach uses fuzzy membership functions, while linear or constant functions are used in Sugeno's approach. The neuro-fuzzy model used in this study implements the Sugeno's fuzzy approach with geographical information of each station as input variables and air temperature values as output variable.
As a simple example an FIS with two inputs x and y and one output z is assumed. Here, x and y may be considered as latitude (ϕ) and longitude (λ) where as the output z represents the air temperature (TA). Suppose that the rule base contains two fuzzy IF-THEN rules:
The IF (antecedent) part is fuzzy in nature, while the THEN (consequent) part is a crisp function of an antecedent variable (as a rule, a linear equation). The study presented here for ground water table, for the above example Equations (8) and (9) can be written as:
where pi, qi and ri are parameters with i = 1, 2, 3, …, n corresponding to Rule 1, Rule 2, Rule 3, …, Rule n. In a Type 3 Sugeno fuzzy model, the output of each rule is a linear combination of input variables plus a constant term and the final output z is the weighted average of each rule output. More information about ANFIS theory can be found in the study of Jang (1993) and Jang et al., (1997). As mentioned in previous section, for a given input–output dataset (similar to predicting air temperature or precipitation using chronological or geographical data), various Sugeno models may be developed by using different identification methods (i.e. grid partitioning and subtractive clustering), but the commonly used grid partitioning identification method was used in this study. The grid partitioning identification method proposes independent partitions of each antecedent variable through defining the membership functions of all antecedent variables.
2.4. Performance evaluation parameters
Three statistical evaluation criteria were used to assess the models' performances: the Correlation Coefficient (R2):
Since the Pearson correlation coefficient (R) term and the coefficient of determination (R2) provide information for linear dependence between observations and corresponding simulations, they should not be alone applied as performance indicators (Legates and McCabe, 1999). Therefore, other statistical measures such as MAE (which is a linear scouring rule and describes only the average magnitude of the errors, ignoring their direction) and RMSE (which describes the average magnitude of the errors by giving more weight on large errors) should be applied to evaluate the models' performance. The mentioned scours can be defined as:
Root mean-squared error (RMSE):
Mean absolute error (MAE):
where, xi and yi denote the observed and corresponding simulated values at the ith time step, respectively and n is the number of time steps. Also and represent the mean values of observed and simulated values, respectively.
3. Results and discussions
This paper aims at estimating monthly averaged temperature and precipitation values at 30 weather stations in Iran by using ANFIS and ANN techniques. In this way, the number of the months, station latitude, longitude and altitude values were used as input parameters to the ANN and ANFIS for estimating long-term temperatures. Monthly data of 20 weather stations (20 stations × 12 months = 240 data) were used for training and 10 stations' data (10 stations × 12 months = 120 data) were used for testing. The stations were randomly selected for training and testing periods. The stations used for the testing procedure are Bandar-e-Abbas, Birjand, Bojnurd, Bushehr, Kerman, Mashhad, Semnan, Shiraz, Yazd and Zahedan. Before applying the ANN models to the data, training input and output values were normalized using the following equation
where xmin and xmax are the minimum and maximum of the training dataset. In this study, a and b were taken as 0.6 and 0.2. The training data were normalized into range [0.2, 0.8] following the suggestion of Cigizoglu (2003). Cigizoglu (2003) showed that scaling input data between 0.2 and 0.8 gives the ANNs the flexibility to predict beyond the training range. LM algorithm was used for calculating the ANN weights in this study because this technique is more powerful and faster than the conventional gradient descent technique (Hagan and Menhaj, 1994; Kisi, 2007). A difficult task with ANN involves choosing the hidden nodes' number. Here, the ANN with one hidden layer was used and the hidden nodes' number was determined using trial and error method. The tangent sigmoid activation function was used for the hidden and output nodes. The ANN network training was stopped after 100 epochs since the variation of error was too small after this epoch. For the ANFIS model, Gaussian membership functions and 200 iterations were used. In implementation of fuzzy logic, several types of membership functions (MFs) can be used. However, recent studies have shown that, the type of MF does not affect the results fundamentally (Vernieuwe et al., 2005). Different numbers of membership functions were tested and the best one that gave the minimum mean square errors (MSE) was selected, which was 3 MFs for each variable.
ANN and ANFIS model is compared for the training stations in Table 2. The ANN model has the lowest RMSE (0.68 °C) and MAE (0.54 °C) for the Rasht station. The worst ANN estimates belong to the Ardabil station with the RMSE of 2.72 °C and MAE of 2.35 °C. In the case of ANFIS, the best and worst models were obtained for the Yasuj (RMSE = 0.29 °C, MAE = 0.25 °C) and Tabriz (RMSE = 2.19 °C, MAE = 1.91 °C) stations. It can be obviously seen from Table 2 that the ANFIS model (RMSE ranges 0.29–2.19) are better than the ANN (RMSE ranges 0.68–2.72) in training period.
Table 2. Summary of the training process of ANN and ANFIS models
Testing results of the ANN and ANFIS model for each station are given in Table 3. It is clear from the table that the RMSE values range from 0.1.53 to 4.20 °C differ from the observed value for the ANN model, while the RMSE values range from 0.1.18 to 9.25 °C differ from the observed. For the ANN model, the maximum RMSE (4.20 °C) and MAE (3.79 °C) values were found for the Zahedan station. For the ANFIS model, however, the maximum RMSE and MAE values were found to be 9.25 and 7.91 °C in the Bandar-e-Abbas station. However, the best ANN (RMSE = 1.53 °C, MAE = 1.27 °C) and ANFIS (RMSE = 1.18 C° MAE = 0.82 °C) results were found for the Yazd station. It is clearly seen from Table 3 that the performance values of the ANN model are generally better than the performance values of the ANFIS model in long-term monthly temperature prediction. In seven of ten stations, the ANN model performs better than the ANFIS model. The maximum R2 values between the observed and predicted values for the ANN and ANFIS models were found to be 0.995 and 0.999 in Semnan and Shiraz meteorological stations, respectively. However, the minimum R2 values were respectively found as 0.921 and 0.876 for the ANN and ANFIS models in Bandar-e-Abbas station. Both ANN and ANFIS models give poor estimates for the coastal stations (Bandar-e-Abbas and Bushehr). The reason behind this may be the fact that most of the training data composed of inland stations. The training data samples may be not enough for learning the coastal stations.
Table 3. Summary of the testing process of ANN and ANFIS models
The test results of the ANN and ANFIS models are compared in Figures 3-12. It can be obviously seen from the figures that the ANN predictions are generally closer to the corresponding temperatures than those of the ANFIS model. ANFIS seems to perform better than the ANN for the Semnan, Birjand, Kerman and Yazd stations. Both ANN and ANFIS models overestimate the observed monthly temperatures of the Bandar-e-Abbas, Birjand, Bushehr, Kerman, Shiraz and Zahedan. For the Bojnurd, Mashhad and Yazd stations, the ANN model generally overestimates while the ANFIS model underestimates. For the Semnan, however, the ANN model underestimates while the ANFIS overestimates the observed temperatures.
The knowledge of air temperature values is of great importance for irrigation scheduling and hydrological management as well as for soil- and plant-related studies and agro-hydrologic fields. In this article, the abilities of ANFIS and ANN models were investigated to predict air temperature and precipitation values using the geographical input data. The data from 30 weather stations in Iran were used for training and testing of the introduced models. The ANFIS and ANN models were compared with each other with respect to root mean-squared error, mean absolute error and determination coefficient statistics. ANFIS model was found to be better than the ANN in the training period. In the test period, however, the ANN model performed better than the ANFIS model in seven of ten stations. For the ANN and ANFIS models, the maximum determination coefficient values were found to be 0.995 and 0.999 in Semnan and Shiraz meteorological stations, respectively. The minimum determination coefficient values were respectively found as 0.921 and 0.876 for the ANN and ANFIS models in Bandar Abbas station. As a conclusion, it can be said that the ANN technique can be successfully used to predict the long-term monthly temperatures of any site at a location with no measurement based on the temperature data and geographical variables of the neighbour stations.
This study applied ANN and ANFIS techniques for modelling long-term air temperature values by using geographical information. Further investigations may be carried out with other techniques and data management scenarios for generalization of the obtained results. Nevertheless, the techniques applicability may be examined for other important climatologic variables (e.g. rainfall, snow, avalanches, etc).