Modelling daily soil temperature at different depths via the classical and hybrid models

Soil temperature (ST) is one of the crucial variables of soil and it plays a fundamental role in different research scopes such as underground soil physical and agricultural applications. The study explores the modelling performance of a time series‐based model (i.e. bi‐linear, BL), and an artificial intelligence‐based approach including adaptive neuro‐fuzzy inference system (ANFIS), for modelling the daily ST of different soil depths (5, 10, 50 and 100 cm). The study also develops and proposes two diverse types of the hybrid models through coupling the ANFIS with the BL and wavelet analysis (W) to improve the accuracy of the ST modelling. Two stations in Iran (i.e. Isfahan and Urmia) were selected as the study locations. The results demonstrated that the ANFIS generally presented better results than the BL. Furthermore, the hybrid models (i.e. W‐ANFIS and ANFIS‐BL) gave superior performances than the classical ANFIS and BL for modelling the daily ST of the studied areas at various soil depths. In addition to the local evaluation of the ANFIS (i.e. modelling the ST at a specific depth by using the original ST data at that depth), an external analysis was also conducted. In doing so, the daily ST data at a 5 cm depth were modelled via the corresponding ST data at a 10 cm depth, and vice versa. The results denoted the applicability of the ST data at another depth for modelling the ST of each specific/target depth.


| INTRODUCTION
Soil temperature (ST) is one of the meteorological and climatological variables involved in land surface hydrological processes, land-atmosphere interactions (Hu and Feng, 2003;Bilgili et al., 2013), agricultural meteorology (Hillel, 1998;Mavi and Tupper, 2004) and different ecosystems (Jebamalar et al., 2012), and it is an important feature of plant growth, based upon plentiful agricultural research.
In addition, the ST affects the physical and chemical responses in soil (George, 2001). Moreover, it is a vital indicator of climate change and a crucial variable in numerical weather forecasting and climate predicting (Holmes et al., 2008;Qian et al., 2011). The ST has various spatial and temporal conducts at different depths of soil. Thermal flux inside the soil is affected by different parameters consisting of air temperature, solar radiation, wind speed, season and the physical properties of soil (Behmanesh and Mehdizadeh, 2017). Therefore, estimating the ST is moderately demanding, particularly close the surface of the ground where the ST fluctuations are relatively high than the deeper soil layers (Mihalakakou, 2002).
Measuring the ST at different depths of soil is usually performed by a variety of thermometers and sensors installed at various depths. The measurement error for the ST thermometers is 0.1 C (Mehdizadeh et al., 2020a). However, the ST sensors are expensive, and their application requires special expertise. Furthermore, continuous monitoring of this variable needs a couple of sensors at different depths. Therefore, it is desirable to develop an alternative approach for direct ST measurements.
For example, the ST can be estimated by artificial intelligence (AI)-based models, such as adaptive neuro-fuzzy inference systems (ANFIS), artificial neural networks (ANN), multivariate adaptive regression splines (MARS), M5 model tree (M5T), gene expression programming (GEP) and support vector machine (SVM) (Nahvi et al., 2016;Citakoglu, 2017;Sanikhani et al., 2018). Besides the AIbased models, time-series analysis (TSA) approaches could also be appropriate tools for the ST estimation (Mehdizadeh et al., 2020a). The linear autoregressive (AR), autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA), fractionally autoregressive integrated moving average (FARIMA), as well as non-linear bi-linear (BL), autoregressive conditional heteroscedasticity (ARCH) and self-exciting threshold autoregressive (SETAR) models belong to the TSA-based techniques. Overall, the AI-based models have been applied more than the TSA approaches for the estimation of ST. Besides the applications of standalone AI and TSA-based models in climatological and soil sciences, remarkable progress has been made to implement hybrid models to improve the performances of the standalone models. In this context, the hybrid models created via coupling the wavelet analysis (W) and TSA paradigms with the AI-based models can be recommended as alternatives to the standalone AI models.
The ST at different depths at Adana meteorological station, Turkey, was predicted by Bilgili (2010) using linear regression (LR), non-linear regression (NLR) and ANN methods. The obtained results revealed that the ANN outperformed the LR and NLR models. Estimation of the daily ST at six different depths in an arid region of Iran, namely Isfahan, was conducted by Tabari et al. (2011) through the applications of ANN and multivariate linear regression (MLR) techniques. The results indicated that the ANN had a better estimation accuracy than the MLR model. Wu et al. (2013) modelled the monthly ST at a depth of 10 cm by applying the ANN in southwestern China. The proposed ANN models showed relative improvement over classical linear models. Kim and Singh (2014) modelled the daily ST at two different depths (i.e. 10 and 20 cm) using data-driven models, including the ANFIS and multilayer perceptron (MLP), at Springfield and Champaign stations in Illinois. They found out that the MLP illustrated superior performance compared with the ANFIS. They also reported that the ANFIS and MLP had better results than an MLP-based spatial distribution. Nahvi et al. (2016) estimated the daily ST at different depths of ground using an extreme-learning machine (ELM) and self-adaptive evolutionary extremelearning machine (SaE-ELM). Their findings showed that the performances of both the ELM and SaE-ELM models were good at all depths. However, the hybrid model showed a better modelling accuracy of the ST than the classical ELM. To model the monthly ST at different depths (i.e. 10, 50 and 100 cm), Kisi et al. (2017) used the ANFIS, ANN and genetic programming (GP) models. Two stations in Turkey at Adana and Mersin were chosen as a case study. Comparing the efficiencies of the models showed that the GP outperformed the other applied models. Behmanesh and Mehdizadeh (2017) estimated the daily ST at various depths using the GEP, ANN and MLR in the semiarid region of Sanandaj, western Iran. The results illustrated that all three methods were capable of estimating the ST well, and among these methods, the ANN had superior results. Samadianfard et al. (2018) used wavelet neural networks (WNN), ANN and GEP models for estimating the daily ST at Tabriz station, Iran. Their results indicated that the WNN had superior outcomes compared with the ANN and GEP. They also found that by increasing the depth of soil, the estimation accuracy and influence of meteorological parameters decreased quickly. The monthly ST at several depths from two stations, namely Mersin and Adana, was modelled by Sanikhani et al. (2018). They applied different machine-learning methods, including the ELM, ANN and M5T. Their results revealed that the best estimates of the ST were found by the ELM. Feng et al. (2019) estimated the half-hour ST from meteorological data using different machine-learning methods at four depths in the Loess Plateau, China, namely: back-propagation neural networks (BPNN), ELM, generalized regression neural networks (GRNN) and random forests (RF). The ELM was found to provide better performance than the other models. Novel hybrid models were proposed by Mehdizadeh et al. (2020a) for modelling the daily ST at different depths. The authors used a time-series-based model called FARIMA, two machine-learning-based models, including feed-forward back propagation neural networks (FFBPNN) and GEP, as well as the hybrid FFBPNN-FARIMA and GEP-FARIMA. It was concluded that hybrid models had more accurate results than the standalone models. Mehdizadeh et al. (2020b) proposed bio-inspired meta-heuristic optimization algorithms for estimating the daily ST of different depths by using an AI-based model, namely Elman neural network (ENN) along with its hybrid with the gravitational search algorithm and ant colony optimization (i.e. ENN-GSA and ENN-ACO) at two meteorological stations in Iran. Their findings demonstrated that the developed hybrid models presented superior performances compared with the conventional ENN.
Given the importance of the ST in agricultural and climatological studies, especially for soil science, estimating this important variable through a proper technique is an essential task. Therefore, the current study seeks to model the daily ST time series using the standalone and hybrid models. To this end, two stations in Iran, including Isfahan and Urmia, were considered as the study sites. The AI-based ANFIS and a TSA-based BL are used as the standalone models. Furthermore, the study aims to improve the modelling performance of the daily ST by implementing the hybrid models. In this context, the standalone ANFIS was hybridized with the W and BL. Besides the local evaluation of the ANFIS, an external evaluation of this model was also performed for 5 and 10 cm soil depths (i.e. surface layers) and 50 and 100 cm soil depths (i.e. deep layers), which is worthwhile when the ST data at a certain depth are not available. To the best of the authors' knowledge, evaluating the modelling efficiency of the standalone BL time-series model, as well as the hybrid ANFIS-BL and W-ANFIS, along with an external assessment of the AI-based ANFIS have not yet been addressed in previous studies for modelling the ST.

| Study area and data used description
To model the ST, two stations in Iran, namely Isfahan and Urmia, were considered as a case study. Isfahan station is situated at a latitude of 32 30 0 N and longitude of 51 42 0 E, while Urmia station is located at a latitude of 37 42 0 N and longitude of 45 06 0 E. The geographical positions of the study regions are illustrated in Figure 1. The required daily ST data at different soil depths (i.e. 5, 10, 50, and 100 cm) recorded during the periods 1995-2017 for Isfahan station and 1993-2017 for Urmia station were compiled from the Iran Meteorological Organization (IMO). Figures 2 and 3, respectively, depict the time-series graphs of the observed daily ST data at Isfahan and Urmia stations.
The daily ST data recorded at the selected stations were divided into two data sets: training and testing. In the training phase, the ST data recorded for the periods 1995-2011 and 1993-2011, respectively, were used for Isfahan and Urmia stations. The remaining data for 2012-2017 were used for the testing phase at both stations.
The statistical parameters of the daily ST data at the studied locations are presented in Table 1, where x min , x max , x mean , x sd and x cv denote the minimum, maximum, mean, standard deviation and co-efficient of variation of the ST time series, respectively. According to Table 1, the STs at different depths are greater at Isfahan station than at Urmia station. At both stations, the minimum STs were recorded at a 5 cm depth; by increasing the depth of the soil, the STs increase, with the maximum STs recorded at 100 cm. Additionally, the x sd 's indicated that by increasing the soil depth, the deviation from the mean ST decreases.

| Bi-linear (BL) model
The BL model is classified in the group of non-linear TSA-based models, which was initially proposed by Granger and Andersen (1978). This model is extended on the basis of the ARMA family models. The BL model is extracted from Taylor's second-order expansion and can be indicated as a BL (p, q, r, s) model (Fan and Yao, 2008). It can be mathematically formulated as follows: where Z t is a standardized time series; (p, q, r, s) are the positive integers indicating the BL model order; (φ, θ, β) are the model co-efficients; and ϵ t is a standardized stochastic series. The multiplication of two variables including Z t and ϵ t (i.e. P r has also leads to the BL model becoming a non-linear form and it can be used to model non-linear phenomena (Ainkaran, 2004). Fitting of the non-linear BL used in the study consists of two steps: (1) Determination of model order. In the current study, the corrected Akaike's information criterion (AICC) was applied to determine the most proper BL model order. The AICC was computed for the various BL orders through the following equation, and each order with the lowest AICC was selected as the model order: where n is the number of data; p is the model order in the autocorrelated section; q is the model order in moving average section, andσ 2 is the variance of errors.
(2) Computation of model co-efficients. The standard method of maximum likelihood estimation function is used to estimate the model co-efficients.

| Adaptive neuro-fuzzy inference system (ANFIS)
The ANFIS as a powerful tool for modelling purposes was firstly introduced by Jang (1993). It is capable of analysing non-linear phenomena and surveying the relationships between inputs and outputs in multi-variable systems. The fuzzy part of the ANFIS establishes a link between the input and output variables, which is called the membership function (MF). In general, there are two types of fuzzy logic systems including: (1) the Mamdani inference system (MIS); and (2) the Sugeno inference system (SIS). The main difference between them is in the F I G U R E 1 Locations of the considered stations in Iran MF, so that the output MF in the Mamdani is as nonlinear form, but it is linear (constant) in the Sugeno type.
To classify the data in the ANFIS, two paradigms consisting of grid partitioning (GP) and subtractive clustering (SC) can be applied. In an ANFIS-GP technique, the space of input data is classified in a number of fuzzy local areas, while each data point is considered as a potential cluster centre in an ANFIS-SC.
The ANFIS-GP was used in the current study when modelling the daily ST time series. Note that the modelling performance of this model is highly affected by the type and number of MFs. The optimal type and number of MFs must therefore be achieved through a trial-anderror procedure.

| Wavelet analysis (W)
The basics of signal analysis was introduced by Fourier in the 17 th century. In fact, the Fourier's theory was the basis for the invention of W. Grossmann, and Morlet (1984) initially introduced wavelet theory. It was developed by integrating engineering and mathematical theories and has attracted the attention of many scholars due to the simplicity of its application. As is apparent, the word "wavelet" means a small wave. This small wave must include a limited number of oscillations, a rapid return to zero in both positive and negative directions of its domain, and a mean of zero. These attributes are called admissibility conditions.
One benefit of a wavelet transform is its capability of filtering, in which the data are divided into two categories including approximations and details. The approximations include large-scale components (low frequency) and details consist of small-scale components (high frequency). The decomposition process of the waves involves one or more steps (Miner, 1998). In multi-step decomposition, the decomposition procedure can be iterated by continuous decomposition of the approximations. A wave is therefore divided into subsets, which is called wavelet decomposition tree (Polikar 1996).
The wavelet function contains two important characteristics of being oscillatory and short term. φ(x) is a wavelet function if its Fourier transform satisfies: F I G U R E 2 Time series of the observed daily soil temperature (ST) data at different depths during the studied period for Isfahan station This is known as the admissibility condition for φ(x) wavelet. φ(x) is the mother wavelet function in which the functions used in the analysis change in size and location via two mathematical operations of transmission and scale within the signal to be investigated: Finally, the wavelet co-efficients can be calculated at any point of the signal (b) and for any scale (a) (Mallat, 1998):  The T is calculated for the different values of a and b, and it shows the most compliance when T has the highest positive value. There is no compliance when T = 0. There are several types of the mother wavelets, which Daubechies4 (i.e. db4) is used in the current study to decompose the ST data.
One of the important steps in applying wavelet functions is to select an appropriate decomposition level for the analysis of the intended signal. The following equation was used to identify the allowed decomposition levels (Wang and Ding, 2003;Mehdizadeh et al., 2020c): where L is the number of decomposition levels; N is the total number of data series; and Int is an integer operator. Hybrid models were developed after selecting the appropriate decomposition level.

| Models development
The AI-based classical ANFIS and the time-series-based BL models were used to achieve the aims of the present study (i.e. modelling the daily ST data at different soil depths). Moreover, to improve the modelling accuracy of the ST time series, the ANFIS was hybridized with the BL and W techniques to create the hybrid W-ANFIS and ANFIS-BL models. The development steps of single and hybrid models are presented below.

| Local evaluation
Single BL model Before developing each model, initial calculations were needed. All the applied data must be standardized to fit the BL model to the target data. Therefore, the daily ST data recorded at both the studied stations were standardized as follows: where ST s and ST denote the standardized and original observed soil temperature, respectively; and ST and σ ST represent the average and standard deviation of the observed ST data recorded, respectively. The standardized ST data sets were also employed when modelling the ST through the single ANFIS.
After standardizing the original ST data at all depths (i.e. 5, 10, 50 and 100 cm), the BL models with various orders were fitted to the standardized ST data. The best BL models at different depths were then selected based on the least AICC.

Single ANFIS model
Besides a time-series-based BL, the local performance of an AI-based ANFIS was also investigated. For this purpose, the daily STs for the present day (ST t ) at different soil depths were modelled using the previous day's ST (ST t-1 ) (i.e. the one day antecedent ST data) as inputs. The local development of the ANFIS is mathematically described as follows: Hybrid models Five given steps were followed to implement the hybrid ANFIS-BL and W-ANFIS models. First, the single ANFIS was employed to model the deterministic constituent of the daily ST data recorded at both stations. Second, the most suitable BL models were fitted to the standardized ST data at different depths and then selected for each station. Third, the stochastic constituent (S t ) gained from the BL model was combined with the deterministic constituent (D t ) of the ST data, which was obtained from the single ANFIS. Fourth, the following equation was applied to estimate the ST via the hybrid ANFIS-BL models (i.e. Y t ) as: Finally, the hybrid W-ANFIS models were developed at different depths by coupling the W on the ANFIS. To this end, db4 was used as the mother wavelet. Additionally, different decomposition levels were considered and tested depending on the total numbers of the ST data at the studied locations.
It is necessary to mention that whole the modelled data via the single and hybrid models are in the standardized form. The de-standardization process was therefore used to transform the modelled STs to real data with a unit of C.

| External evaluation
Besides the local evaluation, the performance of the ANFIS was also evaluated under external analysis. The time-series-based BL model was not considered in this analysis because its structure is not conducive to external evaluation. To develop the AI-based ANFIS, the STs recorded at 5 and 50 cm depths were modelled using the STs at 10 and 100 cm as inputs, respectively, and vice versa. Equation 10 represents the scenarios considered for the external evaluation of the ANFIS applied:

| Performance evaluation metrics
Three statistical metrics, including the root mean square error (RMSE), mean absolute error (MAE) and Kling-Gupta efficiency (KGE), were employed to evaluate the performance of the classical BL and ANFIS, as well as the hybrid W-ANFIS and ANFIS-BL models in modelling the daily ST time series of various depths at both the studied stations: where ST o, i , ST m, i , ST o and N indicate the ith observed daily soil temperature, the ith modelled daily soil temperature, the mean of the observed soil temperature data, and the total number of observations for each of the training and testing data sets, respectively. Besides, CC,α and β in the KGE index illustrate the correlation co-efficient, the standard deviation ratio, and the average ratio of the observed and modelled ST data, respectively. A model with the lowest RMSE and MAE and the highest KGE is selected and proposed as an appropriate model when modelling the daily ST.

| Performance evaluation of the classical and hybrid models under local analysis
First, the optimal time-series-based BL models were fitted to the ST data at different soil depths in their standardized forms. In doing so, different BL models containing the various orders were examined, and the optimal BL models were chosen focusing on the least AICC. The error statistics including the RMSE, MAE and KGE computed for the best BL models at Isfahan and Urmia stations for both the training and testing phases are tabulated in Table 2. As can clearly be seen, the BL models fitted to the daily ST data provide the lowest performance when modelling the daily ST of the surface layers (5 and 10 cm), particularly for a 5 cm depth. Generally, the accuracy of the BL models developed at the studied locations was improved by increasing the soil depth; the BL models yielded the high-performance modelling of the daily ST for the deep layers, specifically at a 100 cm depth. For example, the BL model implemented at the depth of 100 cm had an RMSE = 0.346 C, MAE = 0.196 C and KGE = 0.998 for the training period, and an RMSE = 0.353 C, MAE = 0.200 C and KGE = 0.998 for the testing stage at

Station
Depth ( Tables 3 and 4, respectively. Note that the best ANFIS models at various soil depths were obtained via a trialand-error process. In this regard, different membership functions and also different numbers for the membership functions were tested and the optimal ANFIS models were then selected by considering the lowest error rates for the classical ANFIS. Based on the results of Tables 3  and 4, an ANFIS approach demonstrated the potential for the daily ST modelling at different soil depths through the application of the antecedent ST data as the inputs. Similar to the results concluded for the BL models, the ANFIS illustrated higher estimation accuracy at deeper layers (i.e. 50 and 100 cm) compared with the surface soil layers (i.e. 5 and 10 cm).
Afterwards, two various types of the hybrid models were developed via coupling the ANFIS with the W and time-series-based BL techniques in order to improve the modelling accuracy of the daily ST. Consequently, the hybrid W-ANFIS and ANFIS-BL models were developed and proposed. Note that the hybrid W-ANFIS model is named as W(Á)-ANFIS so that the number in parentheses indicates the decomposition level used in the modelling procedure. The numbers of decomposition levels allowed for developing the hybrid W-ANFIS at both Isfahan and Urmia stations are three since L = Int [log 8395] = 3 at Isfahan; and Int[log 9125] = 3 at Urmia. The statistical indicators of the RMSE, MAE, and KGE obtained for the hybrid W-ANFIS and ANFIS-BL models for the Isfahan and Urmia stations are listed in Tables 3 T A B L E 3 Error statistics of the classical adaptive neuro-fuzzy inference system (ANFIS), and the hybrid wavelet analysis-adaptive neuro-fuzzy inference system (W-ANFIS) and ANFIS-bi-linear (ANFIS-BL) models for training and testing periods under local analysis at Isfahan station and 4, respectively. Regarding the hybrid W-ANFIS models developed at different soil depths, it is apparent that coupling the W on the ANFIS leads to improve the daily ST estimation in comparison to the classical ANFIS. In addition, the hybrid W-ANFIS models developed under two and three decomposition levels (i.e. W(2)-ANFIS and W(3)-ANFIS) generally represent better results than the hybrid W(1)-ANFIS one. Superior performance of the hybrid W-ANFIS compared with the classical ANFIS can be verified considering the point that, the W provides appropriately the required information to the ANFIS by decomposing the inputs. Moreover, it can be clearly seen that hybridizing the BL and the AI-based ANFIS illustrated better estimates of the daily ST at various soil depths compared with the classical BL and ANFIS models. As mentioned, the hybrid ANFIS-BL model employs both the stochastic and deterministic terms when modelling the daily ST; however, the classical ANFIS and BL are capable of modelling the deterministic and stochastic terms, respectively. This feature is the most important reason for the higher accuracy of the hybrid W-ANFIS than the classical ANFIS.
Similarly, both the hybrid models (i.e. W-ANFIS and ANFIS-BL) also showed higher modelling accuracy at the deep layers than the surface layers. Nevertheless, the best-performing hybrid model of W-ANFIS for both the studied sites is generally observed at the 100 cm depth. On the other hand, the hybrid ANFIS-BL models developed at Isfahan and Urmia stations demonstrate the best performance at the 50 and 100 cm depths, respectively. Likewise, Tabari et al. (2015) reported better performance of an AI-based model including the MLP using the antecedent ST data for modelling the ST at deeper layers than the surface layers. Furthermore, the results of the current study verify the outcomes of previous studies such as Mehdizadeh (2018b), Mehdizadeh and Kozekalani Sales (2018), Mehdizadeh et al. (2017cMehdizadeh et al. ( , 2018bMehdizadeh et al. ( , 2019, and Fathian et al. (2019). The authors developed different types of the hybrid models through hybridization of the different time-series-and AI-based models for improving the modelling efficiency of classical models in modelling hydrological variables.
Here the performances of all the models developed, consisting of the classical BL and ANFIS, as well as the T A B L E 4 Error statistics of the classical adaptive neuro-fuzzy inference system (ANFIS), and the hybrid wavelet analysis-adaptive neuro-fuzzy inference system (W-ANFIS) and ANFIS-bi-linear (ANFIS-BL) models at Urmia station for training and testing periods under local analysis hybrid W-ANFIS and ANFIS-BL, were investigated and compared with each other. A performance assessment of the classical models indicated that the AI-based ANFIS models implemented at different soil depths performed better than the BL models at the studied locations, specifically for the surface layers. As already mentioned, the developed hybrid W-ANFIS and ANFIS-BL models, however, outperformed all the classical ANFIS and BL models in modelling the daily ST data at different soil depths of the studied stations. Among the hybrid models, the ANFIS-BL models generally yielded better modelling performances compared with the hybrid W-ANFIS ones. Figure 4 depicts the radar plots for the RMSE criterion during the test phase of all the classical BL and ANFIS, as well as the hybrid W-ANFIS and ANFIS-BL models. Note that the optimal W-ANFIS models for different depths at the studied stations were selected. For both Isfahan and Urmia stations, the hybrid models provided the lowest RMSE than the single models. This result is clearly visible for surface layers than for deep layers. The hybrid ANFIS-BL is also the best-performing model at different depths. Conversely, the classical BL illustrated the highest RMSE, verifying the poor performance of this model when modelling the daily ST at various soil depths.

| Performance evaluation of the classical ANFIS models developed under external analysis
As already mentioned, the classical ANFIS was developed under two evaluation types: local and external. The statistical results obtained for the external analysis of the F I G U R E 4 Radar plots of the classical and hybrid models using the root mean square errors (RMSEs) during the testing period T A B L E 5 Error statistics of the classical adaptive neuro-fuzzy inference system (ANFIS) for training and testing periods under external analysis at Isfahan and Urmia stations ANFIS at Isfahan and Urmia stations are tabulated in Table 5. The modelling results demonstrated that the ST data at a 5 cm soil depth can be used as the inputs for modelling the ST at a target soil depth (i.e. 10 cm), and vice versa. This result was also concluded for the deep soil layers (i.e. 50 and 100 cm). For both studied sites, the external performances of the ANFIS models developed at 5 and 10 cm depths were much better than the corresponding ANFIS performances under a local analysis, because there was a high correlation between the ST data of surface layers for any particular day. For deeper soil layers, the external performance of the ANFIS was worse than the results achieved for the local analysis. However, it can be seen that the accuracy of the ANFIS under external analysis was reasonable for the ST modelling at the depth of 50 cm using the corresponding ST data at 100 cm soil depth, and vice versa. The greatest improvements in the external performance of the ANFIS, compared with the local analysis, were observed for both the studied stations at the 5 cm surface layer. For an example, the local evaluation of the ANFIS for the daily ST modelling of a 5 cm depth at Isfahan station yielded an RMSE = 1.457 C, MAE = 1.044 C and KGE = 0.989 for the training period, and an RMSE = 1.497 C, MAE = 1.069 C and KGE = 0.987 for the testing period (Table 3). Alternatively, using the external analysis improved these statistical metrics to RMSE = 0.871 C, MAE = 0.630 C and KGE = 0.991 for the training period, and to RMSE = 1.028 C, MAE = 0.781 C and KGE = 0.959 for the testing period (Table 5) for modelling the daily ST of a 5 cm depth at Isfahan station. As a general conclusion, it can be said that the ST data of another depth can be used for the ST modelling of a specific/target depth when the ST data are not available at that soil depth.

| CONCLUSIONS
The daily time series of soil temperature (ST) at different depths (i.e. 5, 10, 50 and 100 cm), which were recorded at two stations in Iran at Isfahan and Urmia, were modelled through the classical adaptive neuro-fuzzy inference system (ANFIS) and bi-linear (BL) models, as well as hybrid models developed by coupling the ANFIS with the wavelet (W) and BL (i.e. W-ANFIS and ANFIS-BL). All the models were developed under a local analysis. However, the external performance of the ANFIS was also investigated for the ST estimation at different depths. Three error criteria including the root mean square error (RMSE), mean absolute error (MAE) and Kling-Gupta efficiency (KGE) analysed the local and external performances of the developed models. Notably, assessing the performance of a standalone form of the BL and its hybridized form with an artificial intelligence (AI)-based ANFIS, as well as an external analysis of the ANFIS for estimating the ST data at different soil depths, had not been done previously.
Generally, the local performance of the ANFIS was better than the BL when modelling the daily ST at different depths. The modelling accuracy of the classical ANFIS was then improved by hybridizing the ANFIS with the W and BL techniques (i.e. the hybrid W-ANFIS and ANFIS-BL models). It was concluded that the hybrid models outperformed the classical ones at various soil depths. Based on the results obtained, the local performance of the ST modelling via the classical BL and ANFIS, as well as the hybrid W-ANFIS and ANFIS-BL, was better at deep layers (50 and 100 cm) than that of the surface layers (5 and 10 cm). Besides the local analysis of an AI-based ANFIS, an external evaluation of this model was also performed. To that end, the daily ST data at depths of 5 and 50 cm were estimated by the corresponded ST data at 10 and 100 cm depths, respectively, and vice versa. The results illustrate that the local performance of the ANFIS was better than the external performance for deep depths (50 and 100 cm); however, the external analysis presented superior modelling results compared with the local analysis for the surface depths (5 and 10 cm).
The literature review verifies that the AI-based models have been increasingly used to model the ST data in comparison with the time-series-based approaches. Therefore, future studies could further explore the application of time-series-based models for modelling the ST. Specifically, different types of these techniques, including autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA) and fractionally ARIMA (FARIMA), can be recommended. On the other hand, as concluded, the hybrid W-ANFIS and ANFIS-BL models led to the better estimates of the ST compared with the classical BL and ANFIS ones. Future research could therefore explore the development of other hybrid models by hybridizing the various types of AI-based approaches with the different time-series models and W analysis for the accurate modelling of the ST.