## 1. Introduction

Long-term average values of solar radiation (Rs) are required in many solar energy applications. However, a limited number of meteorological stations record Rs due to the cost and difficulty of maintenance and calibration of the measurement equipment (Hunt *et al.*, 1998). When measured records of Rs are not available, a common practice is to estimate them from other observed meteorological variables using different approaches. In view of practical application, empirical approach is generally preferred to other methods (Badescu, 1997; Liu and Scott, 2001; Li *et al.*, 2010). However, since there are many factors, such as astronomic, geography, cloud cover, albedo of the underlying surface, atmospheric turbidity, absorption and scattering, and influence on Rs, development or validation of empirical models require pre-knowledge about relationships between these variables and Rs.

In the last decade, many studies reported the utility and efficiency of artificial neural networks (ANNs) in estimating Rs in distinct places around the world (e.g. Al-Alawi and Al-Hinai, 1998; Toǧrul and Onat, 1999; Mohandes *et al.*, 1998; Sözen, 2004; Lam *et al.*, 2008; Benghanem *et al.*, 2009; Rehman and Mohandes, 2008; Tymvios *et al.*, 2005). Recently, a novel machine-learning algorithm, support vector machine (SVM) originally developed by Vapnik (1998), has been widely applied to many traditionally ANNs dominated fields, such as pattern recognition, signal process, and time series analysis (e.g. Asefa *et al.*, 2006; Osowski and Garanty, 2007; Liu *et al.*, 2008; Chen *et al.*, 2011). Studies have demonstrated that SVM is superior to traditional ANNs in dealing with classification and regression problems due to its good generalisation ability (Vapnik, 1998; Schölkopf and Smola, 2002). Rooted in the statistical learning theory and structural risk minimisation (SRM) principle, SVM uses a hypothesis space of linear functions in a higher-dimensional feature space and it is less vulnerable to overfitting problem (Vapnik, 1998). Unlike most of the traditional neural networks which are based on empirical risk minimisation, SVM is guaranteed to find a global minimum by striking a right balance between the quality of the approximation and the complexity of the approximating function (Vapnik, 1998). Thus, the learning problem in SVM is formulated to be quadratic programming (QP) with linear constraint using kernel functions. An excellent introduction to SVM could be found in Vapnik (1998).

With respect to the advantages and rising popularity of SVM, an attempt is taken to explore the potential of SVM in estimating long-term monthly Rs in this study. Air temperatures are being used as input parameters for SVM, since these records are more readily available at most meteorological stations. Therefore, the objectives of the current study are: (1) to develop long-term monthly Rs models using SVM and air temperatures; (2) to compare the SVM models with empirical temperature-based equations; and (3) to assess the predictability of the models on a regional scale.