A hybrid prediction model for photovoltaic power generation based on information entropy

Photovoltaic power is affected by various random and coupled meteorological factors, and its changing trend implies the non-linear effects of these factors. According to the quan-titative analysis results, a statistical prediction model is proposed to accurately predict the power, which is of great signiﬁcance to the safe and efﬁcient use of solar energy. In this study, the authors ﬁrst use grey relation analysis to select four main meteorological factors affecting photovoltaic power. Further, they combine grey relation analysis with information entropy and apply grey relation entropy to similar day analysis. On this basis, they take grey relation analysis to optimise extreme learning machine model to establish the grey relation analysis-extreme learning machine model, while taking similar day analysis to optimise ﬁreﬂy algorithm to establish the similar day analysis-ﬁreﬂy algorithm. By combining the two sub-models with information entropy, a hybrid prediction model for photovoltaic power generation based on information entropy is proposed. The experimental results show that in various weather conditions, the values of mean absolute percentage error, root mean square error and standard deviation of error are 2.8425%, 2.5675 and 2.2642, respectively. Therefore, the proposed hybrid model has superior prediction performance. and respectively. Compared with BP, SVM, GPR, ESE, WPS and GAS, the MAPE value, RMSE value and SDE value decreases by 66–75%, 71– and 65–88%, respectively. The results show the hybrid model proposed in this work has better prediction performance.

ment of a photovoltaic power generation prediction model with high accuracy and stability has become a major issue that has to be broken through.
In terms of characteristics and performance, photovoltaic power prediction methods can be classified into physical methods and statistical methods. Physical methods are mainly aimed at solar irradiance prediction and cloud detection. They consist of three sub-models: numerical weather prediction (NWP), sky imagery and satellite-imaging models [2]. NWP models are essentially based on the numerical integration of coupled differential equations describing the radiation transport mechanisms [3]. These models are used widely to predict atmospheric state up to 15-days ahead. Meanwhile, they are biased by less than 50 W/m 2 for measured clear conditions [4]. In [5], a shortterm forecasting system for hourly electrical energy production is proposed. The system includes a global NWP module corresponding to the Global Forecasting System (GFS) model, a meso-scale NWP module corresponding to the Meso-scale Model 5 model (MM5) and an energy forecasting model based on artificial intelligence. The forecasting horizon ranges from 1 to 39 h, covering all of the following day. With forecasting horizons from 16 to 39 h, the RMSE error represents 11.79% of the rated power of the PV plant. Sky imager is a digital camera that is applied for clouds detection, cloud height measurement above ground, and cloud motion determination. Researchers can use sky imager to obtain sky images and recognize cloud. On the basis of establishing cloud motion vectors, they can predict cloud cover, irradiance and photovoltaic power. In [6], a solar irradiance prediction methodology for all sky conditions using the TSI-880 model is presented. The results indicate the average normalized root mean square error (nRMSE) value of beam irradiance, diffuse irradiance and global irradiance is 25.44% in cloudless skies, 11.6% in partially cloudy skies and 11.17% in overcast skies. Satellite imaging is another model to obtain sky image data. The satellite-based sensors can take visible and infrared images to determine cloud pattern. In [7], a satellitederived, ground data model is developed, which improves the intra-day solar prediction and neural network (NN) model. The results indicate the model increases the forecast time scales from 15.47% to 22.17% for Co-Pozo Izquierdo station and 25.15-34.09% for C1-Las Palmas station. The disadvantage of physical methods is the need for accurate topographic maps, coordinate maps, photovoltaic array power curves and other relevant data. At the same time, the local anti-interference ability is poor and the overall robustness is weak [8]. The advantage of physical methods is that they do not require a large amount of historical data, save the calculation time, and enhance the prediction ability when the weather conditions change.
Statistical prediction is a direct way to forecast the photovoltaic power by considering some key influencing factors. Statistical methods usually analyse historical data to obtain the prediction trend. Further, they take the changing trend as a basis for photovoltaic power generation prediction. The commonly used statistical prediction methods are multi-factor line regression method [9], time series forecasting method [10] and Markov chain method [11]. In [9], a non-linear regression model known as multivariate adaptive regression splines (MARS) is proposed to forecast the solar power of a grid-connected 2.1 kW PV system. The comparative experiments show the MARS model can provide more reliable forecast performance. In [12], the hourly photovoltaic power generation in Chile is estimated by using multiple linear regression models. The results show the RMSE value and mean bias error (MBE) value of the proposed model are reduced by 11% and 10% compared with the comparison model. Statistical prediction also includes artificial intelligence prediction. The commonly used artificial intelligence prediction mainly includes back propagation (BP) neural network, support vector machine (SVM) and artificial neural network [13]. In [14], a photovoltaic prediction model based on artificial neural network is proposed, which predicts the photovoltaic power according to solar radiation intensity, relative air humidity and temperature training self-organizing map (SOM). In [15], a BP neural network model is proposed to forecast the output power of a PV system located in Ashland at 24-hour-ahead. The best performance can be obtained with the BP neural network structure of 28-20-11, while the MAPE error is within 8%. Different from physical methods, statistical methods have no requirements for the geographical location and construction parameters of photovoltaic power stations. Therefore, compared with physical methods, they have the advantages of simple modelling and strong universality. However, a large amount of historical data and meteorological data are needed for model training to ensure the prediction accuracy. Besides, the accuracy of statistical methods depends on the data quality to a large extent. Although data screening and false data elimination can improve the prediction accuracy, they extend the calculation time, so the statistical methods are not suitable for ultra-short-term photovoltaic prediction.
Since photovoltaic power generation is affected by a variety of dynamically changing environmental factors, all influencing factors should be scientifically analysed according to actual working conditions. At the same time, a single prediction model cannot meet the requirements, so it is particularly important to develop an accurate hybrid model based on different single models. In [16], a hybrid model to forecast the power produced by small-scale grid-connected photovoltaic plants (GCPV) is introduced, which combines the seasonal auto-regressive integrated moving average method (SARIMA) and SVM. The experimental results indicate that the developed hybrid model performs better than both SARIMA and SVM. In [17], a hybrid short-term prediction model combining improved K-means clustering, GRA and Elman neural network is proposed. The results show that the average daily RMSE, MAPE and R 2 of the model are 4.3210 kW, 2.8366% and 0.9953, respectively. In [18], a short-term combination forecasting model for PV power based on similar days and cross entropy theory is proposed. Compared with combination models based on the sum of squared errors and correlation coefficient, the model has lower prediction error and better prediction performance.
There is uncertainty in the results of photovoltaic power prediction models, and the degree of distortion is expressed by their internal information entropy. The higher the information entropy, the lower the credibility. Therefore, to reduce the infor-mation entropy of the system and improve the prediction accuracy, this work takes GRA to optimise the ELM model to establish GRA-ELM, while taking SDA to optimise the FA model to establish SDA-FA. On this basis, GRA-ELM and SDA-FA are integrated to establish a hybrid prediction model based on information entropy. Compared with existing single prediction models and hybrid prediction models in accuracy and stability, the superiority of the model is clearly verified. The innovations and contributions of this work are as follows: 1. GRA is applied to select four main influencing factors from seven meteorological indicators as the input of ELM, so as to establish the GRA-ELM model; 2. SDA is used to select similar days from historical days by introducing grey correlation entropy, further, the photovoltaic power of similar days is taken as the input of FA to establish the SDA-FA model; 3. A hybrid prediction model for photovoltaic power generation based on information entropy is proposed, and its prediction accuracy and stability are compared, further, its running speed and convergence effect are analysed; 4. The proposed hybrid model is compared with existing single prediction models and hybrid prediction models in terms of MAPE, RMSE and SDE to verify its superior prediction performance [19]. The single prediction models include BP, SVM and Gaussian process regression (GPR), while the hybrid prediction models include EMD-SCA-ELM (ESE) [20], wavelet-PSO-SVM (WPS) [21] and genetic algorithmbased support vector machine (GAS). 5. The rest of the work is as follows: Sections 2 and 3 introduce the construction of GRA-ELM model and SDA-FA model, respectively. A hybrid prediction model based on information entropy is proposed in Chapter 4. Chapter 5 tests the hybrid prediction model and makes a comparative analysis. Chapter 6 summarizes the whole paper and puts forward research objectives of the next stage.

Meteorological factors sorting based on GRA
As the photovoltaic power generation prediction is a time-delay, non-linear, multi-parameter process, affected by weather temperature, weather relative humidity, wind speed and other factors, it is difficult to establish an accurate mathematical model. Therefore, it is necessary to use GRA to analyse and rank multiple meteorological factors according to the historical data, thus establishing the basis for the subsequent photovoltaic prediction models.
In GRA, the major and minor factors leading to the development of the system are determined by analysing the relationship between factors. For the photovoltaic prediction, this work takes the photovoltaic power as the reference data column and uses X 0 to represent it.  [22]. The research period of the recorded data is 08:00-17:00. The output power and meteorological indicators are measured every 5 min. Since sensor failure will cause errors in the recorded data, pre-processing must be done to exclude all error values in the original data to ensure the reliability of the analysis results. To describe the characteristics of the natural environment in more detail, this work selects seven indicators affecting the photovoltaic power generation as the influence factor array (expressed by X 1 , X 2 , X 3 , X 4 , X 5 , X 6 and X 7 , respectively). The seven indicators are clearly explained in Table 1. In GRA, with the continuous change of weather conditions, the coupling relationship between the photovoltaic power and meteorological factors will also change. Therefore, the sorting results of meteorological factors need to be refreshed dynamically every 30 min. In this work, to highlight the effectiveness of GRA, we select a rainy day (09.30.2017 14:00-14:25, Rainy) with strong randomness and unpredictability as the training sample to describe the specific process of GRA. The values of seven indicators are shown in Table 2. In GRA, it is difficult to directly compare and get accurate results due to the different dimensions of meteorological factors. Therefore, to ensure the unity in quantity, the above data must be dimensionless: , (i = 0, 1, … , m, j = 1, 2, … , k) , (1) where m represents the number of meteorological factors and takes 7, and k takes 6. The dimensionless matrix is obtained in Table 3.
After the dimensionless reference data column X 0 * and dimensionless meteorological factor data column X i * are determined, the correlation coefficient between meteorological factors and photovoltaic power is defined as where is the resolution coefficient and takes value in [0,1]. The correlation coefficient matrix is shown in Table 4. Finally, the meteorological factors are analysed according to Table 4, and the average value of correlation coefficients is calculated and defined as correlation degree where r i ∈ (0, 1), it indicates that any meteorological factor is not strictly independent of photovoltaic power, nor is it the only determining factor. If l, j ∈ {1, 2, … , m} and r l ≥ r j , it means that the correlation of X l is stronger than X j . Similarly, select 09.17.2017 (10:30-10:55, Sunny) and 10.20.2017 (11:00-11:25, Cloudy) as training samples. The correlation degrees of all meteorological factors are summarized and shown in Table 5. The correlation degree between the system reference column and each meteorological factor column is sorted in descending order ( The results are as follows: The four meteorological factors mainly affecting photovoltaic power on sunny days are weather temperature, weather relative humidity, wind direction and global horizontal radiation; on cloudy days, they are global horizontal radiation, weather temperature, wind speed and wind direction; on rainy days, they are global horizontal radiation, diffuse horizontal radiation, wind direction and weather temperature.
When we cannot judge the weather type of the forecast day through the weather forecast, in order to reasonably evaluate the impact of each meteorological factor on photovoltaic power, this work introduces p 1 , p 2 and p 3 as the preference coefficients of sunny days, cloudy days and rainy days, respectively, and assumes ∑ 3 i=1 p i = 1. On this basis, the comprehensive correlation degree is defined as where j = 1, 2, … , 7, X i j represents the correlation degree of meteorological factor j in weather condition i.
According to the recorded data from 04.01.2016 to 04.01.2018, we found that in 730 samples, 399 samples are sunny days, accounting for 54.66%; 171 samples are cloudy days, accounting for 23.42%; 160 samples are rainy days, accounting for 21.92%. Therefore, p 1 , p 2 and p 3 can be set to 0.5466, 0.2342 and 0.2192, respectively.
Calculate the comprehensive correlation degree between each meteorological factor and photovoltaic power as shown in Table 5. Sort them in descending order (X 1 > X 3 > X 2 > X 4 > X 7 > X 6 > X 5 ). We can conclude that according to the recorded data from 04.01.2016 to 04.01.2018, weather temperature, global horizontal radiation, weather relative humidity and diffuse horizontal radiation can be selected as the four main meteorological factors.
According to the calculation results in Table 5, the correlation degree histograms are drawn in Figure 1. In summary, we can analyse two prediction scenarios separately. (1) If we have accurately judged the weather type of the forecast day (such as sunny, cloudy or rainy), we can select the corresponding main meteorological factors; (2) If we cannot judge the weather type of the forecast day, we can use the comprehensive correlation degree for photovoltaic forecasting, which is slightly less accurate than the first prediction scenario. It can be seen that GRA can provide a reliable basis for the selection of meteorological information.

Construction of GRA-ELM model
In this chapter, ELM machine learning algorithm is combined with GRA to predict photovoltaic power. Take the four main meteorological factors in GRA as the input of ELM, to establish GRA-ELM model. ELM is a single hidden layer feedforward neural network learning algorithm. The traditional neural network learning algorithm (such as BP neural network algorithm) needs a lot of parameters in the training model, and it is easy to fall into the local optimal solution. By contrast, ELM only needs to set the number of neurons in the hidden layer, and the algorithm does not need to iterate the hidden layer. The output weight is determined by calculating the generalized inverse of the output matrix of the hidden layer. ELM not only guarantees the good generalization performance of the network, but also greatly improves the learning speed of the feedforward neural network. It has the advantages of high training speed, and can effectively overcome the inherent defects of BP neural network and learning vector quantization neural network.
There are seven meteorological factors affecting the photovoltaic power generation, and each factor has a certain coupling with the output power. For different prediction time, the coupling strength of each meteorological factor is different. Take the photovoltaic power before the prediction time as the training sample and use GRA to select four main influencing factors (such as weather temperature, global horizontal radiation, weather relative humidity and diffuse horizontal radiation). Further, take the above factors as the input layer node of ELM and the photovoltaic power as the output layer node, thereby establishing GRA-ELM photovoltaic power prediction model as shown in Figure 2.
GRA-ELM needs a large number of samples to nondimensionalize them. The samples are classified into training samples and test samples. Suppose the training samples are [x i , y i ], the number of neurons in the hidden layer is k, and the infinitely differentiable excitation function is g(x) (g(x) can be sigmoid, sine or radial basis function). The output model can be calculated as follows: where X j represents the input matrix, which is composed of main meteorological factors (weather temperature X 1 , global horizontal radiation X 2 , weather relative humidity X 3 and diffuse horizontal radiation X 4 ); O j represents the output matrix, which is the predicted value of photovoltaic power; b represents the bias matrix of hidden layer N is the number of training samples.
In (5), W represents the connection weight matrix between the input layer and the hidden layer In (5), represents the connection weight matrix between the hidden layer and the output layer The purpose of training GRA-ELM model is to make the prediction error close to 0, that is where T j represents the target value of GRA-ELM, which is the actual value of photovoltaic power. In other words, there are W i , b i and i satisfying (9) and (10) where According to (10), the training of a single hidden layer neural network is transformed into calculating the least square norm solution of the weight matrix as follows: where H + is the Moore-Penrose generalized inverse matrix of H .
To study the influence of the hidden layer excitation function on the prediction accuracy of the model, considering the availability and validity of the data, this work selects the data from where y i represents the actual value of photovoltaic power, y ′ i represents the forecast value, and N represents the number of samples. When different excitation functions are applied to GRA-ELM, the evaluation results are shown in Table 6. Table 6 indicates that when the excitation function is sigmoid, the RMSE values of training samples and test samples are the lowest, proving that the prediction accuracy reaches the highest. Therefore, this work selects sigmoid as the hidden layer excitation function to achieve the best prediction results.
After determining the excitation function of the hidden layer, this work changes the number of neurons in the hidden layer and analyses its impact on the prediction accuracy. Set the number of neurons to 3-8, respectively, and summarise the evaluation results as shown in Table 7. Table 7 indicates that as the number of neurons in the hidden layer increases, the RMSE values of training samples and test samples gradually decrease and tend to be stable. Since the increase of the number of neurons will lead to a significant reduction in the prediction speed, this work balances the prediction accuracy and algorithm learning speed, and sets the number of hidden layer neurons to 6. At the same time, it is necessary to compare the forecast value with the actual value, so as to get the prediction error and make a judgment. If the error is greater than 10%, to ensure the prediction accuracy, the number of hidden layer neurons must be increased to 7 in the next forecast cycle. So far, Section 2 has successfully constructed the GRA-ELM model with sigmoid as the excitation function of the hidden layer, which has six hidden layer neurons.

SDA-FA
SDA-FA is a combination of SDA and FA. SDA takes the main meteorological factors to form the meteorological characteristic vector, then selects similar days from historical days by introducing grey relation entropy; FA takes the photovoltaic power of similar days as initial solution vectors and obtains the optimal solution through multiple iterations.

SDA based on grey relation entropy
Grey relation entropy is a combination of GRA and IE, which can be used to quantify the similarity between different sequences [18]. In this chapter, grey relational entropy is used to select the historical days with high similarity to the forecast day in terms of meteorological characteristics. The greater the grey relation entropy value, the higher the similarity between the historical day and the forecast day. The basic steps of SDA based on grey relational entropy are as follows: Step 1: Based on the GRA results in Table 5, select four main meteorological characteristic variables to form the meteorological characteristic vector; Step 2: Select the samples of historical days with the same season and weather conditions, while the sample size can be dynamically adjusted from 5 to 30.
Step 3: Calculate the probability of occurrence of the meteorological characteristic variable j on historical day i by using (14) where i ( j ) is the correlation coefficient between the forecast day and the historical day i. Meanwhile, J is the total number of meteorological characteristic variables and takes 4.
Step 4: The grey relation entropy of historical day i is defined as: For example, if 09.30.2017 is selected as forecast day (assuming the weather type cannot be judged), we can summarize 8 days before the forecast day as historical days (09. 22.2017 to 09.29.2017), then select 4 days with the highest grey relation entropy as the similar days. Calculate the correlation

Construction of SDA-FA model
In this chapter, the SDA-FA model takes similar days photovoltaic power obtained in SDA as initial solution vectors of the FA intelligent optimization algorithm. On this basis, it outputs the optimal solution after the steps of exploration, exploitation, location updating and solution evaluation. FA is a heuristic algorithm of swarm intelligence optimization, which is derived by simulating the natural phenomenon that fireflies swarm and glow at night. In FA, each firefly is regarded as a solution of the objective function, which is randomly distributed in the solution space. Each firefly carries fluorescein, and has its own sensing domain and decision domain. The fluorescein value of the firefly is related to the objective function of its position. Every firefly is attracted to the one having comparatively greater brightness and its velocity is based on attractiveness [23]. The size of the firefly decision domain will be affected by the number of individuals in the domain. The smaller the number of fireflies in the decision domain, the larger the decision domain, to find more peers. On the contrary, the radius of the firefly decision domain will be reduced. If the brightness of adjacent individuals is the same, the firefly will move randomly. After multiple moves, all individuals will gather at the location of the firefly with the highest brightness, to achieve optimization. Because of the similarity of weather conditions, SDA-FA will obtain the optimal solution faster than basic FA. The SDA-FA model schematic diagram is shown in Figure 3.
In Figure 3, firefly i, j , k and l represent initial solution vectors, that is, the photovoltaic power of similar days (09. 23 where I 0 is the fluorescein value at the light source, that is, the maximum fluorescent brightness emitted by the firefly itself; is the absorption coefficient of the light intensity, assumed to be constant; r i j is the distance between the firefly i and j . Attractiveness function:  where 0 is the attractiveness at the light source, that is, the maximum attractiveness of the firefly itself. Mathematically, the spatial distance between firefly i and firefly j can be expressed by the Cartesian distance where x i (t ) and x j (t ) represent the positions of firefly i and j at time t , while d represents the spatial dimension. SDA-FA is a combination of SDA and FA, and the algorithm flowchart is shown in Figure 4. The basic steps of SDA have been described in detail in Section 3.1. Further, the basic steps of FA based on SDA are as follows (Step 1-Step 8): Step 1: There are N fireflies distributed in the solution space. Each firefly is defined by its position x i (t ) = [x 1 (t ), x 2 (t ), ⋅ ⋅ ⋅, x n (t )] and fluorescein z i (t ) = [z 1 (t ), z 2 (t ), ⋅ ⋅ ⋅, z n (t )]. According to the prediction requirements, the population size, spatial dimension and iteration number in the algorithm must be initialised. For photovoltaic power prediction, the firefly position represents the forecast value. Combined with the analy-sis results in SDA, the initial position of a firefly is determined by the corresponding similar day photovoltaic power; Step 2: Calculate the objective function f (x i (t )) corresponding to the position x i (t ) of firefly i, which is a better solution than the prediction result i; Step 3: In the exploration phase, firefly i looks for all individuals with higher brightness than itself in its decision domain, to form a neighbour set where r d i (t ) is the decision domain radius of firefly i; Step 4: In the exploitation phase, firefly i is attracted by firefly j in its neighbour set and approaches it. The position of firefly i is updated to where represents the step factor and ∈ [0, 1]; i is a random number vector generated by Gaussian distribution, uniform distribution or other distributions and i ∈ [0, 1]; Step 5: The firefly with the brightest luminosity moves randomly, and the position is updated to where x k best represents the optimal solution of generation k, that is, the predicted photovoltaic power most similar to the actual photovoltaic power; Step 6: The fluorescein value of firefly in the new position is calculated as follows: where is the volatilization coefficient of fluorescein; Step 7: Dynamically adjust the decision domain radius of firefly i as follows: where represents the change coefficient of the neighbour set; n t represents the threshold of the neighbour set; Step 8: Determine whether the number of iterations reaches the upper limit iter max . If the upper limit is reached, output the optimal solution of photovoltaic power prediction. Otherwise, return to Step 2 for the next iteration. At this point, the complete SDA-FA model has been constructed.
SDA-FA has a strong local search ability and can obtain the optimal solution of photovoltaic power in a small region. The parameters of the algorithm are few and have little influence on the algorithm. However, the algorithm requires excellent individuals in the sensing domain to provide information, otherwise, the search will stop. This search method relies too much on excellent individuals, thus reducing the convergence rate.

Information entropy description
As a measure of uncertainty, information entropy is an effective method to evaluate the performance of information fusion [24]. Information entropy can measure the uncertainty information in a single random variable. The larger the value, the higher the uncertainty of the variable. The output photovoltaic power is random, and the characteristics of the output power curves are different in various weather conditions. Since a single model canno achieve ideal prediction accuracy, a hybrid prediction model based on the information entropy principle is established, which combines GRA-ELM and SDA-FA. The weight of the sub-model reflects the contribution to the prediction results, while the hybrid model determines the weight according to the prediction error variation degree of each sub-model. Therefore, the predicted photovoltaic power of the hybrid model is more close to the actual value.

Establishment of hybrid prediction model
In certain weather conditions, n independent photovoltaic power prediction models are used to predict the power of a total of m times per day. The absolute error matrix can be calculated as follows: where e t,i represents the absolute error.
Calculate the relative error based on the absolute error and upper limit:  The predicted photovoltaic power of the hybrid model based on the information entropy principle is computed as follows: where x t,i represents the forecast value of the independent photovoltaic power prediction model i at time t . In different weather conditions, the prediction results of submodels are different, which leads to changes in both information entropy and weight coefficient. For stronger regulation ability and robustness, Section 4.2 applies the information entropy principle to the prediction model construction, thereby establishing the hybrid prediction model by combining GRA-ELM and SDA-FA, as shown in Figure 5. The hybrid model can dynamically adjust its weight coefficient according to the degree of distortion. Therefore, the hybrid prediction model has strong adaptability and high prediction accuracy under a variety of weather conditions, even extreme weather conditions.

Data description
The experimental data of this study comes from the DKASC [22]. DKASC is a demonstration facility for a range of solar technologies, which operates in the arid climate of Alice Springs, Central Australia. There are 38 solar power stations in Alice Springs and the total installed capacity is 263.0 kW. The photovoltaic array of DKASC must be thoroughly and professionally cleaned once or twice a year to ensure a good level of accuracy. The research period of photovoltaic power generation is 08:00-17:00, and the photovoltaic power and meteorological data are recorded every 5 min. Due to the weak sunshine intensity in other periods, they are not within the statistical range. The map is shown in Figure 6.

Numerical results
Take 01.12.2017-01. 16 Figure 7a, while the relative error is shown in Figure 7b. In Figure 7, the errors of GRA-ELM and SDA-FA mainly concentrate in the early stage of prediction on sunny day. The initial prediction errors of GRA-ELM and SDA-FA are about 6% and 32%, respectively. In 0-30 min, the prediction error of sub-models gradually decreases and stabilizes within [−7%, 7%], and the initial prediction error of the hybrid model is about 4%. In 195-205 min, a sudden change of weather conditions results in a large fluctuation of photovoltaic power. When the weather conditions change greatly in a short time (such as radiation intensity, weather temperature and weather relative humidity), the relative error of the test sample will increase. At 200 min, the relative error is −34% for SDA-FA, −8% for GRA-ELM and −4% for the hybrid model. In 305-390 min, the prediction error of SDA-FA fluctuates within [−3% to 3%]. The reason is that when the prediction result is extremely close to the optimal solution, the step is larger than the distance between them and causes fluctuations.
The relative error histograms on sunny day are shown in Figure 8. It shows that the relative error of GRA-ELM is mainly distributed within [−5%, 5%], accounting for 98.67%; the relative error of SDA-FA is mainly distributed within  Figure 9a, while the relative error is shown in Figure 9b.
On cloudy day, the errors of GRA-ELM and SDA-FA mainly concentrate at the time when the climate changes drastically. When the weather conditions are stable, the relative error of GRA-ELM is within [−10%, 10%], the relative error of SDA-FA is within [−12%, 12%], and the relative error of the hybrid model is within [−3%, 3%]. When the weather conditions change, the relative error of SDA-FA reaches more than 30%. By contrast, the relative error of GRA-ELM increases less and decreases faster. At 455 min, the relative error of GRA-ELM is −9.83%. Due to the timely adjustment of meteorological factors based on GRA, the error decreases rapidly to 2.59% at 460 min.
The relative error histograms on cloudy day are shown in The weather type is rainy and the main meteorological factors are global horizontal radiation, diffuse horizontal radiation, wind direction and weather temperature. The predicted power of the hybrid model and sub-models (GRA-ELM and SDA-FA) is shown in Figure 11a, while the relative error is shown in  Figure 11a, the photovoltaic power on rainy day reaches a peak value of 125.86 kW at 155 min and decreases continuously. At this peak, the relative errors of GRA-ELM and SDA-FA are −9.37% and 20.67%, respectively. Combined with Figures 7 and 9, it indicates the prediction accuracy and stability of SDA-FA for cloudy days and rainy days are poor. The  Comparing the prediction errors of sub-models with the hybrid model. When the weather conditions suddenly change (as shown in Figure 9, 295 min), GRA-ELM takes 5 min to reduce the relative error to within 2%, SDA-FA takes 7 min, and the hybrid model takes 3 min. After comparison, the hybrid model takes 40% and 57.14% less time than GRA-ELM and SDA-FA, respectively. The results show the performance of the proposed model is excellent.
In multiple regression analysis, decision coefficient R 2 is often used to evaluate the fitting degree of the regression model to sample data, which ranges from 0 to 1. When R 2 is closer to 1, it means that the interpretation ability of the prediction model to the sample data is stronger. When R 2 is 1, it means that the sample data is completely predictable. When R 2 is closer to 0, it means that the interpretation ability of the prediction model to the sample data is weaker. When R 2 is 0, it means that the sample data is completely unpredictable. Define decision coefficient as For regression analysis of prediction results of GRA-ELM, SDA-FA and the hybrid model, the decision coefficient is calculated and summarized in Table 10.
According to the analysis of Figures 7-12 and Table 10, the following conclusions can be obtained:

Performance analysis
This work uses the benchmark functions in Table 11 to verify the performance of the proposed model on 30-dimension. At the same time, take some commonly used single models (BP, SVM, and GPR) and hybrid models (ESE, WPS, and GAS) as comparison models. All simulations use unified equipment to make the experimental results more reliable. The algorithms are coded using MATLAB R2016a with PC Intel Core i5 CPU, 12 GB RAM and Windows 10.1 system. Summarise the parameters and test results in Tables 12 and 13, respectively. Table 12 shows that the population size is 30 and the maximum number of iteration is 1000 [20]. The numbers of input layer nodes, hidden layer nodes and output layer nodes of the cos(   [25]. Table 13 presents the running speed of single models is higher than that of hybrid models. For example, the average running time of single models on F2 is 19.54 s, which is 56.07% less than that of hybrid models. However, the optimization ability of single models is poor. For example, the best value of GPR on F4 is 1.77e-07, which is much higher than the worst value of WPS. The running speed of SVM is the fastest among the seven models, and each optimization of SVM takes the least time. However, by comparison, it is found that the convergence effect of SVM is poor, and it cannot converge to the global optimum on the all five benchmark functions.
The hybrid model proposed balances the running speed and the convergence effect to a large extent. The running speed of the model is faster than all other hybrid models listed. The running times on F3 and F4 are 17.98 s and 14.06 s, respectively, which are 22.83% and 42.02% less than the average time of other hybrid models. Besides, the running speed of the model on F5 is even faster than GPR. At the same time, the model can converge to the global optimum on all five benchmark functions, which proves the superior optimization ability.

Results assessment and comparison
The prediction of photovoltaic power is influenced by both season and weather. Therefore, to improve the credibility, we use sunny, cloudy and rainy days in spring, summer, fall and winter of 2017 as samples           The hybrid model proposed in this work has the smallest standard deviation among all models. On 01.11 (Rainy), the SDE value of the proposed model decreases by 85.24%, 86.66% and 84.38% compared with that of BP, SVM and GPR, respectively. On 07.09 (Rainy), the SDE value of the proposed model decreases by 65.41%, 74.61% and 70.12% compared with that of ESE, WPS and GAS, respectively. The stability proves the proposed model has good adaptability and can dynamically update the parameters according to the weather conditions, to achieve the optimal prediction results.

CONCLUSIONS
To optimize the prediction of photovoltaic power generation, this work summarizes various meteorological factors affecting photovoltaic power and analyses them with the method of GRA. Based on the intelligent algorithm, a hybrid prediction model for photovoltaic power generation based on information entropy is proposed. The superior performance of the model is verified by multiple comparative experiments using DKASC data. The main contributions of this work are as follows: The hybrid prediction model for photovoltaic power generation proposed in this work only considers short-term prediction. Therefore, the next step is to build a multi-time scale photovoltaic power prediction model with both short-term and long-term prediction, to further improve its market value and engineering significance.