Study on photovoltaic power forecasting model based on peak sunshine hours and sunshine duration

Accurate prediction of photovoltaic power generation is a critical technical problem for utilizing solar energy. Aiming at the problem that the model parameters are difficult to obtain in applying photovoltaic power prediction methods, this paper has used long‐term monitoring data of output power, various meteorological data, and solar irradiation intensity of photovoltaic modules. This paper establishes the functional relationship between the output power of photovoltaic modules and the irradiation intensity through Pearson correlation analysis. By deducing the distribution relationship of irradiation intensity, the prediction model of irradiation intensity based on peak sunshine hours and sunshine duration is constructed and based on 340 sites across the country 64 years peak sunshine hours and sunshine duration query database. In this work, the theoretical value of the prediction model on sunny days is close to the measured value (R2 = 0.918–0.985). The solar radiation intensity on rainy days is weak, and the prediction accuracy is low (R2 = 0.838–0.930). The relative errors between the sunshine duration and the peak sunshine hours in the database are less than 4.55% and 4.79%, respectively, under sunny conditions in each quarter, indicating that the accuracy of the database meets the actual needs.

improvement of the photovoltaic power consumption rate. 6In addition to the traditional regression method and time series method, 7 the photovoltaic power generation prediction method has been widely used in artificial neural networks in recent years. 8Among them, fuzzy theory, 9 gray system, 10 random forest, 11 support vector machine, 12,13 and other intelligent prediction methods are widely used.The similar-day method selects a value identical to the current meteorological conditions from the historical data through parameters such as irradiation intensity, ambient temperature, and relative humidity to establish the prediction model with the data of similar days. 14,15hotovoltaic power generation is affected by meteorological factors such as radiation intensity, ambient temperature, relative humidity, wind speed, and cloud opening, and there are significant spatial and temporal differences. 16,17Zhen et al. 18 considered that the intensity of solar radiation was the decisive factor affecting the output of photovoltaic power generation.The sunshine time, radiation intensity, and ambient temperature of different weather types differed.Lu et al. 19 used the trend surface analysis method to fit the relationship between the temperature of photovoltaic panels and the amount of radiation in each season.Lu et al. 20 established a prediction model with temperature as the input variable under various weather types.Based on the weather forecast, they can predict the power generation of horizontally placed photovoltaic modules in the next few days.Wang et al. 21proposed a breadth power prediction method based on clearness coefficient and multilevel matching, and the prediction model accuracy should be improved in practical engineering.
Many scholars have established horizontal irradiation intensity to calculate the amount of solar radiation at any inclination angle, [22][23][24] but the model calculation was complicated.With the change constantly of the solar elevation angle and azimuth angle, more parameters needed to be input hourly, and the estimate was highly cumbersome.Obtaining the input parameters of the above prediction model for the photovoltaic power generation system in remote areas was challenging, making it difficult to popularize the current photovoltaic power prediction model studied.A simple and accurate photovoltaic power prediction model was urgently needed.
Through years of real-time monitoring data, this study analyzed the influence of various meteorological data and irradiance on photovoltaic power, established a simplified model of photovoltaic power based on peak sunshine hours and sunshine hours, and verified the prediction accuracy of the model in different seasons and different weather types.Based on the simplified prediction model, photovoltaic power generation can be predicted with fewer parameters, which provides a scientific basis and technical guidance for constructing and applying the photovoltaic system.

| Testing apparatus
The experiment was carried out in the experimental building of the Water Saving Research Institute (108°4′ 28″ E, 34°16′56″ N).The test device is shown in Figure 1, mainly composed of photovoltaic modules, battery assembly, solar controllers, AV6592 portable solar cell tester, and personal computer (PC) data monitoring system.
The photovoltaic module was CS5M32-260 singlecrystal silicon cell.Its peak power and area under the standard test condition (STC: atmospheric mass AM1.5, solar radiation intensity 1000 W/m 2 , photovoltaic cell temperature 25°C) were 260 W and 1.5 m 2 , respectively.The maximum power tracking solar controller was connected to the computer through the interface to collect and record data automatically.The AV6592 portable solar cell tester mainly included the tester host and the AV87110 data acquisition probe box.The host and probe can be connected by Bluetooth, providing a wireless communication connection function of up to 100 m and a standard STC test condition correction program.The probe box was parallel to the photovoltaic module and connected to the host through the RS232 serial port.The measurable parameters included output current, voltage, power, solar radiation intensity, and so forth.Meteorological data were obtained from a weather station near the experiment site.

| Design of experiment
The test was from April 2016 to July 2023 and was carried out 24 h a day.A set of data was automatically collected every 10 min to form a CSV file, which was automatically saved to the PC data monitoring system.In addition, to avoid dust and other effects on the power generation of solar panels, natural cleaning methods such as rain and snow removal were manually cleaned every 10 days.

| Test indicators and methods
The data indicators detected by this test device include photovoltaic module output power, ambient temperature, irradiation, wind speed, relative humidity, rainfall, atmospheric pressure, and wind direction.The output power of the photovoltaic module and the received real-time irradiation intensity were measured by using the AV6592 portable solar cell tester and displayed on the PC.The read data was automatically corrected to the data under the STC standard condition, while the wind speed, relative humidity, rainfall, atmospheric pressure, and wind direction at the same time were obtained through a small weather station in the laboratory.

| Model evaluation index
According to the international common new energy prediction assessment indicators, the determination coefficient R 2 , root mean square error (RMSE), and mean absolute error (MAE) were selected as the model evaluation indicators.The corresponding formulas were as follows:  Forecast skill (FS) as an evaluation indicator compares the model with the persistence model.The higher this value is, the stronger the model's prediction ability is.
FS is computed as one subtracting the ratio of the RMSE of a model and that of a persistence model.This paper takes the measurements on Day n − 1 as the persistence forecast for Day n.The persistence model is frequently used as a baseline. 25,26The measured value y i is in the formula; y is the average of the measured values; y ˆi is the fair value; and n is the number of samples.

| Analysis of the influence of irradiation intensity and various meteorological data on the output power of photovoltaic modules
Pearson correlation coefficient analysis was conducted using SPSS software to bring the measured meteorological data.The results are shown in Table 1.From Table 1, it can be seen that the Pearson correlation coefficient between the output power of the photovoltaic module P PV and the irradiation G t is 0.987, which is a very strong correlation.The output power of photovoltaic modules P PV had a moderate correlation with ambient temperature T and relative humidity RH.In addition, the Pearson correlation coefficients between the output power of photovoltaic modules P PV and wind speed v s , atmospheric pressure P atm , rainfall P r , and wind direction v d were all less than 0.2, showing weak correlation or noncorrelation.
To further simplify the model, the relationship between output power and irradiation intensity was established by ignoring the influence of ambient temperature, wind speed, atmospheric pressure, rainfall, wind direction, and relative humidity. 14,15e variation of output power per unit area of photovoltaic modules with irradiation intensity was analyzed using 82,398 test groups with irradiance greater than 5 W/m 2 from April 2016 to July 2023.The results are shown in Figure 2.
It can be seen from Figure 2 that the output power of photovoltaic modules per unit area had a highly positive linear relationship with the irradiation intensity.Equation (5) was the right formula, and the coefficient of determination R 2 = 0.968.P G = 0.131 + 5.107.

| Establish the distribution function model of hourly irradiation intensity
Accurate solar radiation estimation was significant for designing and optimizing photovoltaic power generation systems.3][24] This paper established an hourly irradiance model based on peak sunshine hours and sunshine duration through theoretical analysis and further summarizes the peak sunshine hours and sunshine duration in various regions of the country according to the season, combined with Equation ( 5) to achieve the purpose of hourly prediction of photovoltaic power generation in different areas and quarters of the country.Considering that hourly solar radiation intensity monitoring was significant, time-consuming, and lagging, the hourly radiation intensity was calculated by establishing a daily radiation intensity distribution model.To facilitate the analysis and research of the system, the radiation intensity distribution function model based on peak sunshine hours and sunshine duration is established. 27quation ( 6) was as follows: In the formula, G t is the irradiance on the photovoltaic panel at time t, W/m 2 ; G max is the maximum irradiance in a day, W/m 2 ; t is the time point in the sunshine period; and T H is the average sunshine duration of each quarter, h.
The total solar radiation in a day is integrated by Equation ( 7): Q is the total daily radiation received by the PV module per unit area, Wh/m 2 .The peak sunshine hours T m is the number of hours that the total solar radiation received by the photovoltaic array per unit area is converted into a standard test condition (radiation intensity G [ ] = 1000 W/m 2 ).In designing and optimizing a photovoltaic power generation system, Equation ( 8) can convert the total daily radiation into peak sunshine hours and then calculate the photovoltaic power generation.The calculation diagram is shown in Figure 3.

| Model validation
The measured irradiance data from April 2016 to July 2023 in Yangling, Shaanxi Province, in Figure 1, were used to eliminate the error data and incomplete data caused by machine failure.The available test data in Spring (March-May), Summer (June-August), Autumn (September-November), and Winter (December-February) was 640, 630, 295, and 409 days, respectively.The data were preprocessed, and the average sunshine duration T H .It was 11, 11, 10.5, and 9 h, respectively.Six days were randomly selected from the three weather types of sunny, cloudy, and rainy in each quarter, with a total of 18 days in each quarter.The test device records the irradiation intensity every 10 min, and the total radiation amount of the day can be calculated by Equation (10).After the peak sunshine hours of the day are obtained by Equation ( 8), the theoretical value of irradiation intensity is calculated by Equation ( 9).The results are shown in Figure 4.It can be seen from the figure that the irradiation intensity on sunny days was generally higher than that on rainy days.The theoretical value of hourly irradiation intensity was very close to the measured value, and the model prediction was the highest.The irradiance on rainy days was weak, less than 200 W/m 2 , and the predicted goodness of fit of the irradiance model was lower than that on sunny days.On cloudy days, the solar irradiance still showed a trend of increasing first and then decreasing, but there was no quantitative change rule, and the irradiance prediction model was the worst.
144 (10)   To quantitatively analyze the verification accuracy of the radiation intensity model, performance metrics in different seasons results of Figure 4 data were shown in Table 2. Due to the unexpected irregular radiation intensity changes on cloudy days, the model error was the largest.From Figure 4, it can be found that the coefficient of determination was the lowest (R 2 = 0.639-0.796),the coefficient of determination of sunny day variance was the highest (R 2 = 0.918-0.985),and the coefficient of determination of a rainy day was between The value range of forecast skill (FS) on sunny, cloudy, and rainy days in four seasons was 0.523-0.576,0.253-0.326,and 0.406-0.458.It is shown that the prediction continuity of the prediction in sunny days.
In summary, the local predicted values and the measured values fit more closely, and the model effectively predicted the trend of photovoltaic power generation, which illustrated that the prediction effect was excellent.From the overall predicted values, the prediction was best in summer and winter.The affection was poorer in spring and autumn, which might be affected by the seasonal variation of solar irradiance, resulting in significant fluctuation and uncertainty of photovoltaic power in spring and autumn, which dramatically increases the difficulty of the model in the prediction process.
However, the prediction model of photovoltaic power generation based on peak sunshine hours and sunshine duration established above had a better prediction effect on sunny days.In contrast, the prediction effect on cloudy days was poor.The prediction model was more suitable for predicting photovoltaic power on sunny days.

| Sensitivity analysis with the number of samples
In this paper, the number of samples selected for sunny, cloudy, and rainy days in each quarter was 6 days.The  sensitivity analysis of the number of samples was carried out to show the rationality of the number of pieces selected.The optimal number of samples was determined by the change in model accuracy under each number of samples.The analysis results were as follows.
It can be seen from Figure 5 that as the number of sample days increases, the accuracy of the prediction model established in this paper also gradually increases.Until the sample was 6 days old, the model's accuracy hardly changed, indicating that each weather type in each quarter could represent the model's accuracy when 6 days were selected.It shows that the number of samples determined in this study was reasonable.

| Construction of prediction model database
If the peak sunshine hours and sunshine duration of a day can be obtained directly, the hourly photovoltaic power generation can be calculated directly by Equations ( 4) and ( 8).If the above two parameters are difficult to obtain in time, the photovoltaic power generation must be evaluated in the early stage of photovoltaic design.This study established a database of sunshine duration and peak sunshine hours so that users can query the local sunshine duration and peak sunshine hours.Due to the significant regional, seasonal, and weather differences, the database established was more complicated.The following discussion was only based on the country's sunny days of each season.

| Construction of database
In this study, the sunshine duration and total horizontal radiation data of 340 stations in China Meteorological Data Network from 1958 to 2023, a total of 66 years, were downloaded and collated.According to the multi-year sunshine duration data, each station's monthly average sunshine duration on sunny days was calculated, and the database of sunny sunshine duration in each quarter was established.According to the total radiation data of the horizontal plane, the average daily total radiation of sunny days in each quarter of each region was obtained.According to Equation ( 7), the peak sunshine hours database of sunny days in the horizontal plane of 340 stations was obtained.
The radiation on the tilted photovoltaic module was 1.05-1.15times that of the horizontal radiation.
In practical application, it should be converted to the peak sunshine hours on the corresponding tilted surface. 28Then, the local hourly irradiation intensity and power generation were estimated by Equations ( 9) and ( 5), respectively.Taking the peak sunshine hours of sunny days as an example, Figure 6 was a comparison map of the peak sunshine hours of sunny days in different regions and seasons of the country using the ArcGIS Kriging interpolation method.It can be concluded that the peak sunshine hours of sunny days in the four seasons are 8.19, 9.13, 6.59, and 5.68 h, respectively.respectively.The error analysis of the daily measured value and 1.1 times the query value of the sunshine duration and peak sunshine hours is shown in Table 3.The root mean square error of sunshine duration in each season was not more than 1.36 h, the average absolute error was not more than 1.28 h, the source mean square error of peak sunshine hours was not more than 2.26 h, the average fundamental mistake was not more than 1.76 h.The maximum relative error of sunshine duration in each quarter was 4.55%.The relative error of peak sunshine hours was less than 4.79%, which showed that the sunshine duration and peak sunshine hours database had specific reference values for photovoltaic system design.

| Comparison with the existing literature
In addition, to evaluate the rationality and novelty of the method proposed in this paper more objectively, in recent years, several research cases and procedures have been collected and analyzed.Comparing the literature in Table 4, it can be found that the method proposed in this paper is reasonable and promising.To thoroughly verify these views, we will discuss the following in detail.First, we compared the prediction models established by deep learning and other methods to establish the irradiation intensity or photovoltaic power generation.In this paper, the model shown by regression was better at predicting the indicators of photovoltaic power on sunny days.Zenan, Reikard, Qing, et al. adopted deep learning to construct models to predict the power generation of photovoltaic power stations.0][31] The more historical data, the more accurate the model, which contained many parameters and layers, resulting in highly complex models.It was challenging to understand the working principle inside the model, which made the model established by deep learning less interpretable and challenging to analyze and debug.In addition, the prediction effect was better for complex cloudy days.Still, for sunny days, due to the large number of parameters required by the established model, the operation was prone to over-fitting, resulting in low prediction accuracy.It was not easy to apply to areas with more historical data.The model established in this paper can easily predict the photovoltaic power in each region by combining the peak sunshine hours and sunshine duration database, which had high universality.The model had a good prediction ability for the administration of sunny days in various regions.Still, the prediction effect was relatively poor for cloudy days, which was the deficiency of the model proposed in this paper.However, photovoltaic technology should be used in all walks of life at present, among which photovoltaic pump water lifting technology was widely used in agricultural irrigation.As we all know, in the agricultural irrigation industry, many areas need artificial irrigation due to continuous sunny days and less rainfall.The photovoltaic water pumping irrigation system had been well applied at this time.The rainy weather conditions on cloudy days may require less irrigation.Currently, the new model proposed in this study can be used.Moreover, in some areas that needed photovoltaic water pump pumping irrigation systems installed, under the premise of lacking historical irradiation data, the model proposed in this study can be used to match the growth characteristics of crops well, 32,33 which shows that the prediction method proposed in this paper was reasonable and adequate, which was helpful for the promotion and application of photovoltaic technology.

| CONCLUSIONS
In this paper, we proposed a new prediction model based on peak sunshine hours and sunshine duration, which can be used for regional PV technology construction in China.To analyze the results, four performance indicators were used for objective evaluation.The case study results showed that the prediction model based on peak sunshine hours and sunshine duration proposed in this paper had higher accuracy than other prediction models under sunny conditions and had a more comprehensive range of adaptability.The prediction model of photovoltaic system power generation was studied, and the following conclusions were obtained: 1. Through long-term test data, the functional relationship between irradiation and output power of photovoltaic modules per unit area was obtained (R 2 = 0.987).Various meteorological factors were reduced through Pearson correlation analysis, the functional relationship was further simplified, and the irradiation intensity was acquired.2. A prediction model of hourly output irradiance intensity based on peak and sunshine duration was established through theoretical derivation.The meteorological data of Yangling in Shaanxi Province were used to verify that the theoretical value of hourly irradiance of this model on sunny days was very close to the measured value, and the model prediction was the highest (R 2 = 0.918-0.985).
The irradiance was weak on rainy days, and the prediction accuracy was lower than on sunny days (R 2 = 0.838-0.930).On cloudy days, the solar irradiance still showed a trend of increasing first and then decreasing, with no quantitative change rule and the worst prediction (R 2 = 0.639-0.796).3. The radiation and sunshine duration data of 340 stations in China Meteorological Data Network from 1958 to 2023 for a total of 66 years were sorted out, and the sunshine duration and peak sunshine duration databases in different regions of the country were summarized.The relative errors of each season were not higher than 4.55% and 4.79%, respectively, indicating that the accuracy of the database met the demand for a photovoltaic power generation prediction model based on peak sunshine duration and sunshine duration.

FFS
I G U R E 1 Schematic diagram of the structure of the photovoltaic power monitoring device.

T A B L E 1
The correlation between the output power and irradiation intensity of photovoltaic modules per unit area and meteorological factors.The absolute value of the Pearson correlation coefficient is a very strong correlation between 0.8 and 1, strong correlation between 0.6 and 0.8, moderate correlation between 0.4 and 0.6, weak correlation between 0.2 and 0.4, and no correlation or very weak correlation between 0 and 0.2.F I G U R E 2 Schematic diagram of photovoltaic module output power per unit area changing with the irradiation.

F I G U R E 3
Peak sunshine hours and sunshine duration calculation diagram.a sunny day and a cloudy day (R 2 = 0.838-0.930).The RMSE had a maximum value of 172 W/m 2 on cloudy autumn days, and a minimum value of 15 W/m 2 appeared on rainy winter days.The MAE range was 19-135 W/m 2 .

F
I G U R E 4 Schematic diagram of comparison between theoretical and measured values of radiation intensity model under different weather types in four seasons.(A) Spring.(B) Summer.(C) Autumn.(D) Winter.T A B L E 2 Performance metrics in different seasons.

F I G U R E 5
The influence of sample selection days on model accuracy.(A) Spring.(B) Summer.(C) Autumn.(D) Winter.

3. 3 . 2 |
Verify the database Based on the measured radiation intensity values in the Yangling area of Shaanxi Province from April 2016 to July 2023, the practical value of the database of sunshine duration and peak sunshine hours was analyzed and verified.From Section 3.2.2, it can be seen that the sunshine duration in Yangling of Shaanxi Province in Spring, Summer, Autumn, and Winter was 11, 11, 10.5, and 9 h, and the peak sunshine hours were 5.23, 6.02, 4.23, and 4.47 h, respectively.The sunshine hours of local sunny days were 11.2, 10.9, 10.3, and 9.3 h; the peak sunshine hours were 4.86, 5.63, 3.96, and 4.26 h.When the photovoltaic module was on a 45°inclined plane, the peak sunshine hours coefficient was 1.1, and the peak sunshine hours on the inclined plane were 5.35, 6.19, 4.36, and 4.69 h, F I G U R E 6 Distribution diagram of sunny peak sunshine hours in each quarter of the country.(A) Spring.(B) Summer.(C) Autumn.(D) Winter.
Prediction error analysis table of sunny sunshine duration and peak sunshine hours.Comparison with other prediction methods in the existing literature.