Future prospects research on offshore wind power scale in China based on signal decomposition and extreme learning machine optimized by principal component analysis

In recent years, China has promoted many new energy projects in order to meet the growing demand for electricity. Therefore, China's offshore wind power installed capacity has grown rapidly. China has a long coastline and abundant offshore wind energy resources. Offshore wind power is an important area for the development of renewable energy, which can promote wind power technology advancement and energy structure adjustment. Therefore, conducting effective research and forecast on the cumulative installed capacity of China's offshore wind power will help the government to rationally deploy and reduce the risk of investment in offshore wind power. In order to accurately predict the future prospects of offshore wind power in China, this paper firstly constructed a set of influencing factors and used gray correlation analysis to screen the main influencing factors. Then, this paper proposed a novel forecasting model named e‐VMD‐PCA‐RELM. The algorithm is based on the traditional RELM (robust extreme learning machine) algorithm, which effectively processes the noise information through the PCA (principal component analysis) algorithm, and extracted the feature elements of the RELM hidden layer to reduce the information redundancy. At the same time, the e‐VMD (variational mode decomposition optimized by entropy) algorithm is used to decompose the original time series to obtain multiple components. By comparing with the other forecasting algorithms, it is proved that the proposed forecasting model has strong generalization ability and has achieved good prediction result. Finally, the e‐VMD‐PCA‐RELM model is used to predict the scale of offshore wind farms in China from 2019 to 2035. We find that the cumulative installed capacity of China's offshore wind power will exceed 60 GW in 2035, and the installed capacity will increase year by year. In 2030, there will be a large increase, with a relative growth rate of 20%.


| INTRODUCTION
Wind energy has become one of the most mature, most widely used, and fastest-growing new energy sources. The increasing installed capacity of wind power worldwide has proved that it can be applied on a large scale. During the "Twelfth Five-Year Plan" period in China, the newly added wind power installed capacity is 210 million kW, with an average annual growth rate of 18.4%. It is the fastest-growing power resource in clean energy. 1 In China, the offshore wind speed is more than 20% higher than that of land, and the power generation per unit area can be increased by more than 70%. In recent years, with the increase in energy demand and the deterioration of the global environment, wind power generation should become a key energy project. As of 2017, the installed capacity of the global 9 wind power market has historically increased by 4334 MW, an increase of 95% compared to 2016. The cumulative installed capacity of offshore wind power in the world reached 18 814 MW, a 30% increase from 2016 (14 384 MW). Europe continues to maintain its position as the world's largest offshore wind power market, with nearly 84% (15 780 MW) of offshore wind power facilities located in the offshore waters of 11 European countries. China's offshore wind power is during the breakthrough progress with a total of 11.6 MW new installed capacity and cumulative installed capacity reached 27.9 MW. According to the National Development and Reform Commission's Energy Institute, the potential of wind power installed capacity in China's offshore can reach 100-200 GW. Therefore, in the future, the development of offshore wind power projects has become a hot spot for the use of renewable clean energy. 2 According to the sustainable development strategy and the deployment of China's energy plan, the planned construction of offshore wind power in 2020 needs to exceed 30 GW, and offshore wind power will become a core project in renewable energy. Promoting the construction and access of China's offshore wind power projects is a key part of the current energy planning work. It is significant for the expansion of offshore wind power installed capacity, optimization of energy structure, achievement of low carbon emission reduction targets, and improvement of sustainable development level. After years of steady development, China's offshore wind power has basically met large-scale development conditions in terms of resources, technical and policy levels.
In order to accurately predict the cumulative installed capacity of offshore wind power in China, this paper proposes a forecasting model based on e-VMD (variational mode decomposition information optimized by entropy) and PCA-RELM (robust extreme learning machine optimized by principal component analysis) as a combined model of cumulative installed capacity of offshore wind power in China. The research innovations of this paper are as follows: 1. The VMD algorithm is optimized by entropy, which solves the problem that the historical time series characteristics of the cumulative installed capacity of offshore wind power are weak and difficult to extract. The entropy can reflect the validity of signal decomposition. The smallest principle of entropy value is used to determine the value of k and optimize α in VMD. 2. The RELM algorithm is improved by PCA. Through the PCA, the principal component features of the hidden layer can be extracted, and the influence of the number of hidden layer nodes on the accuracy of the model can be reduced. The quick setting of the number of layer nodes simplifies the operation flow. 3. This paper uses combined prediction algorithm to predict the scale of offshore wind power in China. Eighteen secondary influencing factors are selected according to energy resources, industrial technology, investment, and macroeconomics and screened by gray relationship analysis, thus ensuring the accuracy of the prediction of the overall combined algorithm.
The main contents of the paper are as follows: Section 2 summarizes the research status of offshore wind power and prediction algorithms development; Section 3 introduces the algorithm principle of the VMD, RELM etc; Section 4 summarizes the overall flow of the e-VMD-PCA-RELM model; Section 5 applies the proposed e-VMD-PCA-RELM model to forecast the cumulative installed capacity of offshore wind power in China. Based on the comparison with RELM, LSSVM, PCA-RELM and VMD-PCA-RELM prediction results, it proves the practicability and accuracy of the proposed model. The cumulative installed capacity of China's offshore wind power from 2019 to 2035 is output by the model; Section 5 presents forward-looking conclusions.

| LITERATURE REVIEW
In recent years, many scholars have devoted themselves to studying the status of offshore wind energy resources in various countries, and conducted evaluation from multiple layers. The hottest research is the construction technology of wind power 3-20 such as tower structure, unit aerodynamic performance, wind turbine blade optimization design. Ma et al 3 established the time domain coupling calculation model of floating wind turbine, and proposed the wind turbine blade combination optimization design method. Jalbi et al 4 studied the role of vibration mode in the design of WTG (wind turbine generator) support structure and proposed the symmetrical (equal) triangle and asymmetry (isosceles) which can help the designer to optimize the basic layout. pokhrel et al 5 simulated different degrees of structural damage to offshore wind turbines (OWT) caused by different natural climates and disasters, and conducted vulnerability analysis in the form of a brittle curve. Yao et al 6 improved the output characteristics of offshore wind power and enhance the wind power regulation capability of the southern coast of China and proposed an optimized selection method for seawater pumped storage power station (SPSP). Castro et al 15 proposed an evaluation model to achieve a life cycle cost assessment of a combined or hybrid floating marine renewable energy system to better understand the technical solution; Shiau et al 16 studied the social impact assessment of offshore wind power. Salvador et al 17 studied the legal constraints imposed by the protection of the marine environment and other interests to determine areas with large wind density (WPD) and less conflict.
Wind power resources forecasting is a new research spot, which is important to capacity layout and planning. Literature 21-31 studied the wind power resources calculation and prediction. Rusu 21 used a multi-climate model comparison enables reliable prediction of wind energy resources in the Black Sea coastal environment based on the Rossby Center Regional Atmospheric Model (RCA4). Akdag and Dinler 22 developed a new method called the power density (PD) to estimate wind energy applications. Ganea et al 23 analyzed the wind and wave conditions along the coast of Europe, including transportation, as well as various ports and offshore operations. Onea and Rusu 24 analyzed 11-year historical data and evaluated the probability of implementing wind power on the northwest side of the Black Sea through the classical logarithmic transformation law. Kim et al 25 evaluated the wind energy resources of the 2.5 GW offshore wind power project in southwestern Korea by onshore and island meteorological tower measurements by using WindSim (a flow model based on computational fluid dynamics and statistical dynamic downscaling). These researches showed that the environment, climatology, detection technology, policy, and law have a higher impact on the offshore wind power resources calculation and prediction.
Combined prediction algorithms such as signal decomposition and machine intelligence algorithms have become more and more mainstream methods in the field of scientific energy resource prediction. The cumulative installed capacity prediction of offshore wind farms is a complex nonlinear solution problem. So far, many scholars have proposed various prediction models for wind energy installation predictions in various countries. [32][33][34][35][36][37][38][39][40][41][42][43][44] Vahidzadeh and Markfort 32 predicted the turbine power output curve by using a high-resolution wind measurement power curve, including turbulence, yaw error, air density, wind direction, and shear; Huan et al 33 proposed a combined prediction model based on Set Empirical Mode Decomposition (EEMD) and Least-Squares Support Vector Machine (LSSVM) to improve the accuracy and effectiveness of dissolved oxygen (DO) prediction; Kim and Hur 34 proposed a random prediction of wind power resources for the Jeju Island wind farm in South Korea using the enhanced set model. The Kriging method (one of spatial interpolation) is applied to estimate the wind speed at a potential location; Niu et al 35 developed the multi-target locust optimization algorithm based on the combination of empirical mode decomposition and adaptive noise and no negative constraint theory to achieve accurate prediction results of short-term wind speed. Li et al 36

| Robust extreme learning machine
RELM is a single hidden layer feedforward neural network proposed in recent years. RELM is different from other neural networks, which can randomly generate weight and threshold while we input data. Meanwhile, it can obtain output weight by generalizing inverse matrices without adjusting parameters. RELM learns faster and has better generalization performance. The algorithm process is described as follows.
Suppose there are different samples (x i ,y i ), among them, the network has hidden layer nodes, then the RELM network output is In the formula, W k = (w k1 ,w k2 , … ,w kn ) T is the weight of the kth hidden layer neuron and the input layer neurons. k = ( k1 , k2 , … , km ) T is the weight of the kth hidden layer neuron and the output layer neuron.b k is the threshold of the kth hidden layer neuron.
RElM that owns k implicit neurons and one activation functions g(x) that can make the error of sample approximately equal to 0, Then, we can get the equation as follows. (1) Above equations can be summarized as Among them, H is the output matrix for the hidden layer, . Then randomly determine input weights and thresholds, and use the leastsquares solution to calculate output weights.
where H + is the generalized inverse matrix H.

| Principal component analysis algorithm
Principal component analysis (PCA) is the basic method of multivariate statistics. 41 The basic idea of PCA is to find a new set of variables that replace the original variables. The new variables are linear combinations of the original variables. The new variables are irrelevant and carry the useful information of the original variables to the maximum extent. The principal component extracted by PCA can reflect the core information of the research object, remove the linear relationship and redundant information between high-dimensional data variables, thereby realizing data dimensionality reduction, and improving the efficiency of analysis problems while ensuring modeling accuracy.
Supposing a data set consists of m variables, n sets of values, this paper converted the data set into a matrix X, X ∈ R n×m , as shown in formula (5).
The calculation formula of the steady-state point of the optimal problem is as follows.
where v ∈ R m . Formula (6) can be calculated by singular value decomposition.
where U ∈ R n×m , V ∈ R m×n , matrix ∑ ∈ R n×m are positive numbers that includes with successive descending along its main diagonal, and all other elements are 0. The load vector is an orthogonal column vector in the matrix V, but the variance of the projection of the historical data set X along the ith column of the matrix V.
Formula (7) is equivalent to formula (8), which means the decomposition of the sample covariance matrix S.
Diagonal matrix ∧ = ∑ T ∑ ∈ R m×m has eigenvalues that are successively decremented, where the ith eigenvalue is equal to the square of the ith singular value.
Retaining the larger singular value corresponding to the load vector according to the cumulative variance contribution rate formula.
Select the column of the load matrix P ∈ R m×a to make the columns of matrix T correspond to each other. Therefore, the matrix T contains the projection of the data set X in a low dimensional space.

| RELM optimized by PCA
The change in installed capacity of offshore wind power is a complex system. The traditional RELM cannot effectively model the data of offshore wind power installation. Aiming at this problem, this paper proposes a novel robust extreme learning machine based on principal component extraction. By performing PCA on the RELM hidden layer, the principal component features of the data are extracted, and the linear correlation between variables is removed to simplify the research problem.
PCA can effectively process high-dimensional noise data. Principal feature extraction is performed on the elm hidden layer to reduce the data dimension and remove redundant information between data. Further learning of the neural network based on the extracted principal component features can reduce the influence of the number of hidden layer nodes on the accuracy, and realize the rapid setting of the number of nodes in the RELM hidden layer, so that RELM is robust.
Compared with the traditional ELM, the PCA-RELM model has the advantages of simple design, good robustness, and high precision, which can guide the prediction of the installed capacity of offshore wind power. The example shows that the novel model does not require high empirical data and multiple training to optimize the parameters, so that higher precision prediction results can be obtained, and the machine learning design is simple and quick.
The PCA-RELM network structure diagram is shown in Figure 1.
Given network training sample set U that includes I group of Establishing a PCA-RELM model based on training sample U. The steps are as follows.
Step 1: Data normalization The experimental data are normalized to have a uniform dimension.
Among them, Y min and Y max are the minimum and maximum values in the output vector Y i respectively.
where k is the number of hidden layer nodes.
Step 3: Randomly generated hidden layer neuron threshold b 1 ,b 2 , … ,b k , and calculate the hidden layer output matrix H according to formula (3).
Step 4: Extract the main components of the hidden layer output matrix H.
Constructing PCA process on matrix H and obtaining a new sample set Step 5: Calculate the output weight matrix according to formula (4).
Step 6: Train and test the PCA-RELM model.
Based on the above steps, the general flow of the PCA-RELM model proposed in this paper is shown in Figure 2.

| Variational mode decomposition
Variational mode decomposition (VMD) is a new adaptive signal processing method based on Wiener filtering. In the VMD algorithm, each IMF is called AM − FM Signal.
(11) where, A k (t) is the instantaneous amplitude of the signal; u k (t) can be regarded as the harmonic signal. Assuming that each modality is a finite bandwidth with a center frequency, and the center frequency and bandwidth are constantly updated during the decomposition process, the variational problem can be expressed as seeking k modal functions u k (t), and make the estimated bandwidth of all modal functions are minimum. The sum of modalities is the input signal f . Specific steps are as follows.
Step 1, Construct Hilbert transformation for each modal function u k (t).
Step 2, Mix the estimated center frequency of each modal analysis signal, and transfer the spectrum of each modality to the baseband.
Step 3, Calculate the bandwidth of the signal after frequency shifting by using H 1 Gaussian smooth estimation, resulting in the following constrained variational problem.
where u k is the obtained decomposition. k is the center frequency of each component.
Step 4, Solve the above variational problem.
Introduce a quadratic penalty factor and Lagrange multiplication operators . can ensure the accuracy in the presence of Gaussian noise. can ensure the Guarantee the strictness of the constraints. Through repeated iterations, the "saddle point" of the Lagrange expression is found to solve the minimum value. The extended Lagrange expression is as follows.
where the optimal solution is intrinsic mode function {u k } and their respective center frequencies { k }.

| VMD parameter optimization process based on entropy value
Aiming at the problem that the historical time series characteristics of the cumulative installed capacity of offshore wind power are weak and difficult to extract, a signal feature extraction method based on entropy theory is proposed.
Because the historical change of the cumulative installed capacity of offshore wind power can be decomposed into a variety of signals, based on, 36 it can be considered that the more ordered the signal, the smaller the value of entropy. Therefore, entropy can be used as a kind of representation of the effectiveness of signal decomposition. When the information entropy value has the minimum value, it can be considered that it has a good VMD decomposition effect. Based on this, this paper proposes a method to determine the VMD parameters based on the principle of entropy.
The proposed method in this paper aims to extract the signal components including the fluctuation characteristics of the cumulative installed capacity of offshore wind power from the original signals, then achieving weak changes. Firstly, the original fault signal is subjected to VMD to obtain a set of IMF components. When optimizing the parameters, the IMF with the top three entropy values is removed, and the IMF component with smaller entropy value is selected as the effective IMF component. The envelope demodulation analysis is carried out to extract the fluctuation characteristic frequency of the historical time series of the cumulative installed capacity of the offshore wind power. The flow chart is shown in Figure 3.

PCA-RELM FORECASTING MODEL
Due to the insufficient historical data of the cumulative installed capacity of offshore wind power in China, it is difficult to achieve the data magnitude requirements of intelligent algorithms. Therefore, it is easy to cause the original data homogeneity of the influencing factors. Meanwhile, with the different scale information characteristics of the input data, the traditional machine intelligence algorithm cannot fully discover the time-frequency characteristics of the time series data, and thus greatly affect the performance of the forecasting model. The accuracy of the RELM algorithm cannot be achieved. Therefore, this paper firstly separates the original data by e-VMD and extracts the eigenmode components including the volatility characteristics of the cumulative installed capacity of the offshore wind power, thereby realizing the magnitude of the data, and realize the data requirements of the machine algorithm. The prediction steps of the overall algorithm of e-VMD-PCA-RELM model are as follows.
Step 1: Time series data decomposition. Using e-VMD to decompose x(t) and obtain n IMF components and one residual r n .
Step 2: Based on the principle of entropy, the first three IMF components with larger entropy value are eliminated, and the final n − 3 IMF components and one residual r n are obtained.
Step 3: Build training and test sample sets of the IMF component and the r n . The input component and the output of each component are constructed.
Step 4: Construct PCA-RELM training for each component. In the PCA-RELM algorithm, the number of nodes of any hidden layer is selected. Establish PCA-RELM model to obtain prediction results for each component.
Step 5: Output prediction results. For each IMF component and r n , the predicted results are summed to obtain the final predicted results of the cumulative installed capacity of offshore wind power in China.
In summary, the overall forecasting flow chart of scale of offshore wind power in China is shown in Figure 4.

| Data processing
A total of 18 impact indicators such as electricity consumption are used as preliminary indicators for the factors which are shown in Table 1. The classification and specific data of preliminary indicators are from the China Statistical Yearbook and the official website of the China Energy Administration. Among them, the offshore wind energy resource data are derived from the statistical data of the height of 50 m in the China Sea. Industrial technology and investment data are derived from China's Shanghai offshore wind power statistics. Since the cumulative installed capacity of offshore wind power is a complex nonlinear problem with many influencing F I G U R E 4 Overall forecasting flow chart of scale of offshore wind power in China factors, the relationship between each index and the cumulative installed capacity is vague. Therefore, this paper uses the gray correlation method to define the influencing factors. The effective screening of the preliminary impact indicator set can effectively improve the running speed and accuracy of the forecasting model.

| Normalization of indicators
Normalize all corrected data by using Z-score data standardization method.
where y i is the normalized data, x is the average of the original series, and is the data standard deviation of the original number.
Since the average water area roughness and the average investment payback period are negative indicators, in order to achieve the data forward, the inverse index forward processing method is adopted.
where X i is to the data after the forwardization, max is the maximum value of the normalized data, and min is to the minimum value of the normalized data.

| Gray correlation method
In this paper, the normalized and forwarded data of the preliminary impact indicators from 2000 to 2018 are input to the gray correlation algorithm. The gray correlation is shown in Table 2. The four indicators below 0.5500 are excluded, 33 that is, the average waters roughness, the sea abandonment rate, the average investment payback period, and the average investment return rate, and then, final influencing factors of installed capacity of offshore wind power in China are obtained. The normalized index values (2000-2018) of main influencing factors are shown in Table A1 in Appendix A.

RELM model
Decomposing the original historical load sequence by e-VMD. Firstly, input the time series signal of the cumulative installed capacity into the e-VMD algorithm. In the combined forecasting algorithm, the cumulative installed capacity e-VMD model of China's offshore wind power from 1990 to 2018 is first loaded. 4 IMFs and one residual are obtained. The decomposition results are shown in Figure 5.
From Figure 5, it can be found that each IMF showed more obvious characteristics of periodic changes. IMF1 has the highest frequency and a shorter wavelength while IMF2 to Residual decrease in order. The time series of the cumulative installed capacity has obvious multi-scale features, and the 4 IMF components present information of high and low variations with different fluctuation scales. Among them, the high frequency of IMF1 can reflect the random noise information of the original time series data, the residual frequency is low, and its change is stable. In fact, it represents the average trend of the original data sequence of the cumulative installed capacity of offshore wind farms in China. The original data sequence is independent of each other, and the individual IMF components are mutually orthogonal.
The influencing factor indicator set and each component are used as input of the training set and testing set of the PCA-RELM model.
The training parameters unified in this paper are shown in Table 3.
This paper reconstructs the prediction series to obtain the prediction results and relative errors of the components and residuals of the time series decomposition of the cumulative installed capacity of offshore wind farms in China since 1990. The predicted results for each series are shown in Figure 6.
Through the analysis of Figure 6, this paper finds that the e-VMD-PCA-RELM prediction model proposed in this paper has high prediction accuracy for each IMF and the residual. The average error of each component is −3.368%, −0.789%, −2.211%, 1.895%, 2.211%, and the total average relative error is −0.453%. Among them, the predicted average relative error of the residual signal timing sequence does not exceed 2.211%, indicating that the residual frequency is low and the change is stable, reflecting the trend information of the cumulative installed capacity of China's offshore wind power. Therefore, it is proved that the e-VMD-PCA-RELM prediction model proposed in this paper is highly scientific and applicable.

| Predictive effect analysis
In order to compare the superiority of the proposed model, this paper also inputs the data of the main influencing factors into LSSVM, RELM, PCA-RELM, and VMD-PCA-RELM models. Finally, the results of the five prediction models are shown in Figure  In order to objectively compare the accuracy of a variety of models, common statistical indicators including rmse, r 2 , mre are adopted and the index calculation formula is as follows: where q i is the predicted value, q i is the sample mean, and n is a sample number. The calculation results for the five forecasting models are compared in Table 4.
According to the calculation results of the statistical indicators, we conclude the following points: 1. The accuracy of the signal decomposition prediction algorithm is higher than that of the single algorithm. 2. The prediction accuracy of the model under the combined algorithm is higher than that of the single algorithm. 3. The prediction effect of the RELM algorithm itself is not as good as that of the LSSVM but the combination of algorithms can greatly improve the its prediction effect.

The values of the statistical indicators show good per-
formance of e-VMD-PCA-RELM model, indicating that signal decomposition plays an important role in the combined prediction algorithm.

| Final prediction output
The paper applied GM(1,1) to predict the value of main influencing factors from 2019 to 2035, which will be used as input data of e-VMD-PCA-RELM forecasting model. Finally, the paper got the amount of cumulative installed capacity of China offshore wind power from 2019 to 2035, which has been shown in Figure 8.

| CONCLUSION
Promoting the construction and access of China's offshore wind power projects is a key part of the current energy planning work. It is significant for the expansion of offshore wind power installed capacity to update energy structure and decrease carbon emission. This paper uses the e-VMD-PCA-RELM forecasting model to successfully predict the cumulative capacity of offshore wind power installed in China in the future. It can be found that the compound growth rate reaches 16.90%, and it shows a trend of increasing year by year. Through comparison and verification of various models, it is found that the combined prediction model used in this paper has the highest prediction accuracy. The forecasting results show that by the end of 2035, the cumulative installed capacity of China's offshore wind power will reach more than 60.924 GW. Therefore, it can be preliminarily concluded that by 2035, the proportion of China's offshore wind 1. According to GM (1,1), this paper has achieved a gray forecasting value for offshore power generation costs. In the next 15 years, the comprehensive life rate of unit life will reach 19 297.37 hours, and the comprehensive average internal rate of return will reach 9.54%. The increase in internal rate of return is mainly due to the fact that wind power affiliated facility costs and management costs will gradually decrease with the enrichment of technology and management experience. Although the cost of offshore wind power generation is higher than that of onshore wind power generation, the cost of offshore wind power generation is much faster than that of onshore wind power, reflecting the stronger development prospects of offshore wind power. 2. While promoting the absorption of offshore wind power, the government needs to subsidize the on-grid tariff of offshore wind power to ensure that offshore wind power has priority for power generation, and the corresponding price policy can provide price support for future offshore wind power generation, which is beneficial to attract investors into the offshore wind power industry.