Method to describe the distribution of wind velocity and its application in wind resource evaluation

A convenient and effective method based on vector analysis was proposed in this paper to quantitatively describe the pattern of the wind speed distribution curve. This method can accurately describe the steepness of the single‐peak curve by using two parameters of concentration degree (CD) and concentration period (CP). Through the analysis of wind speed data in China over the past 40 years, this paper found that regional wind resources with larger CD were poorer, while regional wind resources with larger CP were richer. In addition, there were obvious cubic relationships between the CP and the average wind power density. These two parameters could reflect the richness of wind energy and realize the comparisons of wind resources across time and scales. In comparison with conventional approaches, this method is simpler and avoids fitting process, which has broad promotion prospects in the field of power grids.


| INTRODUCTION
Since the 21st century, the climate has undergone drastic changes due to excessive emissions of greenhouse gases. To protect the ecological environment and ensure energy supply, countries around the world have put forward the goal of energy transition, and the transition process has accelerated significantly. 1,2 Wind power is a major source of renewable energy, generating more than 40 times the world's annual electricity consumption. 3 The use of wind power helps meet the rapidly growing demand for energy. With the gradual maturity of wind power technology and the support of relevant policies, wind power has been developed rapidly worldwide 4,5 and has become an important part of the grid operation in most of the countries with large investment. 6 At the same time, the randomness and volatility of wind power bring severe challenges to large-scale wind power grid connections. 6,7 Accurate evaluation of wind resource characteristics is the basis for wind power prediction and further development of the wind power industry. 3,8 Wind power can be determined by statistical analysis, in which case wind speed changes are characterized by a probability distribution function. 9 At present, most of the existing methods for describing wind speed distribution patterns use fitting functions to construct statistical models, such as two-parameter Weibull distribution model, log-normal distribution model, Rayleigh distribution model, [10][11][12] and so forth. These methods have a good fitting effect on wind speed distribution, but some real information of data will be smoothed out during the fitting process, and the calculation process is complicated.
The wind speed distribution pattern varies greatly in each place, numerous studies have shown that it is almost impossible to identify a single distribution that can be fitted universally. 9,[13][14][15][16] If only one distribution function is used to fit all wind speed curves, it is difficult to predict the superiority of the results. To achieve a refined calculation of wind resources, the researchers have to process the wind speed curves of each place one by one and find their best-fit models separately, the cost of such "precise calculations" is massive. 14 Therefore, fitting methods are often common for precise calculations at single-station or small-range for short-time. 17 More importantly, cross-regional wind energy comparisons become difficult when the criteria for wind energy assessment become inconsistent, which is still a pressing challenge. 13 To address the above issues, a meteorological statistical method will be cited in this paper, with certain adjustments based on the wind speed distribution characteristics. Thus, the wind speed distribution can be described conveniently and can be used for wind resource evaluation and comparison across regions and times.

| Method introduction
Zhang et al. 18 proposed a method to quantitatively describe the time distribution pattern of precipitation. This method quantified the concentration degree (CD) and concentration month of precipitation into two values, so as to realize the spatial distribution features detection of interannual precipitation distribution pattern variation. This method has been widely used in the field of meteorology to describe the annual distribution pattern of hail, snow and runoff, and so forth, [19][20][21][22] and has a good application effect. Both the initial and relative articles use this method with the aim of finding the temporal distribution properties of a weather phenomenon. But essentially, the statistical significance of the method lies in describing the shape of the single-peaked curve, which is exactly what we need to describe the shape of the wind speed distribution curve in the field of wind energy assessment. On this basis, this paper converts the statistical perspective of the original method from time to wind speed and delineates the statistical interval based on the wind speed distribution characteristics, thus linking it to the wind speed distribution curve and introducing the method into the field of wind resource assessment.
The basic principle for calculating the CD and the concentration period (CP) is based on vector analysis. The histogram of wind speed distribution during the statistical period can be viewed as a pie chart, where the statistical numbers within each wind speed range represent the length of the vector (Figure 1).
A complete wind speed distribution range, which is regarded as a circle (360°), is divided into N parts. And the direction angle of each wind speed range θ i equals to i × N 360°. On this basis, vectors representing each wind speed interval are decomposed into horizontal (X) and vertical (Y) directions, the sum of the vectors projected along the X and Y axes are the total projection R x and R y of the wind speed distribution pattern: The ratio between the length of the synthesized vector R R ⃗ , ⃗ x y and the total statistical number is defined as the CD of wind speed distribution pattern, ranging from 0 to 1. The expression for CD is as follows: CP represents the azimuth of the composite vector, according to which the main distribution interval of wind speed can be calculated.
From the formulas above, CD and CP quantitatively reflect the distribution pattern of a curve in a simple expression. To deeply understand the physical implications of CD and CP, two limiting cases are given below. CD equals 0 when each wind speed interval has the same numbers. On the contrary, CD equals 1 when the wind speed is concentrated in one interval at all times, the concentrated interval of wind speed can be calculated from the azimuth angle of the synthesized vector. Generally speaking, this method is suitable for describing the distribution pattern of a single peak curve, but when the curve is double-peaked or has multiple inflection points, this method is not applicable.
In the application of wind resource assessment, the probability distribution curve of wind speed presents a unimodal distribution. The roles of CD and CP are similar to that of shape parameter k and scale parameter c in the two-parameter Weibull fitting model. While the calculation of this method is simpler, and the real data will not be lost due to the fitting process. CD and CP could hopefully be a valid tool to reflect the real wind concentration level and the prevailing winds within the study area, respectively.

| Data
ERA5 global meteorological datasets provided by European Centre for Medium Range Weather Forecasts are used in this study. ERA5 combines vast amounts of historical observations into global estimates using advanced modeling and data assimilation systems. Hourly 10 m wind, pressure, and temperature data with 30 km resolution are used to analyze the distribution of wind resources in China from 1981 to 2020.

| Wind resources exhibited by CD and CP
The wind speed interval selected in this paper is 1 m/s, and the unit research period is 1 year. Considering the temporal resolution of the ERA5 database, hourly data is taken as the minimum statistical unit. On this basis, hour numbers of each wind speed interval in a year are calculated and then CD and CP of each grid point in China are obtained according to Equations (3) and (4). Table 1 gives the azimuth angle corresponding to each wind speed range.
In this case, N equals 30 and R equals the total number of hours in a year, a constant equal to 365 × 24 (or 366 × 24 in a leap year), the expression for CD can be written as: Obviously, the calculation of statistical days of each wind speed interval r i can be transformed into the calculation of wind speed probability density r R i , which is closely related to the wind power calculation. In other words, the description of wind speed distribution patterns can directly reflect the richness of wind resources. Figure 2A,B show the spatial distribution of the perennial mean state of CD and CP in China from 1981 to 2020. It can be seen from Figure 2A  the wind speed in the southwestern region is mostly concentrated at 0-1 m/s. Overall, CD and CP values in the same region showed an obvious inverse correlation. This negative correlation is mainly derived from the inherent features of wind speed probability distribution curve. The growth of wind speed is a continuous and smooth process, and the probability distribution curve will not increase abruptly in the large wind speed range, but the transition from the low wind speed range to the high wind speed range smoothly. These features make the proportion of high wind speed relatively modest, and the curve of wind-rich areas tend to be "short and fat," which results in the phenomenon of higher CP and lower CD. From 1981 to 2020, The CD value decreased in Inner Mongolia Plateau and northeast China but increased in other regions, among which, the CD of Qinghai-Tibet Plateau increased significantly ( Figure 2C). There was no obvious inverse correlation between the CP trend and the CD trend. CP values showed an increasing trend in Inner Mongolia, Northeast China, Xinjiang, and southwest China, but a significant decreasing trend in Qinghai-Tibet Plateau, and a slight decrease in East and South China ( Figure 2D).
To further process the wind speed data within each region as a type of data set, extract the regional wind speed probability distribution curves and discuss the relationship between the curve shape and CP, CD value, as well as their interdecadal variations, CD-average, CPaverage, CD-trend, and CP-trend are set as the basis for clustering each grid point. The k-means clustering method is adopted to classify the grid point in China and five major divisions are finally obtained (not detailed here). The scope of each area is shown in Figure 3. Area Based on this regional classification, the wind speed data in each region can be considered as a similar data set, and all grid points in the region are included in the statistical sample to extract their CP, CD value and their variation trend on a regional basis as shown in Table 2. The results obtained by recounting the five typical regions do not differ much from those in Figure 2. Area A with the largest CD has the smallest CP value, while Area D with the smallest CD has the largest CP value. Meanwhile, Regions D and E show a large interdecadal trend.
To reflect the ability of CD and CP values to describe the distribution pattern of the wind speed distribution curve, the probability distribution curves of wind speed in five typical wind resource areas are plotted in Figure 4. The curve in the area with higher CD is steeper, and the peak value is corresponding to a smaller wind speed (i.e., a smaller CP value), while the curve in the area with lower CD is gentler, and the wind speed concentration range is relatively larger. This confirms our previous inference on the cause of the negative correlation between CP and CD. In summary, CD and CP can effectively reflect the shape of the wind speed probability distribution curve and achieve quantitative comparisons.

| Relationship with average wind power density
Wind power density is the most valuable reference quantity to measure the reserves of wind energy in a region. It refers to the energy of the airflow flowing vertically through a unit cross-sectional area in a unit time. The average wind power density can be calculated by the following formula: ( ) i is the probability corresponding to a certain wind speed range v i . ρ Refers to the air density, which is a parameter determined by the local air pressure and temperature. The value of ρ is different in each area and can be calculated by the equation given below: Based on the formulas above, the annual average values of CP, CD, and average wind power density in each grid point in China were calculated to find the correspondence between the three. The results are shown in Figure 5. Scatter diagrams indicate that CD has a quadratic negative correlation with the average wind power density, while CP has a cubic positive correlation F I G U R E 3 Schematic diagram of five typical areas.
T A B L E 2 Regional means for CD, CP, and their interannual trend in five typical areas with the average wind power density. Consistent with the results in Section 3.1, CD and CP have a significant negative correlation. According to the physical sense of CP, it is essentially representing the dominant wind speed interval of the wind speed probability distribution curve with a unit of m/s. Since the air density varies very little from region to region, it can be approximated as a constant, thus the wind power density is equivalent to the weighted average of v i 3 . It can be inferred that the cubic relationship between CP and w̅ is stable. As for the CD, the value is tied with the division of wind speed interval but has no direct relationship with the wind speed. Therefore, we infer that the quadratic relationship between CD and w̅ will change with the division standard of wind speed intervals. In summary, a point with a higher average wind power density corresponds to a smaller CD value and a larger CP value. Therefore, the availability of wind resources can be judged and compared by the CD and CP value of the wind speed distribution curve.
In this part, we will judge the effect of CD value and CP value change on average wind power density according to the interannual changes of average wind power density, CP, and CD values in five typical regions. According to the analysis above, CP is negatively correlated with CD, while is cubic proportional to w̅ . Such a relationship can also be reflected in the interannual variations of the three. It can be seen from Figure 6 that the CP and the average wind power density curve are varies consistently, while the CD curve is reversed. Both CP and average wind power density in Regions A-D have a decreasing trend in the last 30 years, with a significant decreasing trend in Region D, while Region E is the only region where the average wind power density has increased. Generally, the interdecadal scale fluctuation can reflect the climate trends through the decadal curves. From the research above we can infer that if the CD has a positive trend and CP has a negative trend in a place, the wind resources will further decrease, and vice versa. During the research period, such a trend is manifested in most of the regions especially for Region D, which can be predicted that the wind energy resources in the Tibetan plateau and coastline will be further attenuated with this climate trend.

| DISCUSSION AND CONCLUSION
This article introduced a simple statistical method that can describe the shape of the curve. The method describes the steepness and peak interval of the singlepeak curve through two indicators, CD and CP. It simplifies the calculation and also considers the descriptive ability and effect of these two indicators. By comparing the CD value and CP value in China, this paper found that the CP value of Mongolian and Tibetan Plateau is higher and the CD value is lower. It can be inferred that the region is relatively rich in wind energy resources. The CP values and average wind power density of the Qinghai-Tibet Plateau and the eastern coastline have been found to be decreasing rapidly over the last 30 years, and if the climate trend continues, the wind energy resources in this region will further diminish.
CP and CD can also intuitively reflect the average wind power density. By constructing a scatter plot among CP, CD, and average wind power density, this paper found that CP and average wind power density have an obvious cubic relationship and their interannual trends were largely synchronized. In general, the larger the CP value and the smaller the CD value corresponds to the larger average wind power density, which means the abundant wind energy resources. Through this relationship, the average wind power density in a certain place can be estimated by querying the CD and CP values thus realizing the comparison of wind resources across regions. Similarly, CP and CD values can also be used to establish relationships with other wind resource assessments, such as average wind speed, for a more comprehensive estimation of wind resources.
Compared to the traditional fitting methods, the method presented in this paper applies to the shape description of all single-peak curves and does not require finding the best-fit model for each site, which is meaningful to the large-scale wind resource assessment. This method is based entirely on real data and has a common calculation method, these features will greatly facilitate the comparisons between different wind curves across time and at regional scales. However, it is difficult to derive a series of statistical eigenvalues or to construct a direct functional relationship with average wind power density, which is limited by the poor derivability of CD and CP. Therefore, this method focuses more on the comparison of wind resources over large temporal and spatial scales, rather than on fine-grained calculations. In fact, these two methods are not opposite but can be used in conjunction with each other. For example, when dealing with large amounts of wind speed data, the data can first be filtered using this statistical method to select valuable wind speed data by comparison, and then the filtered curves can be refined using the fitting method. In this way, computational resources are saved and computational accuracy is ensured.
During the analysis, the wind speed probability distribution curve was calculated with 1 year as the statistical unit. In practice, the statistical time can be flexibly modified according to the actual needs, which means that this method can be used to evaluate wind resources for any length of time. It should be noted that the definition of wind speed interval corresponding to each azimuth angle has a certain influence on the statistical results, once they are fixed, the relationship between CP, CD, and average wind power density is determined. However, the method to calculate the optimal interval value still remains to answer, this paper only takes the wind speed interval of 1 m/s as an example. For further use, reporting of standardized intervals will greatly facilitate intercomparison between locations and times, which requires a lot of experiments and evaluation, and the related work will be carried out in future studies.
In addition to the wind power density estimation and the quantitative description of the wind speed probability distribution curve mentioned in this article, this method can be extended to any single-peak curve description field. For example, describing the seasonal distribution pattern of renewable resources (just like the initial application field of this method); evaluating the selection of renewable energy stations; and quantifying the diurnal distribution of photovoltaic resources, [23][24][25] and so forth. In general, this method has broad promotion prospects in the field of power grids.