Minute-scale detection and probabilistic prediction of offshore wind turbine power ramps using dual-Doppler radar

Funding information H2020 Marie Skłodowska-Curie Actions, Grant/Award Number: 642108 Abstract Predicting the occurrence of strong and sudden variations in wind power, so-called ramp events, has become one of the main challenges for the operation of power systems with large shares of wind power. In this paper, we investigate 14 ramp events of different magnitudes and minute-scale durations observed by a dual-Doppler radar system at the Westermost Rough offshore wind farm. The identified ramps are characterised using radar observations, turbine data and data from the Weather Research and Forecasting (WRF) model. A remote sensing-based forecasting methodology that propagates wind speeds upstream of wake-free turbines is extended here to the whole farm, by including corrections for wake effects. The methodology aims to probabilistically forecast the wind turbines' power in the form of density forecasts. The ability to predict ramp events of different magnitudes is evaluated and compared with probabilistic statistical and physical benchmarks. During the observed ramp events, the remote sensing-based forecasting model strongly outperforms the benchmarks. We show here that remote sensing observations such as radar data can significantly enhance very short-term forecasts of wind power.

Reliability Council of Texas (ERCOT) on 26 February 2008 when an unexpected strong and fast ramp-down event forced ERCOT to call for an Emergency Electric Curtailment Plan. 14 Ramp events also cause higher management costs of wind farms and can result in expensive penalties for electricity market traders (as a result of inaccurate bids). 15 Despite these negative consequences, forecasters tend to underestimate the impact of ramp events as state-of-the-art forecasting models focus on minimising the overall forecasting error. 16 Currently, there is no commonly accepted definition of ramp events nor a straightforward method to classify them. 17 In a review of the recent history of wind power ramp forecasting, Gallego-Castillo et al. 18 include a list of the literature-reported power gradient thresholds for different timescales used to define a ramp event. Reported amplitude thresholds range from 10% to 75% of the installed capacity and window lengths extend from 5 min to 6 h. As pointed out by Gallego-Castillo et al., 18 every detection threshold should be chosen according to a specific end-user application (e.g., grid operator, energy market trader and wind farm operator) and the temporal and/or spatial scale where the ramp event occurs.
Most works on forecasting ramp events are based on numerical weather prediction (NWP) models using either a deterministic approach 5,19,20 or ensemble members. 11,21 These models have lead times of one to several hours and can predict ramp events controlled by quasi-horizontal processes, such as large-scale fronts and mesoscale circulations. However, NWP systems have difficulties in predicting ramp events associated with vertical processes, such as turbulent mixing or convective processes, 7 particularly their arrival time. Moreover, the ability of NWP to reproduce and capture ramp events strongly depends on the resolution of the model 22 and its parameterisation schemes. 23 A recent case study of a ramp event in a cluster of offshore wind farms in the United Kingdom 11 has shown the importance of using forecasting models with a high temporal and spatial resolution. The system operator's forecasting model did not capture a strong local ramping event, which led to a dramatic increase in the system balancing cost. This was attributed to the poor resolution of the model (averaged hourly for all wind farms), and the authors speculated that the model's output was an average of different forecasting ensembles. Although they were able to anticipate the ramp event using a high-resolution multi-ensemble NWP model, there was a consistent 2-h phase error in the prediction.
Studies on the predictability of ramp events on shorter lead times (minutes to a few hours) are built on statistical methods such as data mining, 24 artificial neural networks, 25 hidden Markov models 26 or autoregressive logit models. 27 The main shortcoming of statistical models is their inability to predict the correct timing or arrival of transient events, unless those are based on variables measured upstream or variables that can anticipate strong changes in wind speed. Indeed, it has been demonstrated that incorporating atmospheric information in statistical models can improve ramp forecasting, as shown in Gallego-Castillo. 28 As the flexibility of the trading market is increasing, with gate closure times as short as 5 min, 29 and the economic dispatch is processed at time resolutions of 5 to 15 min, minute-scale forecasts of wind power are becoming more important. The question that arises here is how to generate forecasts that are able to capture unexpected strong ramp events. Given the limitations of the aforementioned physical and statistical models, it is clear that upstream observations close to the forecasting location could provide the required information to improve the prediction of ramp events. As local on-site measurements with a met mast are not necessarily representative of the inflow of a wind farm, the use of long-range lidars or radars to predict ramp events is a promising and not so 'unrealistic' option, given their optimal trade-off between temporal and spatial resolution and their extended range. This alternative was first pointed out by Zack 7 in 2007. For instance, some precipitation fields associated with severe offshore wind speed and power fluctuations have been captured by weather radars in the Horns Rev offshore wind farm. 30 Additionally, Hirth et al. 31 documented a ramp event at an onshore wind farm using a dual-Doppler (DD) radar system. A recent workshop conducted by the IEA Wind Task 32 and 36 on 'Very short-term forecasting of wind power' has called for further development and investigation of minute-scale forecasts of wind power based on remote sensing measurements. 32 Following this line of thought, remote sensing systems such as scanning lidars or Doppler radars have been proven to be suitable for minute-scale forecasting applications, as demonstrated by several authors. [33][34][35][36] We recently introduced and discussed how to use DD radar observations to probabilistically predict the power generated by wake-free wind turbines in an offshore wind farm with lead times of 5 min. 35 The overall goal of this work is to extend this methodology to waked turbines in order to generate predictions for the whole wind farm. The main focus is on evaluating the performance of the predictions during the occurrence of 14 ramp events.
Our paper is organised as follows: Section 2 introduces the wind farm and the different datasets. In Section 3, the definition of a ramp event is given, and 4 months of wind farm SCADA data are investigated. A significant part of this section characterises the meteorological conditions during the ramps, using simulations from the Weather Research and Forecasting (WRF) model and both SCADA and radar observations. Additionally, we evaluate the radar data's skill in ramp event detection at the turbine and farm level. Section 4 outlines a probabilistic-based methodology for forecasting wind turbine power based on DD radar data, with lead times of 5 min. Subsequently, in Section 5, the model's predictions are discussed and compared with the benchmark persistence and with forecasts from a multi-ensemble NWP model. Conclusions and a brief outlook on the application of DD radar technology for ramp event forecasting are included in Section 6.

DATA DESCRIPTION
In this study, we use wind speed and direction observations collected during the BEACon research project 37    The DD radar system 31,38 collected three-dimensional wind field measurements with a temporal resolution of 1 min. Each radar scanned a 60 • sector at 13 different elevation tilts, ranging from 0.2 • to 1.4 • , which constitutes a volumetric measurement. The maximum range of each radar is 32 km, and the spatial resolution is 0.5 • in the azimuth dimension, given by the beam width, and 15 m along the beam direction. The two horizontal wind speed components were retrieved by using the line-of-sight (radial) velocity measurements from the two radars and interpolating them into a three-dimensional Cartesian grid over the overlapping region measured by the radars. The final resolution of the interpolated measurements is 50 m horizontally and 25 m vertically. The area considered in this analysis is depicted in Figure 1B. Further information about the Doppler radar technology can be found in Hirth et al. 39 and Vignaroli et al. 40 For this study, 2512 1-min DD wind fields (41.86 h) were considered. The samples were collected continuously by the radar system on nine periods of different duration between November 2016 and January 2017.
For our analysis, we focus on wind speed and direction observations at the height of 100 m. A complete list of the variables employed to analyse and characterise the ramps is provided in Table 1. Following the work of Hirth et al., 31 we estimate a turbulence intensity (TI) at the height of 100 m based on the DD data (TI DD ), defined as the ratio between the standard deviation of the wind speed of all wind vectors at that height, collected within a 10-min segment, and the mean value of the same segment.
Together with the DD wind fields data, four continuous months of wind speed, wind direction and power output from the wind farm's SCADA system were made available for this work. The SCADA data were averaged over 1 min and temporally synchronised with the DD data. All periods during start-ups, shutdowns and abnormal performance were removed. The 10-min statistics of the wind speed are used to determine a wind turbine turbulence intensity (TI wt ).
To investigate the meteorological mechanisms behind the identified ramp events, WRF model 42 data from the mesoscale production run of the New European Wind Atlas (NEWA) 43 was used. NEWA covers large parts of Europe for a 30-year period from 1988 to 2017 with 3-km spatial resolution. These data were publicly released in June 2019. The WRF model configuration for the NEWA production run is based on the evaluation of a large ensemble of WRF simulations with different model set-ups and compared with measurement data from tall met masts. We extracted time series of horizontal wind speed, wind direction, Monin-Obukhov length and temperature at several heights for selected days from the WRF grid point nearest to the position of the Westermost Rough wind farm.
To assess the capabilities of radar-based predictions, we compare them with forecasts from a high-resolution NWP model with 75 ensemble members. The multischeme ensemble prediction system (MSEPS) was run by the forecast provider Weather & Energy PROGnoses (WEPROG). 44 For this study, hourly wind speed and direction predictions at the height of 100 m are employed, which were selected from NWP model runs initialised 5 to 7 h ahead of the studied periods.

Ramp definition
As mentioned before, there are no established criteria to define a ramp event, but generally speaking, it represents a strong and sudden variation of power in a wind farm or a cluster of wind farms. A ramp event is normally parameterised by its starting time (t s ), its duration ( t) and the ramp magnitude ( P). In Figure 2, three ramps (two ramp downs and one ramp up) of different magnitude and durations can be distinguished.
Most works on ramp event detection use binary criteria to classify the wind power time series into ramp events and nonramp events. 18 Thus, given a wind power time series P n (t), an indicator ramp function R(t) can be used to classify the time intervals by using a criterion function C(t) and a threshold C 0 : There are different criterion functions employed to define a ramp event, with the fixed-time interval method being the most commonly used. 17 This applies a moving time window t w , where the change in power C(t) is evaluated as If a ramp event is found within the time window t w , all the points of the time series belonging to that interval are considered a ramp event.
Once a time point has been defined as a ramp, its status cannot change when evaluating other intervals. With this method, ramp events can be easily classified into ramp-up events (C(t) ≥ C 0 ) and ramp-down events (C(t) ≤ −C 0 ). The drawback of this is that several ramp events of opposite direction might overlap, and strong ramp events of a shorter duration than t w could be missed. To overcome these issues, we use the minimum-maximum method, 17,45 where C(t) is defined as With the minimum-maximum method, only the time points in the time window t w between the maximum and minimum values are treated as a ramp. In the case that there are more than a single pair of time points that meet the threshold criteria within the moving window, only the shortest period is considered. Thus, with this method, ramps can have a duration greater or smaller than the time window t w . As this criterion will only give absolute values of the ramp magnitude, the relative time position of the maximum and minimum points needs to be considered when assigning the direction of the ramp.
Most related works express the threshold for detecting ramp events as a percentage of the capacity of the wind farm or the cluster of wind farms. Existing thresholds used in detection of ramp events can be found in Gallego-Castillo et al. 18 Still, there are a few studies that aim to detect and/or forecast ramp events in a minute scale. For instance, Cutler et al. 22 used a 150-MW threshold for a time interval of 5 min in a wind farm with an installed capacity of 868 MW. This corresponds to a threshold of 17.28% of the nominal capacity. They also employed a threshold of 54%  of changes in power. The distribution has a median value close to zero and presents heavier tails than a Gaussian distribution, with some extreme values close to 50% of the wind farm's capacity. This non-Gaussian behaviour is a characteristic of power increments and has been extensively studied. 47,48 Following the work of Wan on the ERCOT system, 46 we set the threshold C 0 to 3 , where is an estimate of the standard deviation of the distribution of 5-min changes in wind farm normalised power. This threshold corresponds to 10% of the nominal power of the wind farm.
The two vertical dashed lines in Figure 3 (left) are the boundaries set in this work for the occurrence of ramp events in the wind farm. In addition, Figure 3 (right) shows the magnitude of changes in power for different minute-scale time windows. For all timescales, the distribution shows similar frequencies of ramp ups and ramp downs. As the timescale grows, the ramps' magnitude increases. Although 90% of the values do not exceed 8%, 12% and 15% for 10, 20 and 30 min, extreme ramps of 60%, 77% and 86%, respectively, are observed. These extreme ramps are clearly stronger for ramp ups than for ramp downs, which can be associated with the strong cold fronts that are characteristic of winter months.

SCADA data ramp event identification
In this section, a summary of the ramps detected in the SCADA data based on the minimum-maximum method is presented. From now on, we only focus on time periods where DD data were made available ( Table 2). With a threshold of 10% of the normalised power in a time interval of 5 min, a total of eight upward ramps and six downward ramps were identified. Table 2 gives a summary of the number of ramps in both directions, together with the power swing and duration of the largest ramps. Figure 2 illustrates the three ramp events (two ramp down and one ramp up) identified on 18 November 2016.

Characterisation of ramp events observed by DD radar
A complete characterisation of the meteorological conditions behind the ramp events is out of the scope of this paper. However, we made use of the available information from the DD system, the SCADA system and the WRF data to understand and describe the meteorology behind them.
First, we compare the 10-min averages of nacelle wind speed (averaged over all wind turbines) with the DD mean wind speed at 100 m and the WRF wind speeds at the same height. The same approach is applied to the wind direction. We also compare the TI wt (averaged over all machines) and the TI DD , previously described in Section 2. To analyse the atmospheric stability, we relate the potential temperature of the air (100 m) to the potential sea temperature extracted from the WRF model, and we also check the Monin-Obukhov length.

Ramps connected to a low-pressure system on 17 and 18 November 2016
On 17 and 18 November 2016, an area of low pressure, accompanied by strong precipitation, was observed over the Westermost Rough wind farm. 49 During those days, the wind farm experienced several ramps, which we associate with the passage of an occluded weather front on 17   Figure 4C compares the wind direction. For the first pair of ramp events, an almost constant wind direction is observed. Here, there is a good agreement between the SCADA and the DD data, and the differences with WRF are not so large. However, during the passage of the front, a strong change in wind direction can be observed, which is not reproduced by WRF. The fact that WRF sometimes fails in simulating the timing and magnitude of wind ramps has been already been pointed out in previous studies. 13 Although the lower temporal and spatial resolution of WRF does not allow conclusions to be drawn on the exact state of the atmospheric boundary layer, it provides us with some clues about the atmospheric conditions. Figure 4D shows the potential sea surface and air temperature extracted from WRF. A convective situation (the sea surface is warmer than the air) can be identified. This is also confirmed by the Monin-Obukhov  Table 1 for more details [Colour figure can be viewed at wileyonlinelibrary.com] length (see Figure 4E). The period illustrated by WRF indicates changes in stability towards a more neutral atmosphere, followed later by a decrease in stability. In Figure 4F, the turbulence intensities derived from the SCADA and the DD data are compared. There is generally a good agreement between both variables, except at 13:50. We attribute this to the lower DD data availability used to calculate the TI during this period, as shown in Figure 5K. Although there are some differences in both variables, an general increase in turbulence is observed during the ramp downs. During the front, this is much stronger. For the ramp-up events, a decrease in TI is seen.  Figure 5G, as some of the upstream wind turbines (the southern ones) are seeing a south-south-westerly wind, whereas WT7 (first row in magenta) is yawed to the west. In Figure 5I, a higher wind speed flow from the north-west is observed, which propagates into the wind farm as seen in Figure 5J-L. This leads to the second ramp up in the wind farm. Interestingly, this flow seems to merge with the high wind speed flow observed in the lower left part of the radar image, which seems to remain unaltered during the passage of the front. Figure 6 depicts two DD wind speed vertical cross-sections (X-X ′ and Y-Y ′ ), which are shown by the dashed black lines in Figure 5. The cross-section X-X ′ has been defined as a plane parallel to the wind farm rows and perpendicular to the prevailing south-westerly wind direction during the ramp-down event. Y-Y ′ is perpendicular to X-X ′ and therefore parallel to the flow causing the ramp down. Figure 6 (right) shows how the low wind speed front enters and passes through the wind farm (from Y to Y ′ ) over the different time steps. The relaxation or 'recovery' after the low wind speed front can be seen in Figure 6J-L as the high wind speed flow propagates into the wind farm. Figure 6 (left) also depicts how the low wind speed flow is passing through the plane X-X ′ , and it is later replaced by the high wind speed flow from the north-west, which induced the second ramp up.

Ramps during a stable boundary layer on 8 December 2016
On the early morning of 8 December 2016, several ramps with power swings not greater than 30% were experienced in the Westermost Rough ( Figure 7A). Figure 7B shows the numerous ups and downs in the wind speed. There is a good agreement between the SCADA and the DD wind speeds, whereas WRF overestimates the wind speed by 2 m/s on average from 7:00 on. Whereas WRF predicts a westward shift in wind direction, the SCADA and DD indicate a less significant shift to the south ( Figure 7C). The meteorological conditions reproduced by WRF clearly indicate a stable atmospheric boundary layer ( Figure 7D,E), that is, the air temperature is several degrees warmer than the sea surface. Figure 7F shows a lower TI than during 17 November, which is consistent with a stable boundary layer. Again, there is a good agreement between the TI DD and the TI wt . For the strongest ramps (between 8:40 and 9:40), an increase in TI can be seen during the downward ramp, followed by a decrease during the ramp up.
Ramps during stable boundary layers are associated with changes in turbulent mixing, which can be caused by the nocturnal cooling of the atmosphere, low-level jets or thunderstorms. These events are difficult for physical models to predict, as modification of atmospheric boundary layer parameterisation schemes is required. 50 In our case, meteorological charts indicated a high-pressure system over major parts of Europe including the United Kingdom. 51 However, an embedded weak low-pressure system 51 accompanied by heavy rain 52 was observed crossing the Westermost Rough. We infer that this dynamic situation generated a downward mixing of high and low wind speeds, which caused the numerous wind ramps.
The change in the turbulent mixing during the latest ramp down is further investigated in Figure 8. At 08:47, a stable flow with a strong shear and a wind speed near 10 m/s is observed ( Figure 8A). The strongly stratified atmosphere can be seen in the vertical cross-section and in the corresponding wind speed profile. In the next images ( Figure 8B-D), a transient wind speed lull enters the wind farm from the south-south-west.

A cold weather front on 1 January 2017
On the first day of 2017, an increase of almost 78% of the normalised power in only 23 min was experienced in the Westermost Rough ( Figure 9A).
This strong ramp was generated by the passage of a large-scale cold weather front, as indicated by meteorological charts. 53 The DD wind speed data captured the increase in wind speed, which was accompanied by a wind direction shift of nearly 150 • . Again, there is a good agreement between the DD and the SCADA wind speed data, but WRF underestimates the ramp magnitude and predicts its arrival too early. This is clearly seen in the time series of the wind direction ( Figure 9C), as WRF simulated a strong change of almost 120 • , about 2 h early. An increase in the TI is observed before the ramp-up event starts. After the ramp, the TI drops. Regarding the meteorological conditions, WRF predicted a more convective situation between 4:30 and 6:30, as shown in both time series of potential temperature and Monin-Obukhov length. However, from 06:30 onward, both variables indicated a more neutral situation. The fact that WRF predicted the ramp some hours before its arrival does not allow us to draw conclusions about the specific meteorological conditions at the wind farm during the ramp. The DD horizontal wind fields during the passage of the cold front are shown in Figure 10. On 1 January 2017 at 05:20, a south-westerly low wind speed was observed by the radars. Ten minutes later, Figure 10B shows a slight change in wind direction and decrease in wind speed enter from the north, possibly associated with a prefrontal trough. During the following images, one can observe how the wind turbines progressively yaw to the north. Furthermore, the wind farm power output decreases by nearly 25% during the next 30 min. At 06:10, a distinct high wind speed flow enters the radar domain. Over the next 20 min, the high wind speed flow reaches all wind turbines and the wind farm produces at nominal load ( Figure 10H). In the radar observations before 05:40, one can see a quite stable flow and strong wind farm wake effects. After the ramp passes, clear turbulent structures can be observed. However, the derived TI DD shown in Figure 9F is much lower after the ramp. This could be related not only to the fact that TI averages the fluctuations with the wind speed, rather than the flow being more or less turbulent, but also to the fact that we analyse here a TI based on non-free-flow observations.

Evaluation of DD ramp event detection
After the characterisation of the ramp events, the ability to capture wind farm ramps using DD radar data is assessed. We first derive a power curve based on DD wind speeds measured 2.5D upstream of the rotor and the filtered SCADA power. We later estimate a DD representative wind turbine power output using the DD wind speeds measured at the same distance and the derived power curve. Implicitly, we assume that the wind turbines are instantaneously yawing into the wind direction. We then compute the aggregated power from the wind farm and apply again the threshold criteria defined in Section 3.1 to detect wind farm ramp events. Finally, the detected DD ramp events are compared with the identified events in the SCADA data. and WT21 (now upstream wind turbines) experience the ramp-up event earlier than WT10, WT25 and WT33.
To evaluate the radars' ramp-detection skill, we compare the number of ramp events detected by the DD radar data with the events identified in the SCADA data. Here, our analysis is based on the aggregation of wind power. To compute the aggregated power from the wind farm, we consider only the wind turbines that are operating, as indicated by the SCADA signal. We also fill in the DD missing values with a linear interpolation. Ramp events are detected using the threshold criteria defined in Section 3.1. To assess the radars' detection skill, we use a binary classification (detected/nondetected). Finally, we assign three possible outcomes for each pair detected (DD)/observed (SCADA) ramp: • True positive (TP): A ramp is detected by the DD data and observed in the SCADA data.
• False positive (FP): A period is classified as a ramp in the DD data but not observed in the SCADA (false alarm).
• False negative (FN): A ramp event is observed (SCADA) but missed in the DD data.
To evaluate the success of detection of the DD data, we use the metrics detection accuracy (DA) and ramp capture (RC), defined as follows:   were particularly less abrupt than on the other days, and some of the ramp magnitudes were very close to the imposed threshold. Therefore, small differences in the wind speed observations result in the radar data missing two of the ramp events and detecting one ramp that was not seen in the SCADA data.
We also evaluate the detection skill by investigating the magnitude error C , timing error T and duration error D between the DD-radar detected ramps and the SCADA-observed ramps. The errors have been defined as follows: with t s being the starting time of the ramp, t being the ramp duration and P is the ramp swing. Table 4 shows statistics on the three types of errors. The mean absolute error for the magnitude error is 3.70%. The magnitude error does not exceed 8.5%. In terms of timing, 75% of the ramps are detected within 1 min of delay/anticipation, and the maximum timing error is 3.18 min. In the case of duration errors, the errors range between −5.00 to 4.03 min, and the absolute average duration error is 1.86 min. In Figure 12 (bottom), the time series of aggregated normalised power is also shown for SCADA and DD-derived power. The starting and ending times of the ramps are indicated by vertical solid and dashed lines, respectively. In general, timing errors of less than 3 min are observed, with all starting errors being around 1 min.

REMOTE SENSING-BASED WIND POWER PROBABILISTIC FORECASTING TOOL
This section extends the remote sensing-based forecasting (RF) methodology we previously introduced and applied to wake-free wind turbines in Valldecabres et al. 35 to the level of a wind farm. RF is a probabilistic wind power forecasting model based on the propagation of upstream wind field vectors measured by a remote sensing system. For a given minute-scale horizon, the method assumes there is a persistence of the wind field vectors in their trajectories, or in other words, there is a perfect spatio-temporal correlation between the upstream observations and their future downstream occurrences. First, we describe the RF model for single wake-free wind turbines. Second, we introduce how to correct the predictions for wake effects. Figure 13 outlines the methodology for the RF forecast. The part plotted in grey was previously introduced in Valldecabres et al. 35 Given a wind turbine and upstream remote sensing wind field observations, the idea of the model is to get a probabilistic preview at time t of the wind speed observations that will arrive at the wind turbine at the forecast time t + k. The proposed methodology is not only foreseen for radar data but could also be applied to other remote sensing data, such as long-range lidar measurements.

Single wind turbine forecast
In the RF model, wind field vectors are propagated with their horizontal wind speeds and directions during the defined forecasting horizon k. Here, it is assumed that during t and t + k, there is no change in the trajectories of the wind vectors. We therefore infer that, by adopting a probabilistic approach, the uncertainty related to the propagation of the wind vectors is included. Our model is intended to be used in the minute scale (e.g., 1-15 min). To predict a density forecast of power P RF i for the wind turbine i at a future time t + k, an estimate of the wind speeds representative of the wind turbine is required. This is not intended to model any rotor-equivalent wind speed but a distribution of the expected hub-height wind speed. Thus, the group of wind field vectors to be found at time t + k within a spatio-temporal window at each wind turbine forms the predictive wind speed distribution ws i . The spatial window is a circle centred at the wind turbine, which extends horizontally outward to one rotor diameter; the temporal window covers 1 min, centred at the forecasting time. The predictive wind speed density is corrected for wind turbine induction effects (ws RF i ) using the thrust curve data from the manufacturer. For more details, we refer the reader to Valldecabres et al. 35 An example of the propagation of the wind field vectors is given in Figure 14. There, two DD radar wind fields that are about 5 min apart are displayed. Figure 14A shows the wind field at the time that the forecast is issued (t). Figure 14C illustrates the wind field at the forecasting time (t + k). Predictions for selected wind turbines in the first and second row (highlighted and surrounded by coloured circles) are depicted in Figure 14B, which is a zoomed-in image of Figure 14A. The wind field vectors to be found within each wind turbine's spatio-temporal window at time t + k are surrounded by a circle with the same colour than the wind turbine. An arrow indicates the main direction of the vectors. For brevity, we do not look deeply here into the performance of the wind speed predictions. In Valldecabres et al., 35 the mean of the wind speed distributions was evaluated against the wind speeds in front of the rotor at the time t + k and compared with the benchmark persistence. Persistence assumes that the prediction at time t + k equals the observation at time t and is one of the most commonly used forecasting models in the minute scale. 32 Results of the comparison showed that the correlation between the predictions and the observations was higher for the RF model than for persistence. Finally, we estimate the predictive densities of power by applying a probabilistic wind turbine power curve as a transfer function to FIGURE 13 Scheme of the remote sensing-based wind farm power density forecasting (RF) model. The blue part indicates this paper's contribution to the RF model presented in Valldecabres et al. 35 [Colour figure can be viewed at wileyonlinelibrary.com] the wind speed densities. 35 The power curve is based on 1-min DD wind speed observations, as shown in Figure 11 (left), which allows us to have a good calibration between the model and the real observations.

Correction for wake effects
The RF methodology was introduced to predict the power of free-flow wind turbines. As we intend here to forecast the power of all wind turbines (wake free and waked ones), corrections for the waked wind turbines need to be considered. In this work, we do not apply any wake modelling, but we correct our power predictions for waked wind turbines based on a directional turbine efficiency. To correct for induced wakes inside the wind farm, we use the following approach. For each wind turbine, its predictive density of power P RF i is only corrected in the case that (i) the predictive density of wind speeds originates from outside the wind farm domain, (ii) the target wind turbine is not a free-flow wind turbine and (iii) the mean wind speed from the distribution is less than a reference wind speed near the rated wind speed.
If the three mentioned conditions are fulfilled, the predictive power density is corrected with the directional wind turbine efficiency factor, using its P50 value.

RESULTS AND DISCUSSION
This section evaluates the methodology proposed to derive density forecasts of wind turbine power in a 5-min-ahead horizon, based on remote sensing wind speed and direction observations. Special emphasis is given to the prediction of ramps. Finally, we discuss the advantages and limitations of the methodology and the research required to further improve the forecasts.

Evaluation of the RF model as a wind farm power density forecast
Despite the fact that the aggregated power forecast is of higher relevance for the end-user, the analysis of the prognoses for the individual turbines discloses important insights into the performance of the RF model and the dataset used. Figure 16 (left) displays the estimated density forecasts generated by the RF model for WT12 on 17 November 2016. For the whole depicted period, WT12 is in a wake situation, as shown in the radar images in Figure 5. All four ramps observed during the convective day of 17 November are predicted by the RF model. However, there are some minutes of delay in the predictions of the ramps, which we associate with the broad range of wind speeds upstream of the wind turbine. Figure 16 (right) shows the predictions for WT19 on 1 January 2017. As seen in Figure 10, WT19 is in a wake situation before the strong ramp starts. During that time, strong wake effects are seen and observations sometimes fall outside the 90% prediction intervals. The extreme ramp up is however very well predicted by the RF model. Unlike 17 November, there is a well-defined northerly wind direction, and the range of wind speeds observed in the spatio-temporal window of WT19 is much more uniform.
To evaluate the skill of a density forecast, its sharpness under the constraint of calibration needs to be assessed. 54 Sharpness refers to the spread of the density forecast and is only a property of the forecasts. Calibration, however, is a joint property of the forecasts and observations and indicates the statistical consistency between the predictive distributions and the observations. We benchmark the RF model with the probabilistic version of persistence, which we build using the persistence point forecast and the 19 most recent consecutive observed values of the persistence error, as described in Gneiting et al. 54 To assess the forecasting model, we use the complete dataset (N = 2512) including samples during ramps and nonramp events.
The overall skill of the forecasts is assessed with the average continuous ranked probability score (CRPS). The continuous ranked probability score (crps) evaluates the spread of the predictive densities in regard to the observation 54 and is given by where F is the cumulative distribution function of the predictive density, x 0 is the observation and is the Heaviside step function, which takes the value 1 when x ≥ x o and 0 otherwise. The crps can be understood as the absolute error in a single-point forecast. Hence, the lower the crps, the better the density forecast. To benchmark the RF model, we use the CRPS in its discretised version, which is given by where T refers to the number of samples analysed. Figure 17 shows the results of the CRPS for RF and persistence for all individual wind turbines, averaged over all periods. To illustrate the effect of correcting for the wake effects with the wind turbine efficiency factor, the CRPS before applying the correction is also shown (RF-nwc).
When evaluating the CRPS only during ramp periods (Section 3.2), the RF model outperforms persistence. For 29 out of the 35 wind turbines ( Figure 17A), there is an improvement over persistence in the CRPS. The performance when averaged over all periods ( Figure 17B) is less convincing compared with the ramp events only. However, the impact of the wake modelling can be analysed. For wind turbines WT1-WT7, an improvement over persistence is observed. Given that the prevailing wind direction during the measurement campaign was south-south-westerly, the wake effects on these turbines were not so prevalent, as indicated by the small difference between the CRPS of RF and RF-nwc. For the remaining turbines, the results are less satisfactory. Compared with persistence, the RF model has a higher CRPS for the second row of wind turbines (WT8-WT14). For these turbines, the wake corrections strongly improve the results. However, for WT8-WT14, most of the wind speed observations (during south-westerly winds) originate from outside the wind farm domain. This strongly influences the predictions for these wind   turbines and indicates the need for better wake modelling. For the rest of the wind turbines, the CRPS is slightly higher than for persistence, and the wake corrections have less influence. The overall results for the wind farm are shown in Table 5. Overall, the RF model has a smaller CRPS than persistence for the ramp periods, whereas the opposite is the case when considering all periods.
To assess the statistical consistency between the prediction intervals and the observations, we use the reliability diagram. 54 In a calibrated probabilistic forecast model, x% of the observations should be below the xth percentile of the forecast distributions. We compare here the reliability diagrams of the RF model before the wake correction (RF-nwc), the RF model and persistence. The reliability diagrams displayed in Figure 18 have been built using 5% quantiles. Figure

Evaluation of the RF model for ramp event forecasting
In this section, we evaluate the DD radar-based forecasting model for the aggregated power of the wind farm during ramp events. To aggregate the marginal predictive densities of power, one should first model the spatial dependency among the predictive distributions of each wind turbine.
This can be done using a copula approach as shown by Gilbert et al. 55 Second, to detect wind farm power ramps in a probabilistic frame, the temporal structure between each time-step distribution needs to be modelled. Based on this, wind farm power trajectories can be generated, as explained in Pinson et al. 56 Due to data limitations and the fact that we do not have a uniform number of wind turbine marginal densities at each time step (mainly caused by the radar data's limited range and availability), we omit these two steps. Instead, we evaluate the aggregated deterministic power and we compare it against the benchmark persistence. The aggregated power is obtained by adding up the mean value of all wind turbines' predictive densities. We later run the ramp event detection algorithm for both SCADA and RF aggregated power, using concurrent wind turbine data and time steps. Given that the benchmark persistence would perfectly 'predict' the ramps, but 5 min late, we compute the improvement over persistence in the root-mean-square error. This step is only carried out for the ramp events intervals detected in the SCADA data. Finally, forecasts of wind speed and direction at 100-m height from a multischeme ensemble NWP model are also considered. Given that the temporal resolution of the NWP is hourly (instantaneous values), we will make a rough evaluation of the NWP forecasts by analysing the variability of wind speed and direction of the 75 ensembles during the ramps.  Note: Mean absolute values of the magnitude error ( C ), the timing error ( T ) and the duration error ( D ) are also shown. Improvement in the root-mean-square error over persistence (Imp PER ) is averaged for the ramps in each period. January 2017, the RF model could anticipate the strong ramp with only 2 min of timing error and with a very low magnitude error of 2.5%. In this case, the improvement over persistence is 86%. This wind speed range is significantly higher than the SCADA-registered wind speeds (see blue lines in the graphs). Around 1 h later, the NWP system predicts lower wind speeds. In addition, a change in wind direction towards the west, as experienced in the wind farm, is also forecasted later. During 18 November 2016 ( Figure 19, right), a relatively uniform wind speed forecast is observed in the NWP model, with wind speeds ranging from 10.45 to 13 m/s. The NWP forecasts strongly overestimate the wind speed observed between 15:00 and 19:00. 12.15 m/s. In this case, the observed ramps could presumably be captured by the NWP model, as we do not have higher frequency data. For the strong ramp on 1 January 2017, the NWP model clearly forecasts this strong ramp event with good performance, as it is shown in Figure 20 (right). In addition, the strong change in wind direction caused by the front is relatively well predicted by all ensemble members, although it is forecasted with some anticipation.

Discussion
As shown by our results, remote sensing observations can be used to generate skilful minute-scale density forecasts of wind turbine power.
When considering all events, the CRPS of the benchmark persistence was better than the RF model for most of the wind turbines. However, the results during ramp events showed that the RF is able to generate more skilful forecasts during ramps than persistence. This is because the RF model can anticipate strong changes that are hard to predict without upstream information. On 17 November 2016, the NWP model presented a 2-h timing error when predicting the wind direction change. On 18 November 2016, the ensemble wind speeds showed a strong uniformity, and none of the strong meteorological events were resolved. The limitations of NWP when forecasting non-horizontal processes have already been indicated in Zack. 7 By contrast, we have demonstrated that local ramp events, nonresolved by lower resolution models, can be forecasted using horizontal DD wind speed data at 100-m level. In our RF model, we only used data at the 100-m level (near hub height), as the data availability below 100 m was limited. Although including wind speed data at other heights should be able to improve the predictions of the ramps, we have shown that we can anticipate those situations by only using wind data at hub height.
One of the main limitations of our model in its present form can be observed during the period between 5:20 to 5:45 on 1 January 2017 ( Figure 16, right). There, strong wake effects led to a lower wind farm production than what the RF model could predict using the SCADA-derived wake corrections. Although the density forecasts cover the occurrence of these situations, improved wake modelling could enhance and better calibrate the predictions. In addition, systematic errors between the predictions and the observations, like the one discussed here with the wake modelling, could be reduced in real time by considering a moving-average forecasting error term.
The results presented in this work show the importance of using remote sensing observations to generate, or at least to complement, state-of-the art forecasting models. The work presented here has been limited to lead times of 5 min due to technical reasons. For the analysed wind directions, the area covered by the DD measurements extends only over a short inflow distance due to the employed radar scanning strategy. However, DD radar measurements such as the one employed here can cover further distances, as shown in Figure 1B. Those extended measurements could enhance the forecasting horizon and are expected to contribute to reducing uncertainty in the prediction of offshore wind power ramps.

SUMMARY AND CONCLUSIONS
In recent years, there has been an increased interest in forecasting minute-scale events, as they have a strong impact on the electrical grid.
In this paper, we used observations from a DD radar system to detect, characterise and forecast in the minute-scale strong ramp events in the Westermost Rough offshore wind farm. We employed the horizontal information at different heights provided by the radar to investigate the meteorological conditions which induced the minute-scale ramp events. By comparing SCADA, radar and WRF data, we concluded that high-resolution data, such as DD radar observations, are required to understand the meteorological conditions which induce 'local' ramp events.
Before generating minute-scale forecasts of wind power, we conducted a comparison between SCADA power and DD-derived power. For that purpose, we used a site-specific power curve based on DD measurements, and we extracted DD wind speed data in front of the rotor of each wind turbine. During the occurrence of the ramps observed in the SCADA data, the radar was able to capture 86% of all detected ramps with very small timing and magnitude errors. However, the radar's low data availability during some of the periods evaluated led to some ramps not being detected accurately.
Finally, we tested a remote sensing-based forecasting model, which propagates wind turbine upstream observations of wind speed and direction to generate density forecasts of wind power. We extended the model to waked wind turbines. To do so, directional corrections for wake effects were applied among the wind turbines. Results indicate that remote sensing observations can be used to generate sharp and calibrated density forecasts of wind turbine individual power. However, further research should be directed towards improving the modelling of the wake effects (i.e., during different atmospheric conditions). During ramp events, the remote sensing-based forecasting model strongly outperformed the different statistical and physical benchmarks. This demonstrated the benefits of using remote sensing observations upstream of the objective wind turbine.
A future line of investigation should be directed towards the aggregation of marginal predictive densities of wind power to generate probabilistic wind farm forecasts. Research is also needed to account for changes in the wind field while travelling to the wind turbines. The propagation of the wind field vectors with a persistent trajectory is a simple approximation with limited validity, and the real uncertainty should be considered.
Finally, further work should also focus on extending the forecasting horizon and generating forecasts for different minute-scale horizons.