Wind observations from hot‐air balloons and the application in an NWP model

In this paper, we report on a wind observation method based on the movement of hot‐air balloons (HABs). A quality assessment was carried out by comparing against wind observations at the meteorological tower of Cabauw in the Netherlands during May–September 2018, and the obtained standard deviations in error were σu=0.65ms−1$$ {\sigma}_u=0.65\;{\mathrm{ms}}^{-1} $$ and σv=0.69ms−1$$ {\sigma}_v=0.69\;{\mathrm{ms}}^{-1} $$ for the measured zonal and meridional wind components, respectively. Subsequent comparison against short‐term forecasts of the HARMONIE‐AROME model showed a standard deviation of 2.5 ms−1 for the wind vector difference. From the HAB observation set, a case was selected with a rapidly changing wind field belonging to a small intensifying depression. The HAB wind observation was applied in data assimilation as a proof of principle for a single‐observation experiment. It is shown that in a complex baroclinic situation, the model state is slightly improved.


| INTRODUCTION
Mesoscale numerical weather prediction (NWP) models need highly temporal observational data for analysis (initial state) and verification. As a result, there is a great demand for high-resolution observations. The atmospheric boundary layer (ABL) is not frequently sampled, and it is desirable to collect a large amount of data. Wind data derived from hot-air balloon (HAB) tracks provide useful information, as was shown by De Bruijn et al. (2016).
In order to achieve accurate forecasts, it is essential to effectively incorporate both planetary scales and small-scale phenomena into NWP model (Gustafsson et al., 2018). Planetary scales enter a limited-area model via lateral boundaries, which are obtained from a global model in which the limited-area model is embedded. Subgrid-scale processes cannot be analysed and are generated by the physics. In data assimilation (DA), only resolved processes are treated and subgrid phenomena are generated by physical parameterizations. Morning and evening transitions are flow regimes that are not fully understood, and more measurements are needed (Lothon et al., 2014). Wind lidars (Knoop et al., 2021) can provide wind profiles at one location, whereas unmanned aircraft (Lappin et al., 2022;Rautenberg et al., 2018) deliver ABL data on a larger spatial scale. Air traffic provides a huge amount of wind data (De Haan, 2011;Petersen, 2016). Such data have good time resolution, but they are concentrated in flight corridors and the ABL is hardly sampled. The ABL is sampled only in the vicinity of airports. Satellites, for instance, Aeolus (Rennie et al., 2021), provide wind information on a global scale. It can be considered as a curtain along the satellite track. Besides, Aeolus can also be used for kilometre-scale models (Hagelin et al., 2021). It has been shown that Aeolus has an impact on the analyses but not on forecasts.
A large number of observations are needed for a better understanding of ABL turbulence. During evening transition, the convective ABL transforms into a neutral and subsequently into a stably stratified ABL; during morning transition, this process is reversed. Scaling variables, for example, the buoyancy flux, are key in ABL parameterizations. The scaling variables change and the turbulence scheme cannot adequately handle the different flow regimes; for instance, too much mixing will not represent sharp gradients of low-level jets (Bosveld et al., 2014). Sun et al. (2022) and Nielsen-Gammon et al. (2007) have shown that local ABL wind observations have a positive impact on the analysis of mesoscale models. So far, wind measurements from HAB tracks have never been assimilated into NWP models, and a feasibility study is recommended.
In this paper, we aim to assess the quality of HAB observations. We also want to know if they can help detect and solve model deficiencies. Finally, we start with a feasibility study to assess the impact of HAB observations in a DA system. We investigate whether a wind observation by HAB can push an NWP model in the right direction in a rather complex situation.
Obviously, weather balloons can be used to obtain a large number of observations, but they are expensive and only infrequently launched at sparse locations and remain for a relatively short time in the ABL. On the contrary, a HAB provides low-cost observations and remains in the ABL all the time. HAB-derived wind data are typical crowd-sourced data because HABs are not primarily launched for gathering wind information. In fact, it is a leisure activity that can also provide useful wind data.
In Section 2, we describe the high-quality Cabauw mast wind measurements and the HAB wind observations and compare them. We study the HAB wind error and investigate how this error behaves a function of distance to the Cabauw tower. Subsequently, we give an overview of the HARMONIE-AROME model in Section 3 and explain the pre-processing of the HAB data, which is necessary for comparison with an NWP model. In Section 4, we first validate the HARMONIE-AROME model with Cabauw wind mast data, which provide the reference. Then we repeat the procedure with HAB wind data. In Section 5, we present the study of a case that took place during conditions prior to approaching severe weather. The impact of a single HAB wind observation is studied in Section 6 with a DA experiment. Subsequently, in Section 7, we discuss all the obtained results of this paper. Finally, conclusions are drawn in Section 8.

| Cabauw wind observations
The Cabauw meteorological tower is located in the western part of the Netherlands (51.971 N 4.927 E) in a predominantly rural area . In the north, there are scattered farmhouses; in the east, there is the village of Lopik; and the other sectors comprise open fields and the river Lek. The average roughness length is 0.15 m (Verkaik & Holtslag, 2007). Cup anemometers and wind vanes are mounted at heights of 10, 20, 40, 80, 140 and 200 m. The accuracy of the cup anemometer is 1% for wind speeds (or 0.1 ms À1 for low wind speeds) and 3 for wind directions of the wind vane. Precautions are taken to avoid large flow obstruction from the 213-m-tall mast and the main building. The response length of the cup anemometer is 3 m, which means that air would have travelled 3 m before 63% of step-wise wind change is adapted. In the Cabauw data set, we have selected timeslots that correspond to the start and end time of the HAB trajectories. The wind tower data are available as 5-min averages, which corresponds to the processed time resolution of HAB wind data (De Bruijn et al., 2020).

| HAB wind observations
Our data set consists of 90 HAB flights during the months of May-September 2018. We had sent an email to the balloonists with a request for HAB flight data from the surroundings of the Cabauw mast. Their responses are the basis of our data set. Basically, the HAB data are the global navigation satellite system (GNSS) data from the pilot's navigator ( Figure 1). Two successive positions in combination with the time interval deliver the balloon's ambient air velocity (De Bruijn et al., 2016). The accuracy of the measured position depends mainly on the constellation of the GNSS satellites. If we average in time (5 min), the typical values for the standard deviation in the horizontal and vertical planes become 2.5 and 30.0 m, respectively (De Bruijn et al., 2020). Note that we neglect the altitude difference between the GNSS receiver in the gondola and the centre of mass of the aircraft. An HAB is a large body with substantial inertia and does not respond immediately to a changing wind. In De Bruijn et al. (2016), the response length for an HAB has been derived; for an average-sized HAB, the response length is approximately 100 m, which means that air has to travel 100 m along with the HAB before adapting to 63% of stepwise wind change. With an initial difference of 2 ms À1 , this takes about 5 min; see De Bruijn et al. (2016).
For the present study, a subset is defined by a circular area with a 30-km radius centered at the location of the Cabauw mast. The start location can be inside or outside this zone and is determined by the balloonist. A part of the flight should be in the circular area. A good start location should have no tall obstacles and an undisturbed wind. The pilot makes an estimate of where he could possibly land using predicted winds, payload and the amount of fuel. A favourable landing place is an uninhabited area far away from power lines, (rail)roads and inland waters. The flights commence in the evening at around 18:00 UTC, and the duration of the flight is on average between 60 and 90 min.
In Figure 2, we see that HAB flights are not occurring every day, and gaps can be recognized. HAB flights can only take place only when the weather conditions are favourable, for example, a stabilizing ABL with light winds, no wind gusts, clouds or rain. Note that most of the flights take place in July and August, typical months characterized by calm weather and a period of long daylight. Ideally, the HAB flights should have passed over the Cabauw site.

| Assessment of the HAB error
Now, we compare the wind components measured by HAB and Cabauw and investigate how the deviation between the HAB and the Cabauw mast wind data behaves as a function of distance to the Cabauw mast. As a first step, wind observations of the mast are vertically interpolated to the HAB elevation. We have selected those HAB observations that are in the range of the 213-m-tall mast. In an area with a radius of 15 km, the standard deviations of the difference between the zonal (u) and meridional (v) wind components measured by the Cabauw mast and HAB are rather small: σ u ¼ 0:40 ms À1 and σ v ¼ 0:45 ms À1 , but the sample size N is also small, N = 440. In order to increase the sample size, we decided eventually to enlarge the radius to 30 km, and the standard deviations increase slightly, σ u ¼ 0:65 ms À1 and σ v ¼ 0:69 ms À1 , but the number of observations becomes double (N = 927). In Figure 3, the frequency diagram shows the distribution of the differences in wind components between HAB and Cabauw. These differences are small in general, and the largest difference is just above 3 ms À1 . In Figure 4, the HAB and Cabauw wind observations are presented in a scatter diagram. The cloud of points is close to the 1-to-1 line, which means that the uncertainty in both observational systems is small. Both systems are sampling a neutralstable ABL with a rather homogeneous wind field. Further, the maximum u, v components are not beyond 6 ms À1 , which confirms the light wind regime. Table 1 shows that part of the wind error can be attributed to the distance between HAB and the Cabauw mast, and the error increases with increasing distance. As the uncertainty in the Cabauw cup anemometer wind data is substantially smaller, the estimated errors in the first bin (0-15 km distance) provide the best estimate of the HAB error; for larger distances, the total error is the summation of the HAB error and wind variations.

| HARMONIE-AROME
The main characteristics of the HARMONIE-AROME model (Bengtsson et al., 2017) are summarized in Table 2. A 3D-Var DA scheme is used to assimilate conventional observations from synops, buoys, ships and radiosondes as well as from aircraft and satellites, which are available in the model domain. In a 3D-Var DA scheme, it is assumed that all observations have been measured in the analysis time. This is generally true for conventional observations. However, aircraft and satellite observations are asynoptic, introducing a time shift between observation and the model background state (Marseille & Stoffelen, 2017). This timing error can be mitigated by choosing a narrow time window or resolved by using a 4D-Var DA scheme, which is currently available in the research mode. Note that HAB observations are also asynoptic.
HARMONIE-AROME is embedded in the global ECMWF (European Centre for Medium-Range Weather Forecast) model, and it receives large-scale information via the lateral boundaries. As such, the model benefits indirectly from the worldwide satellite observations assimilated in the ECMWF model (Bauer et al., 2015). HARMONIE-AROME has a boundary-layer scheme that is based on the evolution of the turbulent kinetic energy (TKE) equation (Lenderink & Holtslag, 2004). In 2.5-km models, such as the non-hydrostatic HARMONIE-AROME model, the spatial scales smaller than $7 times the model grid size, that is, 20 km, are not resolved (Mile et al., 2021;Skamarock, 2004), and to account for them, they have to be parameterized; ideally, the model departure should not contain scales smaller than 20 km.
In this paper, we use only the model background state, which is the forecast of the previous assimilation cycle with a lead time equal to the cycling time or assimilation window length. Statistics of the observations minus background, denoted as (O-B) is an important diagnostic for NWP models to check for model and/or observation errors. Biased observations are detrimental to data assimilation and should be removed. We have chosen a model set-up with a cycling time of 3 h, which means that for every 3 h the analysis is done, where observations and the +03 h forecast are optimally combined using a variational technique for model analysis, which is the initial state for the next cycle. We have used a hind-cast experiment, which means that all observations are available for data assimilation and forecasting. We have used background information (+03 h forecast) to validate the observations and to trace back biases. Note that the HAB and Cabauw observations in our study are not assimilated in the F I G U R E 4 HAB flights as depicted in Figure 1, but now cross-validated with mast observations at Cabauw (The Netherlands) during May-September 2018.
T A B L E 1 HAB errors as function of distance to the Cabauw mast. HARMONIE-AROME model, which prevents a true comparison between model and observations.

| Pre-processing of the HAB data
The HAB data have to be processed before they can be applied in the NWP model. We describe the steps to be taken. HAB observations can be considered as a sequence of point observations and consist of three-dimensional coordinates and a timestamp. The coordinates are referenced to a spheroid of a geographical coordinate system. A spheroidal height is a geometric quantity and does not have a physical base and may fall above or below the actual earth's surface. Therefore, the spheroidal heights have to be converted to gravity-related elevations. This is usually done in the balloonist's navigator. Subsequently, the elevations have to be merged into the hybrid coordinate system of HARMONIE-AROME (Bengtsson et al., 2017). The model levels are defined by the a and b coefficients and the surface pressure. The vertical plane in the hybrid coordinate system is defined as follows: P i is the pressure at model level i; a i , b i are the coefficients that determine the closeness of the system to the σ-coordinates (a i ¼ 0) or the p-coordinates (b i ¼ 0); and P s is the surface pressure. The coordinate system is nonorthogonal and terrain-following, and the vertical spacing is defined with 15 levels below an elevation of 2000 m. The model levels, which are expressed in pressure coordinates using Equation (1), have to be transformed to z-coordinates. To achieve this, we assume a temperature and humidity profile of the standard atmosphere as proposed by Holton (1967) and integrate the thickness equation to obtain the required elevation. Here, R is the gas constant, T the average layer temperature, q the specific humidity, g 0 the gravity acceleration at surface level, P the pressure and P s the surface pressure. The advantage of using the standard atmosphere is that observations and NWP output are not mixed, thus avoiding correlation errors. Alternatively, profiles of temperature and humidity from the NWP model could have been used, which would give a better estimate of elevation, accepting the possibility of correlated errors. The HAB data have a high temporal resolution. In our data set, there are flights with a sampling rate of 4 s. To reduce the noise of GNSS positions, the data are averaged at 5 min; see De Bruijn et al. (2016). Subsequently, the model is interpolated to the HAB observations. The pre-processing is completed with the elimination of the measurements below 10 m. Indeed, lower observations are erroneous because an HAB usually stops by being dragged over the ground.

| VALIDATION OF HARMONIE-AROME APPLYING CABAUW MAST AND HAB WIND DATA
First, the Cabauw tower observations are compared with HARMONIE-AROME, and subsequently, the exercise is repeated with the HAB wind data. We recall that the verification period is May-September 2018 and the timestamps are identical to the HAB observation data set. The Cabauw validation can be regarded as a reference validation. In Figure 5, we compare the first guess model state (+03 h forecast) with the mast observations. At the lowest levels (10 and 20 m), the bias of the u-component is negative; when it is in the air for longer, it becomes almost zero. The v-component shows a positive bias, which decreases with height, becoming negative above 40 m. This can be explained by the fact that the mast observations are not representative for a 2.5-km grid, especially at the lowest levels. Next, we focus on the HAB data. The HAB winds have varying coordinates and are present only during a flexible time slot of about 90 min. HAB flights usually occur when the atmosphere is changing from unstable to stable, and vice versa. In Figure 6, the +03 h forecast (background) is compared with the HAB wind data in the same vertical range as the Cabauw observational tower. The data are binned in vertical intervals of 5 m, and the bias and standard deviation per bin are calculated. The number of observations per bin is presented in the right panel. Because the HAB observations are from a moving platform, the land surface characteristics vary along the trajectory, for instance, land use, albedo and roughness. These heterogeneities are also defined in the NWP model context, but because of the limited grid box size, not every detail is described. This has impact on the wind, and this deviation is the so-called representation error. Also in Figure 6, we find a slight positive bias in the v-component and a negligible bias in the u-component. The mean bias of the wind vector is 0.5 ms À1 , and the mean standard deviation (σ) is 2.5 ms À1 . It is encouraging to see that other wind observing systems show similar errors. For example, De Haan (2011Haan ( , 2016 showed that the accuracy of wind observations derived from an air traffic control surveillance radar (Mode-S EHS) was around 2.5 ms À1 when compared with radiosonde and In Figure 7, we present the complete HAB data set, including both the observations at heights above the Cabauw tower and at distances more than 30 m from the Cabauw tower. The maximum data density per bin is at 120 m because this is the cruising height at which the balloon is safe from obstacles such as trees and power lines and where the passengers in the gondola still can enjoy the scenery below.
The height-averaged bias is 0.5 ms À1 , and the heightaveraged standard deviation (σ) is 2.5 ms À1 . We also recognize a varying bias with height, which was also present in the profile of the restricted area (see Figure 6). This variation of bias with height was also noted by De Rooy and de Vries (2017). They discovered that the TKE scheme underperforms in weakly stable conditions. The wind speeds were overestimated, and this could be resolved by allowing more mixing.

| CASE STUDY: SMALL LOW-PRESSURE AREA WITH RAPIDLY CHANGING WIND FIELDS
The overall statistics as presented in the previous section may hide the characteristics of extreme events. In this section, we focus on a case with pre-conditions of adverse weather to demonstrate the potential of HAB wind observations. On 7 August 2018, in the late afternoon, there was a small low-pressure system in the southern part of the North Sea, which deepened and moved in a north-easterly direction over the Netherlands. The wind changed gradually in strength, but the wind direction remained constant during the HAB flight. The HAB took off in Buren (36 km from Cabauw) in quiet conditions; see Figure 8. Immediately after take-off, there was considerable wind shear in terms of wind direction. This was probably caused by local conditions and by the fact that wind usually veers with height (De Bruijn et al., 2016). Note that, as shown in Figure 9, initially the HAB speed was less than the predicted wind speed and it moved westerly when it went higher in air with a velocity of 2 ms À1 . Later during the flight, the wind started to increase, which was not predicted by the model.
In Figure 9, we show the HARMONIE-AROME wind data interpolated to the HAB trajectory. For the model data, between 18:38 and 18:52 UTC, there is a small increase in u and a large decrease in v, which implies that the wind direction has backed. A similar pattern (increase in u, decrease in v) is visible in the HAB data from 18:48 UTC to 18:52 UTC. So, it is obvious that both sets of data indicate a change in direction. Of course, the HAB is descending at this time, so this is exactly what would be expected because of the surface roughness/drag (even without any change in the geostrophic wind direction). The change in direction is also visible in Figure 8  balloonist was forced to land in the outskirts on Culemborg (25 km from Cabauw), fearing a further increase of the wind speed. At the Cabauw mast, at 200 m height at 21:00 UTC, a wind speed of more than 10 ms À1 was measured, which confirmed that the pilot had made the right decision. It is clear that HARMONIE-AROME was not able to pick up the right position of this depression, resulting in erroneous wind fields, which was already foreseen in the large (O-B) values of this case.

| STUDY OF ANALYSIS IMPACT
A question that arises is whether assimilation of observations from HABs can model the actual atmospheric conditions. From data assimilation theory, the 3D-Var analysis equation reads as X ! is the model state in terms of the state variables (ps, u, v, t and q), Y ! are observations, B is the background matrix, R is the observation matrix and H X ! is the observation operator. The 'a priori' information is the model forecast valid at the analysis time, the model background. The challenge in data assimilation is to find the optimal analysis X ! a field that minimizes a (scalar) cost function, which is defined as the distance between X ! and the background X ! b , weighted by the inverse of the background error covariance B, plus the distance to the observations Y ! , weighted by the inverse of the observation error covariance R. The minimum variance solution is where K is defined as K is the Kalman gain matrix that determines the spatial structure of the increment and the relative weight given to the observation and background in the analysis. A complete data assimilation experiment requires a large sample to demonstrate the statistical significance of the results. This is outside the scope of this paper. Instead, we will focus on the impact of a single wind vector observation from HAB on the model state and discuss the need for a more extensive impact study.
For the single wind vector experiment, we used the three-dimensional variational (3D-Var) assimilation system, operational at KNMI. We focus on the case study presented in Section 5. In a 3D-Var assimilation system, all observations are assumed to be measured at the analysis time, that is, 18:00 UTC in our case. From Figure 8, many wind HAB vector observations are available from HAB near the analysis time, and we select the one at the exact analysis time. At KNMI, we run 3D-Var eight times per day, i.e., in 3-hourly cycles. In 3D-Var, we start the 18:00 UTC analysis from the 3-h forecast from the previous analysis at 15:00 UTC, the so-called background or first guess (X  Figures 10 and 11 show the two-dimensional increments of the zonal (u) and meridional (v) wind components for the selected case and the HAB wind vector observation at 18:00 UTC. It is good to note that the structure of the increment is mainly determined by the background error covariance matrix, which is part of the Kalman gain matrix. For single-observation experiments, the increment structure is isotropic (concentric) by construction, with the maximum amplitude at the location of the observation. Assimilation of the complete wind vector is not a single-observation experiment, which explains the non-isotropic structures of the increments in  F I G U R E 1 1 As in Figure 10, but now (A-B) for the vcomponent increment.
The static nature of the used background error covariance matrix does not guarantee a similar positive effect on the model state away from the observation, in particular for complex atmospheric conditions as for this typical case. The use of additional observations along the balloon track can further improve the simulated atmospheric state. This requires the correct use of the time of the additional observations: in other words, a four-dimensional variational (4D-Var) assimilation system. 4D-Var is currently in an experimental stage and not yet operational at KNMI.

| DISCUSSION
In this paper, we have seen that HAB wind observations are a unique data source for sampling the ABL, in particular for the beginning of a nocturnal boundary layer, and to diagnose the initial conditions. However, there are still some issues that have to be addressed. First of all, the data set of HAB data is rather limited, but there are potentially more data available. Real-time collection can be used if an appropriate infrastructure is available. Offline collection of data gives access to an abundance of data, because balloonists tend to store their flight parameters. In the future, the metadata of the HAB should be collected because this might be useful for the processing of the data. Knowledge of the call sign of the HAB would give access to typical balloon parameters such as volume, shape (balloon type) and mass.
HAB data can be collected using smartphones (De Bruijn et al., 2016): alternatively, transponders can also be used. Currently, more and more HABs are equipped with transponders, so they are under the surveillance of Air Traffic Control. These data are also used by www.luchtballonradar.nl, a website where HABs can be tracked in real time. Interestingly, this website offers an archive for completed flights as well.
Another issue is on what scale is it still meaningful to assimilate information. What scales are observable and what scales are described by the model? The next step would be to assimilate all available HAB observations to improve the initial stages of typical ABL phenomena in the model. Depending on the atmospheric scales to adapt in the analysis, one could choose to assimilate all HAB observations along a trajectory, but with reduced weight to avoid overfitting and the introduction of observed spatial scales in the analysis, which the model cannot resolve (Skamarock, 2004).
The high-resolution HARMONIE-AROME is a promising model and offers numerous opportunities for improvement. The asynchronous HAB observations will have probably more impact as soon as the 4D-Var assimilation becomes available. The impact should also be considered relative to other observations. In this study, the focus was on the validation of the +03 h forecasts. Clearly, HAB wind observations can also be applied for longer forecast periods, and this is a subject for future research. It should be realized that verification in terms of RMSE is not sufficient to validate the high-resolution model outcome. More advanced verification such as neighborhood methods are necessary to reveal the benefit of high-resolution models and to mitigate the double penalty problem ( Van der Plas et al., 2017).

| CONCLUSIONS
This study showed that HAB flights can provide valuable wind information in the ABL that are in agreement with other observations. In an area with a 30-km radius, the HAB wind data deviate only slightly from the high-quality Cabauw wind mast observations during neutral-stable conditions. The standard deviations for HAB-measured u and v wind components relative to those from the Cabauw mast are σ u ¼ 0:65 ms À1 and σ v ¼ 0:69 ms À1 , respectively.
Comparison against the background state of the HARMONIE-AROME model revealed a standard deviation of 2.5 ms À1 for the wind vector error, which is in the same range as aircraft measurements and radiosondes. HAB flights could provide data from local flows, which are interesting phenomena, especially when such phenomena are not captured by an NWP model or by the regular observational network. We have shown that HAB observations can be ingested by the data assimilation module of HARMONIE-AROME and that they have the potential to push the NWP model in the right direction even in complex baroclinic conditions.
To conclude, these crowd-sourced observations are a welcome addition to the existing observation network, can be used for a better understanding and forecasting of the ABL and can be applied in NWP models.