- Top of page
- 1. Introduction
- 2. Methods
- 3. Results
- 4. Discussion
Experimental weather radars are being developed that could enhance the severe weather warning process by providing higher resolution data sensed closer to the ground and with faster update rates. Because wind speed is an important criterion in the issuance of severe thunderstorm warnings, this research investigates the impact of adding these new data to the forecaster decision-making process. In a static case review setting, 30 National Weather Service (NWS) forecasters evaluated six convective weather cases under two conditions: (1) using (conventional) WSR-88D weather radar data, and, (2) using both WSR-88D and additional data from an experimental four-radar network. Forecasters' predictions of ground level wind gusts, 2–5 min into the future, were compared to measurements from ground-based wind sensors. When provided with the additional radar data participants significantly improved the accuracy of their wind speed assessments (absolute error reduced from 5.9 m s−1 to 4.0 m s−1; p < 0.001), increased their assessment confidence ratings (p < 0.001), forecasted significantly greater wind speeds (20.4 m s−1 as opposed to 17.1 m s−1; p < 0.001), and increased the number of affirmative decisions to warn from 15 to 35 (p = 0.001). While the addition of high resolution, low altitude, rapidly updating radar data is shown to have both qualitative and quantitative benefits, training and warning policy implications for the incorporation of new technology must also be carefully considered as increased accuracy, confidence and higher wind speed estimates may lead to more warnings. Copyright © 2011 Royal Meteorological Society
- Top of page
- 1. Introduction
- 2. Methods
- 3. Results
- 4. Discussion
Severe weather such as hail, high winds, flooding and tornadoes threatens lives and property nearly every day. In 2008 alone, hail caused $ 464 million in property damage, high winds from thunderstorms caused 28 fatalities and $ 1.26 million in property damage, and flash floods caused 58 fatalities and $ 1267 million in property damage (NOAA, 2009). Losses from weather hazards can be reduced if the public is given sufficient warning to take protective action.
Forecasters use remote sensing technologies including radar, satellite and ground sensors to assess and predict hazards, and then to issue and cancel related weather warnings. In the United States, the National Weather Service (NWS) operates 159 Doppler weather radars called Weather Surveillance Radar 1988 Doppler or WSR-88D (Klazura and Imy, 1993) in order to supply data including reflectivity (which relates to precipitation rate) and velocity (which indicates radial wind speed). When operating in a severe storm environment, WSR-88D radars generally perform a complete multiple vertical tilt scan every 4–5 min with a spatial resolution of 1–4 km (Klazura and Imy, 1993). These radars generate both reflectivity and velocity products out to a 230 km radius in range but have more sparse coverage below 2 km (above ground level or AGL (Maddox et al., 2002)).
NWS forecasters make weather hazard assessments and warning decisions using weather products and procedures that help them to maintain a ‘big picture’ awareness, to build conceptual models and to update them with small scale details from radar product interpretation (Andra et al., 2002). Forecasters primarily rely on WSR-88D radar products for real-time weather hazard assessment (Quoetone and Huckabee, 1995; Andra et al., 2002). For example, a forecaster can determine whether a storm is severe based solely on radar products. In the United States, a storm is considered severe when at least one of three conditions is met: surface wind gusts exceed 25.7 m s−1 (50 kt) (determined via interpreting and integrating velocity data), hail exceeds 1.9 cm (3/4 in.) diameter (determined via interpreting and integrating reflectivity data), or tornado production (determined via interpreting and integrating reflectivity and velocity data plus developing a mental picture of storm structure and evolution) (Galway, 1989).
Ideally, forecasters use conceptual models to identify precursors in the radar data to provide proactive warnings. These conceptual models, along with past experience and knowledge of storm physics, allow a forecaster to project winds ‘seen’ in radar data down to the surface and into the near future. Using radar to assess severe surface winds can be challenging because of inherent limitations in data availability and precision. Due to sampling methods and radar spacing, the data are not available uniformly in space and, in some cases, not at all. Radar beams travel in a straight line which limits the coverage area of radar systems to objects on their horizon due to the curvature of the Earth. Radar beams are pointed at angles (tilts) above the horizon: therefore, the atmosphere low to the ground and far from the radar is not sampled. The radar beam spreads out as it travels, resulting in lower spatial resolution with increased distance from the radar. With respect to velocity, Doppler radars can only detect wind speed by the motion of water droplets or other airborne objects moving along the radar beam. Thus, velocity data show radial wind speed which depends on the wind-to-beam intersection angle. Radial wind speed is then negative (towards) and positive (away) relative to the radar, while winds travelling perpendicular to the radar beam are not detected.
Technological advances, such as the introduction of WSR-88D itself, have already made positive impacts on the probability of detection (POD), the false alarm rate (FAR), the critical success index (CSI), and on lead times (Polger et al., 1994; Bieringer and Ray, 1996). Outcome measures such as these are also influenced by procedures and definitions within the NWS. For example, severe thunderstorm warnings, which may cover multiple county areas, may be verified as accurate by a single point report of hail or wind anywhere within the warning area. Also, verification reports coming from the public are more likely to occur in heavily populated areas. Therefore, it can be a challenge to determine whether a warning is accurate.
New approaches to radar design and deployment, and new data dissemination techniques, could enhance the warning process by providing more precise data that are also more accurate. For example, increases in processing power should allow for more effective signal processing, which can create higher quality data with lower cost transmitters. The recently deployed WSR-88D ‘Super-Resolution’ is a two to four times improvement in its output resolution without a change in transmitter (National Weather Service Radar Operations Center, 2009). Smaller antenna designs and low cost transmitters can allow for multiple radar nodes to overlap coverage of an area, thereby helping to fill gaps in coverage and determine true wind velocities (McLaughlin et al., 2009). Also, phased-array antenna technology creates electronically directed beams with little or no moving parts allowing for faster scanning and therefore higher data update rates (Heinselman et al., 2008).
While such advances have the potential to improve the weather hazard assessment and warning process, their exact impacts should be quantified in order to influence training, decision support tool design, normative decision making processes and procedures as well as policy. With respect to resolution, Brown et al. (2005) indicate that radars with greater spatial resolution will report radial wind velocities with greater (absolute) magnitudes for the same volume of the atmosphere and will depict severe storm signatures more clearly than their lower resolution counterparts. With respect to lower troposphere observations, new weather features, such as misocyclones, downbursts, or rear inflow jets can be observed (Bluestein et al., 2007; Brotzge et al., 2010). This combination of changes in sampling may lead to higher forecaster wind speed assessments overall and differences in the number of wind-related warnings.
Systematically designed studies have not been designed to determine the impact of the lower troposphere observations on NWS forecaster decision-making, let alone detailed analyses of how specific weather features that develop and form in the lower atmosphere would impact the warning process. Thus, there is a need for radar system analyses that investigate the quantitative impact of improved design features such as spatial resolution and lower troposphere observations on forecaster decision making (Heideman et al., 1993; Doswell, 2004).
To evaluate the impact on the forecaster decision making process, quantitative outcome and process measures should be considered. For example, to evaluate hazard assessments, judgements can be compared to ground truth where available to determine accuracy. Qualitative measures, such as confidence (Murphy and Winkler, 1984; Nadav-Greenberg and Joslyn, 2009) can also provide insight into how data are affecting a forecaster's decision process. While accuracy is a straightforward measure, previous studies have shown that the additional information does not always increase skill (Stewart et al., 1992; Heideman et al., 1993). Also, many researchers have demonstrated overconfidence in self ratings (Oskamp, 1965; Einhorn and Hogarth, 1978; Fischhoff and MacGregor, 1982) but this has been shown to be a very complex issue (Klayman et al., 1999) and there is some evidence that forecasters are a group of experts who are better calibrated than most (Murphy and Winkler, 1977).
The Engineering Research Center for the Collaborative Adaptive Sensing of the Atmosphere (CASA) is creating a new paradigm for radar systems based on dense networks of low-cost Doppler radars (McLaughlin et al., 2009). CASA radars are designed with a shorter range than WSR-88D (40 vs 230 km) and they can be deployed with overlapping regions of coverage (30 km spacing). When compared to WSR-88D, these technological changes result in increased spatial resolution (median 0.5 vs 2.5 km), temporal resolution (update rates of 60 s vs 4–5 min), and more complete coverage at lower elevations (100% coverage below 1 km AGL vs 35% (McLaughlin et al., 2009)). To address challenges related to velocity determination, radars can be deployed closer together, thereby creating conditions where multiple radars can scan the same portion of the atmosphere. In addition to physical design and layout, a network of CASA radars automatically detect weather features, generate scanning priorities, and allocate sensor resources across the coverage domain (Pepyne et al., 2008; Zink et al., 2008). The dense network of sensors concept from CASA increases the opportunity for a variety of wind-to-beam intersection angles, further improving velocity detections.
CASA is currently operating a four node radar testbed in southwest Oklahoma (McLaughlin et al., 2009) (Figure 1). The cursor readouts of Figure 2 illustrate the increase in resolution and decrease in height coverage: 0.5 Kft (152 m) and 53.14 kt (27.3 m s−1) as opposed to 5.9 Kft (1.8 km) and 20.41 kt (10.5 m s−1) respectively. The effective view of Figure 2 is shown by a small rectangle, ∼1.1 km across, in Figure 1. Improvements in data fidelity alone are expected to improve performance (Stewart and Lusk, 1994) and by design, data from this testbed can be described as ‘more relevant’ and ‘high quality’ (due to filling the current sensing gap), attributes which are predicted to increase accuracy and reliability (or consistency) in forecasts (Stewart, 2001). Also, the data contain additional cues, such as very small-scale rotations and strong low-level winds (Brotzge et al., 2010), that are important to severe thunderstorm or tornado warning decisions.
Figure 1. A 250 km wide map of southwest Oklahoma with county borders. Mesonet stations labelled in lower case with small square markers. Radar sites labelled in upper case. Grey shading indicates urban areas including Norman (near koun) and Oklahoma City (near kokc). A small rectangle approximates the viewing window used in Figure 2. CASA radar 40 km range rings also shown
Download figure to PowerPoint
Figure 2. Radial velocity data from CASA KSAO (2° tilt) view (a) and from WSR-88D KFDR (0.50° tilt) view (b) for scenario 5. NINN and CHIC markers are OK Mesonet ground based sensors
Download figure to PowerPoint
As no studies have measured the impact of such gap filling radar on NWS forecaster severe storm warnings, this study focuses on one weather hazard: high winds. Other studies have investigated some aspects of performance or the impact of new data. However, they have not systematically controlled information sources and measured process and outcome measures of experienced practitioners. Doswell (2004) goes so far as to say, ‘To date, the process of weather forecasting by humans has not been subjected to a thorough and comprehensive study’. The present study measures the impact of the addition of CASA radar data (with its greater temporal and spatial resolution and lower troposphere coverage) on forecasters' assessment of near future winds (on the order of minutes) and related warning decisions. This is an extension of a pilot study (Rude et al., 2009) to include a total of 30 NWS forecasters. In a static part-task setting using a case review paradigm, impacts are measured via forecaster accuracy of predictions of ground level wind gusts, magnitude of these wind assessments, forecaster confidence, and the number of warning decisions.
Based on subject matter expert interviews, both operational and experimental observations of forecasters, and a review of NWS training materials, warning decisions are based on both assessment information (such as visual signatures, current understanding of storm structure, and expected trajectory) and forecaster confidence. When CASA data are provided, this research hypothesizes that surface wind speed assessments will be higher, assessment error will be lower, and forecaster confidence will be higher. This research also hypothesizes that the higher wind speed assessments paired with increased confidence will lead to more affirmative decisions to issue warnings.
- Top of page
- 1. Introduction
- 2. Methods
- 3. Results
- 4. Discussion
Because wind speed is a criterion in severe thunderstorm warnings, the purpose of this study was to measure the impact of the addition of high resolution, lower troposphere radar data on wind speed assessments. Operational forecasters were asked to make wind assessments under two data source conditions, WSR-88D only and WSR-88D with CASA. Forecasters who are provided with the additional CASA radar data significantly increased wind speed estimates by 20%, reduced wind speed assessment error by 30%, and increased confidence for wind speed assessments. In addition, 23 of 30 participants provided written feedback that the CASA data confirmed or refined their mental models of the atmosphere. High resolution lower troposphere radar data clearly had positive effects on forecaster performance. Further, it is very promising that forecasters with minimal training were able to integrate data effectively from an experimental radar system which does not have the same noise level and performance characteristic of a production WSR-88D. These results, and the evaluation method developed, are an important part of engineering a successful radar system.
The part-task case-review setting successfully engaged the interest and motivation of NWS forecasters. Even with a small number of cases forecasters found ‘real events’ to be engaging. Controlled addition of the new radar data enabled comparison with the conventional system. Qualitative and quantitative measures during the decision process were linked to outcomes, such as accuracy and warnings. The combined feedback gathered by this method is generated by practitioners and specific to a genuine sub-task of their process.
The increase in mean wind speed assessments, for forecasters using both WSR-88D and CASA data sources, agrees with prior work (Brown et al., 2005). The combination of both research efforts show that increased sampling generates the possibility to capture higher wind speeds which in turn leads to higher displayed wind speeds, leading to higher assessments of maximum winds. These higher values in the CASA data were visible in the display and observed by the forecasters resulting in wind speed assessments higher than with only WSR-88D data. However, some additional CASA data were available at lower altitudes where conditions may have differed. The mean of ‘max on display’ values (across all scenarios) for the CASA source is 14.1 m s−1, whereas WSR-88D is 16.6 m s−1. This shows that the forecasters did more than report the last display value from the lowest levels at the target location (otherwise wind speeds should have been lower when given CASA data). Forecasters may have been looking at data values further away from the target to compensate for storm motion and the 2–5 min forecast period.
These higher estimates were closer to the ground truth obtained from automated sensors, resulting in lower mean error. The results show that forecasters were able to reduce their wind speed assessment error using this additional data source. This implies that forecasters are able to sift through the extra data points from increased spatial resolution and find the data that are the most informative to their mental model. Forecasters often commented that they trusted the CASA data more because it was closer to the ground. While these results are promising, this research was conducted in simulated operations. Future work should investigate how these additional data impact on the job performance where information overload could potentially harm performance. In addition, future work should provide a wider range of task and scenario combinations in order to identify the impact of individual forecaster performance.
The shift in warning decisions across all scenarios, from 17% ‘Yes’ with WSR-88D only to 39% ‘Yes’ with WSR-88D and CASA data, is interesting, especially because only two scenarios (the first and second) were covered by an actual warning according to NWS archives. This shift may be related to the increase in wind speed assessments when given CASA data which are based on the higher radial velocity values in the display. However, whether this shift is beneficial is a matter of NWS policy, including the event verification process. Since most scenarios had near but sub-severe winds, it seems appropriate that some but not all warning decisions were altered. This implies the new data supported both negative and affirmative warning decisions.
Forecasters were revealing a confidence in their higher speed estimates both in their higher confidence ratings and their shift to more warning decisions. In order to understand these confidence ratings better, a Spearman's non-parametric correlation was used to investigate any relation between absolute wind speed assessment error and confidence rating. While this test is not strictly appropriate given the experimental design, its results may still be informative. As expected this correlation is negative, increased confidence does correlate with decreased error for this experiment (rs = − 0.198, N = 175, p = 0.004, one-tailed) suggesting that overconfidence was not a major issue. Future investigation may be able to address this in a more appropriate fashion and provide insights on the differences between ‘familiar’ and ‘experimental’ data sources.
The increase in warnings for the same scenario when given additional radar data has implications for operational forecasters. They will need to adapt their mental models to incorporate the low altitude wind data and the increase in gust estimates. The results indicate an increase in warnings as previously missed low altitude events are now detected. With an increase in detections, the number of false alarms may increase even as more events are properly warned and skill increases. A policy decision may be needed regarding the threshold for severe winds or the size and duration of warnings that would incorporate these low altitude events and properly alert the public (Morss et al., 2010). System enhancements like WSR-88D ‘Super Resolution’ (National Weather Service Radar Operations Center, 2009) increases the spatial resolution of data while experimental radar like MPAR (Heinselman et al., 2008) has similar spatial resolution with update rates of 1 min. The CASA system features even greater spatial resolution, rapid updates (1 min for the current testbed), and low altitude data. As systems such as CASA, WSR-88D ‘Super Resolution’, and MPAR come on-line, the forecast community may need to revisit the policies and thresholds for issuing warnings.
These results are promising given the limitations of the experimental design. The WSR-88D-first task set was inherently harder than the CASA-first task set based on the difference between the source data ‘max on display’ and the ground truth (Table I). The WSR-88D-first task set had a larger total difference than the CASA-first task set (47.9 vs 27.8 m s−1). For the three scenarios with CASA data in the WSR-88D-first task set, the differences between the CASA ‘max on display’ and the ground truth summed to 36.9 m s−1 and for the three WSR-88D only scenarios, 11.0 m s−1, yielding a total difference of 47.9 m s−1. For the three scenarios with CASA data in the CASA-first task set, the differences for CASA sources summed to 11.4 m s−1 and those with WSR-88D only data summed to 16.4 m s−1, yielding a total of 27.8 m s−1.
Also, there were some interactions between task-set and data source which can be expected due to natural variations in the scenarios and the alternating of data sources across participants. This alternation effectively pairs half the scenarios against each other and no two weather scenarios could ever be perfectly matched. For the ‘max on display’ heuristic across the four combinations of task set and data source, WSR-88D-first with CASA sources has the lowest wind speeds (mean 9.4 m s−1) and the greatest difference to the ground truth (mean of absolute values, 12.3 m s−1) whereas WSR-88D-first with the WSR-88D source has the highest wind speeds (mean 20.2 m s−1) and the smallest ground truth difference (mean of absolute values, 3.7 m s−1). While this is not a perfect match with the observed interactions it was also shown that forecasters out-performed the simple ‘max on display’ heuristic.
As noted above, the scenarios were placed in time sequential order. This design choice could have influenced performance in later scenarios because of the information given in earlier scenarios. However, any potential effect is representative of real operational forecasting where weather events progress throughout the day.
WDSS-II (Lakshmanan et al., 2007), while an effective tool, would ideally be replaced by standard NWS operations software to remove additional confounds and allow detailed warning generation. This standard software, called AWIPS, provides data from many sensors in real time, allowing forecasters to interrogate them quickly visually and with built-in tools (Raytheon Company, 2009). Experienced forecasters have customized AWIPS display configurations as well as strongly developed motor and cognitive routines for accessing radar data in an orderly fashion to build their mental model of the storm. However, the interface control differences between the WDSS-II software used and AWIPS interfered with these routines. Further, radar rendering and display differences may have caused additional error in interpretation due to colouring or other visual differences. Future integration of CASA data into AWIPS would alleviate these issues and provide additional data sources (e.g. satellites) normally available during operations. This integration would allow for even more realistic test settings and possible reductions in assessment error.
Because of CASA design characteristics, the current study could be enhanced by the systematic control of radar beam attributes in various scenarios. Using a single WSR-88D throughout the experiment made radar coverage below 2 km more representative of nationwide coverage. However, radars in the CASA system differ from WSR-88D in more than just beam height or average resolution. CASA radars also update faster and scan regions with automated changes in azimuth coverage and number of tilts per volume (Philips et al., 2008). Additionally, with four radars to choose from in the current testbed, there is no lack of choice for wind observation angle. Update rate, beam height, wind-to-beam intersection angle and sampling fidelity may each influence forecaster performance in different ways. Future work could quantify the impact of these attributes individually regardless of radar source. To understand the impact on warnings fully, additional measures of performance will need to be collected, including the size of warnings, their duration and effective lead time. Future work will investigate the potential effect of increased spatio-temporal and low level data resolution on these severe weather warning attributes.
This work indicates that the addition of high resolution, low altitude, rapidly updating radar data has both qualitative and quantitative benefits. However, training and policy implications for the incorporation of new technology on warning operations must be carefully considered. In particular, the paired increase of confidence and wind speed estimates, no matter how much more accurate, may require changes to warning policy. The method developed here has been very effective for CASA. The use of part-task simulation paired with process and outcome measures has provided feedback vital to the radar system engineering process.