The Value of CME Arrival Time Forecasts for Space Weather Mitigation

Severe geomagnetic storms are driven by the coronal mass ejections (CMEs). Consequently, there has been a great deal of focus on predicting if and when a CME will arrive in near‐Earth space. However, it is useful to step back and ask, “How valuable is this information, in isolation, for making decisions to mitigate against the adverse effects of space weather?” While all severe geomagnetic storms are triggered by CMEs, not all CMEs trigger severe storms. Thus, even perfect knowledge of CME arrival time will provide “actionable” forecast information only in operational situations where false alarms can be tolerated. Of course, any CME transit model used to predict CME arrival time must also produce an estimate of CME speed at Earth. This can help discriminate between geoeffective and nongeoeffective CMEs, reducing false alarms and expanding the range of operational scenarios under which a forecast provides value. Thus, from an end‐user perspective, CME arrival speed should form part of the standard metric by which CME transit models are evaluated. Looking to the future, even coarse information about the CME magnetic properties would likely provide even greater forecast value. These points are illustrated by a simple analysis of solar wind data.


Background
Nearly 30 years ago, the "The Solar Flare Myth" (Gosling, 1993) conclusively demonstrated that severe geomagnetic activity results from the passage of coronal mass ejections (CMEs; large episodic eruptions of coronal plasma and magnetic field) through near-Earth space.Consequently, over the intervening decades, there has been a great deal of effort directed toward forecasting if and when CMEs will arrive at Earth.Such forecasts are routinely initiated using observed properties of CMEs close to the Sun, particularly the near-Sun speeds derived from coronagraphs and, more recently, heliospheric imagers.A range of approaches have been used to model the subsequent transit of CMEs from the Sun to Earth, from the assumption of constant interplanetary acceleration (Gopalswamy et al., 2000), to three-dimensional numerical magnetohydrodynamic simulations using an observationally derived, structured solar wind (e.g., Odstrcil et al., 2004).The merit, or otherwise, of such CME transit models is primarily judged on the accuracy of their CME arrival time predictions (e.g., Riley et al., 2018, see also https://swrc.gsfc.nasa.gov/main/cmemodels).Of course, any method of forecasting CME arrival time must also produce some predictions of CME arrival speed in near-Earth space.Such speed estimates can differ substantially across the CME transit models, even when the same transit time is predicted.
In this short commentary we consider the value of knowing different CME properties for space weather forecasting.The points we raise are illustrated by a simple analysis of solar wind data.Specifically, we ©2020.The Authors.This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
investigate the forecast value of a (hypothetical) perfect forecast of CME arrival time and contrast that with the value of additionally knowing CME arrival speed and, looking to the future, CME magnetic field.

Data Analysis
To illustrate the forecast value of different CME properties, we use a simple measure of geomagnetic activity resulting from solar wind driving.This follows the methodology of Owens et al. (2018).In brief, the Vasyliunas et al. (1982) solar wind-magnetosphere "coupling function" is used to estimate potential geoeffectiveness (G) of the solar wind.It is derived from solar wind speed (V), solar wind density (n), heliospheric magnetic field (HMF) intensity (B), and the HMF orientation, in the form of the HMF nonradial clock angle (θ) in heliographic radial-tangential-normal (RTN) coordinates.Omitting constants, as only relative variations are of interest, gives G ¼ n ð2=3 − αÞ B ð2αÞ V ð7=3 − 2αÞ sin 4 ðθ=2Þ; (1 where α is an empirically derived constant, found to be approximately 0.5 (Lockwood et al., 2017).G is preferable to using a geomagnetic index, as it removes the systematic (and deterministically predictable, Lockwood et al., 2016) variations resulting from Earth orbital and rotational motion, as well as effects associated with the limited network of stations and methods used to construct the index (Lockwood et al., 2019).This leaves only intrinsic solar wind variability, which is the forecast goal.
Near-Earth solar wind conditions (V, B, n, and θ) are obtained from the OMNI data set of near-Earth in situ spacecraft observations at 1-hr resolution.The period of analysis is constrained to 1 January 1995 to 31 December 2019 inclusive, for which there is near-continuous data coverage.
A catalog of 512 interplanetary CMEs in near-Earth space, identified on the basis of a range of in situ properties (Richardson & Cane, 2010), is used to split the OMNI data set into times of (non-CME) ambient solar wind and times of CME-related solar wind.The ability to make this split is equivalent to having a perfect knowledge of the presence of CMEs in near-Earth space, that is, a perfect CME arrival time (and duration) forecast, with no missed CMEs and no falsely predicted CMEs.However, as shall be seen, a perfect CME arrival forecast is not a perfect G forecast.
To illustrate the value of this hypothetical forecast, we use the "cost-loss" analysis.A brief description of this methodology is provided below, but more detail can be found in Murphy (1977) and Richardson (2000).A worked example in a space weather context is provided in Owens and Riley (2017).The analysis code used in this commentary has been packaged with all the necessary data and is publicly available for readers to experiment with themselves.
Consider an operational scenario where mitigating action needs to be taken if G exceeds a certain threshold.
For convenience, let us consider the 99.5th percentile value, G′.Thus, the climatological probability of exceeding G′ is p ¼ 0.005 (note that as hourly data are used, this is not directly equivalent to selecting severe geomagnetic storms, which require more sustained driving over a longer time period, but it serves to illustrate the point).
In any operational situation there is a cost to taking mitigating action, C, which must be weighed against the potential loss, L, incurred if action is not taken when it is needed (Murphy, 1977).The actual monetary values of C and L are not important; what matters is their ratio, C/L, which ranges between 0 and 1 (it would make no sense for the cost of taking action to exceed the loss it is preventing).For example, when the routine maintenance of power lines is delayed for the purpose of protecting the entire power grid from a forecast space weather event, C will be much smaller than L. When C/L is close to 0, an operator would be willing to utilize a forecast that produces false alarms but not one that results in missed events.Conversely, placing a communications satellite in safe mode may incur lost revenue, C, which is a substantial fraction of L. In higher C/L situations, false alarms are also costly.
The total cost of acting on a forecast, T F , over some period of time is computed by simply summing the C and L that would be incurred.To put this number in context, T F is normalized relative to the total cost of using climatology probability, T CL , and a perfect forecast, T P , over the same period (Murphy, 1977).This gives the forecast's "potential economic value," V: 10.1029/2020SW002507

Space Weather
OWENS ET AL.
V ¼ 100 × T F − T CL T CL − T P : (2) V is less than 0 when the forecast results in a total cost higher than simply using the climatological probability and rises toward 100 as the forecast becomes more accurate.A perfect deterministic forecast of G would allow an operator to never incur any loss.Hence, over some period of time, T P ¼ N′C, where N′ is the number of times that G′ is actually exceeded and action must be taken.
Alternatively, an operator could make a decision on the basis of the climatological probability.As p is treated as being unchanging with time, the operator just has to make one decision: whether to always take action or never take action.In situations where C/L < p, total cost will be minimized by always taking action (T CL ¼ NC, where N is the total number of time intervals), whereas if C/L > p, the total cost is lowest by never taking action (T CL ¼ N′L).The change in behavior at p ¼ C/L is best understood by an analogy; if a gambler is seeking to maximize return from betting on a series of coin tosses (p ¼ 0.5), they will always bet when the stake/prize ratio is less then 0.5 but never when it is greater than 0.5.
For a forecast to provide useful information to an operator, that is, for it to be "actionable," it needs to discriminate between times when action (likely) needs to be taken and times when it (likely) does not.A perfect CME arrival time forecast would allow an operator to perfectly discriminate between times of CME and non-CME solar wind in the future.This is an actionable information if p is different in CME and non-CME solar wind intervals.Using the observed G time series and the observed CME list gives the number of CME and (non-CME) solar wind hours to be N CME ¼ 17,420 and N SW ¼ 196,334, respectively.Of these, the number that exceeds G′ is N′ CME ¼ 770 and N′ SW ¼ 299 , giving p in CMEs to be 0.0442 but in the (non-CME) solar wind to be 0.0015.As with climatology, where C/L < 0.0015, T F will be minimized by always taking action, and thus, T F ¼ NC ¼ T CL , and resulting in V ¼ 0. Similarly, for C/L > 0.0442, T is minimized by never taking action, and T F ¼ N′L ¼ T CL , also resulting in V ¼ 0. But in the range 0.0015 < C/L < 0.0442, the perfect CME arrival forecast produces added value, as This is shown graphically in Figure 1.Given most operational scenarios have L ≫ C, C/L is shown on a logarithmic scale.The added forecast value (relative to climatology) of knowing CME arrival time is limited to low C/L values, where false alarms are not a major concern.
The same analysis can also be used to quantify the added value of a perfect near-Earth CME speed forecast.CME arrival speed is defined as the average over the whole CME duration.This is used to split CMEs into quartiles (i.e., four speed categories containing equal numbers of CMEs).This represents having a relatively coarse, but accurate, forecast of CME speed in near-Earth space.For each of the CME speed categories, p is computed.It shows a monotonic increase with CME speed.As with CME/non-CME intervals, each speed category is treated independently to estimate T F .The result is shown as the blue line in Figure 1.At low C/L, there is little gain over the perfect CME arrival time forecast.However, by extending forecast value to higher C/L values, the range of forecast scenarios for which the forecast is "actionable" is increased.
The same process is repeated for the average CME magnetic field intensity.Note that this again is fairly coarse information.In particular, it does not include any information about the magnetic field orientation, θ, which is the primary space weather concern.The red line in Figure 1, however, shows that it significantly adds to the forecast value, even more so than CME speed.Finally, the magnetic field intensity and speed categories are combined, producing greater forecast value than either speed or magnetic field intensity individually.

Discussion
As the passage of a CME through near-Earth space is associated with increased probability of high geoeffectiveness, knowledge of CME arrival time is expected to valuable for space weather mitigation, even without further information about the CME properties.This was illustrated with an analysis of solar wind data and assuming a perfect CME arrival time forecast.But CME arrival time only constitutes "actionable" information in operational situations where false alarms can be tolerated (i.e., where the cost of taking mitigating action is very small compared to the loss incurred if a space weather event occurs without taking action).The reason is perhaps rather obvious; while severe geomagnetic activity is associated with CMEs, most CMEs are not associated with severe geomagnetic activity. 10.1029/2020SW002507

Space Weather
OWENS ET AL.
Sun-Earth CME transit models are primarily used to predict CME arrival time but must also produce estimates of CME speed in near-Earth space, even if it is only the average transit speed.The forecast value analysis highlights that information of the CME arrival speed, even in a relatively coarse form, increases the range of operational scenarios for which forecasts have value.Thus, to an end user, correctly estimating the CME arrival speed is likely to be as important as the CME arrival time.We therefore suggest that it should be part of the standard metric by which CME transit models are assessed.This is likely to enable discrimination between different CME arrival models, which are largely indistinguishable on CME transit time alone (Riley et al., 2018).
The same data analysis was also used to demonstrate that if information about CME magnetic field can be provided, even in a relatively crude fashion, it is potentially even more valuable than CME speed.Unsurprisingly, combining CME speed and magnetic field information improves the forecast value by the greatest amount.Note that this only uses the magnetic field intensity.A reliable forecast of the magnetic field orientation would dramatically improve the forecast value over a much wider range of cost/loss ratios, which could be investigated with the same analysis approach.
There are, of course, a number of limitations to the data analysis we can present in such a short commentary.First, it is based on a point-by-point comparison of forecasts and observations, which can result in "double penalties" from small timing errors (Owens, 2018).In many operational situations, such small timing errors can be tolerated as long as the forecast correctly discriminates between event/nonevent occurrence over a longer time window, such as the next 24 hr.Repeating the presented analysis for 24-hr windows gives the same qualitative results presented here: CME arrival time is valuable in its own right.But any information about CME arrival speed and magnetic field adds significant value.Second, we have only presented results for individual 1-hr values and consequently have neglected the role of time history in geomagnetic activity, particularly geomagnetic storms.However, we do obtain qualitatively similar results for 1-day values.Third, we have only presented a single threshold for action.The same qualitative results are found when this is varied, though in general CME speed becomes more important as the threshold is lowered.We have made the analysis code and data publicly available so that the interested reader can investigate the effect of time scale and action threshold.
Finally, we note that the analysis could be used for a number of additional purposes.For example, it could be used to determine "how good is good enough" for CME arrival time forecasts in various operational scenarios or the value of forecasting corotating interaction regions.Such assessment is considerably more multifaceted than that presented here and will form the basis of a future study.

Figure 1 .
Figure1.Forecast value, V, for different operational scenarios.The x-axis shows the cost, C, of taking mitigating action relative to the loss, L, incurred from a space weather event when action is not taken.Thus, low C/L values represent scenarios where missed events are the primary concern, but false alarms can be tolerated.When only CME arrival time is perfectly known, shown in gray, the forecast only has value at low C/L.When CME speed is also known, shown in blue, the range of C/L for which the forecast is valuable is extended.For CME magnetic field intensity, shown in red, it is extended further.Combining speed and magnetic field information, shown in black, significantly improves the forecast value.