Forecasting GIC Activity Associated With Solar Wind Shocks for the Australian Region Power Network

Space weather induced Geomagnetically Induced Currents (GICs) that are hazardous to power networks at high latitudes are often the result of rapid changes in near‐Earth space current systems such as the auroral electrojets and field‐aligned currents. GIC activity at low‐middle latitudes may be driven by currents that are more relevant to these regions such as the ring or magnetopause current systems. Solar wind shocks create rapid changes in the magnetopause current that manifest as large step changes in the geomagnetic field at the Earth's surface, often referred to as geomagnetic sudden impulses (SIs), that are effective drivers of GIC activity in power networks at low‐middle latitudes. This paper describes the results of a study into the relationship between the driver of SIs, solar wind shocks, and their potential impact on power systems in the Australian region, as quantified by a GIC‐index. The initial data set for analysis was produced by programmatically scanning solar wind data obtained from the ACE satellite spanning the period 1998–2008 for solar wind shocks. For each identified solar wind shock, geomagnetic field data from the Australian region were analyzed to determine the corresponding SI and GIC‐index. Statistical analyses of GIC‐indices and various solar wind parameters associated with the shocks resulted in an empirical model that is a function of solar wind dynamic pressure, geomagnetic latitude, and change in solar wind speed with good operational predictive capability. This model was further assessed using online catalogs of solar wind shock events.

High latitude GICs are typically produced by an increase in the electrojet currents (Boteler et al., 1998;Kappenman, 2004;Pulkkinen et al., 2005) in combination with field-aligned currents (Pulkkinen et al., 2005;Viljanen, 1997). At low-middle latitudes GICs can be caused by geomagnetic disturbances driven by intensification of the ring current (Kappenman, 2004). Impulsive geomagnetic disturbances such as storm sudden commencements (SSCs) and sudden impulses (SIs-typically referring to the scenario of an SSC with or without an subsequent geomagnetic storm), are caused by interplanetary solar wind shocks and are also recognized as a potential driver for intense GICs (Carter et al., 2015;Fiori et al., 2014;Kappenman, 2005;Pulkkinen et al., 2005;Russell et al., 1994;Watari et al., 2009;Zhang et al., 2015). Pulkkinen et al. (2005) noted that sudden changes in the geomagnetic field associated with dynamic pressure pulses in the solar wind are a global threat to technological conductor systems due to their scale size.  noted that a large SSC on 24 March 1991 produced the largest GIC measured in the United States. Pulkkinen et al. (2005) attributed two failures in the Swedish power system during the 2003 Halloween storms with SSC induced GICs. Marshall et al. (2012) highlighted the failure of a transformer in New Zealand coincident with the arrival of a solar wind shock and subsequent SI. However, the authors also suggested that premature aging of the transformer contributed to the failure. Zhang et al. (2015) analyzed GIC activity at two low-latitude 500 kV substations in China during the storm of 17 March 2015 and reported that the GIC due to the SSC was much higher than that during the storm main phase. Such SSC/SIs are observed at Australian region latitudes and this study describes the relationship between solar wind shocks and GIC activity, as quantified by the GIC-index defined by Marshall et al. (2011), over the Australian region to develop a forecast model for these space weather phenomena. It should be noted that only GIC activity related to SIs is considered in this paper and possible relationships between main-phase storm related GIC activity and solar wind parameters is the subject of future studies.

Data Analysis
Although GICs are driven by geoelectric fields, GIC studies often employ other parameters as indicators of space weather conditions conducive to GIC activity such as dB/dt (Boteler et al., 1998;Kappenman, 2001;Trichtchenko & Boteler, 2004;Viljanen, 1997;Viljanen et al., 1999). Pirjola (1982) showed that for the case of a uniform downward propagating plane wave incident on a uniform conducting Earth, the geoelectric field is proportional to both dB/dt and frequency. This scenario gives the equation of magnetotellurics (e.g., Cagniard, 1953;Pirjola, 1982Pirjola, , 2002 which provides a first-order approximation to the geoelectric field: where σ is a scalar conductivity, μ o is the permeability of free space, ω is the angular frequency of the incident wavefield, E y is a function of B x , E x is a function of B y , and x and y are the north and east directions respectively. Equation 1 has been used in a number of GIC studies (Koen & Gaunt, 2003;Liu et al., 2009;Pulkkinen et al., 2005;Torta et al., 2012) and may be simplified to indicating that the geoelectric field has a square-root relationship with frequency to a first approximation. If B(t) and B(ω) represent Fourier Transform pairs then Equation 3 shows the frequency domain representation of dB(t)/dt has a linear dependence on frequency. A study by Marshall et al. (2010) proposed a "GIC-index" that includes the functional dependence on frequency of Equation 2, as a better indicator of GIC activity compared with dB/dt. Analogous to dB/dt data, the GIC-index defined in that study only requires geomagnetic field data and can be applied to any location irrespective of ground conductivity or ionospheric current system geometry as an indicator of space weather conditions conducive to GIC activity. This index has been used to quantify space weather related GIC activity in New Zealand (Marshall et al., 2012), Australia (Marshall et al., 2013), China (Liu et al., 2016;Zhang et al., 2015), and Italy (Tozzi et al., 2019) and was highlighted by Pulkkinen et al. (2017) as the only space weather indicator that has been designed specifically to characterize the level of impact of geomagnetic induction in technological systems such as power grids. Further, the GIC-index has previously been calibrated against anomalies observed within power systems from around the world (Marshall et al., 2011) to derive a probabilistic model and associated relative risk levels to power networks corresponding to GIC-index thresholds. The GIC-index is employed for this study.
GIC-indices are derived by applying a frequency domain filter defined in Marshall et al. (2010) to the frequency spectrum of geomagnetic time series data according to where f is frequency at each bin in the frequency domain and f N is the Nyquist (Brigham, 1974) frequency. This filter function is alluded to in Pulkinnen et al. (2005). Typically, a complete day of 1-min sampled single component (e.g., x-component) geomagnetic field variometer data is processed using Equation 1 to produce the corresponding GIC-indices for each day according to the following. Let x(t) and y(t) represent the time series data for geomagnetic field variations recorded in the geographic north and east directions respectively. If x(t) and X(f), and y(t) and Y(f) represent Fourier Transform pairs then where FFT() −1 represents the inverse Fourier Transform of the quantity inside the parentheses, Z(f) is the filter function of Equation 4, and││refers to the absolute value of the complex quantity returned by the inverse transform. A Hamming window (Press et al., 2007) is applied to the time series data prior to transformation into the frequency domain and the π/4 phase shift in Equation 4 is applied to the complex conjugate of negative frequency values. Analogous relationships for geomagnetic field data recorded in the geomagnetic north and east directions can be obtained using geomagnetic h-and d-component data in Equations 5a and 5b, respectively. Geomagnetic time series data were checked prior to GIC-index calculation to remove possible contamination resulting from data gaps, spikes, saturation or other such spurious time series values.
The relationship between solar wind parameters and GIC-indices associated with solar wind shocks was investigated by first searching the solar wind data and identifying solar wind shocks according to the following criteria. Each of the four solar wind parameters, bulk wind speed V SW , density N, IMF total field amplitude B T , and temperature T, were required to correlate with a step function. For each day of solar wind data, correlation coefficient values were calculated between each of the four solar wind parameters and a sliding step function. Shocks were identified where the correlation coefficient was greater than 0.8 and the point of maximum correlation was coincident across all four solar wind parameters V SW , N, B T , and T. The solar wind data set initially used in this study spanned from 1998 to 2008, the majority of solar cycle 23 (SC23).
The step function was defined as 30 points in length with the 0-24 point interval having a value of 0, and the 25-30 point interval having a value of 1. The ratio of 25 points of value 0 and 5 points of value 1 comprising the step function, as opposed to a 50:50% ratio for example, was used to improve identification of "well defined" shocks following a steady state interval. The step function was applied as a sliding window across solar wind data obtained from the Advanced Composition Explorer (ACE) satellite. The step length of 30 points, a time window of approximately 30 min using ACE 64-s averaged data, was selected to ensure the window time interval was sufficiently greater than the shock rise time and small enough that the method may be implemented in an operational mode to provide timely warnings, requiring minimal time delay to acquire sufficient data. For each solar wind shock identified using the above criteria, 5-point averages of pre-and post-shock values of solar wind bulk speed and ion density, V sw1 , V sw2 , N 1 and N 2 respectively, were used to calculate the dynamic pressure change ΔP SW using the pre-and post-shock dynamic pressure, P SW1 and P SW2 and where K is a scale factor of ∼0.884 due to the influence of the bow shock on the solar wind parameters (Prölss, 2004) and H m is the proton mass. The five-point (∼5 min) pre-and post-shock averages were used to reduce the influence of potentially spurious values in the pre-and post-shock plasma parameters.
The SI associated GIC-index resulting from each identified solar wind shock was determined by first locating the solar wind shock driven SI event in the geomagnetic field data and then assigning the corresponding GIC-index for the identified time. Solar wind shocks produce an increase in the magnetopause current, resulting in an SI ground signature at low-middle latitudes primarily in the geomagnetic north-south (h-component) direction (Araki et al., 1997;Russel et al., 1994 and references therein). The GIC-index probabilistic model reported in Marshall et al. (2011) provides risk levels based on GIC-index data derived from geomagnetic field observations measured in the geographic coordinate system to align with the coordinate system familiar to industry and their infrastructure. This study utilizes geographic north-south (x-component) magnetometer data to leverage the relative risk thresholds derived from this probabilistic model. For the geomagnetic observing stations used in this study (Figure 1), the maximum declination is less than 15° resulting in maximum differences between corresponding GIC index values using x-and h-component data of less than 3.5%. The use of x-component geomagnetic data is therefore considered sufficient over the region of interest, so GIC y -index values (Equation 5b) form the dependent variable. Zhang et al. (2015) similarly used the GIC y -index in their analyses of the solar wind shock of 17 March 2015 and subsequent impact on the power system in China.
In order to identify an SI event, the transit time from shock detection at the ACE satellite location to the magnetopause was calculated to determine a search window of ±60 min within the corresponding geomagnetic field data. Within this search window the geomagnetic field data for each Australian region station ( Figure 1) was correlated with a sliding 30-min step function at 1-min increments, analogous to the method used for determining the time of the solar wind shock. The step function values of 0 and 1 were again in the ratio of 25:5 to reduce correlation with large time varying fields occurring during a geomagnetic storm main phase. The time of maximum correlation, required to be greater than 0.8, was then used as the epoch for the GIC y -index value obtained from the GIC-index time series data for the corresponding UT day calculated using Equation 5. In addition, the SI step amplitude was required to be greater than 10 nT to exclude small amplitude events. The identified GIC y -index value was then analyzed with respect to various solar wind parameters, such as dynamic pressure change, ΔP SW , calculated from the associated solar wind shock. The results of this analysis are presented in the next section. Figure 2 shows a typical solar wind shock signature as seen in the solar wind parameters for the CME of 4 November 2003. The shock was identified using the method described in Section 2 to occur at 0558 UT as indicated by the red vertical dashed line. A total of 90 well defined shocks as specified in Section 2 were detected using the ACE data set over 1998-2008. It should be noted that there are cases when not all four solar wind parameters V SW , N, B T , and T exhibit step changes with the arrival of solar wind shocks. In addition, the criteria for all four parameters to correlate coincidentally with values greater than 0.8 restricted the identification to well defined shocks. Unfortunately, the loss of satellite solar wind data during large space weather events, often due to contamination from high energy particles during solar proton events, resulted in a limited number of data points for solar wind shocks exhibiting relatively large step changes. Figure 3 shows x-component magnetometer data for the same UT day as Figure 2 as detected by the magnetometers located at the sites shown in Figure 1, displayed top-bottom for lowest to highest geomagnetic latitude. The red vertical dashed line indicates the SI times as identified using the method described in Section 2. All SI times in Figure 3 were identified to have the same time of 0626 UT. The corresponding GIC y -indices derived from the magnetometer data in Figure 3 are shown in Figure 4. The SI time is also shown in Figure 4 by the red vertical dashed line. Solar wind shock parameters and associated GIC y -indices such as those identified in Figures 2 and 4, respectively, formed the data set for subsequent analyses. The complete data set of solar wind shocks and GICy-indices identified using the above analysis method is hereafter referred to as the ACE-SC23 data set and is available at https://doi.org/10.5281/zenodo.7091161. Le et al. (1993) reported that the simplest response of the ground magnetic field to a compression of the magnetosphere by solar wind shocks occurs at low and mid latitudes and is an increase in the h-component that is roughly proportional to the change in the square root of dynamic pressure. Figure 5 shows scatter plots of the GIC index versus square root of solar wind dynamic pressure change, √ Δ SW , for all identified shocks and associated GIC y -indices derived from the x-component of each of the magnetometers shown in Figure 1. The best fits from linear regression analyses are also shown as the solid-colored lines. Table 1 lists the relevant parameters from the regression analyses for each location and the associated Pearson correlation coefficients which range from 0.69 to 0.82 across all locations. Forecast models for each location are of the form

Results
where a 0 and a 1 are given in Table 1. Russell et al. (1994) reported that a number of studies have found that the ground magnetic field variation associated with SI's increased with the change in the square root of solar wind dynamic pressure. However, Figure 5 shows that other factors such as latitude also need to be considered. Figure 5 (bottom-right) shows the best fits determined from the regression analyses for all stations, suggesting the coefficients vary with station latitude and a multiple linear regression (MLR) analysis should also consider latitude in the model. The distribution of GIC y -indices incorporating all stations is right-skewed and a log transformation is required to adequately normalize the response variable. This provides a model of the form where λ is the absolute value of geomagnetic latitude. An MLR of all 537 data points of the ACE-SC23 data set shown in Figure 5 was used to provide the coefficients a 0 , a 1 , and a 2 of Equation 8 which are listed in Table 2, along with standard errors, correlation coefficients, t-values and associated probabilities, and 95% confidence intervals (CIs) for each covariate. Russell et al. (1994) and  also considered local time dependence in their analysis of solar wind shock parameters and associated ground magnetic variations. Furthermore,  reported that solar wind shocks with high impact angles were associated with higher solar wind speeds that produced larger dB/dt values, suggesting solar wind velocity should be included in the regression analysis.    reported results from a comprehensive study of the relationships between various solar wind energy coupling functions and magnetic indices including solar wind velocity, density, and magnetic field. Fiori et al. (2014) also considered the influence of solar wind parameters such as velocity, density, magnetic field, and dynamic pressure in their analysis of ground magnetic variations associated with SI events. The MLR analysis of Equation 8 was therefore extended to similarly include additional parameters such as change in solar wind velocity, ΔV sw , density, Δn, total magnetic field, ΔB t , geomagnetic longitude, φ, and Δt, where Δt is defined as the absolute difference between noon and the local time of the associated SI. The results of this analysis are provided in Table 2 along with standard errors, correlation coefficients, t-values and associated probabilities, and 95% CIs. The adjusted-R 2 (R 2 that has been adjusted for the number of predictors in the model) values resulting from the addition of the covariates of successive rows of Table 2 are also shown. The descending order of the variables in Table 2 was produced by testing all different order combinations to obtain the maximum MLR correlation coefficient using the set of remaining variables for each row (the first row used all covariates). The adjusted-R 2 values indicate that no significant additional predictive power of the model occurs for more than the first three covariates of Table 2, √ Δ SW , λ, and ΔV sw , as increases in the adjusted-R 2 values only occur in the third significant figure. This is consistent with the t-values of Table 2 that indicate the t-values are considerably less significant beyond the first three covariates. Furthermore, predicted-R 2 calculated using a leave-one-out cross validation approach, whereby the correlation coefficient is calculated using the residual errors between iteratively omitted data points and the predicted value derived from the remaining points, can be used to assess how well the model performs when given new data, as would occur in an operational environment. Table 2 shows that the trends for predicted-R 2 are comparable to the "in-sample" adjusted-R 2 , calculated using all data points, with no significant additional predictive power after the first three covariates. From the MLR analysis the optimal model is therefore achieved using log (GIC ) = 0.235 + 0.216 √ Δ sw + 0.00458 + 0.00155Δ SW Note. The associated standard errors, t-values and associated probabilities, individual Pearson correlation coefficients, and 95% CIs are shown for each covariate. The MLR Pearson correlation coefficients, adjusted-R 2 , and predicted-R 2 values resulting from the addition of the covariates of successive rows of the table are also shown. Figure 6 provides a scatter plot of observed and predicted GIC-index values obtained using Equation 9 and observed values of ΔP sw , λ, and ΔV sw associated with each observed GIC y -index. The different colors of each point represent the different observation sites. Figure 6 shows a reasonably good linear fit to the data using these three coefficients with a correlation coefficient of 0.825, and F-test value of 377.65, giving a Pr(>F) < 10 −15 .

Discussion
This paper describes the results of a study into the relationship between the space weather driver of SIs, solar wind shocks and a GIC-index, a parameter related to the impact of GICs on power systems. The statistical analysis between GIC-indices and various solar wind parameters associated with the identified shocks provided an empirical forecast model that is a function of solar wind dynamic pressure change, √ Δ SW , geomagnetic latitude, λ, and change in solar wind speed, ΔV sw , with good out-of-sample or operational predictive capability. Other solar wind parameters only marginally increased the predictive power of the model.

The model parameters,
√ Δ SW , λ, and ΔV sw are in general agreement with the results of previous studies that analyzed the relationships between ground magnetic field variations and solar wind shocks. Le et al. (1993) and Russell et al. (1994, and references therein) reported a relationship between ground magnetic field response and change in solar wind dynamic pressure.  reported a dependence of dB/dt on solar wind speed due to shock impact angle. Russell et al. (1994) and  also reported a dependence on latitude and local time, although  suggested the time dependence was most significant for high and equatorial latitudes with the dayside perturbations being only slightly larger than the nightside perturbations for middle and low latitudes. Over the range of low to middle latitude locations used in this study, local time only had a marginal influence on the predictive power of the model.
There are a number of underlying assumptions relevant to interpretation of the statistical data in Table 2 including the model coefficients in Equation 9. The assumptions include aspects that relate to collinearity of the dependent variables and the distribution of the residual errors. For an MLR model with parameters listed in Table 2, the errors should be normally distributed and independent of the model variables. The distribution of residual errors, calculated from the difference between the modeled values obtained from Equation 9 and the observed data, is approximately normally distributed and has a flat response when plotted as a function of the observed values. Furthermore, for the standard errors and coefficients of Table 2 to be reliable the covariates should be linearly independent. A Principal Component Analysis (PCA) was used to orthogonalize the independent variable data. The linear model obtained from a Principal Component Regression (PCR) produced predicted GIC-index values almost identical to those obtained using the original MLR, indicating the set of covariates are sufficiently independent and the standard errors and coefficient values in Table 2 are reliable. The 95% CIs for the last four covariates of Table 2 span zero indicating a weak dependence on these covariates, further supporting the optimal model is obtained using only the first three covariates given in Equation 9.
One of the aims of this study was to develop an empirical model that could provide warnings of relative risk levels of solar wind shocks to power network operations over the Australian region.  used a dB/dt threshold of 100 nT/min reported by Fiori et al. (2014) in their analysis to identify solar wind shock parameters that result in geomagnetic variations considered as "high-risk" to the power industry. Marshall et al. (2012) developed a probabilistic model as a function of GIC-index from a large database of reported power system anomalies from around the world to estimate the relative risk of fault occurrence within power systems due to varying levels of geomagnetic activity. This model was used to define relative risk thresholds of low, moderate, high, and extreme defined for GIC y -indices of 50, 100, 250, and 600 respectively. The model developed in this study can be used in conjunction with these thresholds and automatically detected solar wind shock parameters to produce warnings of the relative risk to the power industry with lead times of tens of minutes. and observed values of ΔP sw , λ, and ΔV sw associated with each observed GIC y -index. The different colors correspond to the colors used in Figure 5 for the different observation sites.
One of the products produced by the Australian Bureau of Meteorology's Space Weather Services is the Solar Wind Shock Alert system (http://www. sws.bom.gov.au/Products_and_Services/4/1). This uses the solar wind shock detection method outlined in Section 2 and can be used in combination with Equations 6 and 9 and the above risk thresholds to provide short lead time warnings of potentially hazardous GIC activity for power systems. For example, the parameters of solar wind shocks detected by this system associated with two of the largest storms of solar cycle 24, 22-23 June 2015 and 7-8 September 2017, are given in Tables 3 and 4, respectively, and serve as out-of-sample events not included in the model training data. Using the values in Tables 3 and 4, and Equations 6 and 9 for an out-of-sample location not included in the set of magnetometer stations of Figure 1, Narrabri in New South Wales (−30.28°, 149.58° GG, −37.1°, 227.0° GM), gives GIC y -index values of 36.3 and 13.1, respectively. The observed GIC y -index due to these solar wind shocks for this location were 41.2 and 17.4. The parameters of Table 3 identify the shock that occurred at 1759 UT on 22 June 2015. The automated system issued an alert at 1807 UT due to the inherent delay in acquiring post-shock data as part of the detection algorithm. The associated GIC y -index of 41.2 was observed at 1835 UT, a lead time of nearly 18 min.
The parameters of Table 4 identify a shock that occurred at 2230 UT on 7 September 2017 and the automated alert was issued at 2247 UT. The associated GIC y -index of 17.4 was observed at 2302 UT, a lead time of 15 min. These examples demonstrate the potential practical application of the empirical model of Equation 9 to provide forecasts of relative risk to power networks in the Australian region posed by incoming solar wind shocks with lead times sufficient to raise situational awareness and invoke mitigating actions.
As mentioned in Section 3, the high degree of correlation required between the four solar wind plasma parameters for the shock detection method used in this study resulted in some potential shock events being discarded. Online catalogs of solar wind shocks such as the Harvard-Smithsonian Center for Astrophysics IP shock list for Wind satellite data (http://www.cfa.harvard.edu/shocks/wi_data/) and ACE data (http://www.cfa.harvard.edu/shocks/ ac_master_data/), hereafter referred to as the CFA data set, have been used in previous solar wind shock studies (e.g., Oliveira & Raeder, 2015;Rudd et al., 2019;Xu et al., 2020) and were used here to validate the methods and results of this study. These catalogs contain the plasma and shock parameters obtained by solving the Rankine-Hugoniot conditions, which are based on the conservation of energy and momentum across the shock surface (further details of these calculations can be found under the catalogs and in Oliveira & Samsonov, 2018). Using these catalogs, 461 fast forward shocks were identified during the 11-year period of 1997-2008 spanning solar cycle 23. Of the 90 shock events of the ACE-SC23 data set identified using the method outlined in Section 2 of this study, 71 were within 2 min of the times given in the catalogs.
To further assess the results obtained using the ACE-SC23 data set of this study, analogous data sets were obtained using the solar wind plasma data from the ACE and WIND satellites for the 461 shock events from the catalogs. Solar wind time series data for these events were manually checked along with the associated geomagnetic field data and derived GIC-index data to ensure the GIC-index was associated with a clearly defined step change in the geomagnetic field, reducing the number of suitable shock events to 244 and a potential 1764 corresponding GIC-index data points across the seven stations used for this study. Geomagnetic field data were unavailable for some events, further reducing the number of data point pairs. The data was additionally checked for outliers using Q-Q plots (quantile-quantile scatterplots), resulting in 13 data points being removed and the final number of data point pairs being 1489 (hereafter referred to as the CFA-SC23 data set). This data set is available at https://doi.org/10.5281/zenodo.7091198.
The results of an MLR analysis on this data set are provided in Table 5. Similar to the results of Section 3, the increases in the correlation coefficient shown in Table 5 with the addition of each covariate are for the covariate that produces the maximum MLR correlation from the remaining successive sets of covariates. Table 5 shows the MLR correlation coefficient, the adjusted-R 2 , and the predicted-R 2 values all have minor increases in these statistical metrics with the addition of covariates after the third or fourth  covariate. In addition, the 95% CIs for the coefficients of the last three covariates of Table 5 span the value zero, suggesting there is a weak dependence on these covariates. Hence the optimal MLR model obtained using the CFA-SC23 data set uses the same covariates of Equation 9, that is, √ Δ SW , λ, and ΔV sw , with the addition of the change in solar wind density, Δn. Table 5 shows the values of the coefficients for these covariates are in good agreement with those of Table 2 within the 95% CIs, except for the coefficient of Δn.
The model of Equation 9 was used to predict GIC-index values for out-of-sample events such as those from solar cycle 24 discussed previously. The CFA data set spans solar cycle 24 (SC24) and may be used to further assess the model given by Equation 9. From the CFA data set, 199 fast-forward shock events were identified for the 11-year period spanning solar cycle 24, 2009-2020. Analogous data quality control and checking procedures to those used for the CFA-SC23 events were followed for the SC24 events, providing 122 solar wind shocks for analysis and 624 data point pairs (hereafter referred to as the CFA-SC24 data set). This data set is also available at https://doi.org/10.5281/zenodo.7091198. The summary results of an analogous MLR analysis to those conducted on previous data sets using the CFA-SC24 data set are shown in Table 6. The statistical metrics of Table 6 are analogous to those in Tables 2 and 5, providing an MLR model for the CFA-SC24 data set that uses the same covariates of Equation 9, that is, √ Δ SW , λ, and ΔV sw , with the addition of the change in solar wind total magnetic field strength, ΔB t . Table 6 shows the values of the coefficients for these covariates are in good agreement with those of Tables 2 and 5 within the 95% CIs, except for the coefficient of ΔB t which is only in marginal agreement.
The general agreement between the coefficients of the covariates of Equation 9 obtained using events of SC23 and SC24 suggests the CFA-SC23 and CFA-SC24 data sets may be combined. The summary results of an analogous MLR to those conducted previously using all CFA-SC23 and CFA-SC24 events are shown in Table 7. The statistical metrics of Table 7 are analogous to those in Tables 2, 5, and 6, providing an MLR model for this data set that uses the same covariates of Equation 9, that is, √ Δ SW , λ, and ΔV sw , with the addition Δn and ΔB t . Table 7 shows the values of the coefficients for these covariates are in good agreement with those of Tables 2, 5 and 6 within the 95% CIs, particularly for the covariates  The analysis of the CFA data sets presented above provides an independent validation of the model presented in Section 3 and indicates the covariates of Equation 9 are relatively consistent across solar cycles and different data sets, suggesting Equation 9 provides a valid model for predicting GIC-index resulting from solar wind shocks. Equation 9 may be updated with the coefficients in Table 7 to provide a model derived from the larger CFA data set that spans two solar cycles and is given by log (GIC ) = 0.0453 + 0.265 √ Δ sw + 0.00339 + 0.00192Δ SW (10) Figure 7 is analogous to Figure 6 and provides a scatter plot of observed and predicted GIC-index values obtained using Equation 10 and observed values of ΔP sw , λ, and ΔV sw associated with each observed GIC y -index. Data points corresponding to the solar wind shock event of 22 June 2015 have been omitted from Figure 7 for presentation purposes, although the data points were included in the MLR analyses. For this event, the predicted GIC-index values range between approximately 100 to 120 across the geomagnetic field stations for which data was available. Comparison of the solar wind shock parameters for this event given in Table 3 with corresponding values for the same event from the CFA-SC24 data set revealed the density enhancement for the CFA-SC24 data set event was approximately double that of the value shown in Table 3. The distribution of all ΔP sw values obtained from the plasma parameters of the combined CFA-SC23 and CFA-SC24 data sets suggests the CFA-SC24 value for the 22 June 2015 event is anomalous and hence the values for this event are omitted from Figure 7.

Conclusion
This paper describes the results of a study into the relationship between the space weather driver of SIs, solar wind shocks, and their potential impact on power systems in the Australian region, as quantified by a GIC-index. The initial data set was obtained by programmatically scanning solar wind data from the ACE satellite spanning the period 1998-2008 for solar wind shocks. For each identified solar wind shock, geomagnetic field data from the Australian region were analyzed to determine the corresponding SI and GIC-indices. Statistical analyses of the GIC-indices and various solar wind parameters associated with the shocks resulted in an empirical model that is a function of solar wind dynamic pressure, geomagnetic latitude, and change in solar wind speed with good out-of-sample predictive capability. This model was further assessed using online catalogs of solar wind shock events over two solar cycles. The empirical models have been developed using data from the Australian region and are relevant for the latitudes of the magnetometer stations used. The model may be used to automatically generate warnings of short duration GIC activity with lead times of 15-60 min for situational awareness of power grid operators in the Australian region. Analysis of GIC activity during the main phase of geomagnetic storms and possible relationships with solar wind parameters is the anticipated subject of future studies.   Figure 7. Scatter plot of observed and predicted Geomagnetically Induced Current (GIC y )-index values analogous to Figure 6. The predicted GIC y -index values were obtained using Equation 10 and observed values of ΔP sw , λ, and ΔV sw from the CFA-23 and CFA-24 data sets associated with each observed GIC y -index. The different colors correspond to the colors used in Figure 5 for the different observation sites.