The impact of the El Niño phenomenon on electricity prices in hydrologic‐based production systems: A switching regime semi‐nonparametric approach

Electricity production in highly hydrological‐dependent systems is determined by different weather phenomena, which strongly impact spot prices. To account for such stylized facts, we propose a stochastic process with a mean reversion and switching regime component to represent the dynamics of the spot price. The short‐term movements are represented by semi‐nonparametric (SNP) distributions, in contrast to previous studies that traditionally assume Gaussian processes. We consider the Colombian electricity market as a study case, in which 68% of its electrical generation comes from water resources, and the El Niño phenomenon represents a critical source of risk for maintaining long‐term supply, sustainability of investments, and efficiency of prices. We show that under scarcity seasons, the spot price mean, variance, and some superior‐order moments of electricity price distribution increase, as does the risk level of the system. In particular, the switching regime model with SNP distributions for the random components outperforms traditional models, leading to accurate estimates and simulations, thus being a helpful tool for resource planning, risk management, and policy‐makers for electricity markets with high climatic dependency.

An alternative to the rigidity of the normality assumption, documented for financial returns since the seminal work of Mandelbrot, 26 is the application of seminonparametric (SNP) probability distribution. Brunner 27 shows how SNP statistical techniques are suitable for treating series when normal distributions do not adequately represent the data under study; they also avoid specification errors since Gram-Charlier (Type A) series has been proved to be the asymptotic distribution of any "regular" probability density function (pdf). Gallant et al. 28 and Jondeau and Rockinger 29 used SNP modeling to describe the United States stock market behavior. Mauleon and Perote 30 also used the SNP distribution to model the stock market in the United States and the United Kingdom, while Ñíguez and Perote 31 did so to evaluate the stock performance of the United States. This SNP approach has been applied to modeling many other series in the last years-for example, Del Brio et al. [32][33][34] and Cortés et al. 35,36 The comparison with other parametric or nonparametric approaches is also left for further research (see, e.g., Velásquez-Gaviria et al. 37 ), but as far as we know and despite its clear advantages against Gaussian assumptions, it has only been used to model electricity markets by Trespalacios et al. 38,39 This paper proposes a spot price model that considers seasonality, mean reversion, asymmetry, kurtosis, and other distribution moments. Although there are a bunch of papers modeling electricity spot prices with regime jumps-see, for example, Huisman and Mahieu, 40 Weron et al., 41 or Haldrup and Nielsen 42 -none of them is built upon the basis of the SNP approach. Our research combines both methodologies by considering a stochastic and mean-reverted SNP process for the short-term imbalances and a regime switch to measure the impact of the El Niño phenomenon on Colombian electricity prices, where 68% of the electricity comes from water resources. The results show that the SNP representation of the random components of the stochastic process outperforms the normal distribution and that the forecast of dramatic changes in weather conditions may help risk management and decision-making in electricity systems in typical weather conditions and those associated with substantial reductions of the hydrologic inflows.
The following section describes the mathematical model proposed in this work, as well as the methodology that was applied. Section 3 describes the data used to calibrate the model and some basic concepts of electricity spot price formation in single-node systems. Section 4 presents the results and the potential applications of the model discussed. Finally, Section 5 summarizes the main conclusions and recommends future works that market professionals or researchers who study related fields could apply.

| Spot price
We assume that P t is the log-spot price and may be written as in Equation (1). Such a stochastic process has three components: a deterministic one, F t ( ), and two stochastic elements, X t and J t . X t denotes a process with mean reversion, and J t can be used to the effect of switching regimes.
F t ( ) incorporates three elements, as in Equation (2): , where a dummy variable (G i ) is assigned to each period i m = 1, …, which may represent a group of months and β i captures its effect: Once the electricity price moves away from such a central mean, the probability that it will return to the starting point is higher than that of further distancing. 43 X , t described in Equation (3), is a process of mean reversion that is based on an Ornstein-Uhlenbeck process with two modifications in its random component: First, the magnitude of the random effect depends on the state of variable h t , which is binary and can only take values 1 and 0. The second modification affects the stochastic differential dz, which is not normally distributed, unlike the Brownian motion differential equation. This work considers that dz is described by an SNP pdf; see Section 2.2 for further details. The parameter κ in Equation (3) is known as the mean reversion speed: t denotes the effect of the regime switch h t on the price, that is, if h = 0 t , then there is no additional effect over the spot price, and if h = 1 t , then the expected price increases by D. The short-term random switches are governed by the random variable j t , described by the SNP pdf: The SNP distribution and some of its main characteristics are described in Section 2.3.
Electricity jumps produce significant increases in skewness and kurtosis, with heavy tails in the price series. Therefore, Poisson jump processes, regime switches, and other models represent the stochastic nature of this phenomenon. 6,9,43,44 The exogenous variable h t governs the regime switch. The specification of this term follows the Markov Switching Model, described by Tsay, 45 and the approximation in terms of the prediction of electricity price jumps by Mount et al. 23 The binary variable h t has transition possibilities w 0 and w 1 , which depend on the cumulative duration of each regime, as shown in Equations (5) and (6): where τ 0 and τ 1 are the numbers of periods where variable h t remained unchanged in states 0 and 1, respectively. ) are the parameters for the probability of transition from state 1 to 0 (from state 0 to 1). This parametrization of the switching regime allows the modeling of events with both long and short durations.

| Estimation
Equation (7) rewrites the model for the log-spot price (P t ) by grouping the different stochastic terms in a single component. It also extends the deterministic function F t ( ) and considers h t as a dummy or categorical variable whose average effect is captured through constant D.
The estimation is performed in several stages: First, the parameters (β) of the deterministic variables; second, the stochastic autoregressive component represented by Y t . The residuals of this autoregressive process are governed by two different random variables, ϵ t and j t , which are assumed to be independent and identically distributed as SNP random variables and have an effect on the price according to the value of h t .
The parameters of the SNP density of ϵ t and j t (namely,   d and n  d j n ) are estimated by implementing ML algorithms, recursively selecting the appropriate truncation order and the significant parameters according to linear restriction tests. For instance, the log-likelihood function for a single variable ϵ t can be expressed as where H (ϵ ) s t is the so-called Hermite Polynomial (HP) of order s, described in Equation (10).
One obstacle to ML implementation is that estimation algorithms might converge to estimates that do not guarantee positiveness in the full domain. This may be solved by implementing recursive estimation algorithms based on moment conditions that lead the ML procedure to a global optimum (Mauleón and Perote 30 ), by constraining the parameter domain 29 or by implementing positive transformations. 46 We opt for the former of these alternatives, which may be particularly useful for large expansions, which enlarge the positivity regions. 47 Furthermore, as the normal is nested in the SNP, the best specification can be directly tested from the Likelihood Ratio (LR) in Equation (9), which compares the log-likelihood under the normal (l normal ) and SNP (l SNP ). Under the null hypothesis of normality, the statistic follows a χ 2 distribution with degrees of freedom equal to the number of parameters of the SNP expansion (n).
For the case of Colombian spot prices, the process h t depends on the occurrence of the El Niño phenomenon, which triggers the price increases. For this purpose, we consider the information from the National Weather Service of the National Oceanic and Atmospheric Administration on the cold and warm episodes by season. There, the warm periods were considered as h t = 1. To estimate the parameters for w 0 and w 1 , we considered a Logit model, and its fit is verified through the Hosmer-Lemeshow test. Finally, parameter D is the coefficient of linear regression that accompanies the occurrence of the El Niño phenomenon, while components j t and ϵ t affect price formation depending on El Niño episodes.

| SNP distribution
The most important contribution of this work is incorporating SNP functions into the components for price uncertainty and the two proposed regimes, either A standardized variable y exhibits an SNP distribution if its pdf is given by Equation (10), that is, it can be expressed in terms of the pdf of the standard normal, ϕ y ( ), and a weighted sum of its derivatives or HPs, H y ( ) F I G U R E 1 Cumulative probability function for SNP and standard normal distributions. SNP, semi-nonparametric.
The HPs form an orthonormal basis that guarantees that the expansion in Equation (10) is a pdf whose first s moments depend on the first δ s parameters. These parameters may be expressed as . The HPs can be recursively obtained, being H y , H y y ( ) = 5 5 y y − 10 + 15 3 , and so on. For empirical purposes, the expansion must be truncated at a finite degree n as in Equation (11), and the vector d Given the orthogonality of HPs, the expression in Equation (11) is a density, and its parameters capture the distribution moments, for example, d 1 and d 2 account for mean and variance, respectively, and d 3 and d 4 for skewness and excess kurtosis (provided that The SNP may also be characterized by the cumulative distribution function (cdf), which can be expressed in terms of standard normal cdf, y Φ( ) (see, e.g., Cortés et al. 35 ): It must be noted that the SNP approach to non-Gaussian processes has several advantages compared to other parametric alternatives. The main one is that it is a natural extension of the Gaussianity that asymptotically captures the true data-generating process. Therefore estimation algorithms may endogenously select the number of parameters necessary to a given degree of accuracy. This is particularly important for distributions with a substantial number of extreme events that require high-order parameters/moments to be considered. Most other alternatives used for capturing skewness and leptokurtosis imply a fixed parameter structure that either is unable to accomplish the far end of the tails (actually, they usually involve smoothly decreasing patterns) or parameter estimates might be misleading or biased − see Mauleón and Perote 30 for a discussion on the potential underestimation of the degrees of freedom of a Student's t parameter for financial risk measurement.
The marginal effect of each Hermite polynomial (to sixth order) on the cumulative standard normal distribution is illustrated in Figure 1, which also shows the sensitivity to different values of d s . There, the second plot in the first column in Figure 1 represents the function in (9) with s = 3; that is, the effect of incorporating only the thirdorder HP for different values of d 3 (−0.4, −0.2, 0.2, and 0.4). The first issue illustrated in Figure 1 is that nonpositivity issues may bring along conflicts in the cdf. When the normal distribution is compared to the SNP distribution, it can be seen that even order polynomials increase the slope of the cumulative distribution. The odd components modify the skew of the probability density function, shifting the cumulative function to the right (left) if the value of d s is positive (negative); this effect is more pronounced in the first-order polynomial.
We assess that the higher the polynomial order, the greater the magnitude of its effect for the same values of d s . The sixth-order HP has a higher impact on the shape of the cdf than its second-order counterpart. As the order of the polynomial increases, so makes the marginal contribution of coefficient d s , for example, when the value of d 5 changes from 0.2 to 0.4, there is a more significant impact on the cdf than when d 1 changes from 0.2 to 0.4. Those conditions on the marginal effects due to both the order of the polynomial s and the value of its parameter d s lead the analyst to identify the relevant components of the SNP distribution when fitting a particular data set.

| ELECTRICITY MARKET AND DESCRIPTION OF THE DATA
By 2018 the Colombian electricity market had a supply of 17,328 MW to attend to a national load of 8000 MW, where hydraulic generation represented 68% of the market share. This study considers information on the price in the Colombian electricity market from its beginnings in 1995 until December 2018 on a monthly basis. 1 In Table 1, Panel A presents the descriptive statistics of the average monthly market price (spot price P t ) of electricity in Colombia and its transformations: first difference ( P Δ t ), log-price ( P ln t ), and logarithmic return ( P Δ ln t ). Likewise, we considered different panels: Panel A analyzes the statistics for the complete series of prices F I G U R E 3 Autocorrelation of electricity log-spot price in Colombia and regime switches (h t ). ACF, autocorrelation function; PACF, partial autocorrelation function. 1 Information from Colombian electricity operator, XM.  Table 1 lists the autocorrelation function for the proposed price transformations. Noticeably, the autocorrelation levels of the differentiated series are lower, suggesting that the series of log-price may contain autoregressive components. Similar observations can be made about the autocorrelation function (ACF) and the partial autocorrelation function (PACF) in Figure 3.
The variable that governs the regime switch h t is constructed under the periods in which the electricity price in Colombia has been affected by the occurrence of the El Niño phenomenon, as measured by the series Oceanic Nino Index (ONI), available on the website of the US National Oceanic and Atmospheric Administration (NOAA). Among the moments when h = 1 t , the greatest impact in terms of mean and variance occurred during 2015-2016. In that year, a strong El Niño occurred, and this event coincided with a time when the thermal generation park was intensively substituting natural gas (with a relatively low variable production cost) with liquid fuels, oil, or diesel (with a high variable production cost). The monthly series of electricity log prices in Colombia (see Figure 2) presents an upward trend since the start of the market. QQplot diagram highlights the deviation of the sample concerning the normal distribution, mainly at the tails that support the modeling of the random components with SNP distributions.

| Model performance
decrease from 80% in typical conditions to 50% during the warming phase. A probabilistic forecast can be produced based on an a priori classification of the atmospheric conditions. 49 Figure 4 presents τ 0 , τ 1 , and the cdf for w 1 and w 0 . There is a probability of 51% that the regime will change from h = 1  Table 2 presents the estimated parameters for log-spot price. 2 The deterministic component evidences the effects of the trend and seasonality for a confidence level above 95%.
During the regime of El Niño, standard deviation, skewness, and kurtosis increase along with the increase of uncertainty and risk. Table 3 displays the results for estimating both random components for the logarithm of the electricity spot price. Panel A lists the descriptive statistics for ϵ t and j t . Panel B presents the d s components fitted for the SNP distribution, and Panel C compares empirical percentiles (Obs.) with fitted percentiles under both SNP and normal distributions. It is noteworthy that all SNP parameters are highly significant.
Furthermore, the performance of the SNP specification relative to the Gaussian is tested with the LR statistic in Equation (8), with a value of 56.6 (p-value lower than 0.001) for ϵ t and 9.3 (p-value 0.002) for j t , revealing the outperformance of the former for both processes. This issue is also illustrated in Figures 5 and 6, which represent the fitted densities under both SNP and normal specifications compared to the data histograms. The pictures show that the SNP distribution (dashed-dotted line) achieves a better description of the data density (shaded area) than the normal distribution (solid line).

| Model performance during the El Niño phenomenon
A major advantage of our model is the joint consideration of both non-Gaussian and switching regimen features. This subsection sheds light on the relative importance of both effects, particularly during El Niño occurrence. For this purpose, we simulate 100 trajectories for log-spot price from 1995 to 2018 considering F I G U R E 4 Regime switch in Colombia h t . 2 The modeling of the spot log-prices with an SNP distribution is equivalent to using the log-SNP proposed by Ñíguez et al. 50 TRESPALACIOS ET AL. three different approaches: (i) SNP-switching regime model, (ii) Gaussian-switching regime model, and (iii) SNP model without switching regime. Table 4 reports some historical and simulated statistics for the three models, including the most representative percentiles and the mean squared error (MSE). For comparison, the sample was stratified, and whilst Panel A presents the results for the full sample, Panel B records just the results during the El Niño periods. Overall, the 95% confidence intervals show an adequate model performance (i.e., historical moments are within the reported confidence intervals), except for T A B L E 2 Parameters of the log-spot price model.  skewness and kurtosis of the Gaussian models and the extreme percentiles of the models that do not consider a switching regime (when El Niño is active). Therefore, ignoring the regime change leads to underestimating the upper price percentiles, affecting the estimation of admissible risks. For the full sample, the SNP seems to be the best for representing the mean, skewness, and kurtosis, although to capture the variance better, the switching regime should also be considered. Focusing on the El Niño periods, the combined SNP-switching regime model outperforms the other two, even for the 2015-2016 period, with the strongest impact. This feature is confirmed by the MSE accuracy criterion for both El Ñiño periods and the full sample (note that the smaller the values, the better performance). Figure 7 shows the performance of the three simulated models compared to the historical data. A first observation is that both Gaussian and SNP models with regime switch (Figure 7A,B) manage to predict the occurrence of El Niño, although they seem to overpredict it when its intensity is moderate. However, in 2015-2016, when El Niño had its strongest impact, the switching regime exhibited a remarkable performance. This suggests that the model can be used as an early warning method for extreme events. On the other hand, the SNP alone ( Figure 7C) fails to detect the El Niño events, but at least it seems to accomplish to the regime change with higher volatility of course, this is just an illustration based on a single simulation of the series.

| Applications and limitations
In this section, we present some ideas about how this modeling of electricity prices can be incorporated into other models to aid decision-making or system planning and the limitations of this perspective that academics or professionals should be aware of during eventual implementations.
An electricity system aims to satisfy all users' electricity needs with proper security, quality, and prices. From the investors' point of view, all these goals | 1573 must be reached under adequate investor return periods, where market risk management and financial hedging products are essential. Our spot price model accounts for extreme price uncertainties linked to unpredictable events and others that might be somehow anticipated since they depend on recursive events related to climate phenomena. As in a purely financial risk model, most uncertainties can be accomplished by a flexible SNP density specification, but those produced by recursive events require implementing some switching regime model. The effectiveness of the applications of the SNP model to asset pricing and hedging has already been introduced by Trespalacios et al. 39 However, incorporating switching regimes would also add T A B L E 4 Historical and simulated moments for spot log-prices.  P  1  63  55  14  125  54  15  134  30  7  77   5  65  78  23  173  71  20  168  40  10  98   25  78  131  37  319  124  36  297  65  18  148   50  150  191  49  466  184  54  452  90  25  203   75  196  269  67  657  284  72  725  123  35  269   95  808  527  122  1522  490  123  1267  216  66  457   99  1035  807  158  2749  676  160  1682  334  102  784 Abbreviation: P, percentiles.
value to electricity financial derivatives, particularly in highly hydrological-dependent energy systems governed by climate phenomena. Therefore, our modeling perspective has direct applications to improving risk measurement, hedging strategies, and electricity financial derivatives valuation. Of course, this combined SNP switching regime is particularly useful to energy markets influenced by strong climate phenomena. However, in the absence of price shifts, SNP is flexible enough to represent electricity returns. In any case, this is only an improvement; under the low carbon transition, it should be complemented with many other risk management practices. 51 On the other hand, this model also benefits electricity generation and planning. Management of spot price risks and electricity generation are interrelated decisions to satisfy the demand and provide efficient spot (and forward) pricing. Trespalacios et al. 39 research represents the first step in jointly modeling both variables. However, regime change forecasting is crucial not only for guaranteeing energy provision but also for its long-term sustainability. Indeed, our model approach permits electricity planners to complement their analyses to anticipate relevant implications for future market behavior. For example, for the Colombian electricity market, it must be considered that an increasing hydraulic market share could represent a benefit for the users in terms of price but also make the country more dependent on a nondiversifiable (climate) risk.
Finally, one issue that might condition the results in different time horizons is the frequency of the observations. Our study uses monthly data, which still exhibits non-Gaussian features. However, electricity prices fluctuate on a daily and hourly basis. These lower frequencies are even more appropriate for SNP modeling since they tend to exacerbate non-Gaussian features. However, with highfrequency data, the switching regime linked to El Niño is captured too slowly, and thus the applications of the SNPswitching regime model to daily/hourly data should be more useful for the modeling of other types of uncertainty components governed by changing regimes. An example might be measuring the impact on market (electricity) prices of the different waves of the evolution of the COVID-19 pandemic. Of course, implementing this model at daily or hourly resolution requires additional considerations-for example, the eventual presence of nonlinear effects, as in Matias and Reboredo. 52

| CONCLUSIONS
We propose a stochastic process with mean reversion and regime switches to represent the spot price of electricity and its logarithmic with a good performance for a highly hydro-dependent economy. We include three components: a deterministic one, a mean reversion, and the third one of regime-switching. The short-term distortion of the mean reversion and regime-switching components are represented by SNP distributions with a probability density function defined by a finite Gram-Charlier expansion, in contrast to previous studies that assume the case of the Gaussian process. Therefore, the most significant contribution of this paper is modeling the switching regime in terms of mean, standard deviation, skewness, kurtosis, and higher-order moments.
For the Colombian electricity market, the SNP assumption combined with a regime switch outperformed the normal distribution and was able to forecast weather shifts. Planning, regulation, and control agencies, as well as agents who buy and sell electricity, can use this model to improve the way they measure risks, manage their portfolios, define the expansion of the system, and try to anticipate eventual energy crises. The activities related to electrical system planning should consider the effect of extreme events in terms of probability, impact, and length. If the regime-switching (which, in Colombia, is produced by the occurrence of the El Niño phenomenon in its different categories) further affects the levels of expectation and uncertainty, the agents involved in short-and long-term price formation should incorporate the assumed risks in the price that users pay. The government and the energy policy-makers should identify the key necessary conditions and activities to accomplish the sustainable development goals; this market-based approach contributes as a measuring tool. The expansion of systems should not only ensure the supply and long-term prices in accordance with income levels or the needs of users but also the (systemic and idiosyncratic) risk levels that involve hidden or revealed costs that jeopardize the conditions of return of the invested capital or the objectives of development and social welfare.