Assessing skewness in financial markets

It is a matter of common observation that investors value substantial gains but are averse to heavy losses. Obvious as it may sound, this translates into an interesting preference for right‐skewed return distributions, whose right tails are heavier than their left tails. Skewness is thus not only a way to describe the shape of a distribution, but also a tool for risk measurement. We review the statistical literature on skewness and provide a comprehensive framework for its assessment. Then, we present a new measure of skewness, based on the decomposition of variance in its upward and downward components. We argue that this measure fills a gap in the literature and show in a simulation study that it strikes a good balance between robustness and sensitivity.

It is a matter of common observation that investors value substantial gains but are averse to heavy losses. Obvious as it may sound, this translates into an interesting preference for right-skewed return distributions, whose right tails are heavier than their left tails. Skewness is thus not only a way to describe the shape of a distribution, but also a tool for risk measurement. We review the statistical literature on skewness and provide a comprehensive framework for its assessment. Then, we present a new measure of skewness, based on the decomposition of variance in its upward and downward components.
We argue that this measure fills a gap in the literature and show in a simulation study that it strikes a good balance between robustness and sensitivity.

INTRODUCTION
The financial literature offers a wealth of papers dealing with the construction and implementation of risk measures to allow investors to make informed trading decisions. In light of the recent financial crises, these measures have become increasingly important in order to prevent tail risk events.
One of the most important risk measures is the VIX index adopted by the Chicago Board Options Exchange (CBOE). This is a forward-looking measure of volatility that investors expect in the coming month (Whaley, 2009). Volatility indexes are deemed by market operators to capture market fear: high index values are associated with high levels of uncertainty in the underlying market, whereas low index values with stable conditions; see Whaley (2000) and Muzzioli (2013b). Traditionally, financial returns are assumed to follow a normal distribution. In this connection, volatility is a good measure of risk based on the idea that investors are averse to uncertainty. However, a significant number of studies highlight the fact that financial returns are nonnormally distributed; see for example, Fama (1965), Peiró (1999), Chen, So, and Gerlach (2005), Lempérière et al. (2017), and Elyasiani, Gambarelli, and Muzzioli (2018). Specifically, financial returns are found to present an empirical distribution with heavy tails and a negative skew. In other words, the occurrence of extreme and negative events is more probable than in the normal distribution. This has consequences in terms of the need to include higher-order moments as indicators of market risk. This may be illustrated by the CBOE SKEW index. The CBOE SKEW index has been listed on the CBOE since February 2011 to measure the tail risk not fully captured by the VIX index. While VIX measures the overall risk in the 30-day SςP500 log-returns without disentangling the probabilities attached to positive and negative returns, the skewness index (CBOE SKEW) is intended to measure the perceived tail risk, that is, the probability that investors attach to extreme negative returns. The CBOE SKEW index relies on Pearson's (third order) moment coefficient of skewness. It is well known that Pearson's moment coefficient of skewness is not a robust measure of skewness. In the statistical literature there are several cases in which Pearson's moment coefficient of skewness leads to controversial conclusions. From a financial point of view, this could have serious problems. Indeed, the role of the CBOE SKEW index as an indicator of market fear has been questioned since it frequently moves in the same direction as returns (Elyasiani, Gambarelli, & Muzzioli, 2020).
There is an extensive statistical literature on skewness, which we review in this paper in order to outline a comprehensive framework for its assessment. We find that only the classical measures of skewness rely on higher-order moments, while a number of more recent measures do not. We also propose a new measure of skewness: the risk asymmetry index (RAX) based on the same measure introduced by Elyasiani et al. (2018). The authors derive the RAX index by estimating upside, downside and total volatility of returns using option prices in a model-free setting (implied volatility). On the other hand, in this paper we derive the RAX index by computing upside, downside, and total volatility of returns from the physical distribution. By proposing the RAX index and reviewing existing measures of skewness, we seek to contribute to the investigation of skewness measures from a statistical perspective, arguing that the notion of skewness can play an important role in risk measurement.
The choice of the RAX index is based on its ability to detect not only risk, i.e. the volatility of returns, but also asymmetry, that is, the different volatility of positive and negative returns, as empirically found in Elyasiani et al. (2018). In particular, we highlight two main points. First, it has been found that the RAX index subsumes all the information of the skewness index (ITSKEW) and volatility index (ITVIX) for the Italian market. Indeed, when these three indexes are included in the same model, both the ITSKEW and the ITVIX fail to have a significant explanatory power for future returns. In this connection, the contribution of the RAX is useful for investors who can exploit its information in order to make profitable trades. Moreover, Elyasiani et al. (2018) found that the RAX is able to indicate future fear or greed since extremely high or low level of the index are related to positive or negative future returns. While the volatility index ITVIX provides useful information on future returns only in the high volatility period, the RAX index provides useful information about future returns during the entire sample period. The RAX index gives a clear and unambiguous signal to investors as extremely low (high) values of the risk-asymmetry index signal a buy (sell) opportunity.
The paper proceeds as follows. In Section 2 we outline our framework for the study of univariate skewness, with a review of the tools available for both qualitative and quantitative assessments, identifying the statistical properties that a valid measure of skewness should satisfy. In Section 3 we propose the new measure of skewness, focussing on its statistical properties and interpretation. Section 4 contains an analysis of the robustness to outliers and sensitivity to changes in the shape of the distribution of the RAX and the most relevant measures of skewness. Section 5 concludes the paper with a brief discussion.

SYMMETRY AND ASYMMETRY
Skewness is defined as a relaxation of symmetry to allow for asymmetry in a specific direction. We therefore start our investigation by introducing the notion of symmetry, which is uncontroversial, at least in the univariate case. A univariate random variable X is symmetric with respect to the real value m when X − m and m − X have the same distribution; see for instance Doksum (1975). This property of the distribution of X can be written as P{X ≤ m − t} = P{X ≥ m + t} for all t > 0, which amounts to saying that all corresponding left and right tails of X with respect to m have the same weight. Letting t → 0, it is straightforward to see that m has to be the median of X, uniquely defined as the midpoint of the interval formed by all such that P{X ≤ } ≥ 1∕2 and P{X ≥ } ≥ 1∕2. It is also evident that symmetry is invariant with respect to affine transformations: if X is symmetric and Y = + X, then Y is also symmetric. Further considerations will be eased by the introduction of a suitable distributional framework. Let F be the distribution function of X. We define the support interval of F as the open interval ]a, b[ with left endpoint a = inf{x ∈ R|F(x) > 0} and right endpoint b = sup{x ∈ R|F(x) < 1}. Note that it is possible to have a = −∞ or b = +∞. We assume that F is continuous on the real line and strictly increasing on its support interval. More specifically, we assume that F is obtained from a probability density function f that is continuous on ]a, b[ and such that f (x) > 0 for all Let  0 be the class of all such distributions. Piecewise continuous density functions could be allowed to enlarge  0 , but here we favor simplicity over generality. Nonetheless, we point out that such an enlargement would provide scope for probability density histograms (representing an important class of data based distributions). Interesting subclasses of  0 are obtained by assuming that X has finite moments up to some order; let  k = {F ∈  0 |E|X| k < ∞}, k = 1, 2, … , be such classes. If F ∈  1 , we denote by = E(X) the mean of X. If F ∈  2 , we denote by 2 = E(X − ) 2 the variance of X (with denoting the standard deviation of X); note that 2 > 0 because F is continuous.
Assuming F ∈  0 , the symmetry condition can be written as where m = F −1 (1∕2) with F −1 uniquely defined on the open interval ]0, 1[ as the inverse function of F. Note that F −1 , called the quantile function of X, is also continuous and strictly increasing.
Since f is the derivative of F, condition (1) can be rewritten as f (m − t) = f (m + t) for all t > 0, so that = m, for F ∈  1 , and can replace m in (1). A special case of interest is given by unimodal distributions. Following Dharmadhikari and Joag-Dev (1988), p. 2, we say that X is unimodal at x ⋆ ∈]a, b[ if F is convex on ]a, x ⋆ [ and concave on ]x ⋆ , b[, which corresponds to f increasing on ]a, x ⋆ [ and decreasing on ]x ⋆ , b[. We say that X is unimodal (tout court) if it is unimodal at some x ⋆ . In this case, the mode of X (denoted by M) can be uniquely defined as the midpoint of the interval formed by all x ⋆ such that X is unimodal at and M can replace m in (1). Hence, for all symmetric X with F ∈  ⋆ 1 , we have M = m = and the three classical measures of central tendency coincide.
If X is not symmetric, we say that X is asymmetric. While all symmetric distributions are alike in symmetry, each asymmetric distribution is asymmetric in its own way. Skewness relaxes symmetry to allow for a specific type of asymmetry: a random variable is left-skewed when its left tails are heavier than its right tails, whereas it is right-skewed when its right tails are heavier than its left tails. A strictly skewed variable is a skewed variable that is not symmetric and, as such, it represents a way of being asymmetric. We formalize these notions in the following: Section 2.1 provides an assessment of when a random variable is manifestly skewed; Section 2.2 provides an assessment of how much of a skew a given random variable exhibits, even though such a skew may not be manifest.

Qualitative assessment of skewness
We start with a simplifying remark: since the left tails of a random variable X are the right tails of the opposite random variable −X, and vice versa, it will be sufficient to assess when X is right-skewed; X will be left-skewed when −X is right-skewed. We proceed with a discussion of when a random variable Y is more right-skewed than another random variable X, which is both an interesting problem in itself and one whose solution will enable us to consider X as right-skewed when it is more right-skewed than −X, as recommended by MacGillivray (1986). Following van Zwet (1964), we compare X with Y by means of the function R(x) = G −1 (F(x)), a < x < b, where G is the distribution function of Y . We call R = G −1 •F the quantile-quantile function of Y against X, because it is the function whose graph is represented in the Q-Q plot with X on the horizontal axis and Y on the vertical axis. We say that X is less right-skewed than Y , or Y is more right-skewed than X, and write F ≾ G, or G ≿ F, if R is convex, which indicates that the left tails of X are progressively heavier than the left tails of Y and the right tails of Y are progressively heavier that the right tails of X. Note that R is strictly increasing (as well as continuous) on ]a, b[. The quantile-quantile function of X against Y is given by An interesting characterization of the above described comparison is that F ≾ G if and only if Y is equal in distribution to a strictly increasing convex transformation of X: on the one hand, the variable R(X) has the same distribution as Y ; on the other hand, if Y = (X) with strictly increasing and convex, then R = . If both F ≾ G and F ≿ G, we write F ∼ G. Clearly, this is the case when R is a positive affine function, that is, Y = + X with > 0. This makes relative skewness a property of location-scale models rather than individual distributions. The distribution function of −X is F(x) = 1 − F(−x), x ∈ R. Since F ≾ G if and only if F ≿ G, we find that X is less right-skewed than Y if and only if −X is more right-skewed than −Y ; this means that we can safely interpret F ≼ G as X being more left-skewed than Y , or Y being less left-skewed than X.
It is a simple matter to check that ≾ is reflexive and transitive: F ≾ F and F ≾ G, G ≾ H implies F ≾ H, where H denotes the distribution function of a third variable Z. This means that the relationship ≾ is a preorder on  0 , which justifies its common name of convex ordering of distributions and qualifies ∼ as the equivalence relationship defined by ≾. Several other orderings of distributions have been proposed in the skewness literature (Arnold & Groeneveld, 1993;MacGillivray, 1986;Oja, 1981) and it turns out that the convex ordering is the strongest one. This is because it only considers the convexity of the quantile-quantile function, without reference to any measure of central tendency, and thus it only signals the most manifest cases of relative skewness.
As anticipated, we say that X is right-skewed when −X ≾ X (and left-skewed when X ≾ −X); note that here, for the sake of expressiveness, we apply ≾ to −X and X rather than to F and F. Since the convex ordering of distributions actually compares location-scale models, we can compare m − X with X − m instead of −X with X. This leads us to focus on the quantile-quantile function R of X − m against m − X and its slope r, which are given by Since R(0) = 0 and r(0) = 1, it follows from the convexity of R that x ≤ R(x) for all x in (2), and therefore which is known as van Zwet's condition (Abadir, 2005) after van Zwet (1979) introduced it to prove the celebrated mode-median-mean inequality (usually considered a sign of right skewness). When (3) holds, the right tails of X with respect to m are uniformly heavier than the corresponding left tails. This notion of skew to the right (Doksum, 1975) is easy to interpret, but it is relative to a specific measure of central tendency. If the mean or mode of X are also available, different tails can be compared and, in general, they will lead to different assessments of skewness: see Sato (1997) for an illustration of this phenomenon.
We conclude our discussion of qualitative skewness by noting that a random variable X is symmetric if and only if it is both right-skewed and left-skewed (−X ∼ X). A notion of strict skewness can then be obtained by excluding symmetric distributions (−X ≺ X or X ≺ −X). In this way, we are able to partition  0 into four groups of distributions: symmetric distributions, strictly left-skewed distributions, strictly right-skewed distributions, and other asymmetric distributions. As demonstrated in the next subsection, the distribution in the last group can be adjudicated as cases of negative (left) or positive (right) skewness through the choice of a suitable measure of skewness.

Quantitative assessment of skewness
Here we consider the more ambitious problem of measuring the extent to which a given distribution is skewed. This means associating to every distribution of interest a real number, whose sign captures the direction of skewness and whose absolute value is greater when skewness is more pronounced. Formally, within our distributional framework, for distributions with finite moments up to order k ∈ {0, 1, 2, … }, we aim to specify a functional Sk ∶  k → R, which will be called a measure of skewness of order k and will be required to satisfy the following properties: If we are only interested in unimodal distributions, we can replace  k with  ⋆ k and specify a unimodal measure of skewness of the same order. We will say that Sk is valid to stress that (P1) and (P2) hold.
The meaning of (P1) and (P2) is that we intend our quantitative assessment to respect our qualitative assessment. Indeed, it follows from (P1) and (P2) that is, we require any valid measure of skewness to be location-scale invariant. Hence, to all intents and purposes, we are making the same assumptions as Groeneveld and Meeden (1984), rooted in the foundational work of Oja (1981) and consistently used in later work (Arnold & Groeneveld, 1995;Groeneveld, 1991aGroeneveld, , 1991bGroeneveld & Meeden, 2009;Tajuddin, 1996Tajuddin, , 1999. If we define positive skewness as Sk(F) ≥ 0 and negative skewness as Sk(F) ≤ 0, we are able to partition  k (or  ⋆ k ) in three groups of distributions: distributions with strictly positive skew, distributions with strictly negative skew, and distributions with null skew. Properties (P1) and (P2) ensure that such a partition does not contradict the partition obtained in the previous subsection in terms of left and right skewness. We will call two measures of skewness equivalent when they give rise to the same partition.
In the following, we review the literature on measuring skewness with (P1) and (P2) in mind. We also take into account the values taken by Sk. Let s = inf F Sk(F) and s = sup F Sk(F) be its extrema. Note that s = −s by (P1). If Sk is a valid measure of skewness and is an odd and (strictly) increasing real function defined on the image of Sk, then •Sk is another (equivalent) valid measure of skewness. This means that any measure of skewness can be transformed so as to have −1 and +1 as extrema, if s is known. However, this is not necessarily the case and, even if it is, there is an interest in understanding whether the extrema can be attained or not and which distributions come close to them. Finally, we pay attention to the estimation of Sk(F) when a random sample X 1 , … , X n from F is available (beyond the general but noisy strategy of computing Sk on a density estimate belonging to its domain of definition).

2.2.1
Higher-order measures The study of skewness was pioneered by Pearson (1895Pearson ( , 1901Pearson ( , 1916. In fact, the most classical measure of skewness goes under the name of Pearson's moment coefficient of skewness: where typically j = 1, but possibly j = 2, 3, … ; since j is well-defined when the distribution of X has finite moments up to order 2j + 1, the choice j = 1 is the least demanding in terms of distributional assumptions. The rationale behind (4) is to use higher-order moments to gauge the extent to which the right tails of X are heavier than its left tails. This strategy obtains a valid (2j + 1) th order measure of skewness: it is apparent that (4) satisfies (P1) and it was shown by van Zwet (1964) that (4) satisfies (P2); see also Oja (1981) and MacGillivray (1986).
The third-order measure of skewness 1 (standardized third central moment of X) is also called the Fisher-Pearson coefficient of skewness (Doane & Seward, 2011); see Arnold and Groeneveld (1995) for its historical attribution and Holgersson (2010) for the link to Fisher (1929). As an aside, note that Karl Pearson was not interested in the sign of skewness and used 1 = 2 1 in place of 1 . If X follows a Pareto distribution with unit scale and large enough shape, that is, f (x) = ∕x +1 for x > 1 and f (x) = 0 for x ≤ 1, with > 3, then 1 is arbitrarily large for close enough to 3 and we conclude that s = +∞ (no finite upper bound on 1 ); see Groeneveld and Meeden (1984) for details.
A natural estimator of 1 is the sample moment coefficient of skewnesŝ1, defined as the ratio of the sample centered third moment ⟨( is the sample sd. The valuê1 obtained in this way (sample moment coefficient of skweness) can be adjusted for sample size, but we are not interested in such an adjustment here; see Doane and Seward (2011) for information and references on this topic. Egon Sharpe Pearson, together with H. O. Hartley, provided tables to usê1 as a test for departure from normality (Doane & Seward, 2011); see also Holgersson (2010) on testing asymmetry. Finally, the sharp algebraic bound |̂1| ≤ (n − 2)∕(n − 1) 1∕2 holds for all samples of size n (Cox, 2010;Kirby, 1974;Wilkins, 1944) even though we have seen that 1 can take arbitrarily large values.
As illustrated by Li and Morris (1991), in some cases 1 may not express asymmetry well. Furthermore, as it is based on the third-order moment, 1 is strongly influenced by outliers; see for instance Groeneveld (1991a). This lack of robustness, together with an appetite for broadening the domain of definition, motivates the investigation of alternative measures of skewness.

Unimodal measures
A second measure of skewness that dates back to the pioneering work of Pearson (1895) is called Pearson's mode coefficient of skewness or Pearson's first coefficient of skewness: Remarkably, in this case (gamma distribution with unit scale), the equality S ′ K = 1 ∕2 holds (Arnold & Groeneveld, 1995). The quantity 1 = 1 ∕2 is called the coefficient of momental skewness Zwillinger and Kokoska (1999, p. 18) and is clearly equivalent to 1 . In general, of course, 1 and 1 are not equivalent to S ′ K . The rationale behind (5) is that, as discussed in Section 2.1, the mode-median-mean inequality is a sign of right skewness. If −X ≼ X, then S ′ K > 0 and (5) gauges the width of inequality. It is clear that S ′ K satisfies (P1). However, as illustrated by Arnold and Groeneveld (1995), the coefficient S ′ K does not satisfy (P2): compatibility with right skewness does not extend to full compatibility with the convex ordering of distributions. Hence, we cannot regard S ′ K as a valid measure of skewness. Arnold and Groeneveld (1995) proposed replacing (5) with which we call the Arnold-Groeneveld coefficient of skewness. The coefficient AG is well-defined for all F ∈  ⋆ 0 , because it does not involve any moment of X, which is an improvement in itself. The rationale behind (6) lies in an implicit comparison between M and m (in place of ): if M ≤ m then F(M) ≤ 1∕2 and AG ≥ 0. In this way, as in the previous case, right skewness implies AG ≥ 0 through the mode-median-mean inequality. In addition, unlike the previous case, property (P2) is satisfied. This was shown by Arnold and Groeneveld (1995) assuming differentiable probability density functions, but it holds for all F, G ∈  ⋆ 0 with modes M X and M Y , respectively, that F ≼ G implies F(M X ) ≥ G(M Y ); this follows from F = G•R, where R = G −1 •F, and the definition of unimodality. Since (P1) follows from the equality F(−M) = 1 − F(M), we find that AG is a valid unimodal measure of skewness of order zero (best possible order).
The coefficient AG takes values in [−1, 1] and the equality AG = 1 is attained when M = a, which requires a > −∞ in the support interval of F, while AG = −1 when M = b, which requires b < +∞. It follows that all decreasing densities exhibit maximal positive skewness, while all increasing densities exhibit maximal negative skewness. This is clearly a limitation, because AG cannot discriminate between monotone densities of the same type. A sample version̂A G of (6) can be obtained from an estimatorM of the mode and an estimatorF of the distribution function; the latter can be the empirical distribution function, for simplicity, while the former can be one of the estimators of the mode implemented in package modeest (Poncet, 2019) for R (R Core Team, 2019); see the references therein.

2.2.3
First-order measures A third classical measure of skewness is called Pearson's median coefficient of skewness or Pearson's second coefficient of skewness: where the leading (arbitrary) multiplicative constant stems from an approximation of (5); see Yule (1911, p. 150). Equation (7) is well-defined for F ∈  2 and is based, like (5), on the mode-median-mean inequality. It is clear that S ′ ′ K satisfies (P1), but like S ′ K , as shown by van Zwet (1964), S ′ ′ K does not satisfy (P2). As a result we cannot consider S ′ ′ K a valid measure of skewness. However, a valid replacement for (7) is provided by Groeneveld and Meeden (1984): which is well-defined for all F ∈  1 and we call the Groeneveld-Meeden coefficient of skewness.
The broader domain of definition is an advantage in itself, property (P1) is clearly preserved and, moreover, GM satisfies (P2), as shown by Groeneveld and Meeden (1984). The coefficient GM is thus a valid measure of skewness of order one (best possible order using the mean). The mean absolute error turns out to be the right denominator for the difference between the mean and the median, if this is to be used as a measure of skewness. We know from Jensen's inequality that |E(X − m)| ≤ E|X − m| with equality if and only if P{X ≥ m} = 1 or P{X ≤ m} = 1. It follows that −1 < GM < 1 and the extrema of GM are unattainable by continuous distributions; see Groeneveld (1991b) for the case of discrete distributions. A sample version̂G M of (8) will be obtained by replacing m with the sample medianm and E|X − m| with its sample counterpart ⟨|X −m|⟩, as well as witĥ. Finally, we point out that (8) admits an interpretation in terms of a player betting that an observation X will exceed its median m, which is especially interesting from a financial viewpoint; see Groeneveld and Meeden (1984) for details.
A simple alternative first-order measure of skewness was suggested by Tajuddin (1999) in parallel to AG : We call T in (9) the Tajuddin coefficient of skewness, noting that Tajuddin (1996) had previously proposed the equivalent measure log(F( )∕{1 − F( )}) = log(1 + T )∕(1 − T ). Equation (9) is clearly well-defined for all F ∈  1 , it satisfies (P1), because F(− ) = 1 − F( ), and it satisfies (P2), because Jensen's inequality gives E(Y ) = E(G −1 (F(X))) ≥ G −1 (F( )) if X ≼ Y (F ≼ G); see also Tajuddin (1996). It follows that T is a valid alternative to GM . The rationale behind (9) is again the mode-median-mean inequality for right-skewed distributions: if a return is right-skewed, then it is probably below average. It may sound counterintuitive that investors like such returns, but a different wording is possible: if a return is right-skewed, then on average it is in the right tail of its distribution. This may sound more palatable, but neither formulation has any impact on the validity of T . As for the values that T can take, it is immediate to see that −1 < T < 1. The extrema cannot be attained, because F is continuous, but in Section 3 we present an example where T = 1 − 2 → 1 as ↓ 0. Finally, a sample version̂T of T can be obtained from (9) by estimating F with the empirical distribution function and witĥ = ⟨X⟩.

Zeroth-order measures
None of the measures of skewness presented until now is well-defined for all F ∈  0 . A possibility in this sense is offered by the quantile coefficient of skewness where ∈]0, 1∕2[ and a typical choice is = 1∕4. The quartile coefficient of skewness B 1∕4 dates back to Bowley (1920, p. 116) and is known as the Bowley-Yule coefficient of skewness, because the coefficient 2B 1∕4 can be traced back to Yule (1911, p. 150). Groeneveld and Meeden (1984) introduced B , inspired by Hinkley (1975), and also let ↓ 0 to obtain the coefficient B 0 = (a + b − 2m)∕(b − a) for distributions with a bounded support interval, that is, with a > −∞ and b < +∞.
Remarkably, if both the numerator and denominator in (10) are integrated with respect to from 0 to 1∕2, before taking their ratio, the coefficient GM in (8) emerges (assuming F ∈  1 ). It was shown by Groeneveld and Meeden (1984) that B satisfies (P1) and (P2) for all ∈ [0, 1∕2[. Hence, we have a family of valid zeroth-order measures of skewness (making no assumptions on the moments of the distribution). Groeneveld and Meeden (2009) suggest a variant of (10) that is appropriate when the direction of skewness is known a priori, but we do not deal with this case here. Brys, Hubert, and Struyf (2003) argue that the octile coefficient B 1∕8 is more appropriate to detect asymmetry than the quartile coefficient B 1∕4 , because it uses more information from the tails of the distribution, but they also note that B 1∕4 is less sensitive to outliers (more robust) than B 1∕8 . In the end, this tension between sensitivity and robustness is at the heart of the choice of in (10) and, more generally, of a measure of skewness or any other distributional summary.
It is easy to see that −1 < B < 1 for all ∈ [0, 1∕2[; the extreme values are unattainable by continuous distributions, because B = −1 would require F −1 (1 − ) = m and B = 1 would require F −1 ( ) = m, but see Groeneveld (1991b) for discrete distributions. A sample version of (10) will be obtained by replacing all quantiles of F by their sample counterparts (quantiles of the empirical distribution function); in particular, of course,m will replace m. The coefficients B 0 and B 1∕4 admit an interpretation analogous to that of GM ; see Groeneveld and Meeden (1984). The coefficient B features, together with the median m and the th interquantile range S = F −1 (1 − ) − F −1 ( ), in an interesting decomposition of F −1 ( ), where varies in ∈ [0, 1∕2[; see Benjamini and Krieger (1996).

THE RISK ASYMMETRY INDEX
Let X be a random variable with mean and standard deviation . The centered variable X − can be written as the sum of its positive part (X − ) + = max(0, X − ) and its negative part (X − ) − = max(0, − X). Accordingly, the variance of X can be written as 2 = 2 U + 2 D , where 2 U = E(X − ) 2 + is called the upside variance of X and 2 D = E(X − ) 2 − is called the downside variance of X; note, however, that 2 U and 2 D are not ordinary variances, and in particular not the variances of (X − ) + and (X − ) − , but rather their second order moments. The quantities U and D are called the upside sd and downside sd of X, respectively. From a financial viewpoint, U represents "good" volatility and D represents "bad" volatility, while represents "total" volatility. The risk asymmetry index (Elyasiani et al., 2018) is defined as RAX = ( U − D )∕ and represents the relative excess of "good" volatility (with respect to "bad" volatility) in the distribution of returns modeled by X. The rationale behind this definition is to compare above average returns with below average returns in terms of their root mean squared residuals. We show in the following that such a comparison results in a valid measure of skewness.
The RAX can be rewritten as: that is, as a strictly increasing function of the relative upside variance 2 U ∕ 2 or, alternatively, as the opposite function of the relative downside variance 2 D ∕ 2 = 1 − 2 U ∕ 2 . This rewriting is useful to show that RAX is a valid measure of skewness. In fact, property (P1) follows directly from the fact that the upside variance of −X is the downside variance of X. As for property (P2), we first note that (11) is location-scale invariant. This allows us to focus on the standard case = 0 and = 1. In this case, we have 2 U = EX 2 + and it follows from theorem 5.3 in Oja (1981) that X ≼ Y implies EX 2 + ≤ EY 2 + . Then, by (11), the same inequality holds for RAX and (P2) holds. We conclude that RAX is a valid second-order measure of skewness and, as such, it fills a gap in the literature reviewed in Section 2.2.
It is clear from the decomposition of variance in its upside and downside components that 0 < 2 U ∕ 2 < 1. If X is symmetric, then 2 U = 2 D and 2 U ∕ 2 = 1∕2, so that RAX = 0. The following example shows that the relative upside variance can come arbitrarily close to 1 for a suitable choice of F in  2 . Let X be a random variable with probability density function defined by f ( Similarly, since 2 D ∕ 2 → 0, the relative upside variance of −X can get arbitrarily close to 0. It follows that RAX → 1 for X and RAX → −1 for −X. Note that P{X ≤ 0} = 1 − , so that T = 1 − 2 as anticipated in Section 2.2. A sample version of RAX can be obtained from (11) by replacing 2 with the sample vari-ancê2 and 2 U with the sample upside variancê2 U = ⟨(X −̂) 2 + ⟩ or 2 D with the sample downside variancê2 D = ⟨(X −̂) 2 − ⟩, where of coursê= ⟨X⟩ is the sample mean. An advantage of RAX is that risk-neutral versions of these quantities are easy to obtain from option data in a model-free set-up (Bakshi, Kapadia, & Madan, 2003;Muzzioli, 2013aMuzzioli, , 2013b; this was indeed the original setting of Elyasiani et al. (2018). In this setting U and D represent the upside corridor implied volatility and the downside corridor implied volatility, respectively (Carr & Madan, 1998;Muzzioli, 2013aMuzzioli, , 2013b. From an economic point of view the upside corridor is associated with "good" volatility, as it refers to the potential for substantial gains. On the other hand, the downside corridor is associated with "bad" volatility due to the risk of heavy losses for investors. Note that, in principle, the entire risk-neutral distribution function of returns can be recovered from option data, because it is the discounted first derivative of the European put price, but in practice it can be tricky to go beyond the first moments; see Birru and Figlewski (2012) for an example of work in this direction.

EMPIRICAL FINDINGS
In this section we follow the approach of Brys et al. (2003), Tajuddin (1996Tajuddin ( , 1999 to examine the robustness to outliers and the sensitivity to changes in the shape of the distribution of the RAX introduced in Section 3, together with other valid measures of skewness. More specifically, we consider simulated data from four common distributions (Gamma, Weibull, Lognormal, and Pareto) for different values of their shape parameter and we compare four measures of skewness: the Groeneveld-Meeden ( GM ), the Bowley-Yule (B 1∕4 ), the RAX and the Tajuddin ( T ) coefficient of skewness (all taking values between −1 and 1).
Recall that a Gamma distribution has probability density function given by where > 0 is a shape parameter, while is a scale parameter and can therefore be ignored (set to 1) as far as skewness is concerned. Arnold and Groeneveld (1995) highlight that the parameterization of Gamma distributions in terms of respects the convex ordering of distributions. Indeed, the Fisher-Pearson coefficient of skewness is given by 1 = 2∕ √ and it decreases as increases. A Weibull distribution has probability density function given by where > 0 is a shape parameter and is a scale parameter (set to 1 without loss of generality). The Fisher-Pearson coefficient of skewness for a Weibull distribution is known to be positive for small , decreasing with and negative for large (Tajuddin, 1999). A Log-normal distribution has the following probability density function: where > 0 is a shape parameter ( = 1∕ ) and = 0, without loss of generality, because exp( ) is a scale parameter. The Fisher-Pearson coefficient of skewness for a Log-normal distribution is 3 = (exp(− 2 ) + 2) √ exp(− 2 ) − 1, which decreases with ; see for example Tajuddin (1999). Finally, the probability density function of a Pareto distribution is where > 0 is a shape parameter, while x 0 is set to 1 (shape parameter). Note that, for a Pareto distribution, the expected value and variance are finite if > 1 and > 2, respectively, while the Fisher-Pearson coefficient of skewness requires > 3, and it is decreasing with .
In the Appendix, we report the average estimated skewness by varying in a distribution specific set of values, and letting n = 30,100, 1,000. Since the standard errors are small, and the four measures of skewness behave similarly for all sample sizes, in Figure 1 we focus on the F I G U R E 2 Boxplots of skewness estimates on 1000 random samples of n = 1,000 observations from a Gamma distribution with shape parameter = 1.5 without contamination (a) and with 15% contamination (b). Difference between average skewness estimates at contaminated and at uncontaminated data for different values of of the Gamma distribution and 5% (c) and 15% (d) contamination, respectively average estimated skewness, as a function of , for samples of size n = 100. In the four cases considered, we expect all four measures to decrease monotonically in and Figure 1 confirms this expectation. Each measure starts close to 1 when assumes its minimum value (indicating high skewness) and falls toward 0 as grows (indicating low skewness). For the Gamma distribution, in Figure 1a, we see that B 1∕4 and T attain the smallest values, while RAX maintains between GM (upper limit) and B 1∕4 or T (lower limit) for every . Similar results hold for the Weibull distribution, in Figure 1b, and for the Pareto distribution, in Figure 1d, whereas for the Log-normal distribution, in Figure 1c, the RAX remains slightly above the other measures of skewness, which are here very close to each other. This means that, across a variety of probability distributions, RAX displays an intermediate sensitivity to changes in the shape of each distribution.
As a further step, we analyze the robustness of the four measures of skewness with respect to the influence of a number of outliers. For this purpose, in panels a of Figures 2-5 we Weibull distribution with shape parameter = 1.5 without contamination (a) and with 15% contamination (b). Difference between average skewness estimates at contaminated and at uncontaminated data for different values of of the Weibull distribution and 5% (c) and 15% (d) contamination, respectively propose the boxplots of the skewness estimates on 1,000 random samples of n = 1,000 observations, while panels b of Figures 2-5 depict the same boxplots where we replaced 15% of the data with outliers spaced 8 sds to the right of the mean. The value of for each distribution is fixed, and it is equal to = 1.5 for the Gamma and the Weibull distributions, and = 0.8 and = 3 for the Log-normal and Pareto distributions. In the four numerical simulations conducted, it may be seen that the median values increase for all measures of skewness, bringing the boxes with them, whereas on the other hand the RAX shows a decrease. We will see that T can also decrease upon contamination, for smaller values of , while this will not happen to GM and B 1∕4 . We deeper explore robustness in panels c and d of Figures 2-5, where further simulations are depicted. Specifically, panels c and d of Figures 2-5 show, for each measure of skewness and several values of , the difference between the average estimated value at the contaminated and at the original datasets. We replaced a percentage of the original data with outliers under two different contamination levels: we contaminated our data at 5% in Figures 2-5c and Log-normal distribution with shape parameter = 0.8 without contamination (a) and with 15% contamination (b). Difference between average skewness estimates at contaminated and at uncontaminated data for different values of of the Log-normal distribution and 5% (c) and 15% (d) contamination, respectively at 15% in Figure 2-5d. Focusing on the absolute skewness change upon contamination, we note that B 1∕4 stands out as rather insensitive to the presence of outliers, while the performance of RAX is competitive with that of T and GM . As for the sign of change, as anticipated, we observe in panels c and d of Figures 2-5 a possible decrease of T and RAX, but not of B 1∕4 , nor of GM .
As an aside, following Tajuddin (1999), we analyze the skewness sign pattern according to the relationships between the mean, median, and mode of the Weibull distribution. Table 1 presents the signs of the four measures of skewness depending on the value of . It may be observed that the sign of B 1∕4 is opposite to the signs of all other measures when 3.2589 < < 3.4395. In this case, B 1∕4 fails to preserve the sign of ( − m), while RAX, T , and GM respect the relationship between and m. Based on our empirical findings, we can conclude that RAX strikes a good balance between robustness to outliers and sensitivity to changes in the shape of the distribution.

DISCUSSION
In this paper we presented a comprehensive framework for the assessment of univariate skewness. We reviewed existing measures and proposed a new one, called RAX, based on Elyasiani et al. (2018). We showed that RAX is a valid measure of skewness and can be safely used by scholars. By using simulated data, we found that RAX strikes a good balance between robustness to outliers and sensitivity to changes in the shape of the distribution. RAX is the relative difference between upside and downside volatility. We used volatility, following Elyasiani et al. (2018), due to its high standing in finance. In principle, we could also compare above average returns with below average returns in terms of their mean absolute residuals, rather than root mean squared residuals, but we would still need volatility in the denominator to satisfy (P2).