Expectile based measures of skewness

In the literature, quite a few measures have been proposed for quantifying the deviation of a probability distribution from symmetry. The most popular of these skewness measures are based on the third centralized moment and on quantiles. However, there are major drawbacks in using these quantities. These include a strong emphasis on the distributional tails and a poor asymptotic behaviour for the (empirical) moment based measure as well as difficult statistical inference and strange behaviour for discrete distributions for quantile based measures. Therefore, in this paper, we introduce skewness measures based on or connected with expectiles. Since expectiles can be seen as smoothed versions of quantiles, they preserve the advantages over the moment based measure while not exhibiting most of the disadvantages of quantile based measures. We introduce corresponding empirical counterparts and derive asymptotic properties. Finally, we conduct a simulation study, comparing the newly introduced measures with established ones, and evaluating the performance of the respective estimators.


INTRODUCTION
Symmetry of a distribution or density function is one of the oldest and, at the same time, one of the most important concepts in probability theory and statistics. A real random variable X with cumulative distribution function (cdf) F is symmetric about if X − ∼ − X, or, equivalently, if F( − x) = 1 − F( + x), x ∈ R. Over time, a sizeable number of asymmetry or skewness measures has been proposed in the literature to quantify the deviation from symmetry. The best-known skewness measure is certainly the moment-based measure M = E[(X − EX) 3 ]∕Var(X) 3∕2 , which is often used synonymously with the notion of skewness itself. However, this measure has a number of disadvantages. First, it is so sensitive to the extreme tails of the distribution that it is difficult to estimate accurately in practice when the distribution is markedly skew (Hosking, 1990). Second, it can not be normalized, which makes the according skewness values less informative and comparable. Third, the asymptotic distribution of its empirical counterpart involves moments up to order 6 of the underlying distribution, implying a very slow convergence especially for heavy-tailed distributions. Finally, M = 0 does not characterize symmetry. This means that there exist nonsymmetric distributions for which M is equal to zero. For a specific example using the generalized lambda distribution, see Ramberg et al. (1979) and Ramberg et al. (1980). Another example is the gamma difference distribution, also known as generalized asymmetric Laplace distribution. As the latter name indicates, this distribution is asymmetric, but the moment skewness is zero under a specific condition on the parameter values, see Klar (2015). On the plus side, there is its familiarity, and the fact that this measure is also reasonable for discrete distributions with finite third (or sixth) moment.
A distribution is symmetric if and only if q X (1 − ) − q X (1∕2) = q X (1∕2) − q X ( ) for each ∈ (0, 1∕2), where q X ( ) denotes the -quantile of the distribution of X. Hence, one can also provide a quantile-based definition of skewness: A distribution is said to be right-skewed (left-skewed) in the quantile sense, if and only if q X (1 − ) − q X (1∕2) ≥ (≤) q X (1∕2) − q X ( ), ∈ (0, 1∕2), and equality does not hold for each ∈ (0, 1∕2). Corresponding scalar measures of skewness are (see Yule, 1912, Bowley, 1920 or the general version where ∈ (0, 1∕2), introduced by David and Johnson (1956) (see also Hinkley, 1975, Groeneveld & Meeden, 1984. As a measure of skewness, b 1 can be criticized for being insensitive to the distribution of X any further into the tails than the quartiles (Hosking, 1990). Quantiles are not unique for discrete distributions, in particular for empirical distributions. Furthermore, statistical inference involving the asymptotic distribution of b 1 or b 2 requires the evaluation of the (unknown) density of the underlying distribution, which typically requires bandwidth selection and, hence, leads to a certain arbitrariness. Again, rather large sample sizes are desirable to reliably represent the population being sampled. On the other hand, b 1 is a robust and quite intuitive measure, and b 2 ( ) ≡ 0 characterizes symmetry. Both b 1 and b 2 satisfy three properties, introduced by van Zwet (1964), which are often seen appropriate for a skewness coefficient (see Oja, 1981, Groeneveld & Meeden, 1984Tajuddin, 1999): S1. For c > 0 and d ∈ R, (cX + d) = (X). S2. The measure satisfies (X) = − (X). S3. Let F and G, the cdf's of X and Y , be absolutely continuous and strictly increasing on {x : 0 < F(x) < 1} and {x : 0 < G(x) < 1}, respectively. If F is smaller than G in convex transformation order (written F ≤ 2 G), that is, The convex transform order is equivalent to Even though this characterization of the convex transformation order should be known, we were not able to find it in the literature. Thus, we give some details in Appendix. Plugging v = 1∕2, w = 1 − and u = into (2) shows immediately that b 2 satisfies (S3), that is, it preserves the convex transformation order. A different proof of this fact has been given by Groeneveld and Meeden (1984). On the other hand, the convex transform order is the strongest of all commonly used skewness orders. Hence, also the requirement b 2,X ( ) ≤ b 2,Y ( ) for each ∈ (0, 1∕2) is a very strong one. As Arnold and Groeneveld (1993) argue, one might be willing to forgive minor local violations, rather than to insist on a uniform domination of b 2, X by b 2, Y , and such a proposal is made in the following.
To do so, we consider measures of skewness based on or related to expectiles (which are formally defined in Section 2). Among others, we proposẽ where e X ( ) denotes the -expectile of X, as a family of expectile-based measures of skewness. Contrary to M , these measures can be normalized, characterize symmetry when equal to zero, exist for any distribution with finite first moment, and have very convenient asymptotic properties. Since expectiles can be seen as smoothed quantiles,s 2 ( ) has a similar interpretation as b 2 ( ), but is sensitive to the whole distribution of X.
There are some further skewness measures with structures similar to b 2 ands 2 ( ). L-skewness, a measure based on L-moments, was introduced by Hosking (1990) and can be written as where EX 1 : 3 ≤ EX 2 : 3 ≤ EX 3 : 3 denotes the order statistics of a random sample of size 3 from the distribution of X. Likes 2 , it exists whenever E|X|<∞. As for M , 3 = 0 does not characterize symmetry. As an example, consider the generalized extreme value distribution (GEV) with shape parameter k, for which 3 = 2(1 − 3 −k )∕(1 − 2 −k ) − 3 (Hosking, 1990). Hence, 3 = 0 for k * ≈ 0.284. However, having support (−∞, 1∕k * ), this distribution is asymmetric in the strict mathematical sense. Since the Weibull distribution is related to the GEV by a simple transformation, it also follows that the L-skewness is zero for a Weibull distribution with shape parameter 1∕k * ≈ 3.524. This is very close to the value 3.6 for which M vanishes (MacGillivray, 1986). Critchley and Jones (2008) defined an asymmetry function as follows. For any a < b, they considered the class of all rooted unimodal densities with support (a, b). For any 0 < p < 1, there are two points x L (p) and x R (p), one each side of the mode m, satisfying f (x L (p)) = f (x R (p)) = pf (m). Skewness or asymmetry is measured by Even if the idea is interesting, it is difficult to use this measure in practice due to the need of estimating the level points of the density as well as the modal value. Accordingly, not much is known about properties of estimators of this measure. Further, the measure is only defined for a quite restricted class of distributions. Compared with b 2 , this is a local measure of asymmetry, whereass 2 ( ) can be seen as an integral version of b 2 .
Clearly, there exist many other proposals of skewness measures besides the above mentioned. Further examples can be found later in this work; for overviews, see, for example, Benjamini and Krieger (1996), Critchley andJones (2008), andMacGillivray (1986).
This paper is organized as follows. In Section 2, we recall the definitions of expectiles and some of their properties. Section 3 formally introducess 2 and discusses its properties. We define closely linked skewness measures by normalizings 2 and by considering limiting values with respect to . In Section 4, we introduce related skewness measures based on Omega ratios and stop-loss transforms. Empirical counterparts of the proposed measures are analyzed in Section 5. We illustrate the behavior of the newly proposed skewness measures and their empirical counterparts for some families of distributions in Section 6. Most proofs are postponed to Appendix.

EXPECTILES AND EXPECTILE LOCATION ORDER
Throughout the paper we assume that all mentioned random variables X are nondegenerate, have a finite mean (denoted as X ∈ L 1 ) and are defined on a common probability space (Ω, , P) unless stated otherwise. Recall that the expectiles e X ( ) of a random variable X ∈ L 2 have been defined by Newey and Powell (1986) as the minimizers of an asymmetric quadratic loss: where and ∈ (0, 1). For X ∈ L 1 , equation (3) has to be modified (Newey & Powell, 1986) to The minimizer in (3) or (4) is always unique and is identified by the first order condition where x + = max{x, 0}, x − = max{−x, 0}. This is equivalent to characterizing expectiles via an identification function, which, for any ∈ (0, 1) is defined by The -expectile of a random variable X ∈ L 1 is then the unique solution of Similarly, the empirical -expectile ê n ( ) of a sample X 1 , … , X n is defined as the solution of As quantiles, expectiles are measures of non-central location; we collect some of their properties from Newey and Powell (1986) and Bellini et al. (2014) in the following proposition. Clearly, expectiles depend only on the distribution of the random variable X; they can be seen as statistical functionals defined on the set of distribution functions with finite mean on R.
Quantiles and expectiles are closely connected as measures of non-central location. Bellini et al. (2018) introduced the expectile (location) order between two random variables: Two random variables X, Y ∈ L 1 are ordered in expectile order (written X ≤ e Y ) if e X ( ) ≤ e Y ( ) for all ∈ (0, 1).
It is well known that the usual stochastic order ≤ st is equivalent to the pointwise ordering of the quantiles. In view of this, the preceding definition seems quite natural, since quantiles are just replaced by expectiles.
The next theorem shows that the usual stochastic order implies the expectile order, that is, the ordering of the quantiles implies the ordering of the expectiles.
This has been proved by Bellini (2012) by an order-theoretic comparative static approach. Since this proof may be not easily accessible for all readers, we give a rather elementary proof in the appendix.

AN EXPECTILE-BASED ORDERING WITH RESPECT TO SKEWNESS
The distribution of a random variable X is symmetric around if X − ∼ − X. Since a distribution is uniquely determined by the expectile function, and since the mean = e X (1∕2) of X coincides with the center of symmetry for a symmetric distribution, it follows that a distribution is symmetric if and only if e X− ( ) = e −X ( ) for each ∈ (0, 1). Using properties (a) and (e) in Proposition 1, this is equivalent to In analogy to (1), this leads to the following definition of expectile-based skewness.

Definition 1. A distribution is called right-skewed (left-skewed) in the expectile sense, if and only if
and equality does not hold for each ∈ (0, 1∕2).
Corresponding scalar measures of skewness arẽ or the general versions The numerator of (8) is the difference of two positive numbers e X (1 − ) − and − e X ( ), while the denominator is the sum of these numbers. Hence, −1 ≤s 2 ( ) ≤ 1. Note that the following decomposition holds for the expectile as a measure of noncentral tendency: This is the counterpart to the decomposition given in Benjamini and Krieger (1996) for the quantile. The first term in (9), the mean, is a measure of central location; the second term, half of the expectile distance e X (1 − ) − e X ( ), is a measure of variability for < 1∕2. Finally, the third term, which is essentially the numerator in (8), is zero for symmetric distributions and, hence, quantifies the deviations from symmetry.
The actual range ofs 2 ( ) depends on and can be considerably smaller than the interval [−1, 1]. This is specified in the following result whose proof can be found in Appendix.
Based on this result we redefine our expectile-based skewness measure as Then, −1 < s 2 ( ) < 1, and both inequalities are sharp for any ∈ (0, 1∕2). Further, we define s 1 = s 2 (1∕4) = 2s 1 . Clearly, for the comparison of the skewness of two random variables, it does not matter if we uses 2 or s 2 . Hence, we say that Y is more skewed to the right than X in the expectile sense ifs 2,X ( ) ≤s 2,Y ( ) for each ∈ (0, 1∕2). Using the properties of expectiles, it is clear thats 2 ( ) satisfies the skewness properties S1 and S2. The validity of S3 is an open question.
An even more general definition of an expectile-based skewness order would be the analogue to display (2); however, this seems to be rather unmanageable in applications.
For the quantile quotient b 2 ( ), MacGillivray (1992) proposed sup p< ≤1∕2 |b 2 ( )| as "a measure of overall asymmetry in the central 100 ⋅ (1 − 2p)% of the distribution." Because of the high similarity between b 2 and s 2 (ors 2 ), it suggests itself to propose sup p< ≤1∕2 |s 2 ( )| as a measure for the same asymmetry. However, s 2 ( ) and its empirical counterpart are much more stable as a function of than b 2 ( ) and its empirical counterpart (see Eberl & Klar, 2020). Hence, there is no necessity for this kind of cumulative version of s 2 .
Further indices of skewness could also be constructed by replacing expectiles in (8) by more general measures of noncentral location such as M-quantiles (Jones, 1994).

The limiting case → 1∕2
In this section, we examine the behavior of s 2 ( ) if approaches its upper bound 1/2. For this, we assume that the cdf F of X is differentiable with density f ; then, e X ( ) is twice differentiable by Theorem 1(f). Note that we can rewrite s 2 as a ratio of first-and second-order difference quotients where h = 1∕2 − . Splitting the central difference in the denominator into a forward and a backward difference, and taking the left limit yields .
By Proposition 1(f), e ′ X (1∕2) = 2 X , where X = E|X − | denotes the mean absolute deviation (MAD) from the mean. For the calculation of e ′′ X (1∕2), we denote numerator and denominator of e ′ X ( ) in Theorem 1(f) by u( ) and v( ), respectively. Then, lim →1∕2 u( ) = X and lim →1∕2 v( ) = 1∕2 as well as for → 1∕2. By combining these results, it follows which overall yields Apparently, s 3 can also be used as a measure of skewness; in fact, it has already been introduced as such by Tajuddin (1999). Besides, the quantity F( ) is the theoretical counterpart of the test statistic of the sign test for symmetry with estimated center (Gastwirth, 1971). The measure s 3 exploits the idea that the difference between mean and median q 1/2 indicates the skewness of the underlying distribution, which is also prevalent in other popular skewness measures like Since a substitution of the mean by the median in s 3 always results in the value 0 for continuous distributions, a positive value of − q 1∕2 yields a positive value of s 3 and thus right-skewness and vice versa. It is easy to see that s 3 satisfies the skewness properties S1 generally and S2 under the assumption P(X = ) = 0. For continuous distributions, the crucial property S3 follows from Jensen's inequality.
Besides its simplicity, an argument for the use of s 3 is that s ′ 2 ( ) converges to zero as tends to 1/2, so s 2 ( ) flattens out toward s 3 . This means that, at least for values of close to 1/2, s 3 is close to and thereby representative for a range of values of s 2 ( ) without the need of a specific choice of the parameter . This result on the gradient of s 2 can be proved under the assumption of e X ∈ C 4 ((0, 1)) (which is equivalent to the assumption that the density of X is twice differentiable) as follows.
First, we differentiate s 2 with respect to , which yields for ∈ (0, 1∕2). Using the notation h = 1∕2 − , this can once again be rewritten as a composition of difference quotients. Now, using Taylor expansions for each of them such that the remainders are of order O(h 3 ) in the numerators as well as in the denominators yields after some computations , where we used for the second equality that e ′ X (1∕2) > 0 by Proposition 1c). Then, taking the limit h → 0 yields the asserted result lim →1∕2 s ′ 2 ( ) = lim h→0 s ′ 2 ( ) = 0.

RELATION TO OMEGA RATIOS AND STOP-LOSS TRANSFORMS
In this section, we give conditions based on Omega ratios and stop-loss transforms which are equivalent to Definition 1.
Expectiles are related to the Omega ratio, which has been introduced in the financial literature by Keating and Shadwick (2002) as Then, Equation (5) can be written as (Remillard, 2013), which gives the following one-to-one relation between expectiles and Omega ratios: .

(b) A distribution is right-skewed (left-skewed) in the sense of Definition 1 if and only if
The following proposition collects some properties of S X .
Proposition 3. Let X be random variable with cdf F and finite mean . Then: Proof. Since lim t→∞ X ( + t) = 0, and since the monotone convergence theorem implies where F X (z) = 1 − F X (z) denotes the survivor function. Part (c) follows directly from (b). ▪ Figure 1 illustrates the area below F X (z) for z ∈ [ − t, + t] for a (symmetric) normal distribution N(2, 4) (left panel, t = 2.5) and a right-skewed exponential distribution with mean 5 (right panel, t = 3). For the normal distribution, the gray areas below F X sum up to t = 2.5, whereas the sum is larger than t = 3 in case of the exponential distribution.
Remark 1. (i) The representation of S X in Proposition 3(c) bears some similarity to skewness functionals defined in Arnold and Groeneveld (1993). In particular, they proposed as skewness function. Note, however, that this is a skewness measure with respect to the median, whereas S X is a measure with respect to the mean (cf. MacGillivray, 1986).
(ii) Note that S X (t) = 2∫ R F X (z)dH(z) − 1, where H is the cdf of the uniform distribution on ( − t, + t). Replacing H by the Dirac measure in results in s 3 in (10). Another reasonable choice for H would be any cdf with unimodal density, that is symmetric around , for example a normal distribution with mean . S X (t) is location invariant, but not scale invariant. This is not an issue if one analyzes the skewness of a single distribution. However, scale invariance is essential for a meaningful comparison between several distributions. As a scale invariant modification, we proposẽ where X = E|X − EX|, as before, denotes the MAD from the mean. This dispersion measure is strongly related to the stop-loss transform, since X (EX) = X ∕2 is just the absolute semideviation. In principle, one could use any other dispersion measure X satisfying cX = c X for c > 0 instead of X , but the latter is particularly suitable for our purpose.
For the remainder of this section we assume that the cdf's are absolutely continuous and strictly increasing (i.e., F is strictly increasing on {x : 0 < F(x) < 1}). Oja (1981) showed that F and G are strongly skewness comparable (i.e., F ≤ 2 G or G ≤ 2 F) if and only if F(x) and G(ax + b) cross each other at most twice for all a > 0, b ∈ R. He then defined two weakenings of ≤ 2 in case of finite expectations F and G and finite variances F and G as follows.
• F ≤ * 2 G if the standardized distribution functions F( F x + F ) and G( G x + G ) cross each other exactly once on each side of x = 0, with F( F ) ≤ G( G ).
• F ≤ * * 2 G if there exist a > 0, b ∈ R such that F(x) and G(ax + b) cross each other exactly twice with F(x) − G(ax + b) changing sign from positive to negative to positive.
The following implications hold true (Oja, 1981): Similarly like F ≤ * 2 G (see also definition 2.1 in MacGillivray, 1986), we now define skewness with respect to mean and MAD: Definition 2. G is more skew with respect to mean and MAD than F (F < G), if F( F x + F ) and G( G x + G ) cross each other exactly once on each side of The next theorem shows that this new skewness order is weaker than the strong skewness order ≤ 2 ; on the other hand, it is stronger than the skewness order implied byS X given in (16).

Theorem 3. Let X ∼ F and Y ∼ G with finite expectations F = EX and G = EY . Then:
In particular, the skewness measureS X (t) satisfies skewness property S3 for any t > 0.
Remark 2. (i) The proof of Theorem 3 shows that it is reasonable to require exactly two crossings in Definition 2. Exactly one crossing can occur only in specific situations where the standardized distribution functions are identical for all values smaller (larger) than zero; in this case, there is no reasonable comparison between the skewness of the two cdf's.
(ii) From Theorem 3 and the strong connection betweenS X and the skewness measures s 2 in (8) we conjecture that s 2 also satisfies property S3. This is reinforced by the validity of S3 for the limiting measure s 3 , and by numerical computations for specific examples, see Section 6.
In order to obtain the plug-in estimator̂2 of 2 , the expectiles e X ( ) are replaced by the empirical expectiles ê n ( ). Moreover, ( 1 , 2 ) and A( ) are estimated bŷ whereF n denotes the empirical cdf. It is then easy to see that̂2 is a composition of consistent estimators, hencê2 itself is a consistent estimator of 2 . Consequently, an asymptotic confidence interval for s 2 ( ) with confidence level 1 − p is given bŷ where z q denotes the q-quantile of the standard normal distribution. The left panel in Figure 2 shows a plot ofŝ 2,n ( ) with (pointwise) confidence limits for a sample of size n = 50 from an exponential distribution with rate 1. Under the hypothesis of symmetry, s 2 ( ) ≡ 0. Therefore, which define confidence limits under the hypothesis of symmetry. The right panel in Figure 2 shows a plot ofŝ 2,n ( ) together with the limits given in (18) for the same dataset as in the left panel.
With respect to S(t), we again use the plug-in estimator, which is given by Then the following result holds.
Theorem 5. Let X 1 , X 2 , … be iid with continuous cdf F and EX 2 < ∞. Then, The plug-in estimator for 2 t iŝ Analogous to the expectile skewness s 2 ( ), an asymptotic confidence interval for S(t) with confidence level 1 − p is given by Exemplary, (pointwise) confidence limits for a sample of size 50 from an exponential distribution with rate 1 are given in the left panel of Figure 3. As before, the hypothesis of symmetry yields S(t) ≡ 0 and, thereby, the respective confidence limits are given in the right panel of Figure 3.

Comparison of theoretical values
In this section we examine how the expectile skewness s 2 ( ) behaves for specific families of continuous distributions, in particular for the gamma distribution. We analyze how the skewness values depend on as well as on the distributional parameters. The results are compared with the corresponding values of the quantile skewness b 2 ( ). Due to property S1, skewness does only depend on the shape parameter of the gamma distribution, but not on the scale parameter. The density of the gamma distribution is given by with shape parameter k > 0 and rate parameter r > 0. Here, Γ(⋅) denotes the gamma function. All gamma distributions are right skewed and the degree of their skewness decreases as k increases. Figure 4 depicts skewness as a function of for gamma distributions with different shape parameters. First, we only look at the measures b 2 ands 2 without the correction term (1 − 2 ) −1 . Both of them tend to 0 as tends to 1/2. All curves are strictly decreasing in . While the quantile skewness exceeds the diagonal for highly skewed distributions, the expectile skewness is restricted to values below the diagonal, corresponding to Proposition 2. The curves above the diagonal are concave while the ones underneath are convex. If the expectile skewness is normalized to 1, it no longer tends to 0 as tends to 1/2. Instead, the still convex curves flatten out with increasing after a steep decline close to 0, illustrating that s ′ 2 ( ) converges to zero as tends to 1/2. Since the curves flatten out rather quickly, this implies that the limiting expectile skewness s 3 is representative of s 2 ( ) for a considerable part of the range of .
If the quantile skewness is multiplied with the factor (1 − 2 ) −1 , plots show that it also flattens out toward some limiting value with diminishing gradient as approaches 1/2. However, these values are then no longer normalized and can be equal to any real number.
The observed behavior is very similar for other popular classes of skewed distributions like the log-normal and the Weibull distribution. If the underlying distribution is skewed to the left, all considered skewness measures increase in with the expectile skewness curves being concave as long as they stay above the corresponding lower diagonal.
Now we look at the behaviour of the skewness measures b 2 and s 2 as functions of the shape parameter of the underlying distribution. For the shape parameter k of the gamma distribution specifically, this is depicted in Figure 5.
Both skewness measures decrease in k for all values of . This was to be expected since van Zwet (1964, pp. 60-62) showed that F −1 k 1 •F k 2 is convex for k 1 ≤ k 2 , where F k denotes the cdf of the gamma distribution with shape parameter k. The qualitative behaviour is analogous for similarly ordered classes of distributions like the Weibull distribution, thus strengthening our conjecture that s 2 satisfies skewness property S3. The curves of the expectile and quantile skewness differ slightly with the former being strictly convex while the latter become concave for k close to 0. However, except for very small values of k, b 2 decreases more rapidly than s 2 , especially for small values of . The plot also further confirms that the range of s 2 for different values of is substantially smaller than that of b 2 .

Performance of the empirical skewness measures
In this section, we examine and compare bias and variance of different empirical skewness measures. Here, we include quantile skewness b 2 ( ), expectile skewness s 2 ( ), Tajuddin First, we consider a highly skewed (shape parameter 0.1, see Figure 6) and a mildly skewed (shape parameter 10, see Figure 7) gamma distribution. We observe that the MSE generally seems to decrease with increasing skewness; however, that decrease is slower for M than for the other measures. While b 2 and M have the highest MSE at either end of the skewness spectrum, the expectile skewness is always in an acceptable range and converges fairly quickly toward 0 as n increases. While there is almost no bias for the mildly skewed distribution, all measures are at  least slightly biased (relative to their variance) for high skewness. For increasing n, the bias vanishes. Irrespective of the distributional skewness, M is always the most biased measure, for high skewness even to a critical degree.
The results for the highly skewed (log-variance 2.25, see Figure 8) and the mildly skewed (log-variance 0.01, see Figure 9) log-normal distributions confirm the observations made concerning the gamma distribution. The first log-normal distribution is even more skewed than the first gamma distribution, having the effect that the MSE of M is almost completely dominated by the bias. Additionally, the MSE of M seems to converge very slowly relative to the other measures, possibly suggesting worse behaviour on heavy-tailed distributions. Finally, we consider Student's t-distribution and the standard normal distribution (as limiting case) as examples of symmetric distributions. As expected for these distributions, the bias is negligible relative to the variance for all skewness measures. While the MSE basically does not change for b 2 and s 2 (with the latter achieving even lower values), M only behaves nicely for higher degrees of freedom. For lower degrees of freedom, it becomes somewhat unstable (see Figure 10), once again showing a poor behaviour for heavy-tailed distributions.
Overall, s 2 ( ) and s 3 seem to be the most stable skewness measures considered here. While they are outperformed for specific distributions, their MSE never explodes, and their bias is always fairly low compared to their variance.
From (A13) and (A14), we obtain Hence, A 3 = 0, which implies thatF(x) =G(x) for x ≥ 0, a contradiction toF(0) <G(0). Since an analogous reasoning excludes a single root x 1 > 0, it follows thatF andG cross each other exactly twice, with MG ,F changing sign from negative to positive to negative, and MG ,F (0) > 0. The second implication follows from the definitions. (b) Assume F ≤ G. Denote the two roots of MG ,F by x 1 and x 3 , where x 1 < 0 < x 3 . Further, put