### Abstract

- Top of page
- Abstract
- INTRODUCTION
- UNIVARIATE STABLE LAWS
- MODELING FINANCIAL RETURNS
- BOX 1. A SINGLE ASSET EXAMPLE
- MULTIVARIATE STABLE LAWS
- BOX 2. A PORTFOLIO EXAMPLE
- CONCLUSION
- ACKNOWLEDGMENTS
- REFERENCES

The aim of this article was to give an accessible introduction to stable distributions for financial modeling. There is a real need to use better models for financial returns because the normal (or bell curve/Gaussian) model does not capture the large fluctuations seen in real assets. Stable laws are a class of heavy-tailed probability distributions that can model large fluctuations and allow more general dependence structures. *WIREs Comput Stat* 2014, 6:45–55. doi: 10.1002/wics.1286

Conflict of interest: The authors have declared no conflicts of interest for this article.

For further resources related to this article, please visit the WIREs website.

### INTRODUCTION

- Top of page
- Abstract
- INTRODUCTION
- UNIVARIATE STABLE LAWS
- MODELING FINANCIAL RETURNS
- BOX 1. A SINGLE ASSET EXAMPLE
- MULTIVARIATE STABLE LAWS
- BOX 2. A PORTFOLIO EXAMPLE
- CONCLUSION
- ACKNOWLEDGMENTS
- REFERENCES

The fluctuations in many financial time series are not normal. The consequences of this are significant: underestimating extreme fluctuations in asset returns causes real hardship to people the world over. Unfortunately, most of the financial world still uses a model based on a normal distribution, even coining phrases like six sigma events to signify fluctuations that should never happen in the lifetime of the earth, yet they have occurred multiple times. One financial expert wryly commented ‘We seem to have a once-in-a-lifetime crisis every three or four years’. (Leslie Rahl, founder of Capital Market Risk Advisors, quoted in Ref [1], p. 211.) The simplicity and familiarity of the normal distribution, which is characterized by a mean and a variance, make it an attractive model for practitioners. Yet it does not capture the large fluctuations seen in real-life returns.

In this article, we describe one model for financial returns that explicitly incorporates heavy tails: stable distributions. These are a four parameter family of models that generalize the normal model, allowing both skewness and heavy tails. We do not claim that stable laws perfectly describe real-world returns; no distribution is exact. An old quote relevant here is: ‘Essentially, all models are wrong, but some are useful’ (Ref [2], p. 424). The key question is what we want to use a model for: if one wants to model the average behavior of an asset, wants a simple model, and is not concerned about extremes, the normal model may be appropriate. Models based on stable laws give another choice: they can describe real data well over most of its range, give a tractable model for compounding returns, and can capture skewness and heavy tails.

Many people have advocated the use of stable laws in finance, starting with Mandelbrot.[3] This idea has been pursued by others, including Samuelson[4] and Rachev and Mittnik.[5] In the past, the lack of efficient numerical methods have made it impractical to use such models in practice. With recent progress in software and increased computational power, it is now worth another look at this class of models.

We note that there are other classes of models that have been proposed for financial returns: generalized *t*-distributions, generalized hyperbolic, generalized inverse Gaussian, geometric stable, tempered stable, etc. While these models can give a good fit to data sets, they lack all the features described above. A different approach is to use extreme value theory as described in Embrechts et al.[6] and McNeil et al.[7] The discussion of these other methods is beyond our scope here, so we focus only on the basics of stable laws and illustrate their use in modeling returns.

### UNIVARIATE STABLE LAWS

- Top of page
- Abstract
- INTRODUCTION
- UNIVARIATE STABLE LAWS
- MODELING FINANCIAL RETURNS
- BOX 1. A SINGLE ASSET EXAMPLE
- MULTIVARIATE STABLE LAWS
- BOX 2. A PORTFOLIO EXAMPLE
- CONCLUSION
- ACKNOWLEDGMENTS
- REFERENCES

The theory of stable distributions comes from the pioneering work of Paul Lévy in the 1930s, where he examined what sort of limits can arise when normalizing sums of independent terms. For this reason, these distributions are sometimes called *Lévy stable* laws. Here is the basic definition: a random variable (rv) *X* is *stable* if for *X*_{1} and *X*_{2} independent copies of *X* and any positive constants *a* and *b*,

- (1)

for some positive *c* and *d* ∈ ℝ. ( denotes ‘equal in distribution’). The rv *X* is *strictly stable* if Eq. (1) holds with *d* = 0 for all choices of *a* and *b*. It is *symmetric stable* if it is stable and symmetrically distributed around 0, i.e. .

For *X*_{1}, …, *X*_{n} independent and identically distributed as *X* in Eq. (1), iterating that equation shows that there exist constants *c*_{n} > 0 and *d*_{n}, so that

- (2)

This equation generalizes the familiar property of normal random variables: sums of normal terms are normal. In words, sums of i.i.d. stable terms are stable; this ‘stability under addition’ property is the reason of the use of the word stable. We started with Eq. (1) and derived Eq. (2), it can be shown that it is possible to reverse this, so either condition can be taken as a definition of stability.

This abstract definition does not specify what the possible distributions are for stable laws. Paul Lévy[8] showed that their characteristic functions (Fourier transform) must have a special form. We will describe two parameterizations here, which we call the 0-parameterization and the 1-parameterization. (There are multiple parameterizations in the mathematical literature: Hall[9] describes a tangled history of meanings of the skewness parameter; Zolotarev[10] has forms A, B, C, C′, E, and M; Samorodnitsky and Taqqu[11] uses the 1-parameterization. To document these and other parameterizations, Nolan[12] lists 11 different parameterizations, numbering them from 0 to 10.)

Four parameters are required to specify a stable law: the *index of stability α* is in the interval (0,2], the skewness *β* is in the interval [−1,1], the scale parameter *γ* is any positive number, and the location parameter *δ* is any number. The notation *S*(*α*,*β*,*γ*,*δ*;*k*) will be used to specify a stable distribution with *k* = 0 or *k* = 1 for the two parameterizations.

A random variable *X* is *S*(*α*,*β*,*γ*,*δ*_{0};0) if it has characteristic function

- (3)

A random variable *X* is *S*(*α*,*β*,*γ*,*δ*_{1};1) if it has characteristic function

- (4)

Here sign *u* is the sign of the number *u*: it is +1 if *u* > 0, −1 if *u* < 0, and 0 if *u* = 0, and *x* · log *x* is always interpreted as 0 at *x* = 0. The only difference in the two parameterizations is in the meaning of the location parameter. If *β* = 0, then these two parameterizations are identical, it is only when *β* ≠ 0 that the asymmetry factor (the imaginary term in brackets) becomes an issue, and in this case the laws are shifts of each other: when *α* ≠ 1 and when *α* = 1.

If one is primarily interested in a simple form for the characteristic function and nice algebraic properties, the 1-parameterization is favored. Because it is simpler to use when proving mathematical properties of stable distributions, it is the most common parameterization in the literature. The main practical disadvantage of the 1-parameterization is that the location of the mode is unbounded in any neighborhood of *α* = 1: if *X* ∼ *S*(*α*,*β*,*γ*,*δ*;1) and *β* > 0, then the mode of *X* tends to + *∞* as *α* 1 and tends to − *∞* as *α* 1, see Figure 1 below. So the 1-parameterization does not have the intuitive properties desirable in applications (continuity of the distributions as the parameters vary, a scale and location family, etc.). We recommend using the 0-parameterization for numerical work and statistical inference with stable distributions: it has the simplest form for the characteristic function that is continuous in all parameters. It lets *α* and *β* determine the shape of the distribution, while *γ* and *δ* determine scale and location in the standard way: if *X* ∼ *S*(*α*,*β*,*γ*,*δ*;0), then (*X* − *δ*)/*γ* ∼ *S*(*α*,*β*,1,0;0). This is not true for the 1-parameterization when *α* = 1.

#### Properties of Stable Laws

We summarize some basic properties of *X* ∼ *S*(*α*,*β*,*γ*,*δ*;1) without proof.

- If
*β* = 0, then a stable distribution is symmetric. - Reflection property: −
*X* ∼ *S*(*α*, − *β*, *γ*, − *δ*; 1). - All stable laws have densities
*f*(*x*) that are smooth and unimodal. - In most cases the support of
*X* is the whole real line; the exceptions are when (*α* < 1 and *β* = 1), in which case the support is [*δ*, + *∞*), or (*α* < 1 and *β* = − 1), in which case the support is (−*∞*, *δ*]. - Tail behavior. If
*α* < 2 and − 1 < *β* ≤ 1, then the density *f*(*x*) and cumulative distribution function (CDF) *F*(*x*) have an asymptotic power law: as *x* *∞*, - (5)

where . Using the reflection property, the lower tail properties are similar.Owing to the similarity of the tail behavior to a Pareto distribution (an exact power law), the phrase *stable Paretian distribution* is sometimes used in the non-Gaussian case. For all *α* < 2 and − 1 < *β* < 1, both tail probabilities and densities are asymptotically power laws. When *β* = − 1, the right tail of the distribution is not asymptotically a power law; likewise when *β* = 1, the left tail of the distribution is not asymptotically a power law. These are not exact relations, only asymptotic ones, and the point at which these approximations are accurate is not known exactly; Fofack and Nolan[13] give some numerical information on this question. The answer is messy: for *α* near 2, an *α*-stable law is close to a normal law, and one has to go to a very high quantile to see the power law behavior. - Fractional moments. When
*α* < 2, *E*|*X*|^{p} is finite for 0 < *p* < *α*, but infinite for *p* ≥ *α*. This is a consequence of the power law tail behavior. In particular, for *α* < 2, the (population) variance is infinite and for *α* ≤ 1, the (population) mean is undefined. - Generalized Central Limit Theorem. Let
*X*_{1}, *X*_{2}, … be independent identically distributed random variables. The classical Central Limit says that if we start with any distribution with a finite mean *μ* and standard deviation *σ*, normalized sums of such terms converge to a normal law: There is a more general result called the Generalized Central Limit Theorem (GCLT) that applies when the summands do not have a finite variance. The simplest version is that when *P*(*X*_{i} > *x*) ∼ *c*^{+}*x*^{− α} and *P*(*X*_{i} < − *x*) ∼ *c*^{−}|*x*|^{− α} as *x* *∞* with 0 < *α* < 2 and *c*^{+} + *c*^{−} > 0. Set *β* = (*c*^{+} − *c*^{−})/(*c*^{+} + *c*^{−}), then there are constants *b*_{n} and *γ* such that (If *α* > 1, then we may take *b*_{n} = *nμ*.) Note that the normalization factor is different: in the classical case, the scaling is by *n*^{1/2} whereas in the stable case, the scaling is by the larger factor *n*^{1/α}. There is a more precise statement of the GCLT using the concept of regular variation which can be found in Feller.[14] In fact, stable laws are the only possible nontrivial limits that can arise as limits of normalized sums of i.i.d. terms.

#### Calculation and Estimation

There are only a few special cases where there are closed form expressions for stable densities *f*(*x*). These cases are: (a) Gaussian/normal distributions (*α* = 2, *β* = 0), (b) Cauchy distributions (*α* = 1, *β* = 0), and (c) Lévy distribution (*α* = 1/2, *β* = 1). The only case where there is a closed form expression for the CDF is the Cauchy case. In all other cases, including the CDF for Gaussian laws, numerical procedures are needed to calculate densities and CDFs. Using results of Zolotarev,[10] Nolan[15] describes and implements algorithms to numerically compute stable densities, CDFs, and quantiles when *α* < 2. In addition, the method of Chambers et al.[16] gives an algorithm to simulate. So there are now reliable programs to compute these quantities, making it practical to apply these models to real problems. Figure 1 shows some plots of stable densities.

Many of the standard parameter estimation techniques do not work for stable data. For example, the regular method of moments does not work: to estimate the four parameters one would normally compute EX, EX^{2}, EX^{3}, and EX^{4} and then try to solve for *α*, *β γ*, and *δ*. But this will not work, since most (or all) of these moments do not exist. (More precisely, higher order population moments do not exist. While the sample moments do exist, but their behavior is erratic: for example, will diverge as *n* *∞*. Large samples do not help with this approach!). Since there are no closed analytic forms for stable densities, the likelihood cannot be written explicitly, making it impossible to analytically solve for maximum likelihood estimators. As a result, there are multiple nonstandard techniques for estimating the stable parameters, some of them ingenious. Four basic methods are the following.

- Tail estimators. This method uses the tail behavior, Eq. (5), to estimate
*α*. Different methods have been proposed for doing this, ranging from plotting extremes on a log–log scale and estimating slope, to the Hill estimator and generalizations. Unfortunately, these do not work very well with stable laws because the when the power law occurs is a complicated function of the parameters and unless one has a very large data set, it is unlikely that the tail will be exactly a power law. - Fractional moments. When
*X* is strictly stable, there are expressions for fractional moments *E*|*X*|^{p}, for − 1 < *p* < *α*. One can use these for a generalized method of moments: compute sample fractional moments, set them equal to the expressions in term of the parameters, and solve for the parameters. Nikias and Shao[17] used this approach in signal processing case when the distribution is symmetric. - Quantile matching. Fama and Roll[18] noticed certain patterns in tabulated quantiles
*x*_{p}(=*p*-th quantile of a distribution) of symmetric stable laws that could be used to estimate *α* and the scale. For example, the interquartile range *x*_{0.75} − *x*_{0.25} is a monotonic function of the scale *γ*, and the ratio (*x*_{0.95} − *x*_{0.05})/(*x*_{0.75} − *x*_{0.25}) is a monotonic function of the index *α*. McCulloch[19] generalized this to the nonsymmetric case, using other functions of quantiles to estimate the location *δ* and skewness *β*, giving a way to estimate all four stable parameters from a handful of sample quantiles. - Empirical characteristic functions. Koutrouvelis[20] used the fact that there is an explicit formula (4) for the characteristic function
*φ*(*u*). One can compute the sample/empirical characteristic function on a grid of *u*_{i} values for a data set and then use regression to estimate the parameters. Kogon and Williams[21] simplified this method by using the continuous parameterization, Eq. (3), and centering and scaling the data to avoid numerical difficulties. - Numerical maximum likelihood estimation. DuMouchel[22] gave an approximate numerical maximum likelihood method and showed that (away from the boundaries
*α* = 2 and/or *β* = ± 1, the resulting estimators are asymptotically normal. The present author implemented this in[23] and computed tables that can be used for confidence interval estimates. Further work using a precomputed approximation to stable densities has made this method significantly faster.

See Ref [23] for a summary of these methods and a detailed description of the numerical maximum likelihood approach. Simulations show that the efficiency of the estimate procedures are in reverse order of that listed above.

We end this section with a mention of regression with stable error terms. Nolan and Ojeda[24] describe a procedure for estimating the coefficients for problems of the form

where the error terms *ε*_{i} are i.i.d. *S*(*α*,*β*,*γ*,*δ*;0). This gives a robust method of estimating regression coefficients when the error terms are heavy tailed. For example, if one wants to estimate the CAPM *β* for a volatile asset in relation to a broadly based index, this method is more robust than ordinary least squares.

If *X*_{t} ∼ *S*(*α*,*β*,*γ*,*δ*;1) stands for the one period return, then property (2) shows that the multi-period return *X*_{1} + ⋯ + *X*_{n} is also *α*-stable. The exact relationship, derived using Eq. (4) is

This gives an exact formula for the multi-period return, a very useful fact (Box 1).

### MULTIVARIATE STABLE LAWS

- Top of page
- Abstract
- INTRODUCTION
- UNIVARIATE STABLE LAWS
- MODELING FINANCIAL RETURNS
- BOX 1. A SINGLE ASSET EXAMPLE
- MULTIVARIATE STABLE LAWS
- BOX 2. A PORTFOLIO EXAMPLE
- CONCLUSION
- ACKNOWLEDGMENTS
- REFERENCES

The definition of stability is exactly the same, with **X** a *d*-dimensional vector in Eq. (1). The surprise here is that there are many possible dependence structures possible in the stable case. One can have an elliptically contoured case, but many other unexpected types of dependence are possible, see the contours of the bivariate stable densities in Figure 2. Feldheim[26] showed that every multivariate stable vector has a characteristic function of the form

where *Λ* is a finite measure on the unit sphere , *δ* is a shift vector in ℝ^{d}, and

is minus the exponent of the characteristic function of a univariate *Z* ∼ *S*(*α*, *β* = 1, *γ* = 1, *δ* = 0; 1) Hence every multivariate stable law is characterized by an index of stability *α*, a spectral measure *Λ* on the sphere, and a shift *δ* .

This is a very large class of distributions which cannot be parameterized by a finite number of parameters. To use multivariate stable laws in practice, one has to restrict the type of spectral measure. We describe three accessible classes.

- Independent components. Here the spectral measure is concentrated on the points where the coordinate axes intersect the sphere. The independence makes it easy to work with, simulate and compute densities and CDFs based on the univariate case.
- Discrete spectral measures. Here the spectral measures
*Λ* is discrete, with point mass *λ*_{j} at locations **s**_{j}. It was shown by Byczkowski et al.[27] that this is a dense class in the sense that for any spectral measure *Λ*_{1}, there is a discrete measure *Λ*_{2} with a finite number of point masses such that |*f*_{1}(**x**) − *f*_{2}(**x**)| (the difference in the corresponding density functions) is uniformly small over all **x**. - Elliptical contours. In this case, the joint characteristic function is of the form where
*Q* is a *d* × *d* positive definite shape matrix and *δ* is a shift vector. A major advantage of this class is that it is computationally accessible and that joint dependence is characterized by the set of pairwise parameters, so *d*(*d* − 1)/2 values are needed, just like in the Gaussian case.

We briefly state some of the basic properties of multivariate stable laws. Sums of independent stable random vectors are stable, all univariate projections **u** · **X** = ∑ _{k}*u*_{k}*X*_{k} are univariate stable laws, the support of a stable law is generally the whole space, but like in the one dimensional case, there are exceptions when *α* < 1 and the spectral measure is one-sided. Plot (d) in Figure 2 shows an example where the support of the distribution is the cone bounded by the dashed lines.

To be jointly stable, there has to be one *α* for which every component is univariate *α*-stable. In finance and other applications, there may be different components that have different tail behavior. In this case, one can fit each component with a univariate stable model, each having possibly different *α*'s. A joint distribution can be constructed using copulas or vines, see McNeil et al.[7] or Kurowicka and Joe.[28] If multiple components have the same or similar index of stability, then it may make sense to use a joint stable model for those components, and then build a higher dimensional distribution out of these. An unfortunate consequence of these procedures is that the full distribution is generally not jointly stable.

#### Multivariate Computations, Simulation, and Estimation

At the current time, there is limited ability to compute densities and probabilities for multivariate stable laws. For discrete spectral measures, Nolan and Rajput[29] gave a program to compute bivariate densities *f*(**x**) as in Figure 2. There are integral expressions for stable densities in Abdul-Hamid and Nolan[30] in higher dimensions, but they are complicated and difficult to evaluate numerically, and the difficulties increase as the dimension increases. Modarres and Nolan[31] give an algorithm to simulate stable random vectors with discrete *Λ* in any dimension. Cheng and Rachev,[32] Rachev and Xin,[33] and Nolan et al.[34] describe ways to estimate a discrete spectral measure. While the methods work in higher dimensions, to our knowledge these have only been implemented in two-dimensions.

The elliptical case is much more accessible. Nolan[35] develops algorithms to evaluate the density for dimensions up to *d* = 100. There are also methods for simulating and estimating in arbitrary dimensions, all based on the *d* × *d* shape matrix *Q*. In finance, joint distributions of returns are frequently somewhat elliptically shaped. One intriguing feature of this class is that unlike the Gaussian case, there can be positive tail dependence when there are elliptical contours and *α* < 2 (Box 2). This is due to Hult and Lindskog[36]; values for the tail dependence coefficient are tabulated in Ref [35].