The QVF family and Kullback–Leibler entropies
The definition of the cumulant-generating function is that, not only does dψ/dθ = μ, but d2ψ/dθ2 is the variance of the observation X. Morris defines its relation to the mean μ as a variance functionV(μ). The quadratic variance relation is the dependence
By definition of θ(μ) and μ(θ) as inverse functions, it follows that the variance is also the (geometric and algebraic) inverse of the curvature of the relative entropy. We differentiate the second line in eqn (30) twice and substitute eqn (36), to produce
Because we have first and second constants of integration from the relations following eqn (30), eqn (37) has an unambiguous integral. To assign meaning to this integral, however, and in the process to expose a relation between the Morris and Pearson approaches to classification, we first factor the variance function into an overall normalization and the roots of the polynomial. Write
with the solutions
Then, the integral of eqn (37) becomes
If we denote by ϕ ≡ (μ − μ1)/(μ2 − μ1), the analytic continuation of a partition of the unit interval, we may write eqn (40) as
A slight variation on the formula (40), making use of forms (39) for the roots, the Legendre transform relations (30) and the constants of integration, reads
This integral relation between the cumulant-generating function and the variance function appears as eqn 3.7 in Morris (1982).
Two fundamental NEF-QVF families and various limits
Working in terms of the signs and magnitudes of the coefficients v0, v1, v2, Morris identifies exactly six inequivalent natural exponential families with quadratic variance functions. Three are continuous (Gaussian, gamma, and hyperbolic-cosecant probability density functions), and three are discrete (binomial, negative binomial, and Poisson probability mass functions), up to offset and scaling of the natural observation X. We will see here that, working in terms of the analytic structure of the entropy (40), and a simple classification of the roots μ1,2, we may identify two main classes, corresponding to the continuous and discrete distributions, and various limiting forms of these, which complete Morris's families.
The quantity that distinguishes the continuous from the discrete NEF-QVF families is the discriminant (which is unchanged by offset of X). In the case where d > 0, the variance function (36) has two real roots, while if d < 0, it has two complex-conjugate roots. By choice of offset and scale, we may obtain Morris's canonical families by making the complex-conjugate roots purely imaginary when d < 0, or by taking one of the two real roots to lie at the origin if d > 0.
We begin with the imaginary roots, which select the continuous-valued NEF-QVF distributions. The canonical form for these is obtained when v1 ≡ 0, and v0,v2 > 0. We may then define
The relative entropy, about a distribution px ∣0 in the NEF-QVF family with mean μ0, must have the form
The relation of θ to μ and μ0 is
If we choose a background in which μ0 = 0 (by freedom to offset X), it follows that we may write the cumulant-generating function as
The canonical normalization for this family of distributions is given by v2 = 1. One may check directly that they are produced by the family of hyperbolic-cosecant density functions
(The proof is by contour integral. Check that
with integration variable u ≡ eπx/2Λ and shifted parameter . The contour that avoids branch cuts, in the log-transform to variables u, closes in the negative-imaginary half-plane, encircling the pole u = −i.) The distributions at Λ = 1 are the canonical densities given in Morris (1982), eqn 4.2.
It is straightforward to check that, as Λ∞, the relative entropy (45) reduces to the form
for a Gaussian distribution
with arbitrary mean. We have used v2Λ2 ≡ v0 as v20.
In the other limit, as Λ0, it is convenient to take v2 = 1/q ≡ 1/μ0, in which case we recover the relative entropy
appropriate to the standard gamma distribution
Two of the three continuous-valued NEF-QVF families, therefore, are degenerate limits of the hyperbolic-cosecant distribution, which represents the generic case.
The discrete-valued families, following when the variance function has real roots, may be handled in similar fashion. We choose canonical forms by offsetting x to set μ1 = 0 and attain this in the variance function by taking v00. The canonical scale for x is then given by taking v1 = 1.
For the discrete distributions, there are two ‘interior’ families of solutions (the binomial and negative binomial) and one limiting family (the Poisson) that may be reached from either of them. The root μ2 = −v1/v2 in all cases. To obtain the binomial distribution on N samples with mean μ0 = pN,
we take μ2 = N, corresponding to v2 = −1/N. For this distribution only, the range is finite, 0 ≤ x ≤ N. The relative entropy takes the standard form of a Kullback–Leibler divergence without extending the definition of ϕ by analytic continuation,
The negative binomial distribution is immediately obtained by taking N−N in the second line of eqn (55) while holding μ0 fixed. The corresponding distribution is
with p = μ0/(N + μ0). This is the other ‘interior’ solution, with μ2 = −N and therefore v2 = 1/N.
For either of the negative binomial or the Poisson, the range of x is unbounded, x ≥ 0.
The relative entropy expressions (52, 58) for the gamma and the Poisson distributions are the same functional form, under exchange of the reference mean μ0 with the distribution mean μ. Their respective distributions are likewise interchanged under exchange of x with μ0, except that in the gamma case (53), a further shift μ0μ0 − 1 must be performed as well. We will return to integer shifts of this form in the next section.
(We note that the association of imaginary roots with continuous-valued distributions, and of real roots with discrete-valued distributions, is a defining structural feature of quantum mechanical distributions for particles with finite temperature but continuous time-dependence (Mahan, 2000). This is one of many interesting connections to the NEF-QVF families that it will not be possible to explore in this publication.)