Demographic variance in heterogeneous populations: matrix models and sensitivity analysis

Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.


Introduction
The sources of inter-individual variance in population outcomes can be divided into two categories. One is heterogeneity: genuine differences in the properties of individuals. Heterogeneity is defined as any difference among individuals that results in differences among them in any of the vital rates (mortality, fertility, development) that • Individual stochasticity. This is variation among individuals created by random events in the processes of survival, reproduction and development (Caswell 2009, Caswell and Kluge 2015, van Daalen and Caswell 2015. Individual stochasticity has the same meaning as 'dynamic heterogeneity' in the independent usage of Tuljapurkar et al. (2009). Even in the complete absence of heterogeneity; i.e. when every individual is subject to exactly the same vital rates, individuals will differ in their demographic outcomes because of individual stochasticity. It is thus a candidate explanation for variance among individuals in, e.g. longevity and lifetime reproductive output. It has now been documented that individual stochasticity can explain a surprisingly large fraction of inter-individual variance (Steiner et al. 2010, Steiner and Tuljapurkar 2012, Hartemink et al. 2017, Jenouvrier et al. 2017, Hartemink and Caswell 2018. • Demographic stochasticity. This is the result of embedding individual stochasticity in a population growth process, by including the (stochastic) creation of new individuals by reproduction. In models for demographic stochasticity, including the branching processes and diffusion models considered here, the random events at the individual level produce variance in population size over time. Demographic stochasticity is thus a candidate explanation for variance in population growth trajectories, growth rates, and population size, as well as a potential cause of population extinction. Because demographic stochasticity creates variation in population growth due to random individual events, its effects scale inversely with population size. It is particularly important for the viability of small populations in conservation, for the fate of new mutations, for the success or failure of introduced populations, and for the fate of epidemics following exposure to an infected individual. • Environmental stochasticity. This is variation in population growth caused by stochastic changes over time in the environment, which affect all individuals in the population (Tuljapurkar 1990). We will not consider environmental stochasticity here.
The incorporation of individual heterogeneity has always been the main goal and the greatest accomplishment of demography. Age-classified models are so familiar that we forget that, in their absence, age differences were a source of heterogeneity that was unaccounted for in unstructured models. Size-classified models seem so ordinary now that we may forget that they were developed because size was a source of heterogeneity unaccounted for in age-classified models. Age-classified, stage-classified and multistate demographic models incorporate differences among individuals in the vital rates and yield the consequences for population behavior. (Regardless of the mathematical form of the models: renewal equations, matrix projections, integral equations, partial differential equations, etc.). Incorporating age, or size, or more elaborate differences among individuals into an analysis does not change their nature as heterogeneity in any signficant way. The analysis of unobserved or latent heterogeneity, before it is incorporated into demographic models, is a long-standing challenge in demography (Yule 1910). Such heterogeneity is given a variety of names: frailty (Vaupel et al. 1979), individual quality (Wilson and Nussey 2010), fecundability (Sheps and Menken 1973), and the distinction between movers and stayers (Morrison et al. 1971). Recent advances in multistate matrix models and estimation of latent variables have made it possible to incorporate unobserved heterogeneity along with other forms of heterogeneity.
Once the extra source of heterogeneity is incorporated into a multistate model, then demographic analysis can quantify the effects of all types of heterogeneity operating together. The incorporation can be accomplished either by identifying and measuring the source of heterogeneity (e.g. health status or education in human demography) or by estimating it as a latent unobserved variable (Erişoğlu et al. 2012, Hartemink et al. 2017, Jenouvrier et al. 2017. The rest of the paper is organized as follows. The section 'Two models for demographic stochasticity' recalls the definitions of the branching process and diffusion models for demographic stochasticity. The section 'Demographic variance from matrix models' presents a general result for calculating the demographic variance (to be defined below) from any matrix population model. The matrix formulation leads directly, in the section 'Sensitivity analysis of the demographic variance' to a sensitivity analysis that quantifies the effects of mortality, fertility, and transitions on the demographic variance and extinction probability. The next section (Examples) presents examples from several age-classified human populations and a stage-classified plant species.

Notation
Matrices are denoted by upper-case boldface letters (e.g. U), and vectors by lower-case boldface letters (e.g. n). Vectors are column vectors by default. Where necessary, the dimensions of matrices and vectors are denoted by subscripts; thus I s is an identity matrix of order s and 1 s is a s  1 vector of ones. The unit vector e i is a vector with a 1 in the ith entry and zeros elsewhere. The unit matrix E ij is a matrix with a 1 in the (i, j) entry and zeros elsewhere; the dimensions will be indicated if not clear from the context. When convenient, Matlab notation will be used to refer to rows and columns of matrices; thus X(i, :) is the ith row and X(:, j) the jth column of X. The diagonal matrix with x on the diagonal and zeros elsewhere is denoted D(x). The symbol • denotes the Hadamard, or element-by-element product; the symbol ⊗ denotes the Kronecker product. The symbol ǁxǁ denotes the 1-norm of the vector x. The transpose of the matrix X is X T . The number of age classes is ω and the number of stages is s. The expected value of x is E(x); the variance is Var(x). The covariance matrix of the entries of the vector x is Cov(x). The vec operator (vec X) stacks the columns of X into a column vector.

Two models for demographic stochasticity
Two modelling approaches -branching processes and diffusion models -have been widely used to study effects of demographic stochasticity; our goal in this paper is to present a new synthesis of these approaches and develop new analyses from them. There is a close link between matrix population models and multitype branching process (MBP) models (Pollard 1966, Caswell 2001. Information on the higher moments of the population can be obtained from the matrix model with only a small amount of extra information (or, failing that, a few extra approximations; Caswell 2001). Our goal here is to provide an explicit link between the MBP formulation of matrix population models and diffusion models for demographic stochasticity, and to use that link to develop a sensitivity analysis for demographic variance. Our results are generally applicable to age-or stage-classified models.

Moments of multitype branching processes
Demographic stochasticity was the subject of what may be the first stochastic population model: the branching process introduced by Galton and Watson (1874) to analyze the extinction of British surnames. The Galton-Watson branching process describes a population in which all individuals are identical, and independently produce a random number of 'offspring' according to a specified probability distribution. The 'offspring' may, depending on the context, include the survival of the 'parent' individual as well as the production of genuinely new offspring. The analysis is carried out by a now classical application of probability generating functions.
Structured populations, in which individuals differ on the basis of age or stage, are analyzed using multitype branching processes (MPB). Demographic applications of these models were developed for age-classified models by Pollard (1966) and extended to stage-classified models by Caswell (2001, chapter 15). In these models, an individual of a given type is characterized by a multivariate probability distribution of the numbers of 'offspring' of all types produced over a time interval. Again, offspring include the possible survival and stage transition of the starting individual. Analysis based on the probability generating functions of these multivariate distributions gives the properties of future population size and the probabilities of extinction. The analysis also provides expressions for the moments of the population vector as a function of age or stage. The familiar population projection matrix is the expected value operator of a MBP; this makes it possible to develop a MBP from any stage-or age-classified matrix model.
Let n(t) denote the population vector, with expectation E(n(t)) and covariance matrix G = Cov(n(t)). The population projection matrix is A = U+F, where U contains transition probabilities and F contains fertilities.
Define the vectors giving the per-capita production of offspring of all stages (recall that there are s stages) by an individual of stage j as Both x U and x F are random variables. The total offspring prodution is The variability introduced by individual stochasticity depends on the covariance matrices of the x ( j) . At this point, approximations of various kinds must be made. These are well known, and are described at length in (Caswell 2001, chapter 15), so we discuss them only briefly here.
We will rely on a set of approximations tailored to the demographic information that is reported in the overwhelming majority of ecological studies. In idealized situations, some of the approximations can be avoided, and we return to this issue in the Discussion. A matrix population model reports the expected value of the x F ( ) j in the fertility matrix F. A complete branching process model for the population would require the complete joint probability distribution of the numbers of all types of offspring produced by an individual of each stage. Here, we are focused on the mean and variance of the populations, so the calculations require only the means and covariances of offspring production. Confronted with only the expected values, the appropriate approximation is to use a biologically reasonable distribution that can be parameterized by its mean. For species producing at most one offspring, the appropriate choice is the Bernoulli distribution. For species producing many offspring, the Poisson distribution is an appropriate choice. More interesting distributions (e.g. mixture distributions such as the negative binomial) can be parameterized only if data on the variance of reproduction are available. For example, Kendall and Wittmann (2010) compared the fit of a variety of probability distributions to data on vertebrate annual reproductive success, and found that a generalized Poisson distribution gave the best fit. This distribution requires an extra parameter that adjusts the relationship between the variance and the mean, so it cannot be fit from data on the mean alone. We encourage researchers to present full distributions, not only means, of stage-specific reproductive output, to enable use of such distributions.
In this paper we explore both Bernoulli and Poisson models for reproduction. Study of other distributions, and of distributions appropriate to taxonomic groups other than vertebrates, are interesting and significant unsolved research problems.
The second approximation concerns the covariance between x U ( ) j and x F ( ) j ; that is, between reproduction and transitions. The usual demographic models provide no information on this, simply adding together transitions and fertility to obtain A. In the absence of this information, the appropriate approximation is to treat transitions and fertility as independent, in which case the covariance matrix of offspring production is the sum of the covariances due to fertility and those due to transition. This assumption about independence of the outcomes of random events does not imply that the rates of survival and of reproduction, across i-states, are independent.
We write the covariance matrix among the stages as We collect all this covariance information in a matrix D given by This information suffices to project the mean population vector E(n) and its covariance matrix G: (Pollard 1966, Caswell 2001. This is the extension of the matrix population model to incorporate demographic stochasticity. The key to the analysis is covariance matrices that appear in D, which we now derive.

Covariance due to transitions
The transition part of Eq. 4 is easy; if U is the transition probability matrix among the stages, then the distribution of offspring is multinomial, with a covariance matrix

Covariance due to fertility
The covariance matrix C F in Eq. 4 depends on the types of offspring produced and the probability distribution of the numbers produced. Here we consider two common cases: one in which a single offspring is produced, and one in which multiple offspring, of multiple types, are produced. Other reproductive modes can be constructed similarly.

Bernoulli reproduction
Many long-lived organisms, including whales, albatrosses, elephants and humans, typically produce a single offspring at a time. Suppose stages are numbered so that new offspring are stage 1. Then F contains expected fertilities f j in the first row and zeros elsewhere, and the number of offspring is a Bernoulli random variable. The covariance matrix is

Poisson reproduction
A simple model for reproduction in organisms that produce potentially many offspring, of many different types, is obtained by assuming that the number of offspring of type i produced by a parent of type j is a Poisson random variable with mean f ij . The variance of the Poisson distribution is equal to the mean; thus, if the types of offspring are independent, then the covariance matrix satisfies

Diffusion models
At the opposite extreme from the discrete-state, discrete-time and vector-valued multitype branching process models are diffusion processes, which are continuous-state, continuous-time, scalar-valued stochastic processes. There is a rich but mathematically difficult literature on diffusion models (Cox andMiller 1965, Karlin andTaylor 1981), which has been applied to a variety of models in population ecology (Lande et al. 2003) and population genetics (Kimura 1964, Nagylaki 1989. Diffusion models can be derived in several ways. Start, quite generally, with a continuous-time Markov process for a continuous variable x(t). If it can be solved, this original model produces a probability density function f(x, t) of population size at time t. Such a solution may not be possible or known, and Bailey (1964) shows it can be approximated by a diffusion model. Considering the probability of a small change in x over a short time interval, and expanding the resulting change in f(x, t) in a Taylor series, yields a diffusion equation for f: Here the function µ(x) defines the growth of the mean, and ν(x) the growth of the variance, of x. In applications to population growth, x is some measure of population size, and µ(x) and ν(x) are taken to be proportional to x (Bailey 1964). That is, population size influences the growth rate of both mean population and the variance. We will denote the proportionality constant relating x and ν(x) as s d 2 . The diffusion coefficients are obtained from the original model (which is being approximated as a diffusion) as where E[∆x|x] and Var [∆x|x] are obtained from the model being approximated.
Because we are focusing here on variance in the growth increment that is due to demographic stochasticity, we have followed standard practice (Lande et al. 2003) and written the coefficients in terms of s d 2 , referred to as the demographic variance. There are two main strengths of the diffusion approximation. First, the diffusion approximation is usually described by fewer parameters than is the full process. This simplifies the model, and the calculation of µ and ν from Eq. 11 and Eq. 12 provides valuable information about which aspects of the process are important to consider for describing the dynamics. Second, it is also possible to simulate realizations of the diffusion process, using µ(x) and ν(x) to add a random perturbation to the growth of the population (Vindenes et al. 2010(Vindenes et al. , 2011.

Extinction probability
Analytical solutions, which are known for many diffusion processes (Karlin and Taylor 1981), provide information on the time required to reach a given state (first passage time). Extinction, in a population described by a diffusion model, is analyzed as just such a passage time problem. If N is the population size, then a natural extinction boundary exists at N e = 1. The probability of extinction depends on the initial population size N 0 , the growth rate r = log λ of the mean population size, and the demographic variance s d 2 , according to (Engen et al. 2005). For a specified value of r  0 and a given initial population, extinction probability increases with increasing demographic variance.

Demographic variance from matrix models
We now connect the structure of multitype branching processes with the framework of diffusion models, by calculating the demographic variance s d 2 from the matrix model. To do so, the dimensionality of the MBP must be reduced to a scalar population size, in spite of the fact that population structure may be fluctuating. The key is to measure population size not by total numbers, but by the total reproductive value (Engen et al. (2005) for age-classified models, Vindenes et al. (2008) for stage-classified models). It is known that the total reproductive value grows exponentially regardless of the age or stage distribution of the population (Fisher 1930).
Begin with the matrix A, with dominant eigenvalue λ and corresponding right and left eigenvalues w and v, where w gives the stable stage distribution and v the reproductive value distribution. We scale the eigenvectors so that w sums to 1 and Define a vector q whose ith entry is Then it can be shown that (see section 'Derivations' for the derivation). The entries of q are the contributions to the variance from each stage, so the conditional variance in V (t + 1) depends on the stage distribution of the population. Following Engen et al. (2007) and Vindenes et al. (2008), we will use the stable stage distribution, so that n(t) ≈ w N. Under this assumption, , so that Eq. 15 can be written Comparing this to Eq. 12, which defines a conditional variance in terms of s d 2 and an expected population, we obtain the demographic variance in terms of w and q, This result provides the demographic variance for any ageor stage-classified model from the eigenvectors of A and the variances and covariances of the stage-specific transition probabilities and reproductive outputs.

Partitioning the components of demographic variance
Because Eq. 4 expresses the covariances in terms of the matrices U and F, the demographic variance can be partitioned into contributions from these life cycle components. The demographic variance (Eq. 17) is a weighted mean, over the stable stage distribution, of the variances of the offspring production of each stage, v T C (i) v. Each of those variances contains a term due to transitions and a term due to reproduction. Thus 653

Sensitivity analysis of the demographic variance
We turn now to the problem of the sensitivity of s d 2 to changes in parameters affecting the means and covariances of the vital rates. This calculation is a systematic application of the matrix calculus approach to sensitivity analysis; see (Caswell 2007(Caswell , 2008(Caswell , 2009(Caswell , 2012. Derivations of all results are given in the section 'Derivations'.
Differentiating Eq. 17 and applying the vec operator gives The analysis proceeds by determining the differentials d w and dq; the latter will require differentials of both v and of covariances.

Derivatives of w
From Caswell (2008) we have d w subject to the constraint

Derivatives of q
To differentiate q, start by rewriting Eq. 45 as Differentiating and applying the vec operator gives For this, we need two pieces: the differential of the left eigenvector v and the differential of the covariance matrices C i . We address these in turn.

Derivatives of v
An extension of the methods in Caswell (2008) yields the differential of the v scaled so that v T w = 1: where d w is given by Eq. 20.

Derivatives of the covariance matrices C (i)
The derivatives of the covariance matrices C C C include a number of possibilities, because the covariance matrix will change due to changes in U and F, as well as due to changes that may be incorporated in parametric form.

Covariances due to transitions
We start with the covariances due to transitions, which are given by Eq. 7. For each parental stage i, we have where Z = D (vec I).

Covariances due to reproduction
In both Bernoulli and Poisson reproduction, the covariance matrix C (j) is obtained from the fertility matrix F. For the moment, supress the indication of the parent stage i, and let f denote column j of F.
The Bernoulli and Poisson models for reproduction are convenient because they permit the calculation of variances from the mean fertility matrix F routinely reported in demographic studies. But if data were available on the mean and the variance of reproduction by each stage, one could use these empirical measures, and calculate the sensitivities using the derivatives of the matrices C F ( ) i with respect to the variance of fertility.

Sensitivity of extinction probability
The extinction probability in Eq. 13 depends on the initial population size and the extinction threshold. Since population size is a continuous variable in the diffusion approxi mation, extinction must be interpreted as quasi-extinction relative to this threshold.
Consider a population of initial size (total reproductive value) V 0 , and quasi-extinction threshold of V e , and write V V V e = − 0 . Assume that r = log λ is greater than 0 (otherwise extinction is certain); the probability of extinction is (cf. Eq. 13). A parameter vector θ will, in general, affect both r and s d 2 .
Differentiating P e and simplifying yields The first of these terms shows how increases in r reduce extinction probability; the sensitivity of r depends only on the mean matrix A and can be calculated by well known formulae (Caswell 2001). The second term shows how increases in s d 2 increase the extinction probability; the results in previous sections for the sensitivity of s d 2 make it straightforward to compute this term.

Implementation: a protocol for sensitivity analysis
We now have all the information necessary to calculate the sensitivity of the demographic variance to any set of parameters that affect U and/or F. Let θ be such a vector of parameters. The following protocol integrates the results from the previous sections. Starting with the matrices U and F, and the dominant eigenvalue of A and its eigenvectors w and v, and , respectively. These derivatives will, of course, be unique to the choice of parameters, the structure of U and F, and their dependence on θ.
The expressions for each of these steps may appear complicated, but the matrices involved in each are either simple constants, composed of identity matrices and vectors of ones, or are obtained directly from the population projection matrix A or its components U and F.

Examples
The results presented here apply to a wide range of life cycles (age-classified, stage-classified, multistate) and reproductive tactics (single offspring, multiple offspring, single or multiple types of offspring). In this section we explore a few examples to reveal some intruiging patterns. We will calculate the values of the demographic variance and the contributions to demographic variance from the (co)variances of transitions and of fertility. We will also present the sensitivity of s d 2 to fertility and to mortality, and do so for both age-classified and stage-classified examples, including both Bernoulli-and Poisson-distributed numbers of offspring.
Our intention is not a rigorous comparative study, but some of the patterns that begin to appear suggest that such a study would be worthwhile.

Age-classified human populations
The large collections of age-specific mortality and fertility data for human populations, sometimes over long time periods, make them an attractive option for studying demographic stochasticity in age-classified populations. While the details would vary in other species, the human life cycle is typical of a long-lived, slowly maturing, monovular (ignoring uncommon multiple births) species.
We will present results for nine populations studied, in a different but related context, by van Daalen and Caswell (2017). These include time series for the Netherlands (1950), Sweden (1891, and Japan , as well as two hunter-gatherer populations (the Ache of subtropical Paraguay, Gurven andKaplan 2007, Hill andHurtado 1996, and the Hadza of the Tanzanian savanna, Blurton Jones 2011), and the Hutterites of North America. The Netherlands, Sweden and Japan show patterns over time that are typical of developed countries (the demographic transition), in which reductions in mortality lead to increases in life expectancy, followed some time later by reductions in fertility. The hunter-gatherer populations have higher mortality, lower life expectancy and higher fertility than the developed countries. The Hutterites, an Anabaptist religious community in the United States and Canada, make an interesting comparison. They had the highest total fertility recorded for any human population (Eaton and Mayer 1953), but had (or were assumed to have) experienced mortality rates similar to those of the US around 1950.
Female life table data for the Netherlands, Sweden and Japan were obtained from the Human Mortality Database (Human Mortality Database 2013) and the Human Fertility Database (Human Fertility Database 2013). Data for the Ache were obtained from Gurven and Kaplan (2007) and Hill and Hurtado (1996), and for the Hadza from Blurton Jones (2011). The fertility and mortality schedules for the Hutterites were taken from Eaton and Mayer (1953). Fertility was divided by 2 to count female offspring per female.

Analysis
Matrices U and F were created for each population in each year. Fertility was treated as a Bernoulli process. Demographic variance was calculated from Eq. 17, including the contributions from survival and fertility using Eq. 7 and Eq. 8 for the covariance matrices and Eq. 18 for the decomposition. The sensitivity of s d 2 to age-specific mortality and fertility was calculated from Eq. 19.
In addition to s d 2 , its components, and its sensitivity to the life history, other demographic properties were computed. Population growth rate λ was calculated as the dominant eigenvalue of A. The net reproductive rate R 0 was calculated as the dominant eigenvalue of the next generation matrix F (I − U) −1 . The mean longevity (life expectancy) and the variance in longevity were computed from the fundamental matrix (e.g. Eq. 38 and 41 of Caswell (2009)). The variance in lifetime reproductive output was taken from van Daalen and Caswell (2017).
Life expectancy, R 0 , and λ are statistics describing the mean consequences of the life cycle, in terms of mortality, fertility, and their interaction. The variance in longevity and the variance in lifetime reproductive output describe the results of individual stochasticity. One of our goals is to see how these quantities are related to s d 2 , which describes the result of demographic stochasticity.  Figure 1 shows time series results for Sweden from 1891 to 2010. Over this time period, life expectancy increased by 58%, from 53 to 84 years, while lifetime reproductive output decreased by 34%, from 1.5 to 0.99 female offspring per female. The demographic variance s d 2 increased, with fluctuations ( Fig. 1a) almost doubled, from 0.05 to 0.09. The contribution of fertility to s d 2 increased from 70% to 99% (Fig. 1b). In 1891, the contribution of infant and child mortality to s d 2 was visible, but it had disappeared by 2010 (Fig. 1c-d).
Corresponding time series for the Netherlands and Japan are shown in the Supplementary material Appendix 1, and the values at the beginning and end of the time series are shown in Table 1 Fig. A4). Figure 3 and 4 show the contributions to s d 2 and the sensitivity of s d 2 to mortality and fertility for the Ache and Hadza hunter-gatherer populations. The values of s d 2 (0.04 and 0.06) are in the same range of the values for the early developed countries. However, the contributions from fertility and from survival are almost exactly evenly matched (Table 1).
The Hutterite population, with its extremely high fertility, but more contemporary mortality, has a low value of s d 2 (0.023), almost all of which is contributed by fertility.

A stage-classified example
Calathea ovandensis is an herbaceous understory plant native to Central America. Horvitz and Schemske (1995) developed a size-classified model with eight stages (seeds, seedlings, juveniles and pre-reproductives, and small, medium, large  (Horvitz and Schemske 1995). Using this matrix, and assuming a Poisson distribution for fertility, we computed s d 2 using Eq. 17, and the contributions from the fertility matrix F and from the matrix U that includes transitions and survival. The resulting demographic variance for Calathea is 99.8% of which was contributed by transitions and survival; only 0.2% was due to stochasticity in fertility.
We computed the sensitivity of s d 2 to changes in stagespecific fertility and mortality using (19). For the latter calculation, the derivatives of U to mortality were obtained by factoring U, where Σ has survival probabilities on the diagonal and G contains transition probabilities conditional on survival.
The stage-specific contribution and sensitivity results are shown in Fig. 6. The largest contributions come from transitions and survival of stages 3-5. The fertility contributions are tiny, but of them the largest is from stage 5. The sensitivity patterns are quite different from those of the human populations. The sensitivity of s d 2 to mortality is negative in the smallest stage, and becomes positive later. The sensitivity to fertility is largest in stage 1 (which cannot reproduce), and declines across the larger stages.

Discussion
This methodological study has had several specific goals. The first and most general is to link matrix population models, multitype branching processes, and diffusion models as an approach to the study of demographic stochasticity. The second is to use the linkage to develop a sensitivity analysis of the demographic variance to parameters affecting any part[s] of the life cycle. The third is to develop methods to quantify the contributions to the demographic variance of fertility and transitions, throughout the life cycle. And finally, we set out to apply the methods to a variety of animal and plant population models as a first exploration of the patterns that emerge.
The diffusion approximation omits all the details; it is constructed from nothing more than a mean and a variance. That simplicity makes it a powerful technique because it is able due to fertility and survival, the population growth rate λ, the net reprodutive rate R 0 , the mean E(η 1 ) and variance V(η 1 ) of longevity at birth, and the variance V(ρ) in lifetime reproductive output due to individual stochasticity (van Daalen and Caswell 2017).  to derive a great deal of stochastic information (e.g. extinction probability) from such a minimal specification. But that power is also its weakness, because the diffusion approximation obscures all the demographic and life history information that underlies population dynamics. It contains no stages, no transitions, no growth, no mortality schedules, no maturation, no metamorphosis, no size-dependent fertility, and thus no heterogeneity of any kind. As a result, interpretation of the results in terms of the life history traits that we know and love is impossible. For that reason, results that connect the diffusion approximation to age-and stage-structured models that do contain explicit life history information are so valuable.
Our results make it possible to compute the demographic variance from any age-or stage-classified population projection matrix and demonstrate the link between matrix population models, multitype branching processes, and diffusion models. Applications based on the diffusion approximation, including conservation biology (Lande et al. 2003), population dynamics (Lande et al. 2003, Engen et al. 2007, Vindenes et al. 2008, extinction (Lande 1993, Engen et al. 2005, and fixation probability (Vindenes et al. 2010, Cvijović et al. 2015 are now directly applicable to the huge accumulation of published matrix population models (over 7200 projection matrices, for over 950 species of plants and animals; see Salguero-Gómez et al. (2015). It also makes it possible to use powerful matrix calculus methods to carry out sensitivity analyses.
Future research using comparative analyses will reveal patterns across different kinds of heterogeneity and life histories. For comparative analyses to be revealing, sufficient variation must exist in the variables to be analyzed. For these analyses, we note that among the small set of age-classified human populations, s d 2 varies by almost four-fold. The size-classified plant population extends that range to 68-fold. Among the human populations, fertility contributions range from about 50% to almost 100% of s d 2 . Again, the plant population is extremely different: fertility contributes only 0.02% of s d 2 . 1.2 x 10 −3 x 10 −3 Ache 3.5 The empirical examples we included here are only a beginning, but we already see some interesting patterns.
• In human populations where extensive demographic data are available over a long time period, demographic variance has changed markedly over time, with a general increase in Sweden and Japan, while in the Netherlands a large increase in the 70s has been followed by a decline since the peak in the 80s. • Among these human populations there has been a dramatic shift in contributions to s d 2 from survival and fertility. In populations with lower survival, the contributions from survival and fertility are roughly equal (especially in hunter-gatherer populations with life expectancy on the order of 35 years). As survival, and hence life expectancy, increases, the contributions shift to almost complete dominance by fertility in contemporary developed countries.
• The sensitivity of demographic variance to changes in mortality is consistently positive for early ages, and negative at older ages. The sensitivity to changes in fertility may be slightly positive at youngest ages (where fertility is zero anyway), but becomes negative during the reproductive years. These patterns are sufficiently consistent that one could conjecture that they are a common feature of age-classified populations. • These comparisons of demographic variance across human populations cover a range of conditions that would be regarded as significant environmental variation in any mammal species (life expectancy varying by a factor of 2.5, lifetime reproductive output by a factor of 5). The resulting patterns differ from the interspecific comparisons among mammals by Vindenes and Engen (2017), who found a negative relationship between demographic variance and life expectancy. The explanation, as usual for comparative studies, will undoubtedly reflect the details of life history variation among, on the one hand, populations of one species, and populations of various related species.
The stage-classified model for Calathea ovandensis reveals quite different patterns. The demographic variance itself is 1-2 orders of magnitude larger than the values for the human   populations and is almost completely due to variance in survival and transitions, perhaps because the size-classified model offers so many more pathways through the life cycle and because the life history combines high fertility and low seedling survival. The contributions from each stage or age class to the demographic variance depend on the expected contribution from each individual as well as the proportion of that stage in the stable distribution. Thus, stages where the variance contribution from each individual is small, for instance because the survival probability is high, may still have a large impact on the demographic variance if they constitute a high proportion of the stable distribution. If the survival probability approaches zero or one, the contribution to demographic stochasticity will approach zero even for stages constituting a high proportion of the population.

Future research directions
The methods we have introduced suggest a number of future research directions. We describe a few of those here.
Our examples have focused on age and size, two of the most important and most frequently studied sources of demographic heterogeneity. Other sources of heterogeneity will also affect the demographic variance (e.g. spatial structure, morphological or behavioral trait variation, genetic variation). These dimensions of heterogeneity can be incorporated into multistate or hyper-state matrix models (Caswell 2014, Roth and Caswell 2016 and our analyses applied to them, to explore the effects of these sources. The analyses we present here can be applied when integral projection models are treated as a source of matrices by discretizing the state variable (Ellner et al. 2016). This will open new analyses of more fine-scaled variation in the contributions to demographic variance across continuous traits.
It is inviting to use model life tables (Coale andDemeny 1983, Barlow andBoveng 1991) or parametric survival and fertility schedules (Brass 1960, Siler 1979 to explore ageclassified life cycles. Relating the demographic variance to the pace and shape of life histories (Baudisch 2011) might reveal new patterns in age-classified populations operating on different time scales. In the same vein, a systematic study of the consequences of mixture models for fertility (e.g. the negative binomial as a gamma-distributed mixture of Poisson distributions) would be a valuable application.
The diffusion approximation and the demographic variance s d 2 reveal the population consequences of stochastic individual events. The relation between s d 2 and measures of individual stochasticity warrants further investigation. Our results in Table 1 show that it is possible for s d 2 to increase while individual variance in longevity and lifetime reproductive output are declining. Identifying the conditions under which this happens will require extensive new analyses.
The data required to construct these models consists of the means and covariances of the production of all types of individuals, by all types of individuals. The means are precisely the entries of the standard population projection matrix. The only extra information required is the set of covariances. Because the transitions of existing individuals follow multinomial distributions, their covariances can be calculated directly from the means. The covariances for fertility are not part of the usual presentation of a matrix population model, and for that reason we have shown how to proceed from assuming a distribution (Bernoulli or Poisson) for fertility. But a sufficiently detailed individual data set would include information from which these could be estimated empirically. We encourage researchers to present data on variances in reproductive output as well as means, when available.

Derivations
In this section, we collect the derivations of the major results of analysis of the matrix model in terms of the demographic variance s d 2 .

Derivation: demographic variance
We define λ as the dominant eigenvalue of A, and w and v as the right and left eigenvectors corresponding to λ, scaled so that eigenvectors so that w sums to 1 and w T v = 1. We note that, from the properties of the Kronecker product, The total reproductive value is with expectation and variance The expectation of V (t) grows exponentially at the rate λ: The variance of V (t) grows asymptotically at the rate λ 2 , Define a vector q whose ith entry is Conditioning on V (t) sets the last term to 0, leading to Eq. 15, from which Eq. 16 and 17 follow.

Covariances due to reproduction
In both Bernoulli and Poisson reproduction, the co-variance matrix C F ( ) j is obtained from the fertility matrix F. For the moment, supress the indication of the parent stage i, and let f denote column j of F.
The covariance matrix is