Second derivatives of the population growth rate measure the curvature of its response to demographic, physiological or environmental parameters. The second derivatives quantify the response of sensitivity results to perturbations, provide a classification of types of selection and provide one way to calculate sensitivities of the stochastic growth rate.
Using matrix calculus, we derive the second derivatives of three population growth rate measures: the discrete-time growth rate λ, the continuous-time growth rate r = log λ and the net reproductive rate R0, which measures per-generation growth.
We present a suite of formulae for the second derivatives of each growth rate and show how to compute these derivatives with respect to projection matrix entries and to lower-level parameters affecting those matrix entries.
We also illustrate several ecological and evolutionary applications for these second derivative calculations with a case study for the tropical herb Calathea ovandensis.
Using matrix population models, ecological indices can be calculated as functions of vital rates such as survival or fertility. Measures of population growth rate, including the discrete-time growth rate λ, the continuous-time growth rate r = log λ and the net reproductive rate R0, are of particular interest. The discrete-time population growth rate λ is given by the dominant eigenvalue of the population projection matrix. Sensitivities (first partial derivatives) of λ with respect to relevant parameters quantify how population growth responds to vital rate perturbations. These first derivatives are used to project the effects of vital rate changes due to environmental or management perturbations, uncertainty in parameter estimates and phenotypic evolution (i.e. with λ as a fitness measure, the sensitivity of λ with respect to a parameter is the selection gradient on that parameter) (Caswell 2001).
Applications of second derivatives of growth rates
The second derivatives of growth rates have applications in both ecology (e.g. assessing and improving recommendations from sensitivity analysis, approximating the sensitivities of stochastic growth rates) and evolution (e.g. characterizing nonlinear selection gradients and evolutionary equilibria). Several of these applications are summarized in Table 1 and described in the following sections.
Second-order sensitivity analysis and growth rate estimation
The sensitivity of growth rate provides insight into the population response to parameter perturbations. However, such perturbations also affect the sensitivity itself, that is, sensitivity is 'situational' (Stearns 1992). These second-order effects are quantified by the sensitivity, with respect to a parameter θj, of the sensitivity of λ to another parameter θi, that is, by the second derivatives . The sensitivity of the elasticity of growth rate to parameters similarly depends on second derivatives (Caswell, 1996, 2001).
Table 1. Potential applications for the pure and mixed second derivatives of λ. Analogous interpretations apply to r or R0 as alternative measures of growth or fitness
Sensitivity of λ to θ is independent of θ
Linear selection on trait θ
Sensitivity of λ to θ increases with θ
Convex selection on trait θ
Evolutionarily unstable singular strategy
Sensitivity of λ to θ decreases with increases in θ
Concave selection on trait θ
Evolutionarily stable singular strategy
Sensitivity of λ to θi increases with θj
Selection to increase correlation between traits θj and θi
Sensitivity of λ to θi decreases with increases in θj
Selection to decrease correlation between traits θj and θi
Used to calculate sensitivity of the stochastic growth rate λs
In conservation applications, attention is often focused on the vital rates to which population growth is particularly sensitive or elastic; these first-order results may change depending on parameter perturbations. First derivatives also provide a linear, first-order approximation to the response of the growth rate to changes in parameters. The linear approximation is guaranteed to be accurate for sufficiently small perturbations and is often very accurate even for quite large perturbations (Caswell 2001). If the response of λ to θ is nonlinear, it is tempting to use a second-order approximation for Δλ:
We caution that although this may, in some cases, provide a more accurate calculation, this is not guaranteed. As shown in Fig. 1 of Carslake, Townley & Hodgson (2008), for example, adding the second-order terms may actually reduce the accuracy of the approximation.
Characterizing nonlinear selection processes
The second derivatives of fitness with respect to trait values have consequences for selection. The first derivatives of fitness are selection gradients (Lande 1982). When fitness is a linear function of a trait, its second derivatives are zero, and there is selection to shift the trait's mean value. When fitness is a nonlinear function of a trait, its second derivatives are nonzero and provide additional information on how selection affects the trait's higher moments (Lande & Arnold 1983, Phillips & Arnold 1989, Brodie, Moore & Janzen 1995). Such nonlinear selection can be classified as concave or convex depending on whether the second derivatives are negative or positive.
One can classify a selection process as linear, concave or convex using quadratic selection gradients, the local second derivatives of fitness with respect to trait value (Phillips & Arnold 1989). If fitness is measured as λ, these quadratic selection gradients are equivalent to ∂2λ/∂θ2, the pure second derivatives of λ with respect to trait θ (e.g. the second derivatives with respect to stage-specific survival in C. ovandensis, as shown in Fig. 3a). Concave, linear and convex selection correspond to negative, zero and positive second derivatives, respectively.
Concave selection reduces the variance in the trait, and convex selection increases it; Lande & Arnold (1983, p.1216) equate this to a more sophisticated version of the concepts of stabilizing and disruptive selection. Brodie, Moore & Janzen (1995) provide further analysis of the curvature of the fitness surface and its effects on selection.
Selection operating on pairs of traits is said to be correlational if the cross second derivatives are nonzero. Thus, if the pure second derivatives of two different traits, θi and θj, are both nonzero, their mixed second derivative ∂2λ/∂θj∂θi is a measure of correlational selection. If ∂2λ/∂θj∂θi<0, there is selection to decrease the phenotypic correlation between the two traits; if ∂2λ/∂θj∂θi>0, there is selection to increase their correlation. The concepts of nonlinear selection are powerful, but require the second derivatives of fitness to be applied.
Stability of evolutionary singular strategies
Second derivatives play a role in adaptive dynamic analyses. Evolutionary singular strategies (SSs) are phenotypes for which the selection gradient is locally zero (e.g. Geritz et al. 1998). SSs are classified as stable, attracting or repelling, and by whether they can invade or coexist with other nearby phenotypes (Geritz et al. 1998, Diekmann 2004, Waxman & Gavrilets 2005, Doebeli 2011).
These classifications depend on the local second derivatives of invasion fitness, the growth rate of a rare mutant in an equilibrium resident environment. For example, the second derivative of the mutant growth rate λ to the mutant trait y determines whether a SS is evolutionarily stable (∂2λ/∂y2<0) or evolutionarily unstable (∂2λ/∂y2>0). Evolutionarily stable strategies, once established, are unbeatable phenotypes against which no nearby mutants can increase under selection and are thus long-term evolutionary endpoints. Evolutionarily unstable strategies, on the other hand, are branching points open to phenotypic divergence and may ultimately become sources of sympatric speciation (Geritz et al. 1998).
Sensitivity of the stochastic growth rate
Second derivatives provide a way to calculate the sensitivity of the stochastic growth rate in some cases. The stochastic growth rate is
where N(t) is the population size at time t. Tuljapurkar (1982) derived a small-noise approximation for log λs in the absence of temporal autocorrelation. As shown by Caswell (2001 Section 14.3.6), this approximation can be written in terms of the first derivatives of λ, the dominant eigenvalue of the mean projection matrix . Thus, the derivatives of this approximation can be written in terms of the second derivatives of that eigenvalue (Caswell 2001, Section 14.3.6). We discuss this application further in the section ‘Sensitivity analysis of stochastic growth rates’.
Calculating second derivatives of growth rates
The second derivatives of λ with respect to matrix elements were introduced by Caswell (1996); see also Caswell (2001, Section 9.7). However, these calculations are awkward and error-prone, because they involve all the eigenvalues and eigenvectors of the projection matrix. McCarthy, Townley & Hodgson (2008) introduced an alternative approach for calculating the second derivatives of eigenvalues (they call them 'second-order sensitivities') based on transfer functions, partially to avoid the calculation of all the eigenvectors. However, they consider only rank-one perturbations of a subset of the matrix elements, excluding fertilities, and their calculations are perhaps equally difficult.
Here, we reformulate the second derivative calculations using matrix calculus, providing easily computable results. We extend previous results by including not only second derivatives with respect to matrix elements, but also those with respect to any lower-level parameters that may affect the matrix elements, and by presenting the second derivatives of the continuous-time invasion exponent r and the net reproductive rate R0.
The key to our approach is that the calculation of first derivatives using matrix calculus yields a particular expression, the differentiation of which leads directly to the second derivatives. Second derivatives are easily computed by this method in any matrix-oriented language, such as Matlab or R. Although we consider only the second derivatives of population growth rates, our approach extends naturally to other scalar-dependent variables.
In the section ‘A case study: Calathea ovandensis’, we present an example of the calculation of second derivatives in a case study of the tropical herb Calathea ovandensis.
Matrices are denoted by upper-case boldface letters (e.g. A) and vectors by lower-case boldface letters (e.g. w); unless otherwise indicated, all vectors are column vectors. Transposes of matrices and vectors are indicated by the superscript . The matrix In is the n×n identity matrix, the vector e is a vector of ones, and e1 is a vector with 1 as its first entry and zeros elsewhere. The matrix Km,n is a mn×mn commutation matrix (vec-permutation matrix) (Magnus & Neudecker 1979, Henderson & Searle 1981), which can be calculated using the Matlab function provided in Appendix S1-D. The expression diag(x) indicates the square matrix with x on the diagonal and zeros elsewhere.
The Kronecker product is denoted by X⊗Y and the Hadamard (element-by-element) product by X∘Y. The vec operator (e.g. vecA) stacks the columns of a matrix into a single vector. For convenience, we will write (vecA) as vecA . We will make frequent use of Roth's theorem (Roth 1934), which states that for any matrices X, Y and Z:
Matrix calculus notation
Matrix calculus is a system for manipulating vectors and matrices in multivariable calculus and simplifies partial derivative calculations by allowing the differentiation of scalar, vector or matrix functions with respect to scalar, vector or matrix arguments. While there are multiple matrix calculus notations, we will use the system of Magnus & Neudecker (1999). For a more detailed introduction to these methods in an ecological context, see Appendix 1 of Caswell (2007).
The first derivative of a m×1 vector y with respect to a n×1 vector x is defined to be the m×n Jacobian matrix
that is, a matrix whose (i,j) entry is the derivative of yi with respect to xj. We will also write this as an operator D[y;x]; the first argument of D is the vector-valued function y to be differentiated, and the second argument is the vector-valued variable x with respect to which differentation is carried out. Thus,
As in the scalar case, second derivatives are obtained by differentiating first derivatives. If we consider a scalar-valued function y(x) of a vector-valued argument x, the matrix of second derivatives (the Hessian matrix) is given by the operator
The matrix of second derivatives of a vector-valued function y(x), where y has dimensions m×1, is obtained by stacking the Hessian matrices for each of the elements of y; that is,
These first and second derivative definitions are written in terms of vector-valued functions and arguments. When matrices appear, they are transformed into vectors using the vec operator, which stacks the columns of the matrix into a column vector. Thus, the first and second derivatives of λ with respect to the entries of the matrix A would be written, respectively, as D[λ;vecA] and H[λ;vecA].
The identification theorems
Magnus & Neudecker (1985, 1999) showed how to obtain first and second derivatives from the differentials of functions. Their ’first identification theorem’ showed that
That is, if an expression of the form dy=Qdx can be obtained, then the Jacobian matrix of first derivatives is given by Q.
The 'second identification theorem’ does the same for the Hessian matrix of second derivatives, showing that
Thus, our goal will be to find expressions of the form d2y = dxBdx, where y is a measure of population growth rate and x represents either matrix entries or lower-level parameters; the matrix B will then provide the Hessian matrix using (eqn 11). The key to our approach is to begin with the expression (eqn 10) for the first differential, differentiate it to obtain the second differential and manipulate the results to obtain a matrix B in the form of (eqn 11).
Second derivatives of growth rates
We now apply the identification theorems to three measures of population growth rate, the discrete-time growth rate λ, the continuous-time growth rate r = log λ and the net reproductive rate R0.
Second derivatives of the discrete-time growth rate λ
Second derivatives of λ with respect to matrix entries: H [λ;vecA]
We assume a population projection matrix A of dimension n×n. The discrete-time growth rate λ is the dominant eigenvalue of A. To derive H [λ;vecA], we begin with an expression of the form (eqn 10) for the first differential of λ. As shown in Caswell (2010),
where w and v are the right and left eigenvectors of A corresponding to λ, scaled so that
where e is a n×1 vector of ones.
Differentiate (eqn 12) to obtain the second differential
Because we are calculating second derivatives with respect to A, the second term will drop out because d2vecA = 0 (Magnus & Neudecker 1999). Apply the vec operator to obtain
Second derivatives of λ with respect to lower-level parameters: H [λ; θ]
Because many life-history traits and environmental factors affect multiple life cycle transitions, the entries of A are usually functions of lower-level parameters. The first derivatives with respect to lower-level parameters are calculated with the chain rule. To calculate the second derivatives of λ with respect to a s×1 vector θ of lower-level parameters, we must develop a chain rule for the Hessian.
To do so, we begin with the first differential of λ in (eqn 12) and differentiate to obtain the second differential (eqn 15). Because we are calculating second derivatives with respect to θ rather than A, d2vecA is no longer zero. By the chain rule,
and hence by the second identification theorem (eqn 11),
The first and second derivatives of A with respect to θ, which appear in and H [vecA;θ], respectively, can be evaluated by hand or by using a symbolic math program. This result is in agreement with the Hessian chain rule derived in a different way by Magnus & Neudecker 1999, p. 125).
These results can be used to parameterize constraints or covariation among traits. As a simple example, suppose that survival and fertility are constrained to covary as Fi = cPi, and one wants the total second derivative including this constraint. This is obtained by defining a lower-level parameter θ, setting Fi = θ and Pi = cθ and calculating H [λ;θ].
Second derivatives of the invasion exponent r: H[r;vecA] and H[r;θ]
The population growth rate in continuous time is the invasion exponent r = log λ. By the definition of the Hessian in (eqn 7), the Hessian of r with respect to A is
We insert the first derivative of log λ,
and then apply the product rule to obtain
which simplifies to
where H[λ;vecA] is given by (25).
Replacing vecA in (eqn 42) with a parameter vector θ gives the Hessian
The derivatives can be calculated by hand or with a symbolic math program, and H [λ;θ] can be obtained from (38).
Second derivatives of the net reproductive rate R0
The net reproductive rate R0 measures the population growth rate per generation and is used as an alternative fitness measure to r under some special conditions (Pásztor, Meszéna & Kidsi 1996, Brommer 2000). If A is decomposed into transition and fertility matrices, A = U+F, then R0 is the dominant eigenvalue of the next generation matrix R = FN (Cushing & Zhou 1994), where N is the fundamental matrix:
The (i,j) entry of N gives the expected number of visits to stage i for an individual starting in stage j. The (i,j) entry of R gives the expected lifetime production of stage i offspring by an individual starting in stage j.
Because R0 is an eigenvalue, our results for H[λ;vecA] and H[λ;θ] can be applied to find its second derivatives, but with R taking the place of matrix A. The resulting expressions are more complicated than the corresponding expressions for λ, because parameters can affect R0 through U, F or both. In the important special case where only a single type of offspring is produced (suppose it is numbered as stage 1), then R is an upper triangular matrix and R0 is its (1,1) entry; in this case, eigenvalue calculations are not necessary.
We defer the fully general calculation of H[R0;θ] to Appendix S1-C and show results here for two useful special cases: the second derivatives with respect to the entries of the transition matrix U and with respect to the entries of the fertility matrix F. We consider both single and multiple types of offspring.
If we apply (38) to the case of R0, replacing vecA with vecR, we obtain
and wR and vR are the right and left eigenvectors of R.
To evaluate (47), we must calculate the second derivatives of R0 with respect to R, and the first and second derivatives of R with respect to θ. For the former, the Hessian H[R0;vecR] is given by (25), using the dominant eigenvalues and eigenvectors of R rather than those of A. For the latter, we will consider the derivatives of R with respect to U and F in turn. The derivatives of R with respect to general parameters θ are shown in Appendix S1-B.
Second derivatives of R0to the transition matrix: H [R0; vecU]
The second derivatives of R0 with respect to the entries of the transition matrix U require the first and second derivatives of R with respect to U. The first derivatives are obtained by differentiating R = FN, applying the vec operator and noting that dvecN = (⊗N)dvecU (Caswell 2006, 2009), to obtain
The second derivatives of R are obtained from the definition of the Hessian matrix (eqn 9):
The derivative of vec (N⊗) is given by a result of Magnus & Neudecker (1985, Theorem 11; 1999, p. 209); for a m×n matrix X and a p×q matrix Y,
In the common case where there is only one type of offspring (Appendix S1-B), H[R0; θ] simplifies to
where e1 is the n × 1 vector with 1 as its first entry and zeros elsewhere.
A case study: Calathea ovandensis
Calathea ovandensis is a neotropical perennial herb that inhabits forest understories. Horvitz & Schemske (1995) developed a stage-structured matrix model for C. ovandensis that contains eight stages distinguished by size and reproductive ability: seeds, nonreproductive stages (seedlings, juveniles, pre-reproductive), and reproductive stages (small, medium, large and extra large). Plants may grow larger, remain in the same size class, or shrink at each time step; larger adults are typically more fecund.
Horvitz and Schemske summarized four years of population dynamics from four plots of C. ovandensis with a series of 8×8 projection matrices. The average of these matrices, weighted by the observed stage abundances and transition frequencies, is given in Table 8 of Horvitz & Schemske (1995) as
The dominant eigenvalue of this matrix is 0·9923, indicating a near-steady state population.
To obtain the second derivatives of λ to the entries of A, we calculated the Hessian H[λ; vecA] using (25). It is a symmetric 64 × 64 matrix (Fig. 1). In this example, and in others with large projection matrices, H[λ; vecA] contains many entries and may be difficult to interpret, even when entries that are fixed at 0 are omitted. Most of the second derivatives here are small in magnitude (Fig. 1b) with the exception of a few entries, including the highly negative and ∂2λ/∂a3,1∂a4,2 = −75·64, where a3,1 is the transition probability from seed to juvenile and a4,2 is the transition probability from seedling to pre-reproductive.
Using (38), we calculated the Hessian H[λ; θ] for a set of lower-level parameters θ. For example, the stage-specific survival probabilities are lower-level parameters that affect multiple matrix entries. To analyse these using (38), write the survival probabilities in a vector σ, which is given by the column sums of U, so that
where G describes stage transitions conditional on survival (Caswell 2011). The Hessian of λ with respect to σ is given by (38), with the parameter vector θ given by σ. Calculating this Hessian requires the first and second derivatives of A with respect to σ. The first derivatives, assuming that F does not depend on σ (i.e. prebreeding census), are
(see Caswell and Salguero-Gómez 2013, Appendix A).
The second derivatives of A are given by H[vecA;σ], the derivative of (eqn 61) with respect to σ. However, none of the terms in (eqn 61) depend on σ, so H[vecA;σ] is a zero matrix. Thus, the matrix B in (38) reduces to
where is given by (eqn 61) and H[λ;vecA] is given by (25).
The resulting Hessian matrix with respect to the lower-level survival probabilities, H[λ;σ], is shown in Fig. 2. These second derivatives are generally of smaller magnitude than those of H[λ;vecA] (Fig. 1). The largest second derivatives in H[λ;σ] appear in rows 1 and 2 (equivalently, columns 1 and 2). Figure 3 highlights the mixed second derivatives ∂2λ/∂σ1∂σi and ∂2λ/∂σ2∂σi, along with the pure second derivatives .
C. ovandensis has several large second derivatives involving σ1 and σ2 (the first two rows or columns of Fig. 2, which are shown separately in Fig. 3b,c). As discussed in the section ‘'Second-order sensitivity analysis and growth rate estimation',’ this indicates that the sensitivity of λ to stage 1 (seed) and stage 2 (seedling) survival will be especially responsive to changes in later survival. Similarly, the sensitivity of λ to later survival is especially responsive to changes in seed and seedling survival.
When interpreted in terms of selection gradients, recall from the section, ‘'Characterizing nonlinear selection processes'’ that selection on a single trait is concave, linear or convex if ∂2λ/∂θ2 is negative, zero or positive. Selection on two traits is negatively or positively correlational if ∂2λ/∂θ1∂θ2 is negative or positive. C. ovandensis is experiencing nearly linear selection on survival in stage 8 (∂2λ/∂σ8 ≈ 0), concave selection on survival in stage 2, and convex selection on survival in stages 1, 3, 4 and 5. There is negative correlational selection between survival in stage 1 (seeds) or 2 (seedlings) and survival in stages 4-8 (adults), and positive correlational selection between seed or seedling survival and survival in stages 1–3 (pre-adults). This indicates that seed and seedling survival are being selected to decrease their correlation with adult survival, but to increase their correlation with pre-adult survival.
Because the Hessian matrices include second derivatives with respect to all possible pairs of characters (matrix entries or lower-level parameters), they contain a great deal of information, and there are no established standards for displaying the results. We have shown several possibilities that may be useful: colour plots, plots that remove matrix entries that are of no interest because they are structural zeros, and plots displaying the range of magnitudes of the second derivatives. Others will no doubt be developed. The Matlab code used to generate the analysis is included in the Supporting Information.
Sensitivity analysis of stochastic growth rates
An application in which second derivatives are not the objective, but in which the Hessian matrix plays a role, is the sensitivity of Tuljapurkar's small-noise approximation to the stochastic growth rate log λs (‘Section 'Sensitivity of the stochastic growth rate'’). Tuljapurkar's approximation can be written in terms of the Jacobian matrix D of first derivatives of the dominant eigenvalue of the mean projection matrix, . Assuming that environments are uncorrelated in time,
where C is the covariance matrix of the entries of the population projection matrix
The sensitivity of the stochastic growth rate can be obtained by differentiating (eqn 63) with respect to the entries of . The sensitivity of the stochastic growth rate to , leaving the variances and covariances fixed, depends on the second derivatives of λ as
where is the Hessian matrix of second derivatives. A derivation of (eqn 65) is provided in Appendix S1-C. Much more powerful and general approaches to sensitivity analysis of the stochastic growth rate are available in recent developments of the Monte Carlo method (e.g. Caswell, 2005, 2010, Tuljapurkar, Horvitz & Pascarella 2003, Haridas & Tuljapurkar 2005, Horvitz, Tuljapurkar & Pascarella 2005). This approximate result may, however, be useful in situations where the stochastic environment is defined directly in terms of the covariance matrix C of the vital rates.
Although the first derivatives of population growth rates are commonly used in ecology and demography, tools for calculating the second derivatives are not nearly as well-established, even though second derivatives also have a variety of potential applications. To this end, we have derived new, more easily computable formulae for the second derivatives of three population growth rate measures – the discrete-time growth rate λ, the continuous-time growth rate r, and the per-generation growth rate R0 – both with respect to projection matrix entries and to lower-level parameters. Table 2 provides an overview of the results, with directions to the equations defining the Hessian matrix, containing all second-order partial derivatives, for each type of growth rate and each type of independent variable.
Table 2. An overview of the formulae for the second derivatives of population growth rates (λ, r, R0) with respect to matrix entries (A, U, F), or to lower-level parameters (θ,σ). The equation number for the corresponding Hessian matrix is given in the third column; auxiliary equations for terms in the Hessian expressions are given in the fourth column. The Matlab functions used to calculate each Hessian, as provided in the supplemental material, are listed in the last column
The matrix calculus approach is comprehensive, and even though the formulae may appear complicated, they are easy to apply with any matrix-oriented software. Other methods for finding second derivatives are either more limited or require more difficult and error-prone calculations. Cohen (1978), for instance, derives the second pure derivatives of λ with respect to the diagonal elements of the projection matrix () only. The approaches of Deutsch & Neumann (1984) and Kirkland & Neumann (1994) rely on the calculation of group inverses, while those of Caswell (1996) require all the eigenvalues and eigenvectors of the projection matrix. McCarthy et al.'s method (2008) uses transfer functions rather than eigenvectors and is more complicated when handling lower-level parameters.
Population growth rate, no matter how it is measured, is important in many ecological and evolutionary problems. It is hoped that the methods presented here will contribute to a deeper understanding of the response of growth rates to changes in parameters.
This work was supported by a National Science Foundation Graduate Research Fellowship under Grant 1122374, by NSF Grants DEB-1145017 and DEB1257545, and by Advanced Grant 322989 from the European Research Council. We thank Dave Koons and an anonymous reviewer for helpful comments.
MATLAB scripts: uploaded as online supporting information