SEARCH

SEARCH BY CITATION

Keywords:

  • age-specific mortality;
  • Gompertz model;
  • life-histories;
  • maximum likelihood;
  • senescence

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Background
  5. Analysing mortality data
  6. Simulations
  7. Discussion
  8. Acknowledgments
  9. References
  10. Appendix

Demographic studies focusing on age-specific mortality rates are becoming increasingly common throughout the fields of life-history evolution, ecology and biogerontology. Well-defined statistical techniques for quantifying patterns of mortality within a cohort and identifying differences in age-specific mortality among cohorts are needed. Here I discuss using maximum likelihood (ML) statistical methods to estimate the parameters of mathematical models, which are used to describe the change in mortality with age. ML provides a convenient and powerful framework for choosing an adequate mortality model, estimating model parameters and testing hypotheses about differences in parameters among experimental or ecological treatments. Simulations suggest that experiments designed to estimate age-specific mortality should involve at least 100-500 individuals per cohort per treatment. Significant bias in the estimation of model parameters is introduced when the mortality model is misspecified and samples are too small to detect the true mortality pattern. Furthermore, the lack of simple and efficient procedures for comparing different mortality models has forced the use of the Gompertz model, which specifies an exponentially increasing mortality with age, and which may not apply to the majority of experimental systems.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Background
  5. Analysing mortality data
  6. Simulations
  7. Discussion
  8. Acknowledgments
  9. References
  10. Appendix

It is clear that the fields of experimental and evolutionary demography are growing (Curtsinger et al., 1995; Vaupel et al., 1998), and well-defined techniques for identifying differences in phenotypic and genetic patterns of age-specific mortality are needed. Frequently, a mathematical model is fit to observed mortality rates, and hypotheses concerning parameter values among treatment populations are investigated (Fukui et al., 1993; Nusbaum et al., 1996). There are several considerations to keep in mind when using this approach. First, we require an objective method for choosing an adequate model for the data. Historically, the Gompertz model, which predicts an exponential (log-linear) increase in mortality rates with age, was used almost exclusively for mortality analysis. Often this model was fit to data that were clearly not linear on the log scale (for a discussion see Promislow et al., 1997), and recent observations of more complex mortality patterns suggest non-Gompertzian dynamics are the norm rather than the exception (Curtsinger et al., 1995; Pletcher & Curtsinger, 1998).

After choosing the appropriate model, we need an efficient method for estimating the parameters of that model from the data. Traditionally, Gompertz parameters were estimated by linear regression of log-mortality rates on age (Hughes & Charlesworth, 1994; Orr & Sohal, 1994; Stearns & Kaiser, 1996). Parameters estimated in this way can be highly biased – small samples result in large over-estimates of the initial mortality parameter and under-estimates of the rate parameter (Mueller et al., 1995; Promislow et al., 1997; Pletcher, unpublished results). Moreover, variation in sample sizes among populations can generate apparent differences in parameters values (Promislow et al., 1997). Although various forms of nonlinear regression have been suggested for fitting mortality models (Wilson, 1994; Eakin et al., 1995; Hughes, 1995), well-defined techniques for objectively comparing the fit of different models from this method are lacking.

In many cases evolutionary biologists are also interested in testing hypotheses about the parameter values from two or more treatment populations. For example, Tatar et al. (1993) were interested in whether the differences in longevity between populations of bean beetles allowed to vary in reproductive effort resulted from changes in the rate of ageing (senescence) or from a proportional decrease in mortality at all ages. This amounts to asking whether treatment populations differ in certain parameters of a mortality model.

It is the goal of this paper to discuss statistical techniques – based on the ideas of maximum likelihood – for analysing mortality data using mathematical models. Maximum likelihood provides a simple and powerful framework for mortality analysis that consists of: (i) objectively choosing a mortality model that adequately describes the data, (ii) estimating the parameters of that model and (iii) testing hypotheses about differences in parameter values among treatment populations. The methods are then used to investigate how large mortality experiments should be to answer questions about the mortality patterns of experimental populations. A set of computer packages that provide easy implementation of the ideas presented here is available for IBM (DOS and Windows) and UNIX environments and are freely available from the author.

Background

  1. Top of page
  2. Abstract
  3. Introduction
  4. Background
  5. Analysing mortality data
  6. Simulations
  7. Discussion
  8. Acknowledgments
  9. References
  10. Appendix

Mortality distributions

Demographers and survival analysts have come to represent death rates as the age-specific ’hazard’ (or instantaneous risk of failure) rather than the probability of death (Gavrilov & Gavrilova, 1991; Lee, 1992). As a result, the common mortality models are often represented by their hazard functions. Several formulae have been proposed for describing the relationship between age and hazard (hereafter I use hazard and mortality interchangeably) (Vaupel et al., 1987; Gavrilov & Gavrilova, 1991; Curtsinger et al., 1992; Fukui et al., 1993), but I concentrate on a family of four models: Gompertz, Gompertz–Makeham, logistic and logistic–Makeham.

Hazard functions for the four mortality models are given in Table 1. The Gompertz equation is the most common model used by biologists and demographers to characterize age-specific mortality parameters. It assumes mortality increases exponentially with age. The mortality rate at age x, μx, is determined by two parameters: the initial mortality rate, λ, and the exponential increase in mortality with age, γ. Recent demographic experiments using large sample sizes have documented a significant deceleration of mortality rates at advanced ages (Carey et al., 1992; Curtsinger et al., 1992; Pletcher, 1996; Promislow et al., 1996), and the logistic model has been shown to provide a better description of mortality rate in such cases. In this model (Table 1) the s parameter describes the degree of deceleration in mortality at older ages, λ is the mortality rate at birth, and γ is the rate of exponential increase in mortality with age early in life (Vaupel, 1990). Higher values of s indicate more rapid levelling off of mortality at advanced ages. An s value of zero indicates no deceleration and the logistic model reduces to the Gompertz.

Table 1.  Hazard and probability density functions for four mathematical models of mortality. Equations are given as functions of t, the age at death, and they are parameterized by up to four parameters: λ, γ, s and c. These functions and methods for deriving them can be found in Vaupel & Yashin (1985). In the equations below, inline image. Thumbnail image of

The Gompertz and logistic models assume senescent (age-dependent) factors completely determine observed mortality rates and are expressed from birth (age 0). To account for extrinsic (age-independent) causes of death, mortality can be represented as the sum of a constant (age-independent) term and an age-dependent term. Adding a constant to the Gompertz equation produces the well-known Gompertz–Makeham model, and extending the logistic model to account for age-independent mortality gives the logistic–Makeham model (Table 1).

In the evolutionary literature, age-specific mortality is often analysed by using linear or nonlinear regression with age-specific mortality (or its natural logarithm) as the dependent variable and age as the predictor variable (Hughes & Charlesworth, 1994; Stearns & Kaiser, 1996). The mortality rate at age x can be estimated from empirical data using the approximation

inline image

where px is the probability an individual alive at age x survives to age x + 1 (Carey, 1993). Although plots of age-specific hazard are useful for visualizing data, for purposes of estimation and hypothesis testing the observed hazard is not useful and may be misleading. For example, observed mortality is a threshold character, and the smallest measurable mortality rate is 1/Nx, where Nx is the number of individuals alive at the start of age x. Thus, when mortality rates are very low, observed values may be much higher than the true mortality rates and may appear constant when in fact they are increasing (Promislow et al., 1997). In this paper I focus on the distribution of ages at death. Transformations of the data, to survivorship or mortality rates, often result in a loss of information or the introduction of substantial bias (Eakin et al., 1995).

Maximum likelihood estimation

Assuming the observed deaths are independent, the likelihood of observing specific ages at death for N individuals is

inline image

(Fisher, 1921) where θ is a vector of parameters in the model, f is the probability density function for the model, and xi is the age at death for the ith individual. The ML estimates (MLEs) are those values of the parameters (θ) that maximize this function or, equivalently, the natural logarithm of it,

inline image

where L is the logarithm of L (i.e. L=∑Ni=1logf(xi)). Probability density functions for the four models are provided in Table 1.

The asymptotic (large-sample) properties of ML are well-known and desirable (Searle et al., 1992; Lindgren, 1993). MLEs are asymptotically efficient (i.e. they have the smallest allowable variance) and are normally distributed. Simulation results suggest that, for the estimation of mortality models, MLEs are also unbiased or very nearly so (Mueller et al., 1995; Promislow et al., 1997; Pletcher, unpublished data). Because standard errors of the estimates are readily obtained from the maximization procedure, simple hypothesis tests are straightforward.

Likelihood ratio tests

When comparing different models, likelihood theory provides straightforward significance tests for determining the best model for the observed data and for comparing parameter values among different treatment populations. Both types of tests are carried out by calculating the difference of the log-likelihood obtained under a constrained null hypothesis from the value obtained under a less constrained alternative hypothesis. The magnitude of the difference in L (see eqn 3) estimated under the null hypothesis (L0) and the value of L under the alternative (LA) describes the strength of the evidence against the null hypothesis (Shaw, 1987; Lindgren, 1993). Significance tests are carried out by noting that −2(L0LA) has a χ2 distribution with degrees of freedom equal to the number of additional constraints in the null hypothesis (Lindgren, 1993). This method is sufficiently general to provide tests of hypotheses that least-squares methods (including nonlinear regression) do not allow.

Analysing mortality data

  1. Top of page
  2. Abstract
  3. Introduction
  4. Background
  5. Analysing mortality data
  6. Simulations
  7. Discussion
  8. Acknowledgments
  9. References
  10. Appendix

Choosing the right model

Meaningful conclusions about the effects of an experimental or ecological factor on age-specific mortality require that a mortality model be chosen that accurately describes the data in question. The family of mortality models described in this paper are sufficiently general to accommodate the range of age-specific mortality described in the literature (Finch, 1990), but more importantly they form a hierarchical structure. For example, the Gompertz model can be obtained from the logistic model by letting s = 0. Setting the Makeham term to zero in the Gompertz–Makeham or the logistic–Makeham reduces these models to the standard Gompertz and logistic models, respectively.

Likelihood ratio testing allows us to utilize the model hierarchy to choose the best model for the data. For example, twice the difference between the log-likelihood for the data under the logistic model (LA) and the log-likelihood under the Gompertz model (L0) is a test statistic, and significance values obtained by comparing it to the χ2 distribution with one degree of freedom will provide the evidence against log-linear mortality increase with age throughout the life span and for a deceleration in mortality rates at old ages. Testing the significance of age-independent early mortality can be accomplished by comparing the Gompertz–Makeham to the Gompertz in a similar manner. Moving stepwise up the hierarchy only when the simpler, more-constrained null hypothesis is rejected (say at < 0.05 or < 0.10) results in the acceptance of a mortality model that contains the smallest number of parameters to describe the data sufficiently.

Estimating parameters

Once a specific model is chosen, obtaining the MLEs of the parameters is straightforward (given the appropriate computer software). In addition to the estimates themselves, the optimization procedure allows for the calculation of 95% confidence intervals on all estimates (Lindgren, 1993). Although such confidence intervals can be used to test simple hypotheses, the parameter estimates are only asymptotically normally distributed, and the associated confidence intervals are accurate only for ’large’ sample sizes. Their use with small samples may result in erroneous conclusions (Hauck & Donner, 1977). For small samples we need to return to the idea of likelihood ratio testing.

Testing hypotheses

The problem of testing hypotheses about parameter values with small samples can be overcome by introducing an extended model. An extended model allows each of ρ populations to have its own parameter values for a specific mortality model. In such a case, the log-likelihood is given by

inline image

where fj is the density function for the jth population (j = 1, …, ρ) model, xi(j) is the age at death for the ith individual belonging to population j, and θ is a vector of parameter values. For simplicity, I consider only two populations, and θ contains a full set of parameter values for each of the two populations.

An unconstrained estimate of eqn 4 will produce parameter estimates for each population identical to that obtained by maximizing eqn 2 for each population separately. Further, ignoring rounding error, the final log-likelihood will be identical to the sum of the two likelihoods obtained from eqn 2. We can test hypotheses concerning parameter estimates by constraining certain parameters of θ to be equal. For example, assuming Gompertz dynamics, if we wish to test the hypothesis that two populations have identical rates of increase in mortality with age (see Table 1), the null hypothesis is that the γ are identical in each population and the alternative is that each population has its own distinct γ. Since, under the null hypothesis, γ is constrained to be equal in both populations, twice the difference between the maximum log-likelihood estimates of eqn 4 with θ = {λ1, γ, λ2, γ} (H0) and θ = {λ1, γ1, λ2, γ2} (HA) will be distributed as a χ2 random variable with one degree of freedom. Failure to reject the null hypothesis would suggest the rates of increase in mortality in the two populations are not significantly different.

Likelihood ratio testing procedures can be used to assess the evidence for differences in any set of parameter values between any set of populations. Although the results from these procedures are best interpreted when two populations follow the same mortality model, this is not a necessary condition. Identical procedures can be used to test for differences in rates of mortality between one population that follows the Gompertz model and a second population that follows the logistic model. Further, multiple parameters may be constrained simultaneously providing a flexibility in hypothesis testing not provided by any other methods.

Simulations

  1. Top of page
  2. Abstract
  3. Introduction
  4. Background
  5. Analysing mortality data
  6. Simulations
  7. Discussion
  8. Acknowledgments
  9. References
  10. Appendix

Computer simulations were utilized to illustrate the flexibility of the ML approach and to answer the following two questions: (1) how large must experiments be to detect non-Gompertzian mortality dynamics such as deceleration of mortality rates late in life or significant amounts of age-independent mortality at early ages, and (2) how large must experiments be to detect differences between populations in the parameters of various mortality models. Although these questions are answered for only a small number of specific situations, the results of the simulations provide general insights into the size of experiments needed to accurately determine the true age-specific mortality trajectory and to detect differences between experimental treatments in various patterns of age-specific mortality. The method used to generate simulated data is outlined in the Appendix.

Estimating parameters

The advantages of using MLE to estimate the parameters of mortality models, such as the Gompertz model or the logistic model, have been addressed a number of times in the literature (Eakin et al., 1995; Mueller et al., 1995; Shouman & Witten, 1995; Promislow et al., 1997). Mueller et al. (1995) use simulations to show that maximum likelihood estimates of the parameters of the Gompertz model have a significantly lower bias and mean squared error than those estimated using linear or nonlinear regression. Promislow et al. (1997) point out how sample size can systematically bias the estimates of Gompertz parameters from linear regression (their fig. 3). Moreover, linear regression is not useful when mortality trajectories are clearly nonlinear, and more general regression models (e.g. nonlinear regression) perform very poorly unless sample sizes are extremely large (>1000 individuals) (Pletcher, unpublished data).

image

Figure 3.   Estimates of statistical power to detect differences between two populations in the rate parameter of the Gompertz mortality model. Sample sizes (SS) are indicated. Data were simulated under the Gompertz mortality model with both populations sharing identical intercept parameters (λ=0.001). The rate parameter for population 1 was fixed at γ = 0.10 while the same parameter for population 2 was allowed to vary from 0.10 to 0.29. Each point represents the fraction of 1000 simulations in which differences in rate parameters between the two populations were detected by likelihood ratio test (< 0.05). A: all individuals in the simulation were given identical mortality parameters as described above, B: individual parameters were drawn from a gamma distribution with the appropriate mean, but with variances of 2 × 10−6 and 2 × 10−4 for the intercept and rate parameters, respectively.

Download figure to PowerPoint

Choosing the right mortality model

Utilizing the likelihood ratio testing paradigm described above, I investigated the relationship between sample size and the power to detect specific mortality patterns. The power to detect mortality deceleration late in life for a number of sample sizes and degrees of deceleration is shown in Fig. 1. Data were simulated by assuming individual mortality rates change according to the logistic model with λ = 0.001, γ = 0.1 and numerous s values (Fig. 1).

image

Figure 1 Estimates of statistical power for detecting various degrees of old age mortality deceleration. Data were simulated using the logistic mortality model with λ = 0.001, γ = 0.10 and various values of s. Each point represents the fraction of 1000 simulations that were significantly better fit by the logistic mortality model than by the Gompertz model. Sample sizes (SS) are indicated. A: all individuals were given identical λ and γ values of 0.001 and 0.10, respectively, B: each simulated individual's λ and γ were drawn from a gamma distribution with mean 0.001 and 0.10 and variance 2 × 10−6 and 2 × 1. 0−4, respectively.

Download figure to PowerPoint

All experiments suffer from an abundance of random variation, and it is undoubtedly true that under standard laboratory conditions, even cohorts of genetically identical individuals will exhibit variation in their patterns of mortality. To examine the effect of individual variation on the ability to detect mortality deceleration, simulated deaths were again assumed to follow the logistic model. However, mortality parameters for each individual were chosen randomly from a gamma distribution. The mean values were identical to those chosen for the logistic simulations without variance (λ = 0.001 and γ = 0.1), and the variance of the parameters were chosen to be 2 × 10−6 and 2 × 10−4 for λ and γ, respectively. All individuals within a cohort were given the same s value. These values roughly correspond to the environmental variance estimated in many mortality studies using Drosophila (S. Pletcher and J. Curtsinger, unpublished results). Since the logistic model is derived from assumptions about individual variation (Vaupel et al., 1987), it is not surprising that introducing individual variation in the Gompertz parameters results in a greater tendency to detect leveling off (Fig. 1B) when s is small.

In studies where the logistic model was fit to mortality data, estimates of the s parameter are in the neighbourhood of 0.2–0.8 (Fukui et al., 1993; Promislow et al., 1996; Pletcher et al., 1998). Assuming a level of mortality deceleration such that s = 0.6, a perfectly controlled experiment would require at least 75–100 individuals per cohort to reliably detect leveling off (Fig. 1). Even fairly high amounts of late-life deceleration require samples sizes of 75–250 individuals for a better than 50% chance of detection.

Similar simulations were carried out to determine the power to detect significant nonsenescent (age-independent) mortality early in life (Fig. 2). In this case, data were generated under the Gompertz–Makeham mortality model with λ = 0.001, γ = 0.1 and a variety of c values as indicated in the Figure. A second set of simulations incorporate individual variation in the intercept and rate component of the Gompertz–Makeham model. Individual mortality parameters, λ and γ, were drawn from a gamma distribution with mean 0.001 and 0.1 and variance 2 × 10−6 and 2 × 10−4, respectively.

image

Figure 2.   Estimates of statistical power for detecting various degrees of age-independent mortality. Data were simulated using the Gompertz–Makeham model with λ = 0.001, γ = 0.1 and various values of c. Each point represents the fraction of 1000 simulations that were significantly better fit by the Gompertz–Makeham model than by the Gompertz model. Panels A and B are as in Fig. 1.

Download figure to PowerPoint

The power to determine the correct mortality model is quite low for cohorts having less than 50 individuals (Fig. 2). Individual variation significantly reduces the statistical power in all cases. This effect is roughly proportional to the sample size. Interestingly, there is a limit on the power obtained from any specific sample size. Even with high levels of age-independent mortality (>0.0075) there is a less than 70% chance of correctly identifying the true mortality curve when cohorts are composed of 50 individuals or less. There is a tendency for power to drop at very high values of c due to age-independent mortality depleting cohort size before senescent mortality is observed. In these cases a Gompertz model with λ = 0 fits nearly as well as the correct Gompertz–Makeham (data not presented).

Correctly determining the underlying mortality pattern is not a mundane issue. Conclusions regarding experimental effects on characteristics such as senescence (the rate of mortality increase with age) rely on the right answer. The importance of determining the true mortality model is illustrated in Table 2. Simulated data sets (1000) were generated according to the Gompertz–Makeham model, and both the Gompertz and Gompertz–Makeham models were fit to the data. Standard Gompertz parameters were estimated for those simulated data sets that were not significantly better fit by the Gompertz–Makeham model, and the mean of the distribution of estimates was compared to the true intercept and rate parameters. Absolute bias was calculated as the absolute value of the difference between the mean of the estimates and the true value divided by the true value. Results from this investigation clearly show that a failure to detect the correct mortality model results in parameter estimates that are highly biased (Table 2). Similar values were obtained when the underlying mortality pattern was logistic (data not presented).

Table 2.  Absolute percentage bias in mortality model parameter estimates resulting from the inability to detect the true mortality pattern. Data were simulated using the Gompertz–Makeham mortality model with λ=0.001 and γ=0.10 and various ‘c ’ values as indicated below. Standard Gompertz parameters were estimated for those simulated data sets that were not significantly better fit by the Gompertz–Makeham model. Absolute bias was calculated as the absolute value of the difference between the mean of the estimates and the true value divided by the true value. Each value is based on 1000 simulations Thumbnail image of

Comparing populations

The third major advantage of MLE as presented in this paper is the ability to test hypotheses about differences in mortality parameters between two or more treatment populations. Researchers are often interested in whether some ecological or experimental treatment has resulted in a change in the rate of the Gompertz equation or a change in the intercept parameter (Tatar et al., 1993; Orr & Sohal, 1994). The number of papers reporting treatment effects on the rate parameter currently exceeds those showing differences in the intercept parameter (e.g. Orr & Sohal, 1994; Nusbaum et al., 1996; Stearns & Kaiser, 1996). This might be due to one or more of the following three causes: (i) the rate parameter, more than the intercept, represents a biological trait sensitive to phenotypic or genetic manipulation (e.g. Sacher, 1977), (ii) the use of improper statistical techniques inflate the actual differences in the rate parameter between the populations (see Promislow et al., 1997) or (iii) for any given sample size there is more power to detect differences in the rate parameter than there is to detect differences in the intercept parameter. In this section, I use MLE to focus on the last of these possibilities.

Figure 3 shows the results of applying the likelihood ratio testing paradigm to hypotheses about the rate values of two treatment populations. Data were simulated for each of two populations according to the standard Gompertz mortality model. Both populations share the same intercept value (λ = 0.001). However, the rate parameters of the two populations were allowed to differ. The rate parameter of population 1 was fixed at 0.10, while the same parameter for population 2 varied from 0.10 to 0.29. For each simulated data set, two models were fit to the combined data of populations 1 and 2. The first model assumes that both populations have unique sets of parameters, i.e. each population is allowed to have its own λ and γ (this would be the alternative hypothesis). The second model constrains both populations to have the same rate parameter (the null hypothesis). The difference in log-likelihoods between these two models forms the basis for the test statistic that the rate parameters differ between the two populations. Figure 3(B) shows similar results when variation is added to individual mortality patterns.

Standard laboratory stocks of Drosophila may differ in their Gompertz rate parameters by 0.02–0.04 (Fukui et al., 1993; Pletcher, unpublished data). Rate differences generated by either laboratory selection (Nusbaum et al., 1996) or experimental manipulation such as irradiation (Sacher, 1977) can generate rate differences of 0.06–0.10 or greater. It is clear from Figure 3 that differences as small as 0.06 can be reliably detected with experiments as small as 50 individuals per cohort. If there is variation among individuals the power is reduced somewhat but still remains over 50%. In short, even small experiments (50 individuals per cohort) can detect reasonable differences in the rate of senescence as described by the Gompertz equation.

Figure 4 shows the results of applying the likelihood ratio testing paradigm to hypotheses about the Gompertz intercept values of two treatment populations. Data were simulated exactly as described for the examination of rate parameters with the following exception: both populations share the same rate value (γ = 0.10), but the intercept parameters of the two populations were allowed to differ. The intercept parameter of population 1 was fixed at 0.001, while that for population 2 varied from 0.001 to 0.007. Figure 4(B) shows the results when variation is added to individual mortality patterns.

image

Figure 4. .  Estimates of statistical power to detect differences between two populations in the intercept parameter of the Gompertz mortality model. Data were simulated under the Gompertz mortality model with both populations sharing identical rate parameters (γ = 0.10). The intercept parameter for population 1 was fixed at λ = 0.001 while the same parameter for population 2 was allowed to vary from 0.001 to 0.007. Each point represents the fraction of 1000 simulations in which differences in rate parameters between the two populations were detected by likelihood ratio test (< 0.05). Panels A and B are as in Fig. 3.

Download figure to PowerPoint

Differences in intercept among standard laboratory populations of Drosophila have been reported to range from less than 0.001 to 0.002 (Fukui et al., 1993). Laboratory selection experiments resulting in extremely long-lived flies have been shown to have Gompertz intercepts that differ by 0.002, and Sacher (1977) showed that the Gompertz intercept is displaced by an average across species of 0.003 per unit dose of ionizing radiation. Thus, for experimental treatments known to have large effects on lifespan, such as irradiation or laboratory selection, cohorts of at least 100 individuals are required if the goal is detecting different intercepts at least 50% of the time. Notice that larger sample sizes are required to detect what might be considered ’large’ differences in intercepts than are needed to detect ’large’ differences in rate.

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Background
  5. Analysing mortality data
  6. Simulations
  7. Discussion
  8. Acknowledgments
  9. References
  10. Appendix

In this paper I have discussed the benefits of a maximum likelihood approach to the analysis of age-specific mortality data. I have extended the standard uses of ML (i.e. parameter estimation) to include a hierarchical testing scheme that allows the choice of a mathematical mortality model that adequately fits the observed data (see also Fukui et al., 1993). Further, I have provided the means for applying likelihood ratio tests to hypotheses about the parameter values for different experimental or ecological treatments. All of the methods presented in the paper are implemented in an easy to use software package (IBM PC compatible and UNIX), which makes the maximum likelihood analyses easily accessible.

The use of the ML techniques described in this paper will result in a number of improvements in the design and analysis of experiments investigating mortality patterns. First, ML provides better parameter estimates that are more consistent and less influenced by technical aspects of the experimental design such as sample size than those from other methods. Promislow et al. (1997) discuss how published estimates of mortality parameters may be highly biased due to a combination of incorrect statistical methods and small sample sizes.

Second, the range of models presented here will result in greater documentation of different patterns of age-specific mortality. Brooks et al. (1994) reported that mortality rates increased with age in C. elegans according to the Gompertz model. Subsequent analysis (Vaupel et al., 1994) showed a significant deceleration in mortality rates at older ages. It is unknown how many studies fail to report levelling off simply because of the inability to properly test for it. Moreover, because of its complexity and difficulty in fitting, it is likely the Gompertz–Makeham model is applicable to a much wider range of data than is currently realized.

Despite the difference in the c and λ parameters in this model, there have not been any attempts to use the different models on real data to assess the relative contribution of each to mortality patterns from different types of populations (Finch, 1990). Identifying a significant Makeham term not only serves to provide insight into the levels of extrinsic mortality but also to remove the effects of age-independent mortality that can bias the estimates of mortality parameters (Table 2).

Third, the simulations suggest that researchers interested in understanding mortality patterns should venture to obtain sample sizes of at least 100–500 individuals per cohort per experimental treatment. A similar conclusion was presented by Service et al. (1998b) for parameter estimation of single populations and hypothesis testing using ANOVA methods. Shouman & Witten (1995) used simulations to illustrate the large variance associated with ML parameter estimates when sample sizes are small (<100). Perhaps the most crucial problem with small samples is the inability to detect non-Gompertzian mortality dynamics such as age-independent early mortality and mortality deceleration at older ages (Figs 1 and 2). Mortality plateaus are now established as a real and repeatable result in laboratory experiments (Curtsinger et al., 1995; Pletcher & Curtsinger, 1998; Service et al., 1998a), and some degree of random (age-independent) mortality is to be expected in any ecological study. Until these aspects of mortality are recognized and accounted for, estimates of the rate of senescence (γ) or the initial mortality rate (λ) are questionable.

Finally, the need for big samples is exacerbated by variation among individuals. This suggests that laboratory mortality experiments would be better off using fewer cohorts with larger numbers of individuals. Variation introduced within cohorts by different environments (such as different vials in Drosophila studies) could significantly reduce the statistical power to detect specific mortality patterns. Whether this source of variation can be reduced by ’pooling’ parameter estimates from two or more replicate cohorts is unknown at this time. This observation points out the biggest limitation of these methods as developed thus far, the inability to properly account for complex experimental designs – hierarchical or nested designs for example. Unfortunately, this is not an easy problem. It involves maximizing the marginal likelihood of the distribution of deaths after accounting for (integrating over) the assumed distribution of the model parameters within a treatment. Since there are often a large number of replicates (e.g. vials) within each treatment, Markov Chain Monte Carlo maximization is required (Geyer, 1995; C. Geyer personal communication). This work is in progress.

The question of how much power a specific experimental design provides is always a difficult one. Statistical power depends not only on sample size but also on the actual values of the mortality parameters under investigation. Simulations with parameters ranging from ½ to twice the values reported here produce very similar results. Therefore, the sample sizes I suggest are directly relevant to a number of experimental systems, including Drosophila melanogaster, Caenorhabditis elegans, and Callosobruchus maculatus. In general, it is impossible to provide precise advice about how large an average mortality experiment should be, but preliminary estimates of mortality parameters coupled with the power estimates offered here can be used for guidance.

Even larger samples may be needed to address questions about truly age-specific phenomena. The estimation of mortality patterns by ML is based on the distribution of deaths at all ages. Nonparametric statistical techniques designed to examine differences in mortality at specific ages are not well developed. Whether there are transient, localized differences in mortality between experimental treatments (see Pletcher et al., 1998) is an open and difficult question and will likely require much larger samples than those suggested here.

Acknowledgments

  1. Top of page
  2. Abstract
  3. Introduction
  4. Background
  5. Analysing mortality data
  6. Simulations
  7. Discussion
  8. Acknowledgments
  9. References
  10. Appendix

I thank J. Curtsinger, M. Tatar and D. Promislow for pointing out the need for simple estimation packages. J. Curtsinger, P. Service and R. Shaw provided comments that greatly improved the manuscript. C. Geyer and R. Shaw provided valuable advice on implementing the maximum likelihood procedures. This work was supported by the University of Minnesota Graduate School and National Institutes of Health grants AG-0871 and AG-11722 to J. Curtsinger.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Background
  5. Analysing mortality data
  6. Simulations
  7. Discussion
  8. Acknowledgments
  9. References
  10. Appendix
  • 1
    Brooks, A., Lithgow, G.J., Johnson, T.E. 1994. Mortality rates in a genetically heterogeneous population of Caenorhabditis elegans. Science 263: 668 671.
  • 2
    Carey, J.R. 1993. Applied Demography for Biologists. Oxford Univeristy Press, Oxford.
  • 3
    Carey, J.R., Liedo, P., Orozco, D., Vaupel, J.W. 1992. Slowing of mortality rates at older ages in large medfly cohorts. Science 258: 457 461.
  • 4
    Curtsinger, J.W., Fukui, H.H., Khazaeli, A.A., Kirscher, A.W., Pletcher, S.D., Promislow, D.E.L., Tatar, M. 1995. Genetic variation and aging. Ann. Rev. Genet. 29: 553 575.
  • 5
    Curtsinger, J.W., Fukui, H.H., Townsend, D.R., Vaupel, J.W. 1992. Demography of genotypes: Failure of the limited life¡span paradigm in Drosophila melanogaster. Science 258: 461 463.
  • 6
    Eakin, T., Shouman, R., Qi, Y., Liu, G., Witten, M. 1995. Estimating parametric survival model parameters in gerontological aging studies: Methodological problems and insights. J. Geront. 50: B166 B176.
  • 7
    Finch, C.E. 1990. Longevity, Senescence and the Genome. University of Chicago Press, Chicago.
  • 8
    Fisher, R.A. 1921. On the mathematical foundations of theoretical statistics. Phil. Trans. Roy. Soc. Lond., Aa 222: 309 368.
  • 9
    Fukui, H.H., Xiu, L., Curtsinger, J.W. 1993. Slowing of age specific mortality rates in Drosophila melanogaster. Exp. Geront. 28: 585 599.
  • 10
    Gavrilov, L.A. & Gavrilova, N.S. 1991. The Biology of Life Span: a Quantitative Approach. Harwood Academic Publishers, Chur, Switzerland.
  • 11
    Geyer, C.J. 1995. Estimation and optimization of functions. In: Markov Chain Monte Carlo in Practice (W. R. Gilks, S. Richardson & D. J. Spiegelhalter, eds), pp. 241–258. Chapman & Hall, London.
  • 12
    Hauck, W.W. & Donner, A. 1977. Wald's test as applied to hypotheses in logit analysis. J. Amer. Stat. Soc. 72: 851 853.
  • 13
    Hughes, K.A. 1995. The evolutionary genetics of male life history characters in Drosophila melanogaster. Evolution 49: 521 537.
  • 14
    Hughes, K.A. & Charlesworth, B. 1994. A genetic analysis of senescence in Drosophila. Nature 367: 64 66.
  • 15
    Lee, E.T. 1992. Statistical Methods for Survival Data Analysis. John Wiley and Sons, Inc., New York.
  • 16
    Lindgren, B.W. 1993. Statistical Theory, 4th edn. Chapman & Hall, New York.
  • 17
    Mueller, L.D., Nusbaum, T.J., Rose, M.R. 1995. The gompertz equation as a predictive tool in demography. Exp. Gerent. 30: 553 569.
  • 18
    Nusbaum, T.J., Mueller, L.D., Rose, M.R. 1996. Evolutionary patterns among measures of aging. Exp. Geront. 31: 507 516.
  • 19
    Orr, W.C. & Sohal, R.S. 1994. Extension of life span by overexpression of superoxide dismutase and catalase in Drosophila melanogaster. Science 263: 1128 1130.
  • 20
    Pletcher, S. 1996. Age-specific mortality costs of exposure to inbred Drosophila melanogaster in relation to longevity selection. Exp. Geront. 31: 605 616.
  • 21
    Pletcher, S.D. & Curtsinger, J.W. 1998. Mortality plateaus and the evolution of senescence: why are mortality rates so low? Evolution 52: 454 464.
  • 22
    Pletcher, S.D., Houle, D., Curtsinger, J.W. 1998. Age-specific properties of spontaneous mutations affecting mortality in Drosophila melanogaster. Genetics 148: 287 303.
  • 23
    Promislow, D.E.L., Tatar, M., Khazaeli, A.A., Curtsinger, J.W. 1996. Age-specific patterns genetic variation in Drosophila melanogaster. I. Mortality. Genetics 143: 839 848.
  • 24
    Promislow, D.E.L., Tatar, M., Pletcher, S.D., Carey, J. 1997. Below threshold mortality and its impact on studies in evolutionary ecology. J. Evol. Biol. in press.
  • 25
    Sacher, G.A. 1977. Life table modifications and life prolongation. In: Handbook of the Biology of Aging (C. E. Finch & L. Hayflock, eds). Van Nostrand, New York.
  • 26
    Searle, S.R., Casella, G., McCulloch, C.E. 1992. Variance Components. Wiley and Sons, New York.
  • 27
    Service, P.M., Michieli, C.A., McGill, K. 1998a. Experimental evolution of senescence: an analysis using a heterogeneity mortality model. Evolution in press.
  • 28
    Service, P.M., Ochoa, R., Valenzuela, R., Michieli, C.A. 1998b. The use of small cohorts for maximum likelihood estimation of mortality parameters. Exp. Geront. in press.
  • 29
    Shaw, R.G. 1987. Maximum-likelihood approaches applied to quantitative genetics of natural populations. Evolution 41: 812 826.
  • 30
    Shouman, R. & Witten, M. 1995. Survival estimates and sample size: what can we conclude? J. Geront. 50: B177 B185.
  • 31
    Stearns, S.C. & Kaiser, M. 1996. Effects on fitness components of P-element inserts in Drosophila melanogaster: analysis of trade-offs. Evolution 50: 795 806.
  • 32
    Tatar, M., Carey, J.R., Vaupel, J. 1993. Long-term cost of reproduction with and without accelerated senescence in Callosobruchus maculatus: analysis of age-specific mortality. Evolution 475: 1302 1312.
  • 33
    Vaupel, J.W. 1990. Relative risks: frailty models of life history data. Theo. Pop. Biol. 37: 220 234.
  • 34
    Vaupel, J.W., Carey, J.R., Christensen, K., Johnson, T.E., Yashin, A., Holm, N.V., Iachine, I.A., Khazaeli, A.A., Liedo, P., Longo, V.D., Yi, Z., Manton, K.G., Curtsinger, J.W. 1998. Biodemographic trajectories of longevity. Science 280: 855 860.
  • 35
    Vaupel, J.W., Johnson, T.E., Lithgow, G.J. 1994. Rates of mortality in populations of Caenorhabditis elegans. Science 266: 826826.
  • 36
    Vaupel, J.W. & Yashin, A.I. 1985. The deviant dynamics of death in heterogeneous populations. Soc. Method. 15: 179 211.
  • 37
    Vaupel, J.W., Yashin, A.I., Manton, K.G. 1987. Debilitation's aftermath: stochastic process models of mortality. Math. Pop. Stud. 1: 21 48.
  • 38
    Wilson, D.L. 1994. The analysis of survival (mortality) data: Fitting Gompertz, Weibull and logistic functions. Mech. Age. Devel. 74: 15 33.

Appendix

  1. Top of page
  2. Abstract
  3. Introduction
  4. Background
  5. Analysing mortality data
  6. Simulations
  7. Discussion
  8. Acknowledgments
  9. References
  10. Appendix
Simulation method

The age-specific mortality rate (or hazard) as mentioned here represents an instantaneous measure of risk, and it cannot be used for simulation purposes. Instead, to determine the probability of death at a particular age for a particular individual we need to use the survival function of the particular mortality model. The survival function, S(x), represents the probability an individual survives to the beginning of age x. Thus, the probability any individual dies during age interval x, q(x), is

inline image

The survival functions for the various mortality models are as follows: for the Gompertz–Makeham

inline image

and the logistic–Makeham

inline image

(Vaupel & Yashin, 1985). The standard Gompertz and logistic survival functions are obtained by letting c = 0 in eqs (A2) and (A3), respectively.

Given the values of the model parameters, for each individual in the simulation, I started at age class 0 and calculated the probability of dying in the interval from age 0 to age 1 according to eqn (A1). A random number between 0 and 1 was drawn, and if this number was less than or equal to q(0) the individual was considered to have died in that age class. If the individual survived, the process was repeated for subsequent age classes until death. Repeating this procedure for all individuals produced the desired distribution of deaths, which was subsequently used directly in the maximum likelihood programs to estimate the mortality model of interest.