In a rapidly changing world, understanding and predicting population change is a central aim of applied ecologists, and this involves studying the links between environmental variation and vital rates (survival, fecundity, etc.). Demographic analysis and modelling can be daunting for practicing ecologists, and here we provide an overview of some of the most important issues and methods.
Collection of demographic data should follow standardized protocols and the statistical power to detect links with environment is critically dependent on long time-series. Candidate environmental covariates should be carefully selected with a view to reducing the risk of spurious correlations. The relevant sample size for environment–demography links is typically the number of years and mixed models with random year effects offer a powerful framework to enforce this.
Data on individually marked animals are the best source of information on demography. These data can be analysed and demographic parameters estimated using a wide variety of capture–mark–recapture models, available in standard software packages.
Population models integrate all demographic variables and provide estimates of population growth rate. Two common classes of models are matrix models and integrated population models, where the latter combine parameter estimation and dynamic modelling.
Synthesis and applications. Careful demographic analysis and modelling has provided solutions for many real-world problems in population management, as well as assisting the development of general principles. The tools currently available are flexible and powerful, and the main limitations to their more general use are data availability and training.
One of the most important tasks of applied ecologists is to understand the mechanisms driving increases or decreases in abundance of focal species, often either species of conservation concern or potential problem species, and thus provide the scientific basis for evidence-based management. Because population growth rate is a simple function of the basic demographic parameters (or vital rates) fecundity, survival, immigration and emigration, this is equivalent to identifying environmental causes of, or factors linked to, temporal, spatial, between- or within-individual variation in these parameters. Such factors can be categorized as either extrinsic (e.g. weather, predators or food availability) or intrinsic (e.g. population density or composition) (Aars & Ims 2002). Thus, careful analysis of demographic data and the relationship between demography and environment sensu lato (process) is a prerequisite for understanding and predicting changes in population size or density (pattern) (Benton, Plaistow & Coulson 2006).
Demography is by nature a highly quantitative field, and a large number of advanced statistical methods are available for analysing specific types of data with specific aims. This is particularly the case for the analysis of data from individually marked organisms, or capture–mark–recapture (CMR), on its own or in combination with other data in integrated population models. For practicing ecologists as well as students, many of whom do not have high-level training in statistics, the explosion in the number and variety of methods (reviews in Schwarz & Seber 1999; Schmidt, Schaub & Anholt 2002; Sandercock 2006) can be confusing and at times intimidating. The aim of this review is to provide an overview of the available methods, which should hopefully serve to guide applied ecologists towards appropriate tools for their data and question and thus help lay the foundation for robust inference that can inform evidence-based population management. The emphasis is on methods for identifying links between environment and demography, rather than on providing a complete overview of demographic methods (Williams, Nichols & Conroy 2002) or a primer for specific methods or software (e.g. Cooch & White 2012). We first discuss how to set up a study to answer questions about links between environment and demography (Table 1), including general considerations on the analytical framework. Then, we review the most important methods for answering such questions (Fig. 1) and describe modelling techniques useful for evaluating the influence of observed changes in demographic parameters on population growth rate, either directly from data on marked individuals or using mathematical models. Finally, we discuss briefly whether the demographic tools available can answer the critical questions posed by conservation biologists and wildlife managers.
Table 1. Examples of applied questions about links between environment and demography
How will climate change affect the survival and fecundity of a species?
Will improving the habitat of a threatened species lead to increased population growth?
Is the current hunting pressure on a species sustainable?
How can we most efficiently reduce the population size of a problem species?
This review does not cover the estimation of population size or density, a large field, which has been covered by recent reviews (Amstrup, McDonald & Manly 2006; Thomas et al. 2010; Borchers 2012; McClintock & White 2012). Because our own expertise is mainly in avian ecology, the main focus is on analysis of data from birds and to some extent mammals. Nevertheless, the methods described are equally suitable for other organisms, as long as individuals can be marked or otherwise recognized (e.g. Schtickzelle, Baguette & Le Boulengé 2003; Kéry & Gregg 2004; Rivalan et al. 2005; Buoro, Prévost & Gimenez 2010).
Study design and analytical framework
The identification of environmental drivers of demography involves quantifying the temporal or spatial covariation between vital rates and relevant aspects of the environment. Most studies concern temporal variation in demography, but for questions related to, for example, habitat quality or land management, spatial variation is often more relevant. The number of degrees of freedom for statistical comparisons is thus determined by the number of study occasions, for vertebrates usually years, sites or site–year combinations (if sites can be considered independent over years). Increasing the sample size in terms of individuals improves precision and the power to detect between-year/site variation, but does not affect the degrees of freedom of environment–demography correlations (Devineau, Choquet & Lebreton 2006). Long-term or large-scale studies are thus particularly valuable for investigating links between demography and environment, both because of increased statistical power and because they are more likely to include periods with contrasting environmental conditions (Clutton-Brock & Sheldon 2010).
Field studies of environmental impacts on demography are usually non-experimental (although quasi-experiments may be possible, Schwarz 2002), and, assuming that the study species is determined by management needs, study design thus mainly consists of selecting one or more study site(s) or area(s), ensuring that consistent, repeatable protocols appropriate for the study species are followed to minimize violations of basic assumptions (Kendall et al. 2009) and maintaining a reasonable sample size. These considerations should be guided by detailed knowledge of the ecology of the focal species. Selection of sites (or individuals) to include is critical (Sanz-Aguilar et al. 2009) and should ideally be based on a specific design. In any case, known biases towards selection of, for example, high-quality sites or individuals should be reduced or avoided; without proper attention to this, there is a danger of detection of spurious (nonrepresentative) declines in demographic performance over time. In a CMR context, the annual sample size is determined by the number of individuals encountered and can thus be increased either by marking more animals or by increasing the encounter probability (Devineau, Choquet & Lebreton 2006). In practice, many analyses of environment–demography links use existing data series collected for other purposes, often monitoring; this is not a major problem if the basic study design is sound (see above). Local environmental variables may also be measured as part of the field study, and these should then be carefully selected based on existing knowledge and hypotheses regarding the ecology of the study species (see ‘HOW TO DEFINE ‘ENVIRONMENT’’).
If the aim is to use demographic models to understand the overall influence of environmental variation on population growth rate, it is critical that the life cycle is closed, that is, that all components of fecundity are included in the data collection and analysis (Franklin et al. 2004). In particular, age-specific breeding probabilities need to be accounted for, implicitly or explicitly, which may be difficult for species where only breeding individuals can be observed. Similarly, survival of all age classes (or stages) needs to be modelled, and data collection can be challenging for prebreeders which often show extensive natal dispersal.
Proper replication is the only way to ensure that conclusions of a study can be generalized. Field studies of individually marked vertebrates are labour intensive, and most studies of this type concern only one local population. One way to achieve replication is to form networks of researchers studying the same species with comparable methods, and analyse data jointly. Relatively few studies of this type exist (e.g. Grosbois et al. 2009; Jenouvrier et al. 2009; Papadatou et al. 2011).
How to Define ‘Environment’: Asking the Right Questions
Different study species are affected by different environmental factors, and it is important that the choice of covariates for analysis reflects the ecology of the study species. The choice of specific covariates to include (and thus hypotheses to test) should ideally be based on existing species-specific knowledge, or alternatively on generalizations from similar species or predictions from theory. Often, food availability, predation, physical environment (weather) and management interventions will be among the factors expected to be important. Ideally, these variables should be measured directly at an appropriate spatial and temporal scale (i.e. relevant for the ecology of the study species), but this may not be possible due to logistical limitations. Proxies, that is, variables measured in a standardized way and assumed to covary with ecologically important aspects of the environment, are therefore often used as the main covariates. These range from relatively local indirect measures such as NDVI (Normalized Difference Vegetation Index; Pettorelli et al. 2011) or SST (Sea Surface Temperature, as a proxy for food availability; Rayner et al. 2006), to global climate indices such as SOI (Southern Oscillation Index; Trenberth 1984) or NAO (North Atlantic Oscillation; Hurrell, Kushnir & Visbeck 2001). It has been claimed that global climate indices such as NAO are often more strongly correlated with ecological processes than locally measured weather (Hallett et al. 2004), but there are many cases to the contrary (e.g. Frederiksen et al. 2004). Because global indices necessarily act through local processes, the stronger correlations sometimes observed with global indices are most likely due to lacking ability to identify and measure relevant local covariates. If locally measured weather variables are available, we suggest that they are used in preference to global indices, also because the ecological interpretation of any identified links will be more straightforward.
Migratory species pose special problems, as they are exposed to different environmental conditions in breeding, wintering and staging areas, as well as during active migration. Their demography may be affected by factors acting in any or all of these phases (Schaub, Jakober & Stauber 2011; Genovart et al. 2013), and carry-over effects can be important (Norris 2005; Harrison et al. 2011). Identifying suitable local covariates is thus challenging (though possible, Péron et al. 2011b), making global proxies more attractive. Recent advances in technology have allowed the collection of much more detailed data on nonbreeding distribution of migratory species, which may help in the identification of relevant covariates (Ramos et al. 2012).
Time-series of demographic data are rarely very long, and the number of potential environmental covariates can be high. In particular, local and global climate can be quantified in many different ways, and there may be a temptation to test all of these variables as predictors of demographic variation. Indeed, we have seen examples where the number of covariates tested exceeds the sample size in terms of years of data. As with regression in general, including too many predictors leads to an elevated risk of spurious ‘significant’ correlations, in particular because the number of years is inevitably small from a statistical viewpoint (cf. Thorndike 1978). There are two ways of reducing the number of predictors to achieve an acceptably low risk of spurious results. First, careful attention to what is known about the biology of the system can lead to the rejection of many potential predictors. Each predictor included should be related to a unique scientific hypothesis that needs to be tested (confirmatory rather than exploratory approach). Secondly, potential predictors are often highly intercorrelated (e.g. various aspects of local weather), and the number of predictors can be reduced through application of multivariate dimension-reduction techniques such as principal component analysis (McGarigal, Cushman & Stafford 2000; Juillet et al. 2012), or using biologically meaningful combinations of variables (e.g. an index of frequency and strength of onshore winds, Frederiksen et al. 2008).
Environment and fecundity
Empirical data on temporal variation in aspects of fecundity come in many different forms (clutch or brood size, pregnancy rate, probability of successful reproduction, etc.), mainly determined by taxon-specific practical constraints. While the most appropriate parameter for evaluating impacts on population growth rate is overall fecundity (annual number of viable offspring produced per female), this is not always possible to measure, and in many cases, it may be more useful to relate environmental variation to specific aspects of fecundity, which can be reliably measured.
Raw data on fecundity are usually collected for individual nests or females, but may sometimes consist of, for example, the proportion of juveniles in the population at the end of the reproductive season. In cases where nests or females are not recognizable from year to year, little is gained by analysing the data at the individual level rather than summarized as annual means. Between-year variation in annual means can then be analysed using appropriately structured generalized linear models.
When data on fecundity are collected for individually recognizable parents, it is most appropriate to analyse the raw data using mixed (hierarchical) models with random individual and year effects (Bolker et al. 2009; Zuur et al. 2009). Besides ensuring that the appropriate number of degrees of freedom is used to test correlations between environment and mean fecundity, through the use of nested and crossed random effects such models allow quantification of variance components associated with, for example, within-individual, between-individual and between-year variation (Benton, Plaistow & Coulson 2006; Browne et al. 2007; Dingemanse & Dochtermann 2013), as well as individual variation in reaction norms (individual–environment interactions) (Reed et al. 2006; Nussey, Wilson & Brommer 2007). Hierarchical models can be fitted in either a frequentist or Bayesian framework.
Longitudinal data on breeding performance of individually recognizable animals can also be analysed in a multistate CMR framework. Such analyses focus on, for example, environmental drivers of the probability of attempting reproduction in a given year, while allowing for imperfect detection (e.g. Rolland et al. 2009; Pradel, Choquet & Béchet 2012).
For many birds, fecundity data consist of repeated visits to individual nests, where the status (active, success, failure) of each nest is recorded at each visit. Because not all nests are initiated or found on the same day, allowance needs to be made for the varying time at risk for each nest. Nest survival models (Dinsmore, White & Knopf 2002) deal with this type of data and can usefully be regarded as a special case of logistic regression, where the effect of covariates can be tested (Aebischer 1999; Hazler 2004; Shaffer 2004).
Capture–mark–recapture: an omnibus tool in demography
Repeated observations of individually recognizable animals provide the most detailed information on variation in demographic parameters (survival, dispersal, and often also reproduction). Individuals are most often marked by researchers, but between-individual variation in external characters (Langtimm et al. 2004; Karanth et al. 2006) or genetic characteristics (Lukacs & Burnham 2005) can provide similar information, although the possibility of misidentification should be accounted for. A defining characteristic of this type of data is that the nonobservation of an individual does not necessarily imply that it is no longer alive; it may have left the study area temporarily or permanently or it may have been present, but missed (in which case its state will also be unknown). In addition to the demographic quantities of primary interest, it is therefore also necessary to estimate a ‘nuisance’ parameter, the probability of detection. Individuals may either simply be observed or not on a given occasion (single-state CMR; Lebreton et al. 1992), or they may be allocated to a number of discrete states, for example physical sites or reproductive states (multi-state CMR; Lebreton et al. 2009). Observations of dead marked individuals (typically ringed birds) also provide information on demography, and again the probability of observing and reporting an individual has to be estimated. The field of CMR statistics has developed to deal with the increasing number and complexity of field studies on marked individuals (Williams, Nichols & Conroy 2002; Thomson, Cooch & Conroy 2009).
Data on encounters of live or dead marked individuals have been used to investigate links between demography and environment for decades (e.g. North & Morgan 1979). Sequential estimates of, for example, survival show negative sampling covariance and are thus nonindependent (Jolly 1965). It is therefore statistically invalid to extract annual estimates of survival, etc. and regress these estimates against environmental variables; instead, relationships of interest should be tested as an integral part of data analysis using a so-called ultrastructural model (Lebreton et al. 1992).
Although this approach is easily applied in CMR software, there are potential pitfalls when assessing the statistical significance and biological importance of environmental covariates (detailed review in Grosbois et al. 2008). Briefly, when between-year variation in a given parameter is pronounced (which is often the case in even moderately large data sets), both standard likelihood ratio tests and AIC-based model selection (Burnham & Anderson 2002) are biased. Two approaches exist to deal with this problem: analysis of deviance (Skalski, Hoffmann & Smith 1993), which provides an anova-like partitioning of the total between-year variation into a component explained by the covariate and residual variation, and mixed models with random year effects (Loison et al. 2002). Analysis of deviance has recently been shown to give a robust approximation to the more sophisticated approaches in the mixed model framework (Lebreton, Choquet & Gimenez 2012). Proper statistical assessment of the importance of environmental covariates is critical for achieving robust inference.
CMR studies are most often correlative, because designed experiments are difficult in most vertebrates. This limits the strength of inference regarding causal links. Path analysis (or structural equation modelling) allows causal modelling with observational data, and this approach has recently been implemented in CMR models (Cubaynes et al. 2012; Gimenez, Anker-Nilssen & Grosbois 2012)
Goodness-of-Fit Testing and Capture Heterogeneity
Capture–mark–recapture models make a number of assumptions about the data used (Lebreton et al. 1992; Pradel, Gimenez & Lebreton 2005; Kendall et al. 2009), and whenever possible these assumptions should be tested using goodness-of-fit tests. If not detected and accounted for, violations of these assumptions can lead to bias in model selection and overestimation of precision. The most important assumption is that all individuals should have the same probability of surviving and being observed or captured. While obviously never exactly true, this, like other assumptions, is a useful approximation and a guideline for evaluating data quality. The program U-CARE (Choquet et al. 2009) provides informative tests for specific violations of this assumption for both single-state and multi-state CMR data, and the results can be used to select an appropriate starting model for the data. While among-individual variation in survival often can be accommodated through stratification by, for example, age or sex, or by inclusion of individual-level covariates such as body condition, capture heterogeneity is more difficult to handle. Experience shows that capture heterogeneity is near-ubiquitous in ecological data sets, often in the form of ‘immediate trap-happiness’ identified through test 2.CT in U-CARE. This simply implies that individuals observed on the previous occasion are more likely to be observed on the current occasion than those not observed on the previous occasion (Pradel 1993). There are many potential biological explanations for this phenomenon, including the combination of nest site fidelity and variable observability of nest sites. Trap happiness (and capture heterogeneity in general) can cause bias in survival estimates (Pradel 1993) and should therefore be handled whenever possible; potential solutions include multistate models with unobservable states (Gimenez, Choquet & Lebreton 2003; Pradel & Sanz-Aguilar 2012), inclusion of auxiliary data such as dead recoveries (Frederiksen & Bregnballe 2000), mixture models where individuals are assigned to two or more classes with different encounter probabilities (Pledger, Pollock & Norris 2003; Pradel 2009), and CMR mixed models in which an individual random effect on encounter probabilities is included to account for interindividual differences (Royle 2008; Gimenez & Choquet 2010).
Advanced CMR Models
Capture–mark–recapture statistics have progressed rapidly in recent years (e.g. Schaub & Kendall 2012), and models are now available that take into account many common data peculiarities, allow relaxations of standard assumptions and make possible the estimation of additional parameters. The scope of this review does not allow full coverage of all these developments, but Table 2 provides an overview of recent progress with key references (see also Lindberg 2012).
Table 2. Examples of questions that can be addressed with advanced capture–mark–recapture (CMR) models, including key references
Identifying environmental covariates
Too many candidate covariates; risk of spurious significant results (if no correction) or lack of power (e.g. Bonferroni correction)
PCA of local covariates, and more generally, methods for protecting regression
Capture–mark–recapture models generally cannot be fitted in standard statistical software because they involve the estimation of the ‘nuisance’ parameter detection probability. There are currently two main software packages for CMR: MARK (White & Burnham 1999) and E-SURGE (Choquet, Rouan & Pradel 2009). Both programmes are very flexible and allow a wide range of CMR models to be fitted. MARK probably has the shallower learning curve, largely due to the excellent documentation (Cooch & White 2012) and also includes, for example models for the estimation of population size from CMR data. The greatest strengths of E-SURGE are the underlying multievent framework (Pradel 2005), where observations are seen as imperfect reflections of underlying true states and the quality of its numerical procedures. This framework is particularly useful for models incorporating, for example state uncertainty or heterogeneity (Pradel 2009; Gimenez et al. 2012). CMR models can also be fitted in a Bayesian state-space framework (King 2012), but this requires access to statistical expertise and knowledge of programming in resources such as OpenBUGS (Lunn et al. 2000).
Population modelling: joining the pieces of the jigsaw
Often, the end goal of demographic analysis is to estimate population growth rate and the factors affecting it, so that appropriate management action can be taken. Population growth rate is determined by the values of the basic demographic parameters, so this exercise involves joining up all demographic information available for the study population. Depending on the data available and the specific aims of the case study, this can be carried out in several ways.
Age- or stage-structured matrix models provide a mathematically stringent framework for exploring the population-level consequences of a set of demographic parameter values. For deterministic models (i.e. with constant parameter values), the asymptotic properties of the projection matrix include the projected population growth rate, stable age distribution and reproductive values (Caswell 2001). Models can be constructed in accessible software and can easily be extended to include, for example density dependence, environmental stochasticity or demographic stochasticity (only relevant for small populations) (Legendre & Clobert 1995; Legendre 1999). Time-varying matrices can also be used to explore the implications of functional links between environment and demography (e.g. Frederiksen et al. 2004). Due to rapid environmental change, realized age distributions may be far from stable, and in such cases, transient dynamics are of more interest than asymptotic properties (Hodgson & Townley 2004; Koons, Grand & Arnold 2006; Stott, Townley & Hodgson 2011).
Integrated Population Models
This powerful tool (Besbeas, Freeman & Morgan 2005; Thomas et al. 2005; Schaub & Abadi 2011) allows the inclusion of all available data on the study population, for example, counts and observed age ratios, as well as demographic data. In contrast to matrix models, which take estimated demographic parameter values as input, integrated models combine parameter estimation and dynamic modelling, so that population projections take proper account of estimation error and potentially model uncertainty. These models include statistical estimation of demographic parameters as well as a matrix model and are thus complex to construct; some level of proficiency in programming is required (Kéry & Schaub 2012). The main advantages are that they make full use of all available data, provide honest projections reflecting the full range of uncertainties and sometimes allow the estimation of additional demographic parameters on which no direct data are available (e.g. immigration or emigration, Reynolds et al. 2009; Abadi et al. 2010; Péron et al. 2010b).
Estimating Population Growth Rate Using CMR
Population growth rate can also be estimated directly from data on marked individuals, and hypotheses regarding the importance of environmental covariates can be tested using specific CMR models (Pradel 1996; Nichols et al. 2000). Unlike most other CMR models covered here, this approach does not condition on first capture; in other words, the initial capture process must be modelled. In practice, this is equivalent to an assumption that the initial capture process is comparable to later recaptures. This assumption is obviously not met for species where field methods differ between first capture and recapture (e.g. physical capture vs. resighting from a distance). On the other hand, this approach is robust in certain conditions to violations of the standard assumption of homogeneous detection probabilities (Pradel et al. 2010; Marescot et al. 2011).
Conclusions: can demography answer the critical questions?
Applied issues that require an understanding of drivers of demographic variation are ubiquitous and diverse, including management of threatened or declining species, minimizing problems caused by pest species, or understanding the impacts of global change. Whenever sufficient data are available, demographic analyses of varying degrees of sophistication can provide robust answers to management-related questions. The following examples demonstrate this.
Management of rare and/or threatened species: The takahe Porphyrio hochstetteri is a globally threatened large flightless rail, which has only one naturally occurring population, in the Fiordland area of New Zealand. Hegg et al. (2012) demonstrated that while annual variation in demography was related to local weather, management initiatives to reduce predation by stoats Mustela erminea and rearing chicks in captivity had resulted in higher annual survival and fecundity, respectively.
Identifying causes of declines in common species: The black-legged kittiwake Rissa tridactyla is a widespread and common seabird, which has declined recently in large parts of its Atlantic range. In one colony, Frederiksen et al. (2004) showed that both fecundity and survival were negatively affected by high winter temperatures and the presence of a local fishery and that these impacts were sufficient to induce a long-term decline in population size. For fecundity, the relationship was confirmed by independent data from several nearby colonies (Frederiksen, Mavor & Wanless 2007).
Management of problem species: The great cormorant Phalacrocorax carbo is a very efficient predator of fish in shallow waters, and from 1970 to 2000, the European population increased dramatically as it recovered from past persecution. This led to widespread conflicts with fishery interests. Demographic analysis and modelling revealed pronounced density dependence in all vital rates (Frederiksen & Bregnballe 2000; Frederiksen, Lebreton & Bregnballe 2001), and along with increased culling, this has contributed to a stabilization of breeding populations in Western Europe.
Identifying and predicting impacts of global change: Many studies of climate–demography links concern seabirds, but few have used projections from global climate models in conjunction with demographic analysis and modelling to predict impacts on population growth rate. Barbraud et al. (2011) did this based on long-term demographic data for three seabird species in the southern Indian Ocean. They found that the most northerly species, the Amsterdam albatross Diomedea amsterdamensis was unaffected by projected global warming, whereas two more southerly distributed species, the black-browed albatross Thalassarche melanophrys and the snow petrel Pagodroma nivea, were strongly negatively affected.
It should thus be clear that careful demographic analysis and modelling can answer a large range of highly relevant questions in applied ecology, both specific and generic. In particular, CMR analysis has led to new insights of general relevance for, for example, conservation, such as the widespread pattern that survival is higher, overall fecundity lower, and generation time thus longer than previously assumed for most species (Lebreton 2006), with strong implications for the sensitivity of populations to environmental change, and their ability to recover from perturbations. The main factor limiting the potential of these methods is the availability of high-quality demographic data, although valuable insights can be gained from comparison with related or ecologically similar species or from meta-analyses. We therefore stress the importance of implementing and maintaining standardized long-term programmes for collection of demographic data, including observations of marked individuals (Clutton-Brock & Sheldon 2010). In addition, there is a need for better and more widespread training of students and practicing ecologists in quantitative demographic methods (Gimenez et al. 2013).
This work was partly motivated by the experience of the first author as reviewer and associate editor. Our thanks therefore go to the many authors who have brought the issues covered in this review to our attention. Thanks also to Nigel Yoccoz and an anonymous reviewer for extremely helpful comments on a previous version.