Estimating demographic parameters from capture–recapture data with dependence among individuals within clusters


Correspondence author. E-mail:


  1. Two-level data, in which level-1 units or individuals are nested within level-2 units or clusters, are very common in natural populations. However, very few multilevel analyses are conducted for data with imperfect detection of individuals. Multilevel analyses are important to quantify the variability at each level of the data.
  2. In this study, we present two-level analyses for estimating demographic parameters from data with imperfect detection of individuals and with a source of individual variability that is nested within a source of cluster variability.
  3. This method allows separating and quantifying the phenotypic plasticity or facultative behavioural responses from the evolutionary responses. We illustrate our approach using data from studies of a long-lived perennially monogamous seabird, the Cory's shearwater (Calonectris diomedea) and a patchy population of collared flycatchers (Ficedula albicollis).
  4. We demonstrate the existence of dependence in recapture probability between paired individuals in the Cory's shearwater. In addition, we show that family structure has no influence on parent–offspring resemblance in collared flycatchers dispersal.
  5. The new method is implemented in program e-surge which is freely available from the internet.


Demographic parameters are key population dynamics components to address important questions in ecology, management and evolution. For example, estimation of survival and dispersal often involves a capture–recapture (CR) protocol in which individuals are captured, marked, and released in their environment. CR models allow inferring demographic processes in spite of the practical impossibility to detect all individuals in a natural population at each sampling session. However, most CR studies rely on the assumption of independence between individuals, hence ignoring the natural associations among individual fates. For example, each individual may belong to a cluster of individuals (i.e. a subset of individuals that remains the same across time) such as a family, a set of young born the same year with the same mother or a set of individuals occupying the same geographical location. In these situations, two individuals belonging to the same cluster may have more similar parameter values than two individuals from different clusters.

If individual characteristics such as phenotype, cluster characteristics or habitat quality are measured in the field, it is relatively easy to incorporate these sources of variation in CR models, using covariates. However, there are many situations in which the information at the individual or the cluster level cannot be measured, for example when individuals are not captured physically. Importantly, ignoring heterogeneity arising from individuals or clusters of individuals may induce biases in parameter estimates (Barry et al. 2003). This leads to the detection of an effect more often than it should (i.e. an inflated type I error rate) (Lin 1997). From a biological perspective, ignoring this heterogeneity may lead to flawed inference in conservation biology (Cubaynes et al. 2010) and evolutionary ecology (Cam et al. 2002; Péron et al. 2010). These data sets can be statistically modelled using a multilevel analysis, taking into account sources of variation in the cluster-level process that generates the dependence.

Recent studies have focused on accounting for the variation in demographic parameters at the level of either the individual (Royle 2008; Gimenez & Choquet 2010) or the cluster of individuals (Choquet & Gimenez 2012). However, studies accounting for both levels simultaneously are lacking. Nevertheless, modelling for dependence between individuals within clusters is crucial to quantify inter- and intra-individual variation in demographic parameters (Doran et al. 2007; van de Pol & Wright 2009) and thus to address relevant questions in evolutionary ecology (Doligez et al. 2012) and behavioural ecology (Cohas et al. 2007). Of particular importance is the assessment of the proportion of phenotypic variation that can be attributed to between-subject variation, versus the proportion due to measurement error and phenotype flexibility. This so-called intra-class coefficient (ICC) has many applications (Nakagawa & Schielzeth 2010). For example, ICC has been used to quantify the resemblance in time series of demographic parameters between populations (Grosbois et al. 2009) or species (Lahoz-Monfort et al. 2009).

The simplest and most common form of data structure is two-level data, in which ‘level-1 units’ are nested within ‘level-2 units’. Some typical examples of this scheme include measurements nested within subjects (in the case of longitudinal data) and individuals nested within families. In both cases, relationships between level-1 and level-2 units frequently exist and are worthy of research, for example those existing between pairs (Schmutz et al. 1995; Schwarz 2002) or those arising from within-family resemblance (i.e. between parents and offspring or between siblings) in dispersal (Massot & Clobert 2000; Doligez et al. 2012).

In medicine, social and agricultural sciences data arise from an exhaustive monitoring of individuals. Currently there exist several models and tools that use random effects to account for dependence, at both the level of the individual and the cluster. All of which are well spread among practitioners (Diez-Roux 2000; Gelman & Hill 2006). Mixed models accommodating two levels or more of hierarchical data are available in popular statistical programs such as sas with its procedure NLMIXED, or r with its packages LME4 and NLME (Pinheiro & Bates 2000; Bates & Sarkar 2007). However, this category of models has not yet been fully developed in population biology. Data here often originate from a nonexhaustive monitoring of individuals, resulting in complex likelihoods that are very time consuming to calculate. Recently Choquet & Gimenez (2012) considered the full dependence of individuals belonging to the same cluster in CR studies. However, this preliminary work did not consider two-level analyses and therefore was unable to compute the ICC.

Here, we generalize this work by allowing flexibility in the dependence between individuals of a cluster. We develop random-effects models to accommodate data hierarchy with two levels: subjects nested within clusters. Our new approach is illustrated using two case studies. First, we consider the dependence of fates (survival and recapture) between members of pairs in a long-lived perennially monogamous seabird, the Cory's shearwater (Calonectris diomedea). Secondly, we investigate the influence of family structure on parent–offspring resemblance in dispersal in a patchy population of collared flycatchers (Ficedula albicollis). These models are implemented in the software application e-surge (Choquet, Rouan & Pradel 2009) within a frequentist approach using numerical integration.

Probabilistic framework for modelling capture–recapture

Assume we have K capture occasions and N individuals. Let the encounter history for individual i of cluster j be hij = (oij1,…, oijk), where oijk denotes whether individual i of cluster j is observed in state m (oijk = m) or not (oijk = 0) at time k. We consider three sets of parameters usually used in multistate CR models (see Lebreton et al. (2009) for a review) which include:

  • Survival probabilities st,m: The probability for an individual to survive being in state m at time t from occasion t to occasion t + 1.
  • Transition probabilities ψt,mn: The probability for an animal being in state m at time t to be in state n at time t + 1 conditional on being alive at time t.
  • Recapture probabilities pt,n: The probability for an animal to be recaptured at occasion t in state n.

This model, in which survival, transition and recapture probabilities are state- and time-dependent, corresponds to the Arnason–Schwarz model [AS, Arnason (1973); Schwarz, Schweigert & Arnason (1993)] and is denoted Sf.t Ψ Pto.t where ‘f’ and ‘to’ are notation for, respectively, state of departure and state of arrival and ‘t’ denotes dependence upon time [see Choquet (2008) for further details on notation]. When there is only one state, then there is no transition and the AS model reduces to the so-called Cormack-Jolly-Seber [CJS, Lebreton et al. (1992)] model denoted StPt.

Modelling dependence in a cluster of individuals

Individuals of distinct clusters are assumed independent whereas individuals of the same cluster are not. We consider a parameter θ(ij) (for survival, transition or recapture probability) for the i-th individual of cluster j. For the sake of simplicity, we remove the index for time. We assume also that the size of clusters is constant and equal to NI. The model for the response (on the logistic scale) of the i-th individual to the j-th cluster is as follows:

display math(eqn 1)

where each vector bj = (b1j, …, bNIj) is a vector of independent and identically distributed (i.i.d.) random effects following a multivariate normal distribution N(0,Σ), where Σ is the variance–covariance matrix associated with the design matrix Ζ, and βl is a fixed effect associated with the design matrix Xlij. The fixed effects in X may take different values for individuals and clusters. That is, one may make use of variables that describe the individual (‘level-1 variables’) or the cluster (‘level-2 variables’) or both. For illustration, we consider three biological situations in which parameter θ is successively survival probability (s), transition probability (ψ) and recapture probability (p).

Example 1 – Modelling survival probability in the CJS model: We consider pairs (level 2) in which males and females are individuals of the level 1. To quantify between-pair variation in survival, we consider the following model:

display math(eqn 2)

where β0 is the mean survival on the logit scale and the bj's are i.i.d. distributed as a univariate normal distribution N(0,σ2) where σ2 is the variance within the pairs.

Example 2 – Modelling recapture probability in the CJS model: As in example 1, we consider pairs. To quantify the variability of the recapture between pairs and the proportion of total variation explained by between-pair variation, we consider the following model:

display math(eqn 3)

where bj = (b1jb2j) is i.i.d. as a bivariate normal N(0,Σ) with inline image and β0 is the intercept. Parameter ρ is the correlation between two individuals. Here, σ2 is the total variance and when positive ρσ2 is the within-group variance. In that particular case where ρ is positive, ρ is the ICC and has a direct biological interpretation (see above).

Example 3 – Modelling transition (conditional on being alive) probability in the AS model: We consider families (level 2) in which all members are exchangeable (i.e. all individuals behave the same concerning the transition). We consider the transition probabilities between two states, ‘ND: non dispersers’ and ‘D: dispersers’. We evaluate the between-family variability in transition between states 2 and 1 using model:

display math

where bj's are i.i.d. following a univariate normal distribution N(0,σ2) and β2 is the intercept for the transition from state D to ND.

To describe a mixed model in e-surge, we use a general syntax of the form ‘phrase1 + phrase2’ where phrase1 is any phrase for fixed effects and phrase2 (in italics) is any phrase for random effects. The phrase ‘i+pairs’ models Eqn 2, where i is the intercept. The phrase ‘i+sex/pairs’ models Eqn 3, where a/b means a nested in b. In that case, the standard deviation (σ) is assumed constant.

Structure of the variance–covariance matrix

The set of covariance matrices that we can consider can be conveniently described by first decomposing

display math(eqn 4)

with Γ the diagonal matrix of the standard deviations and P the matrix of correlations. For a matrix of dimension 2 × 2 associated with a cluster of size two, inline image. For ρ = 0 and constant standard deviations, example 2 reduces to an individual random effect where all individuals are independent (Gimenez & Choquet 2010). For ρ = 1, this random effect reduces to a random effect where individuals are fully dependent (Choquet & Gimenez 2012). For a variance–covariance matrix

display math

ρ captures the relative importance of the between-cluster variance inline imageand the within-cluster variance inline image. For a positive correlation, ρ is the ICC and there is a direct relationship between inline image, which is inline image

Several structures for matrices Γ and P are possible (Table 1), the choice of which depends on the problem under consideration (Wolfinger 1996). Sub-table 1A shows two possible structures for standard deviations, either constant (HOM) or level dependent (HET). Sub-table 1B shows two possible structures for correlation between individuals nested within a cluster. The first one (CS) considers a constant correlation, whereas the second one (UN) considers the full parameterization of correlation. The order of the variance–covariance matrix is the size of the cluster, which can be large. The order of Σ can be reduced when the correlation ρ is assumed to be positive and when both variances and correlations are constant. When the size of the cluster is large, it is convenient from a computational perspective (see above) to consider instead the addition of two random effects, one for the individual (level 1) and the other for the cluster (level 2), which leads to the following variance–covariance matrix:

Table 1. Main used diagonal variance matrices Σ (Part A) and symmetric correlation matrices Γ (Part B and C, upper part only). Structures of Part A and B are implemented in e-surge. Part A shows two possible structures for standard deviations, either constant (HOM) or level dependent (HET). Part B shows two possible structures for correlation between individuals nested within a cluster. Structure CS considers a constant correlation, whereas structure UN considers the full parameterization of correlation. Structures described in Part C are used for spatial and time correlation. TOEP may be used for spatial analysis. A distance structure can be taken into account by applying an exponent to the correlation. The exponent may depend on the number of lags between two observations (second row) or it may be real for a distance (third row: dij is the distance between site i and site j)Thumbnail image of
display math

Parameter estimation

Assuming that individuals are independent conditional on the random effects, the likelihood for fixed effects for the entire set of encounter histories is obtained as the product of the probability P(hij|β,b) for the history hij of individual i in cluster j. The marginal likelihood is obtained by integrating the product of the probability P(hij|β,b) with respect to the distribution of the random effects. For i.i.d. random effects on clusters, calculating the marginal likelihood can be made much more efficient from a computational perspective by reducing the dimension of the integral to the order n of Σ applied to the cluster. This integral is one-dimensional in examples 1 and 3 and bi-dimensional in example 2. The marginal likelihood of the CR mixed model becomes:

display math(eqn 5)

where J is the number of clusters and Ij is the set of individuals of cluster j, f(x|Σ) is the density function of a N(0,Σ) with Σ of order n, which is lower or equal to the size of the cluster and may be constant between clusters or not. For an individual or a group random effect, n is equal to 1.

The marginal likelihood involves integrals that cannot be evaluated analytically due to the complexity of the CR model likelihood. We use the Gauss-Hermite quadrature [GHQ, Liu & Pierce (1994)] which is known to work well for a large class of problems, at least for low-dimensional integrals and a Gaussian distribution for the random effects. We obtain maximum likelihood estimates (MLEs) of the model parameters by maximizing the marginal likelihood using a quasi-Newton algorithm. Approximate standard errors (SEs) are obtained from the inverse of the Hessian calculated from a standard finite-difference scheme. We used a GHQ of order 8 up to order 15 (Gimenez & Choquet 2010; Choquet & Gimenez 2012).

Testing hypotheses about variance–covariance components

As argued before, testing for dependence is crucial for biologists. However, in equation (2), there is an issue to safely carry out a likelihood ratio test (LRT) for testing whether the variance differs from 0 or the cluster correlation differs from 1 (to test whether there is an individual effect) or the cluster correlation differs from -1. This issue arises due to the boundary conditions required to apply the LRT. In a standard approach, the LRT statistic follows a chi-square distribution with one degree of freedom. The test can then be performed if the cluster correlation is equal to 1, but at the risk of rejecting the null hypothesis too often. Fortunately, because a correlation is constrained within the interval (-1, 1), testing whether a correlation differs from 1 is essentially asking whether this correlation is significantly lower than 1. This can be accomplished by testing the observed chi-square against a chi-square distribution that is an equal mix of a chi-square distribution with one degree of freedom and another one with zero degree of freedom [e.g. Self & Liang (1987)]. In practice, this means that one needs to perform a one-sided test, in other words to half the P-value obtained by using the standard chi-square distribution with one degree of freedom. Note that a similar reasoning holds when testing whether variance components differ from zero in equation (2). We refer to Self & Liang (1987) for more complex situations.


We illustrated the above method by addressing two questions about the influence of behaviour on recapture probability (illustration 1) and the estimation of the heritability of a trait via transition probability (illustration 2) using capture–recapture data. In the two examples, we performed model selection in two steps. First, we carried out ‘standard’ model selection to determine which fixed effects should be kept using the Akaike Information Criterion, corrected for small sample size (AICc) and if any, presence of overdispersion (QAICc). Then we performed an LRT test for the random effect(s) on the best fixed-effect model at the 5% significance level. The biological relevance and context of the models is partly detailed in Appendices S1 and S2 as a complement for illustrations 1 and 2 respectively (see Supporting Information).

Illustration 1

We investigated the dependence of fates (local survival and recapture probabilities) between members of pairs in a long-lived perennially monogamous seabird (Calonectris diomedea), using a pair random effect. The data used here were collected between 2001 and 2008 on a medium-size breeding colony of Cory's shearwater (ca. 200 pairs) on the small island of Pantaleu (2·5 ha, 26 m a.s.l.) in the Balearic Archipelago (39º34′ N, 2º21′ E, Spain). Breeding adults were captured on their nests while incubating and marked with individually numbered rings. Individual sex was assigned from morphometric measurements. We considered the observations (captures/recaptures) of individuals from their first capture together with their mate, and each pair received a different code. Data on individuals pairing with widowed or divorced individuals from the initial pairs were discarded and a total of 140 pairs were therefore considered. Consequently, the data set includes two levels, the individual level (characterized by sex within a pair) and the pair level. Goodness-of-fit tests (GOF) for the CJS model (Pollock, Hines & Nichols 1985) were carried out using program u-care (Choquet et al. 2009). No overdispersion was detected and the CJS model was retained as the general model under which we explored the fixed-effect structure. Models in which survival, recapture or both were constant (denoted i) or time-dependent (denoted t) were tested, and models including the random effects (pair and sex nested within pair or not) were secondly tested against these fixed-effects models (Table 2).

Table 2. Model selection for the Cory shearwater's data. Fixed effects considered: I, constant parameter and t, time effect. Random effects (in italics): pair and sex; sex represents the level 1 of clustering, whereas pair represents the level 2. Sex/pair means sex nested in pair; sex.pair is equivalent to an individual effect. # Id. Par. is the number of identifiable parameters of the model
ModelSurvivalRecapture# Id. Par.DevianceAICc

Including an individual (model 6, Table 2, sex.pair) or a pair random effect (models 5 and 7, Table 2, pair and sex/pair) on local survival only slightly improved the deviance of the model compared with the constant survival model. For survival, the assumption H0: σ = 0 could not be rejected (P-value = 0·15). Therefore, the results suggested that the survival probabilities of individuals associated in pairs are independent in this species.

Including an individual (model 3, Table 2, t+sex.pair) or a pair random effect (models 1 and 2, Table 2, t+pair and t+sex/pair) on recapture, markedly decreased the AICc value of the model compared with the model with time dependence alone on recapture. For recapture, the assumption H0: σ = 0 was rejected (P-value <0·01) and the estimate of σ (SD) was 1·708 (0·381). The assumption H0: ρ = 1 could not be rejected (P-value = 1). Therefore, the estimate of σ suggested a high heterogeneity between pairs in recapture probability. In this species, skipping reproduction is a common phenomenon (Sanz-Aguilar et al. 2011). Because only breeding birds were captured, sabbatical breeding periods taken by pairs may create the observed between-pair heterogeneity in recapture probability.

The estimate of survival probability did not differ substantially between the model taking into account a pair effect on recapture probability and the model not accounting for the pair effect (Table 3, s = 0·82 vs. s = 0·81 respectively). On the contrary, the estimates of recapture probability for each year obtained when taking into account a pair effect (model 1, Table 3, t+pair) were higher than those obtained without the pair effect (model 2, Table 3, t+sex/pair). In Appendix S3, we give details for implementing these models in e-surge.

Table 3. Comparisons of the estimates for parameters of the two-level model (i.e. including a pair random effect, model 2 of Table 2) and the standard model without random effect (model 4 of Table 2) fitted to the Cory shearwater data. MLE stands for maximum likelihood estimate, SE for standard error. σ is the standard deviation of the pair random effect
parameterwithout random effect MLE (SE)with random effect MLE (SE)
P 2 0·597 (0·156)0·749 (0·168)
P 3 0·754 (0·043)0·767 (0·060)
P 4 0·790 (0·040)0·839 (0·049)
P 5 0·910 (0·026)0·95 (0·021)
P 6 0·862 (0·034)0·922 (0·031)
P 7 0·972 (0·019)0·985 (0·011)
P 8 0·844 (0·052)0·893 (0·046)
s0·814 (0·014)0·822 (0·014)
σ(pair)NA1·708 (0·381)

Illustration 2

We assessed the influence of family structure on the level of parent–offspring resemblance in dispersal propensity in a patchy population of collared flycatchers (Ficedula albicollis), using a family random effect. The collared flycatcher is a small hole-nesting migratory passerine bird that readily accepts to breed in artificial nest boxes, providing an easy access to breeding data (Gustafsson 1989; Doligez, Gustafsson & Pärt 2009). The data used here were collected on a patchy population of collared flycatchers on the Swedish island of Gotland (Gustafsson 1989; Doligez et al. 2004; Doligez, Gustafsson & Pärt 2009). In this population, dispersal can easily be defined by a change of patch between years. Dispersal status of each individual is therefore determined by comparing the patches of capture between years, ignoring years when the individual was not captured (see Doligez et al. 2004; Doligez, Gustafsson & Pärt 2009 and Appendix S2 for discussions about the validity of this definition). We considered here only females, whose capture histories were built using three observations: 0, not encountered, 1, non-dispersing females (i.e. females caught in the same patch on two successive capture occasions) and 2, dispersing females (i.e. females caught in two different patches on two successive occasions). The first observation (i.e. at age 0) was here the dispersal status of the mother; subsequent events were defined by the capture and dispersal status of the female herself, see Doligez et al. (2012) and Appendix S2 for more details on the data and CR models. We considered a multistate model with two states, ‘ND: non-dispersers’ and ‘D: dispersers’, which had the same structure as the AS model. Transition probabilities to state ND and D measure philopatry and dispersal probabilities, respectively, and we explore the influence on these probabilities of the departing state, i.e. the mother's dispersal status (first transition) or the female's previous dispersal status (subsequent transitions). Families have heterogeneous size (i.e. here number of recruited sisters), from 1 to 4 with 519, 84, 12 and 2 families respectively. Multistate GOF tests for the AS model (Pradel, Wintrebert & Gimenez 2003) were conducted using program u-care. A strong overdispersion was detected and was attributed to an age effect on survival (Table 4). We therefore considered either two age classes (1 to 2 years old vs. older, denoted a2) or a full age effect (8 age classes, denoted a8) on survival. In addition to previous dispersal state (denoted f), we included the effect of age (two classes, first year vs. older) on transition probabilities to estimate separately mother–daughter resemblance (i.e. first-year transition) and further individual consistency (i.e. transitions in older ages) in dispersal propensity (Doligez et al. 2012). A model with a full age effect (8 age classes) on transitions did not perform better (see Table 5). We added the random cluster effect (here family effect) on the first-age transition probability (denoted a(1)/family) only, to focus on the impact of data sibling structure on parent–offspring resemblance level. The data set includes only one cluster level, the family and members of a given brood (sisters) are considered fully dependent. We did not account for a long-term (i.e. after 1 year) family effect here because individual experience is likely to have a major influence on breeding dispersal decisions (Doligez et al. 1999, 2002). Finally, we included the effects of time (denoted t) and state (denoted to) on recapture probability (a constant capture was also tested, denoted i). Therefore, the starting AS model was denoted Sa8.f ψ Pto.t with an overdispersion factor of 1·32.

Table 4. Results of the different components of goodness-of-fit tests for the general multistate model on the flycatcher data. df: number of degrees of freedom
Table 5. Model selection for the flycatchers data. Fixed effects: a2, two age classes; a8, eight age classes; f, dispersal state effect. a(1,3_8)+a(2).f thus stands for eight age classes with an effect of state on age class 2 only; to, dispersal status effect; t, time effect. Random effect (in italics): family. a(1)/family means that the random effect applies only to the first age class. # Id. Par. is the number of identifiable parameters of the model
ModelSurvivalTransitionRecapture#Id. Par.DevianceQAICc

Female survival decreased with age (Table 6), confirming previous results obtained without accounting for individual detection probability and suggesting actuarial senescence in this population (Sendecka 2007). Survival at age 2 was also higher for dispersing compared with non-dispersing females. Furthermore, dispersing females were less likely to be detected than non-dispersing females (Table 6); whether this lower recapture rate results from higher breeding failure, lower mating probability and/or higher temporary emigration probability in dispersers compared with nondispersers requires further investigation (Doligez & Pärt 2008).

Table 6. Comparisons of the estimates for parameters of the one-level model (i.e. including a family random effect, model 1 of Table 5) and the standard model without random effect (model 2 of Table 5) fitted to the collared flycatcher data. Note that survival at age 1 does not appear in the table because it was fixed to 1. s,i and p,i mean survival and recapture for state i. State ND = nondisperser, state D = disperser. MLE stands for maximum likelihood estimate, SE for standard error. σ is the standard deviation of the family random effect
parameterwithout random effect MLE (SE) with random effect MLE (SE)
s,ND age 20·496 (0·048)0·4973 (0·044)
s,D age 20·777 (0·030)0·778 (0·030)
s age 30·571 (0·032)0·571 (0·032)
s age 40·506 (0·042)0·506 (0·042)
s age 50·441 (0·056)0·441 (0·057)
s age 60·454 (0·086)0·454 (0·086)
s age 70·251 (0·028)0·251 (0·115)
s age 80·163 (0·172)0·163 (0·166)
ψDND age 10·215 (0·028)0·173 (0·042)
ψDND age>10·335 (0·023)0·335 (0·023)
ψNDD age 10·738 (0·024)0·777 (0·040)
ψNDD age>10·200 (0·036)0·199 (0·036)
p ,ND 0·973 (0·025)0·972 (0·025)
p ,D 0·581 (0·022)0·582 (0·022)
σ(family)NA1·012 (0·4732)

Both the effects of mother dispersal status and family on first-year transition (i.e. natal dispersal) probability were retained in the model selected with the above structure for survival and recapture probabilities (models 1 and 2, Table 5, a2.f+a(1)/family and a2.f). The null hypothesis σ = 0 could not be rejected, (P-value = 0·07) using a LRT test corrected for overdispersion by dividing the difference of deviance by the estimated coefficient of overdispersion [(5997·82-5994·96)/1·32] (Madsen & Thyregod 2010). Including the family effect did not affect the survival estimates (Table 6), but markedly increased natal dispersal probabilities (i.e. transition probability to the dispersing state between age 0 and 1): from 0·74 to 0·77 and from 0·79 to 0·83 for daughters of non-dispersing and dispersing mothers respectively (Table 6). The parent–offspring resemblance in dispersal propensity was observed in both models (including or not a family effect): in both cases, daughters of dispersing mothers were more likely to disperse than daughters of non-dispersing mothers, i.e. the transition probability to the dispersing state was higher from the dispersing than the nondispersing state. However, the difference was slightly more pronounced in the model accounting for the family effect (Table 6, ΨND→D < ΨD→D = 1 - ΨD→ND: model without family effect: 0·74 and 0·79; model with family effect: 0·77 and 0·83 respectively). After the first year, dispersing females were more likely to disperse again than non-dispersing females (0·35 vs. 0·20, respectively, Table 6), as previously found in this population (Pärt & Gustafsson 1989; Doligez et al. 1999, 2012; Doligez, Gustafsson & Pärt 2009).


Modelling dependence among individuals within each cluster

We developed a frequentist approach for CR mixed models using two levels of data, the level of individuals belonging to a cluster (level 1) and the level of clusters (level 2). This two-level approach allows considering different types of random effects with two levels. A cluster effect can also be considered where individuals within a cluster are treated as fully dependent (ρ = 1), fully independent (ρ = 0) or in between (−1 ≤ ρ≤1) or (0 ≤ ρ≤1). Note that in our implementation we used a standard GHQ, but did not, however, resort to the Laplace method (Liu & Pierce 1994) or to the adaptive GHQ (Choquet & Gimenez 2012). This is because these methods are only efficient when either the number of capture occasions or the size of the cluster is large.

Biological extensions

In the Cory's shearwater analysis, we used a two-level clustering analysis to estimate the ICC for pairs. Further extensions of the model include considering the pair structure when assessing divorce rate, and the influence of the cause of divorce on the variability of divorce rate. To do this, pairs should be monitored in the field to record the origin of divorce over time. In the Collared flycatcher analysis, we used a one-level analysis to estimate the intra-family variability on first-year transition (i.e. level of parent-offspring resemblance in dispersal behaviour) probability. The full distinction between genetic and environmental effects in the determinism of natal dispersal, while accounting for individual detection probability, however, requires the incorporation of full pedigree information into CR models (Papaïx et al. 2010). Further extensions of the model include considering both males and females, by using a sex random-effect nested within families. This would allow testing for sex differences on the variance component of parent–offspring resemblance in dispersal behaviour.


Our method allows modelling data structures with independent clusters only. Several biological situations will, however, not meet this assumption. When estimating the heritability level of a given trait, accounting for the full multigenerational pedigree of the population allows more refined analyses. In this case, Vazquez et al. (2010) used the Laplace method, whereas Papaïx et al. (2010) used a Bayesian approach with MCMC to combine the pedigree and CR data. Other cases of non-independence of clusters include times series and spatial dependence with neighbours, for which sub-table 1C shows specific correlation matrices. For example, cases of temporal or spatial autocorrelation (Johnson & Hoeting 2003; Saracco et al. 2010), where each unit is linked to its neighbours in time and/or space, may be taken into account with TOEP(2). Alternative algorithms to the GHQ should in this case be considered, as in Zhu, Gu & Peterson (2007). Furthermore, we did not consider here repeated measurements within subjects because the computation of the marginal likelihood would have been intractable. Finally, field data in natural populations will very often be structured with more than two levels, e.g. due to spatio-temporal aspects of data. For example, measurements can be nested within individuals nested within clusters, but also individuals can be nested within cluster 1 (e.g. family) nested within cluster 2 (e.g. location/year). Thus, further developments of taking clusters into account are needed to address these situations.


This work provides a solid foundation for quantifying inter- and intra-individual variation and will help to broadcast the use of two-level models applied to CR data. Nevertheless, significant issues of multilevel models applied to CR were not addressed here. Therefore, our contribution should stimulate similar efforts for developing efficient algorithms for further frequentist analyses.


This research was supported by a grant from the ‘Jeunes Chercheuses et Jeunes Chercheurs’ programme of the French ANR (ANR–08–JCJC–0028–01). We thank T. Pärt for his role in the management of the collared flycatcher long-term data base and the many field assistants that collected field data over the years. A. Sanz-Aguilar was funded by a Marie Curie Fellowship (reference MATERGLOBE). We thank the Population Ecology Group (IMEDEA, Esporles, Spain) for making the Cory's shearwater data available to us. The Spanish Ministry of Science funded the study through project CGL 2009-08298.