Using social structure to improve mortality estimates: an example with sperm whales



  1. Estimates of mortality are fundamental to studies of population ecology and assessments of conservation status. Mortality is frequently estimated using individual identifications by means of mark–recapture methods. These estimates become biased with heterogeneity in identification and especially if patterns of heterogeneity change with time.
  2. If animals are social, then survival may be inferred from the identifications of social partners. We produce a likelihood model for estimating mortality using such social data.
  3. We show using simulation that this method can produce less biased and more precise estimates of mortality than standard methods when individuals are almost always identified with associates, and when there are time-varying patterns of heterogeneity in identifiability. The method seems little affected by some change in social affiliations or by growth or decline in population size. SEs and confidence intervals of mortality estimates can be estimated using likelihood methods. We apply the method to data from a population of sperm whales (Physeter macrocephalus) in the eastern Caribbean, obtaining estimates that are more precise and probably less biased than those from other methods.
  4. The method should be useful in improving mortality estimates for social species.


Mortality (or survival, its inverse) is one of the two key elements of population biology, along with reproduction. Thus, estimates of mortality are vital for assessing the status and potential increase in population. For wild vertebrates, survival or mortality is usually estimated either from age distributions – ‘life tables’ – or records of time series of observations of individually identified animals (Murray & Patterson 2006). However, in some circumstances, usefully precise estimates of mortality are hard to achieve. For instance, when animals are nomadic with large ranges, and so have loose, variable or unpredictable ties to any geographical area, then the absence of an animal from a study area could be due to either mortality or movement into a less sampled range. In such cases, estimates of survival from individual identifications using mark-recapture methods become confounded with temporary or permanent emigration, and consequently imprecise, and perhaps biased.

The ‘robust’ mark-recapture design in which short sampling sessions are embedded within longer separated sampling periods is one way to address such emigration (Pollock 1982). However, the robust design makes several assumptions that may not be realistic in some situations: the population is assumed closed within sampling periods, and movement into and out of the study area is at most a one-step Markov process, with the probability of moving out of the study area being constant for all animals inside it, and the probability of moving into the study area being equal for all animals outside it. For many animals, perhaps especially nomadic ones, whose forays into different areas can be either short or long term, these assumptions are problematic. Furthermore, the structure of the robust design, with its interspersed short samples, may not be logistically feasible.

An alternative approach is based upon sociality. If animals have strong social affiliations, then potentially these can be used to improve assessments of the fate of a non-sighted animal. In particular, when the animals form persistent social units that largely travel together, then the presence of the unit without a particular animal can be a strong indication of mortality. If the units are entirely closed, then good inferences about mortality are obtained quite simply from missing observations of unit members. This approach has been used very effectively when studying the demography of ‘resident’ killer whales (Orcinus orca L.) in the eastern North Pacific (Olesiuk, Bigg & Ellis 1990). However, temporary or permanent migration between units, or unit fission, may need to be taken into account in many cases.

These are all issues with the sperm whale (Physeter macrocephalus L.), one of the most ecologically and economically important of mammal species (Whitehead 2003). Despite considerable work on the population biology of the species, especially during the last phase of commercial whaling in the 1970s and early 1980s, estimates of mortality are extremely poor (Chiquet et al. 2013). However, female and immature sperm whales travel in fairly permanent social units (Whitehead 2003), albeit with occasional interunit movement (Christal, Whitehead & Lettevall 1998), so there may be potential in using the dynamic membership of identified units to improve our mortality estimates. It is this potential that we explore.

We develop a fairly general method that does not include a particular model of social structure. It assumes that individuals travel with, and tend to be identified with, a set of associates and that each time unit they have a probability of changing associates, as well as a probability of mortality. We use likelihood models to estimate these probabilities, as well as their standard errors (SEs). We compare the performance of the method with more standard mark-recapture methods using simulated data. We use the simulations to examine bias, error, as well as the potential effects of social fluidity and systematic changes in population size. We also use the method to estimate mortality for a sperm whale population from the eastern Caribbean.

Materials and methods

The General Population and Sampling Model

We assume that all members of the population have a uniform instantaneous rate of mortality at α per time unit and of changing their set of social associates at β per time unit. The population itself can be increasing, stable or decreasing with a constant rate of increase or decrease. Sets of associates are assumed transitive so that if in any time unit i and j are associates, and j and k are associates, then i and k are also associates.

Identification of individuals takes place during short (compared with the time units of mortality and associate change) field seasons that occur at most once per time unit. There is also available a measure of identification effort for an animal or a set of animals in each field season. (In the sperm whale illustration, we use the number of days on which one or more of these animals were identified, but there are other possibilities.) During the field season in time unit y, the probability that an individual, i, is identified is zero if not alive. If alive and with known associates, the probability of identification is P(δi,y) where the effort directed to i's known associates in field season y is δi,y, and, more generally, q(y) if its associates are unknown. We assume that when an individual is identified in any time unit, a set of associates can be determined from the identification record, although this identified set of associates may be incomplete.

Likelihood of Data Set

In this subsection, we show how to approximate the likelihood of the data set. We assume there are no time units in which individuals are identified but without associates (or simply omit these data). We also ignore the possibility of individuals leaving a set of associates and then returning to be with them between two identifications.

Consider each interval between successive identifications of an individual i: yi,t to yi,t+1. If we condition on its observation in yi,t, the probability that it is next identified in time unit yi,t+1, and it has the same associates in both time units, is:

display math(eqn 1)

On the right of the equation, the first multiplicative term is the probability that the individual survives over the time interval and does not switch associates, the second that it is identified in time unit yi,t+1, and the third a product of the probabilities that it is not identified during the intervening time units.

Again conditioning on its observation in yi,t, the probability that it is next identified in time unit yi,t+1, but it has different sets of associates in the two time units, is (c is the time unit when it switched associates):

display math(eqn 2)

Here, the first term on the right is the probability of survival, the second the probability that it switches associates, and the third that it is identified in time unit yi,t. This is multiplied by a summation, over c, of the product of the probabilities that it does not switch associates before c and is not identified before or after c.

Now, consider the period from the last time unit in which i was identified, f(i) until the end of the study in time unit T. If it does not die, and does not switch companions, the probability of this sequence (removing the second multiplicative term from eqn 1) is:

display math(eqn 3)

If it does not die, but does switch companions in time unit c, the probability (removing the third multiplicative term from eqn 2) is:

display math(eqn 4)

If it does die, in time unit d, but does not switch companions, the probability is:

display math(eqn 5)

Here, the first term is the probability of mortality. This is multiplied by a summation, over d, of the product of the probabilities that it does not die or switch associates before d and is not identified before d.

If it does die, in time unit d, and does switch associates in time unit c (<d), the probability is:

display math(eqn 6)

This combines eqns 2 and 5.

Then, the total probability of individual i not being identified after time period f(i) is, using eqns 3–6:

display math(eqn 7)

The log-likelihood of the data, conditioning on when individuals were first identified and when associates were identified, is then (using eqns 1, 2 and 7):

display math(eqn 8)

Here, the first summation is over individuals, and the second over each time interval between successive identifications of the individual, including the interval from the last identification to the end of the study (if the individual was not identified during the final time unit).

Then, α, β and the parameters that determine the identification functions P and q can be estimated by maximizing the log-likelihood, L. SEs and confidence intervals for these parameter estimates can be estimated from the shape of the likelihood functions.

We note that, because of the social structure of the population, the likelihoods of the sighting histories of the different individuals are not independent. This should not much bias the parameter estimates (Whitehead 2001), but may invalidate confidence intervals derived from the support function. We examine these issues using simulation.

For identification functions, we used:

display math(eqn 9)

Here, f is the probability of identifying an individual per unit of effort directed at its associates (in our case, the number of days in that field season during which one or more of its associates were identified), assuming effort units are independent. We examined other, more complex functions for P, but none fitted our sperm whale data as well (as indicated by AIC) as this function. We also used:

display math(eqn 10)

where n(y) is the total number of animals identified in time unit y, and the maximum is over all time units {y′}. This assumes that, if there is no information on social associates, the probability of identifying a particular individual in a time unit is proportional to the total number of individuals identified in that time unit. Thus, f and g are the identification rate parameters estimated by maximum likelihood. Both are constrained to be in the interval [0,1].

We also consider a model, with an additional parameter, in which the population has an annual exponential growth of r (which could be negative giving an exponential population decline), so modifying eqn 10 for an exponentially increasing/decreasing population:

display math(eqn 11)

Simulated Data Sets

To examine the performance of our proposed ‘sociality’ estimator of mortality, we simulated populations of social animals. We call the time units of the simulations ‘years’ within which there are ‘days’. The animals occur in social groups and may die (at a rate of α per year) or change groups (at a rate of β per year). Each death is replaced by a new individual whose group membership is chosen randomly from those of the animals still alive (so, the overall population size, but not group sizes, is stable). When animals switch groups, the new group is chosen with equal probability from all other groups in the population. Sampling occurs on different days (which are the units of effort used in calculating δ) within time units, and the probability that an individual is identified on a day when its group is present is f. Two individuals are considered to be associated if they are identified from the same group on the same day.

We then simulated four sampling schemes (illustrated in Fig. 1):

Figure 1.

Illustration of sampling schemes. The areas of the discs indicate the probability of being able to identify any of the 20 groups (rows) over 10 consecutive time periods (columns) for four sampling schemes (blocks).

  1. Random variation in group identification rates over years (‘group–year’): Each group, u, in each year, y, is given an identifiability, v(u,y), a random variable chosen from the uniform distribution in the interval [0,20]. The number of days that group u is identified in year y is a Poisson-distributed random variable with mean v(u,y). These data fit the model assumed by the standard mark–recapture methods of estimating mortality (except for dependence in sampling rates among group members).
  2. Variation in group identification rates (‘group’): Each group is given an identifiability, v(u), a random variable chosen from the uniform distribution in the interval [0,20]. In each year, y, the number of days that the group is identified is a Poisson-distributed random variable with mean v(u). This variation in identifiability among groups, and thus individuals within the population, is a situation that the mixture methods of Pledger, Pollock & Norris (2003) attempt to model.
  3. Variation in group identification rates plus group-specific trends (‘group–trend’): In each year, y, the number of days that group u is identified is a Poisson-distributed random variable with mean v(u)·(1+2w(u)·(y/(− y(0)) − 0·5)), where v(u) is uniform in the interval [0,20] and w(u) is uniform in the interval [0,1]. This scenario, in which members of a particular group may be more or less relatively identifiable at the beginning or end of the study, violates the assumptions for standard, mixture and robust models.
  4. Variation in group identification rates plus strong group-specific trends (‘group–trend3’): In each year, y, the number of days that group u is identified is a Poisson-distributed random variable with mean v(u)·(1+2w(u)·(y/(− y(0)) − 0·5))3, where v(u) is uniform in the interval [0,20] and w(u) is uniform in the interval [0,1]. This is a more extreme version of scenario 3 and severely violates the assumptions for standard, mixture and robust models.

These simulations were used to examine several issues:

Compare methods of estimating mortality

Initially,the parameters of the simulation roughly followed those of the eastern Caribbean sperm whale population (Gero et al. in press): 10 years of study (time units), 20 groups, initial group sizes (year 1) Poisson distributed with mean 10, mortality rate α = 0·03 per year, no switching of groups (β = 0). To give different rates at which individuals were identified with social associates, the model was run with a range of values of the probability of identifying animals on days when their group was present (f = 0·1, 0·14, 0·2, 0·3 or 0·45), which correspond, roughly, to the proportion of animals identified with associates in each year being in the range 0·2–1·0. For each of twenty runs for each of the four sampling scenarios and five values of f, we calculated the true mortality in the data set and estimated mortality using a standard likelihood model allowing mortality with two parameters (population size and mortality), Pledger, Pollock & Norris (2003) mixture model allowing heterogeneity of identifiability, as modified by Whitehead & Wimmer (2005), as well as the sociality model introduced in this article. For each run, and each population estimation technique, we calculated the percentage bias in estimated mortality 100·(estimated mortality − true mortality)/true mortality and plotted this bias against the proportion of times animals were identified with associates (excluding the few occasions when an individual was identified in a year without associates, but two or more associates from its previous identification were identified separately). To investigate the generality of our results, we also carried out these simulations with different sets of input parameters (columns three and four in Table 2; five runs with each set of parameters).

Examine effects of group switching

In these runs, parameters were as just outlined, except we only used f = 0·45, and rates of switching groups of β = 0·01, 0·02, 0·03 and 0·04 per year were introduced. Using 20 runs for each set of parameters, we examined how group switching changed the estimates of mortality using the sociality model, as well as the performance of the model in estimating β.

Examine effects of growth in population size

In these simulations, we were interested in the effects of an increasing or decreasing population on the estimates of mortality. The population growth rates used were −0·03, 0, 0·03, 0·06 and 0·09 per period. Thus, instead of replacing deaths with the same number of new individuals at each time period, we replaced them with a number allowing the population to grow or shrink at the given exponential rate. (The number of new animals in period y was thus n(0)·er(yy(0)) − ns(y) where r is the trend and ns(y) is the number of survivors after mortality in period y.) In these runs, we estimated mortality using versions of each model that included a growth term, as well as the standard, heterogeneity and sociality models.

Standard error estimates

The social relationships within these populations theoretically invalidate the independence assumption of the likelihood calculations, and thus, the validity of measures of confidence calculated using it. However, it is only the sighting histories that are dependent, not the mortalities themselves. Thus, we used simulation to check the possibility that SEs for parameter estimates can be estimated from the information matrix (the inverse of the negative second derivative of the likelihood function at the maximum likelihood estimator). For three sets of parameters (those used to compare the methods of estimating mortality, above, but only using the group–trend3 sighting scheme and f = 0·45), we compared the standard deviation of sociality estimates of mortality from 1000 runs of the model (2000 runs in one case), with the mean of the SE estimates from each run calculated using the information matrices (square root of the corresponding diagonal elements).

We also examined the utility of estimating confidence intervals from the likelihood support function, where the likelihood of the data for a particular value of a particular parameter (optimizing over the other parameters) drops below 1·92 (this is half the 95% percentile of the cumulative distribution function for the chi-squared distribution with 1 d.f.) from the maximum likelihood (Venzon & Moolgavkar 1988). In this case, the true confidence interval of the estimation was estimated by the range between the 2·5% and 97·5% percentiles of the estimates of mortality for 1000 (or 2000) runs of the simulation with the original set of parameters (only using the group–trend3 sighting scheme and f = 0·45). This span was compared with the mean upper and lower 95% confidence intervals estimated by the likelihood support method from the first 100 of these runs.

Sperm Whale Data

The sperm whale data come from photoidentification studies of the neighbouring islands of Dominica and Guadeloupe in the eastern Caribbean carried out by teams from Dalhousie University, the International Fund for Animal Welfare, the Ocean Research and Exploration Society, and l'Association Evasion Tropicale between 1984 and 2012, although the great majority (89%) of the identifications come from research undertaken by teams from Dalhousie University between 2005 and 2012. Photo-identification followed the methods described by Arnbom (1987). We omitted all identifications of young calves, adult males and photographs with quality Q < 3 (as defined by Arnbom 1987).

We use time units of one calendar year, and units of effort within field seasons are the number of days on which animals were photoidentified. We considered animals to be associates during a year if identified within 2 h of one another; however, we also made estimates with a 12-h cut-off (i.e. animals associated if identified on the same day, as identification only occurred between 06:00 and 18:00). As with the simulation studies, we estimated mortality using the standard, heterogeneity and sociality methods, as well as versions of these models with a population trend added. SEs and confidence intervals were estimated from the information matrix and shape of the support function, as with the simulated data.


Comparison of Methods of Estimating Mortality

The performance of the different methods of estimating mortality is illustrated in Fig. 2. As expected, with the first ‘group–year’ sampling scheme where there are no systematic differences between groups in identifiability, the standard mark-recapture model performs well, estimating mortality with little bias. The mixture model including heterogeneity in identifiability performs similarly. However, the other sampling schemes that include systematic differences in identifiability between groups lead to substantial overestimates of mortality by the standard model. With the ‘group’ sampling scheme, where the differences in group identifiability are fixed, the heterogeneity model does a good job of reducing bias. But when there is a temporal trend in the group-specific bias, as in the group–trend and especially the group–trend3 sampling schemes, then both the standard and heterogeneity models have a substantial positive bias. In contrast, the sociality model that we have introduced gives highly positively biased estimates when there are low rates at which animals are identified with associates, but extremely accurate, and almost unbiased, ones when this rate is above about 0·80. This pattern varied little between sampling schemes. Runs with different parameters gave generally similar results (see Figs S1 and S2). Combining results for all runs with different combinations of parameters (but without changes of group membership), and individuals identified with associates at least 80–96% of the time (to correspond roughly with sperm whale results, see below), produced the biases and root-mean-squared errors displayed in Table 1. The standard model is best when its assumptions hold, but the sociality model performs consistently well in a wide range of conditions and much better than the standard model when its assumptions were violated. The heterogeneity mixture model deals with fixed differences in identifiability between groups, but does not perform well when these vary with time.

Table 1. Mean percentage bias and root-mean-square error (RMSE) in the estimation of mortality using simulated data in which individuals were identified with associates during an average of 80–96% of the samples. Results are summarized for each of four sampling schemes (rows) and three estimation models (columns)
Estimation model n StandardHeterogeneitySociality
Sighting scheme%Bias (%RMSE)%Bias (%RMSE)%Bias (%RMSE)
Group–year424·7 (13·7)3·2 (14·1)9·6 (18·7)
Group4820·4 (26·9)8·1 (16·7)8·0 (18·7)
Group–trend4434·5 (43·0)24·0 (31·6)15·4 (23·0)
Group–trend35662·2 (70·4)53·6 (61·7)12·4 (22·3)
Figure 2.

Percentage bias in estimating mortality using three different methods (colours) on data sampled in the four ways illustrated in Fig. 1 (panels), plotted against the proportion of occasions individuals were identified with associates (x axis). Each dot represents one run of the simulation programme, and curves fitted using the cubic spline are shown for each combination of sampling scheme and mark-recapture method. Other parameters for these simulations are a mean group size of 10, 10 years of study (time units) during each of which each group was identified on an average of 10 days (although the distribution of these probabilities varied with sampling scheme), 20 groups, mortality rate 0·03 per year, and no switching of groups. There were five runs with each of the following values of the probability of identifying animals on days when their group was present: f = 0·1, 0·14, 0·2, 0·3 or 0·45. Results of two sets of runs with other parameters are shown in the Figs S1 and S2.

Effects of Group Switching

The effects of group switching on the estimates of mortality by the sociality model are indicated by the results of the simulations plotted in Fig. 3. While the estimates of mortality become rather less precise as the rate of group switching approaches and then exceeds the mortality rate, the bias changes little. The estimated rates of group switching from the sociality model are generally positively biased within the range of parameters that we explored, but the percentage bias decreases with the rate of switching (Fig. 4).

Figure 3.

Percentage bias in estimating mortality using the sociality method on data sampled in the four ways illustrated in Fig. 1 (panels), plotted against the rate at which individuals switched groups. Each dot represents one run of the simulation programme, and curves fitted using the cubic spline are shown. Other parameters for these simulations are as in Fig. 2 except we only used f = 0·45.

Figure 4.

Percentage bias in estimating the rate at which individuals switch groups using the sociality method, using the same simulations illustrated in Fig. 3 (excluding those with no switching). Each dot represents one run of the simulation programme, and curves fitted using the cubic spline are shown.

Effects of Growth in Population Size

Increases or decreases in population size introduced biases into the standard and heterogeneity methods of estimating mortality (Fig. 5). Adding growth parameters to these models reduced this bias, as they are supposed to do. In contrast, estimates of mortality from the sociality model were not noticeably affected by population increase or decrease, and adding a growth term to this model (eqn 11) gave little or no benefit. Estimates of population growth rate from the mortality plus growth, and heterogeneity plus growth, methods were useful and little biased, while that from the sociality plus growth method did not give useful estimates of population growth (Fig. 6).

Figure 5.

Percentage bias in estimating mortality using six different methods (colours) on data sampled in the four ways illustrated in Fig. 1 (panels), plotted against the rate at which the population was shrinking or growing (slightly jittered so the dots do not overlay one another too much). Each dot represents one run of the simulation programme, and curves fitted using the cubic spline are shown. Other parameters for these simulations are as in Fig. 2 except we only used f = 0·45.

Figure 6.

Estimates of the rate at which the population is declining or increasing using three different methods (colours) plotted against the true rate (slightly jittered so the plots do not overlay one another too much), using the same simulations illustrated in Fig. 5. Each dot represents one run of the simulation programme, and the dashed lines represent unbiased estimation. The dashed line represents the ideal situation when the estimated rate equals the two rate, and linear regressions are shown for the mortality and heterogeneity models.

Standard Error Estimates

For each of our three sets of parameters, the standard deviation of sociality estimates of mortality from 1000 (or 2000) runs of the model was similar to the mean of the SE estimates from each run derived from information matrices, and the 2·5–97·5% span of 1000 (or 2000) mortality estimates was similar to the mean-estimated 95% confidence intervals (Table 2).

Table 2. Efficacy of likelihood estimates of SE and 95% confidence intervals of mortality estimates for three sets of parameters. The distributions of the mortality estimates from the different runs are compared with the mean of the estimates of SE and 95% confidence intervals from likelihood. Each set of runs (columns) used the group–trend3 sampling scheme and f = 0·45
 Model parameters as in simulations of
Figure 1Figure 1AFigure 2A
Simulation runs100020001000
Mean group size10515
No. of groups203010
No. of periods101510
True mortality0·030·080·15
Mean(estimated (SE))0·00490·00780·0164
2·5–97·5% estimates0·0218–0·04150·0718–0·10330·1356–0·1900
Mean(95% CI)0·0223–0·04130·0727–0·10290·1320–0·1876

Sperm Whale Data

The sperm whale data set contained 7626 identifications of 267 individuals. Sperm whales were identified with associates 87% of the time (89% of the time with a 12-h cut-off for association). The sperm whale data support heterogeneity of identification within the population, as well as increasing trend in the population size (Table 3), as found by Gero et al. (2007) for their analysis of the 1995–2006 data. The sociality model gives a similar, but slightly more precise, estimate of mortality compared with the heterogeneity model. In contrast to the simulations, the sociality plus trend model appears to give a slightly better fit than the sociality model alone, perhaps because the real sperm whale data span a longer period than the simulated data, although it is less precise. Estimates of mortality with the 12-h cut-off for the association between two animals are less precise than those with the 2-h cut-off, probably because the 2-h cut-off represents the true social dynamics in this population more accurately. The sociality model with the 2-h cut-off estimates sperm whale mortality at 0·044 per year (95% CI 0·028–0·065).

Table 3. Estimates of mortality (per year) of sperm whales in the eastern Caribbean Sea, using a simple likelihood model, Pledger, Pollock & Norris's (2003) mixture model incorporating heterogeneity in identifiability, and the sociality model (with association defined based on 2- or 12-h maximum difference in identifications), as well as versions of these models that incorporate a trend in population size. Also shown are estimates of the SEs of the estimates from the information matrix, 95% confidence interval from likelihood support function and the AIC. Because of the differences in the data used, AICs from the sociality as well as the sociality plus trend models are not comparable with those from other models, nor can we compare AICs from sociality models with different definitions of association (as indicated by the horizontal lines separating these AIC values)
ModelMortality (per year)SE95% CIAIC
Standard plus trend0·09070·01360·06550·11861614·1
Heterogeneity plus trend0·02590·00990·00910·04781496·7
Sociality (2 h)0·04400·00940·02790·0646 math formula
Sociality plus trend (2 h)0·03720·01100·00500·05851039·5
Sociality (12 h)0·05670·01060·03800·0791 math formula
Sociality plus trend (12 h)0·04810·01270·00860·07181252·4


The sociality method that we have introduced is useful, in the sense that it produces superior mark-recapture estimates of mortality to standard methods in some circumstances. These circumstances are that individuals have long-lasting social relationships that lead to repeated associations, that usually when an individual is identified, one or more of its associates are also identified, and that sets of individuals have systematic differences in their identifiability that may change over time. Individual variation in identifiability leads to a positive bias in standard mark-recapture methods of estimating mortality, and if this variation in identifiability trends with time, the mixture models that incorporate heterogeneity in identifiability (Pledger, Pollock & Norris 2003) do not remove much of the bias. In contrast, the sociality model, when its conditions hold, is nearly unbiased. This is intuitive, as the sociality model is fundamentally assessing mortality on the basis of whether or not an animal is identified when its associates are identified and so complex patterns of heterogeneity in identifiability between groups of associates are factored out.

The model that we have used is not the only way that sociality could be incorporated into the studies of survival. For instance, in some circumstances, group membership could be inferred reliably and entered into the population model directly, and the probability that an animal is not identified could be modelled in a range of ways that better approximate the perceived manner in which the availability of social groups of animals interacts with the effort put into identifying them. We have tried to introduce a fairly generic method that will approximate a range of situations in which animals form long-lasting social groups, move between groups occasionally, and have variable patterns of availability with the identification process that is itself variable with time. In some preliminary explorations, the method that we have presented here performed similarly to more constrained methods tailored directly to specific situations.

However, the method should not be used as a ‘black-box’ on new data sets with very different characteristics to those investigated in our simulations. Because the method is quite computer intensive, we could only investigate its performance over a small region of parameter space, and sociality is typically multidimensional. It may be that in other situations different likelihood models will be more appropriate. However, our work does show that social information can help reduce bias and improve the precision of mark-recapture estimates of mortality.

Estimates of mortality have little utility without some kind of measure of precision. Our simulations indicate that SE and confidence interval estimates calculated using the shape of the likelihood function are reasonably good. The nonparametric bootstrap, a frequently used and often recommended method of estimating confidence in parameter estimates (Efron & Gong 1983), is not applicable for social data, because, with resampling, there will be animals with identical sighting histories, and this will artificially boost social relationships (Whitehead 2008). The parametric bootstrap, which requires an explicit model of the process being modelled, will be hard to implement with complex societies and sighting schemes. The jackknife method, in which ‘pseudovalues’ are constructed by omitting segments of the data in turn (Efron & Stein 1981), is one method likely to work with the kinds of data that we are addressing. Preliminary analyses, omitting either individuals or groups of individuals formed using Newman's (2006) eigenvector modularity method, suggested that the jackknife method produces acceptable measures of confidence in the sociality estimates of mortality. However, they were not clearly better than the simpler and less time-consuming likelihood measures used above.

Our development of this method was sparked by the need to obtain better estimates of mortality for female and immature sperm whales, which have been notoriously imprecise (Chiquet et al. 2013). The Scientific Committee of the International Whaling Commission estimated annual mortalities of 0·066 per year for males, 0·055 per year for females, and 0·093 per year for infants, using the age or length distribution of the catches (International Whaling Commission 1982). These estimates thus include whaling mortality and are compromised by uncertainties in the standard method of ageing sperm whales from tooth sections. More recent mark-recapture estimates for female and immature sperm whales using photoidentifications are very imprecise: 0·021 per year (SE 0·066 per year) in the eastern tropical Pacific (Whitehead 2001) and 0·094 per year (95% CI 0·035–0·169 per year) in the eastern Caribbean (Gero et al. 2007). The sociality model does give better estimates (Table 3) although they are not very much better than those from the heterogeneity model employed on the same data set. That the two methods produce quite similar estimates is reassuring. In the case of the sperm whale, the greatest benefits of the new method are probably yet to come. There is a large data set of photoidentifications of sperm whales from the eastern tropical Pacific, the more recent parts currently being analysed, where the population is considerably larger than that in the eastern Caribbean. However, there has been considerable redistribution by the animals in this population over the past two decades over thousands of kilometres, and patterns of photo-identification effort are very different in different study areas scattered over their large range (Whitehead et al. 2008). This invalidates the standard and heterogeneity methods of estimating mortality, but the sociality method should work. This new method will give more precise and less biased estimates of a much needed population parameter, leading to better management and conservation of wide-ranging species, which can be difficult to sample representatively.


We thank an anonymous reviewer for constructive comments. We are most grateful to Marina Milligan for her work on the Caribbean sperm whale data set, to all those who collected the sperm whale data, as well as to those organizations that contributed data: the International Fund for Animal Welfare, especially Jonathan Gordon and Carole Carlson; the Ocean Research and Exploration Society, especially the late George Nichols; and l'Association Evasion Tropicale, run by Caroline and Renato Rinaldi.

Data accessibility

A MATLAB script that performs the sociality analysis is attached as online Supporting Information.