R. Choquet (email@example.com), J.-D. Lebreton, O.Gimenez, A.-M. Reboulet and R. Pradel, Centre d'Ecologie Fonctionnelle et Evolutive, CNRS, UMR 5175, 1919 Route de Mende, FR-34293 Montpellier cedex 5, France.
In any statistical analysis, assessing the goodness of fit of a model to the data is crucial to avoid drawing incorrect conclusions. U-CARE is a computer application that deals with the mandatory first steps of the analyses of capture–recapture data: the preparation of the data set and the assessment of the fit of a general model (Cormack-Jolly-Seber and variants for single-state data; Jolly-Move and variants for multi-state data). U-CARE implements the current state of the art in goodness-of-fit testing by incorporating components aimed at detecting the most likely departures from assumptions (Pradel et al. 2003, 2005). It is a free and stand-alone application for Windows.
The estimation and comparison of demographic rates in animal and plant populations (Lebreton et al. 1992, Gregg and Kery 2006) is currently based on following marked individuals over time and possibly space and analyzing the resulting data with adequate statistical models. Such models, in which individuals may move among states (such as geographical sites or breeding statuses) over discrete time periods, must account for the non-exhaustive detection of individuals. These so-called capture–recapture models are still evolving (Pradel et al. 2005). In their current form, they are based on two key assumptions. 1) The individuals are assumed to be independent, 2) within the history of each individual, the successive pairs of each release associated with its subsequent reencounter (if any) are assumed to be independent (Burnham 1991).
While fitting multi-state capture–recapture models can now be carried out without much difficulty with a variety of computer software applications (M-SURGE (Choquet et al. 2004), E-SURGE (Choquet et al. 2009) or MARK (White and Burnham 1999) for a frequentist approach, MARK for Bayesian inference), the examination of the above mentioned assumptions can be done in an optimal way (Pradel et al. 2003) only with program U-CARE. And yet this step is critical. For instance, if a major structural effect such as the presence of transient individuals on the study site is overlooked, survival will be severely underestimated. More generally, spurious effects will often be detected if no model fits the data. This is best seen when considering how model selection proceeds.
In the frequentist paradigm, model selection is generally based on the Akaike information criterion (AIC, (Burnham and Anderson 2002)). The AIC is equal to the deviance plus two times the number of estimable parameters. In presence of lack of fit, the deviance tends to be inflated, thus leading to the selection of over-parameterized models and potentially to erroneous biological conclusions. Moreover, the estimate of precision of the maximum likelihood estimators (MLE) will be overly optimistic. The consequences of lack-of-fit are thus too deleterious to be ignored.
A preliminary assessment of goodness-of-fit (GOF) is thus a crucial prerequisite, just as in any statistical analysis (D'Agostino and Stephens 1986). Unlike in regression, residuals are not easily available in CR to check the validity of a model. Furthermore, the omnibus approach to goodness-of-fit testing that consists of comparing expected vs observed sufficient statistics is impractical due to the sparseness of the data. A specific approach was developed (Pradel et al. 2003). By making this approach available in an easy-to-use software application, we aim at encouraging practitioners to assess the validity of their assumptions. Sound model selection, i.e. preceded by appropriate GOF assessment, is indeed becoming more and more common in the literature (Henaux et al. 2007, Jenouvrier et al. 2008) but it is still far from being systematic. We describe here the main features of program U-CARE. More details on tests and tools can be found in the user's manual (Choquet et al. 2005).
U-CARE (<http://purl.oclc.org/NET/U-CARE>) is a free, easy-to-use, stand-alone, menu-driven computer application for Windows, with a set of options for managing data and GOF tests capabilities for both single-state and multi-state capture-recapture models (Fig. 1). The two kinds of test are presented as the sum of components examining different aspects of the data through a range of contingency tables. This structure guides the choice of an appropriate model. For instance, if the only significant subcomponent of the multistate goodness-of-fit test is TEST 3G.SR, a two age-class structure for survival should be used (Pradel et al. 1995), while, if this is TEST WhereBeforeWhereAfter (WBWA), a memory model is recommended (Pradel et al. 2003). Due to historical reasons, biological relevance is particularly furthered for single-state data through the computation of directional tests. We briefly examine hereafter the following four aspects of data analysis for which U-CARE can be very useful. 1) How to prepare the data? 2) How to detect overdispersion? 3) How to correct for overdispersion? 4) How to conduct more specific tests?
1) How to prepare the data?
First, the menu TRANSFORM DATA offers some tools for selecting a data subset, such as a particular range of years or some groups (e.g. males or females, “ringed as young” or “ringed as adults”), and for recoding data, such as pooling states, groups or years. Among other features, the menu FILE makes it possible to convert data between two main formats (BIOMECO (Lebreton and Roux 1989) and MARK (White and Burnham 1999)).
2) How to detect overdispersion?
When the data are ready, the next step should be the assessment of the fit of a general model to the data at hand. For single-state data, the classical model is the model with time dependent only parameters (Cormack-Jolly-Seber) (CJS); for multistate data, this is a time and state dependent model (Arnason-Schwarz) (AS). For both of them, the probability of the encounter history of each individual is calculated conditional on its first capture. Optimal goodness of fit tests of the assumptions inherent in the CJS or the JollyMove (JMV) model, this last being a slight generalization of the AS model, are then derived based on the classical partitioning, according to sufficient statistics T, of the likelihood P(data/parameters):
The compatibility of the data with the hypergeometric distributions in P(data∣T) are tested asymptotically by contingency table chi-squared tests. These tests are organized into several interpretable components by further partitioning P(data∣T) (Pradel et al. 2005). If systematic departures from the general model can be ruled out, lack of fit may result from the lack of independence among individuals (the two members of a pair, social groups, a contagious disease …). Then, the value of the Pearson statistic is much larger than the residual degrees of freedom. One simple measure for overdispersion is defined as the ratio of the Pearson statistic X2 by its number of degrees of freedom df
This ratio can be calculated for each component individually or overall. Although there is no clear cut decision rule to decide that an observed lack of fit results solely from overdispersion, a reasonable rule of thumb is: the c ratio is greater than 1 for all components, and there is no component for which the c ratio greatly exceeds the others. This procedure, although not perfect, works generally well as long as the overall ratio does not exceed 3, sometimes 5 (Burnham and Anderson 2002); if it does, an important factor has likely been left out of the model and should be identified. On the other hand, small structural effects that result in an overall c between 1 and 3 can without much damage be assimilated to overdispersion and treated as noise (see below for how to do this in practice).
To illustrate these general ideas, we will now treat the example of the study of movements of Canada geese Branta canadensis between three wintering regions, mid-Atlantic, Chesapeake and Carolinas, between 1984 and 1989 (Hestbeck et al. 1991).
The option “GOODNESS-OF-FIT for Multi-state” of the main menu of U-CARE opens onto the goodness-of-fit test of the JMVmodel (Brownie et al. 1993). Multi-state models (Hestbeck et al. 1991) allow for transitions between states (the wintering sites for the geese), survival probabilities and encounter probabilities. In the JMV model, transitions vary by state of departure, state of arrival and time interval. Survival probabilities vary by state of departure and time interval. Encounter probabilities vary by previous state, current state and date. This model, in contrast with the better-known Arnason-Schwarz (AS) model, allows encounter probabilities to vary by previous state. However, there is currently no optimal GOF test available for the AS model. Program U-CARE contains specific tests for transience (test 3G.SR, null hypothesis H0: ‘there is no difference in the probability of being later reencountered between “new” and “old” individuals encountered simultaneously.”), trap-dependence (test M.ITEC, H0(i): ‘there is no difference in the probabilities of being reencountered in the different states at i+1 between the animals in the same state at occasion i whether encountered or not encountered at this date, conditional on presence at both occasions.”), and memory (test WBWA, H0: ‘there is no difference in the expected state of next reencounter among individuals previously encountered in the different states”). The c ratio for the JMV model is computed from these three main components plus two complementary tests 3G.SM and M.LTEC (Pradel et al. 2005).
If we were to consider that the lack of fit is due solely to overdispersion, the overall c ratio for the geese would be:
However, the c ratio relative to the WBWA component is huge: 23.6 (Fig. 2)!
3) How to correct for overdispersion?
Given the large value of the overall c ratio obtained for the Canada geese, we must try and find a more general model that takes into account the effects unveiled by the GOF test: the strong memory effect and a transient effect. The c ratio of test 3G.SR is indeed 9.8. A model studied in (Rouan et al. in press) combines these two effects. After discarding the components corresponding to the effects incorporated in this model (memory: component WBWA; transience: component 3G.SR), the new c ratio is
This value is typical of large data sets with several thousands of individuals where individual differences inherent in any animal population are inevitably detected. This ratio can be used as a variance inflation factor for the model of Rouan and derived models. There is an option in M-SURGE, E-SURGE and MARK to introduce this factor. The impact of the correction factor c is null on parameter estimates, but estimated (co)variances are multiplied by c, widths of the Wald-CI-intervals are times larger and the deviance is divided by c. This in turn affects model selection where the AIC is replaced with the QAIC (Burnham and Anderson 2002). This procedure of identifying a starting model is valid more generally and should be pursued until an acceptable value for the c ratio is reached.
4) How to conduct more specific tests?
U-CARE provides details that may give some additional information, notably the tables of observed and expected numbers. For example, Fig. 3 shows that newly marked individuals are less recaptured than previously marked individuals at occasion 5 on sites 1 and 3 but not 2. This is a general pattern over all occasions which suggests that transients may be absent from the central site 2. A more specific model could be considered. For an application to seasonal data, see Gauthier et al. (2001). U-CARE also gives directional tests (Lebreton et al. 1992) focusing more closely on the detection of transience and trap-dependence.
Eventually, the option “TOOLS” contains some specific tools, not easily found, like the ability to test for mixtures of multinomial in a contingency table (Pradel et al. 2003), with an improved algorithm.
To cite U-CARE or acknowledge its use, cite this Software Note as follows, substituting the version of the application that you used for ‘‘Version 2.3’’:
Choquet, R., Lebreton, J.-D., Gimenez, O., Reboulet, A.-M. and Pradel, R. 2009. U-CARE: Utilities for performing goodness of fit tests and manipulating cApture–REcapture data. – Ecography 32: 1071–1074(Version 2.3).