#### SELECTION OF STUDY SITES

This study utilized data collected from cross-sectional surveys of *M. bovis* infection in feral ferrets at five sites in New Zealand (Fig. 2). Sites were primarily selected for survey on the basis that *M. bovis* occurred in wildlife, as inferred either from previous wildlife surveys or from tuberculin testing of cattle herds. Sites were deliberately chosen to sample a range of possum and ferret densities. In the case of possum density, this ranged from low at Lake Ohau in the Mackenzie Basin, which has a naturally sparse population of possums, to moderate at Scargill Valley in North Canterbury, to high at Awatere Valley in Marlborough and the Castlepoint and Cape Palliser study sites in the coastal Wairarapa. Possum and ferret abundance are in general inversely related (Caley 1998). In New Zealand, ferrets occur at highest densities in semi-arid regions, where their principal prey species (European rabbits *Oryctolagus cuniculus* Lin. 1758) are most abundant, whereas possums tend to be more abundant in areas of higher rainfall. For sites that were subjected to repeated surveys (e.g. Castlepoint), the number of ferrets removed in each survey was considered insignificant, hence data from all surveys were included for analysis. For the purpose of analysis, the magnitude of unmeasured factors considered possibly to influence the force of *M. bovis* infection in ferrets (specifically site) was assumed to be constant over time. Becker (1989) points out that, from a single cross-sectional survey, any effects of age and time on λ are confounded, so whether λ is age-dependent [i.e. λ(*a*)], time-dependent [i.e. λ(*t*)] or both [i.e. λ(*a*,*t*)] can not be determined.

#### DIAGNOSIS OF m. bovis INFECTION

From each ferret caught, the jejunal (mesenteric), both caudal cervical (prescapular) and both retropharyngeal lymph nodes were collected. All other major lymph nodes and organs were also examined, and a portion of any potentially tuberculous lesion added to the lymph-node pool, which was stored frozen. Diagnosis of *M. bovis* infection in ferrets was made from bacterial culture of the pooled lymph-node samples for each animal, whatever its apparent disease status. There is an unknown period between infection and positive diagnosis based on the mycobacterial culture of pooled lymph nodes. However, because of the high sensitivity of modern mycobacterial culture techniques, and the collection of the entire lymph nodes considered to be the sites of predilection, this period can be considered negligible (G. de Lisle, unpublished data).

#### MODEL SPECIFICATION

Mathematical models were used to represent the various hypotheses of disease transmission among ferrets, and model fit used as a method of choosing the best hypothesis (or best working model) (Burnham & Anderson 1998). For each hypothesis, we consider candidate mathematical models first with, and then without, disease-induced mortality (α). Few data exist on the disease-induced mortality rate of *M. bovis* infection in feral ferrets. Lugton *et al*. (1997) document a radio-collared feral ferret surviving at least 1 year with tuberculosis infection, and suggest that the time of survival after infection probably ranges from several months in a few cases, to in excess of a year in many cases. For badgers, the most closely related species (also Family Mustelidae) for which data are available, disease-induced mortality arising from *M. bovis* infection is moderate, although highly variable (Wilkinson *et al*. 2000).

For α equal to zero, H1 may be modelled by the exponential model (Lee 1992), by only allowing transmission during the suckling period (*s*) (hazard function 1 and model 1.1; Table 1). For α non-zero, H1 may be modelled based on the model of Cohen (1973; see below) (model 1.2; Table 1).

Table 1. Details of each hazard function (hazard) in terms of the age-specific force of infection (λ(*a*)) for various age classes (age), and the age-specific disease-prevalence model without (α = 0) and with (α > 0) disease-induced mortality. The suckling period is *s*, and the guarantee time *g*. Model numbers are given to the right of brackets Hypotheses H2, H3, H4 and H5 may be modelled by the exponential model, modified to allow for a period when ferrets are not exposed to infection, here termed *g* (hazard function 2; Table 1). This is analogous to the concept of a guarantee time in survival analysis (Lee 1992). In epidemiological studies it commonly arises when individuals are protected from disease for a period after birth due to the presence of maternal antibodies (for mycobacterial infections such as *M. bovis*, immunity is cell-mediated only, hence there is no maternally derived immunity). The value of *g* was set to specify each relevant hypothesis (10, 2·5, 1·75 or 0 months for H2, H3, H4 and H5, respectively). For α equal to zero, the age-prevalence solution is model 2.1 (Table 1). For non-zero α, the age-specific prevalence for hypotheses H2–H5 can be obtained from the solution of Cohen (1973), although modified as before to include the term *g* and omitting the disease latent period term (model 2.2; Table 1).

To represent the candidate hazard functions H6–H12 (Fig. 1), the hazard function needs be able to take different values (not just 0 or λ) for anything up to 3 age classes. For hypotheses with a single step in the hazard function at *g*_{1} (H7–H9) this is represented by hazard function 3 (Table 1). For α equal to zero, the age-specific prevalence for H7–H9 is modelled as model 3.1 (Table 1). For non-zero α, the resulting age-specific prevalence for hypotheses H7–H9 can be obtained from the solution below (model 4.2) with *g*_{1} set to zero.

Hypotheses H6, H10, H11 and H12, which have two steps in the hazard function (say at *g*_{1} and *g*_{2}), are modelled by hazard function 4 (Table 1). For α = 0, the age-specific prevalence for H6 (setting λ_{1} = 0), H10, H11 and H12 (setting λ_{2} = 0) is model 4.1 (Table 1). For non-zero α, there are considerable complications in finding solutions of the age-specific prevalence. For reasons made clear in the Results, solutions with a non-zero force of infection up until the age of weaning were not needed. For the piece-wise constant exponential model with λ_{1} = 0 (H6), the age-specific prevalence including disease-induced mortality (G. Fulford, personal communication) is given by model 4.2 (Table 1).

As an alternative to the piece-wise smooth exponential models being used to account for λ varying with age, the exponential models can be generalized to the Weibull model (Lee 1992). The Weibull model contains an additional shape parameter γ, with λ now termed a scale parameter. Lambda, now age-dependent and including a guarantee time, *g*, is given by hazard function 5 (Table 1), and increases with age when γ > 1 and decreases with age when γ < 1; hence the Weibull hazard may model an increasing, decreasing or constant λ. The age-specific solution is given by model 5 (Table 1). Setting γ equal to 1 simplifies the hazard function to the exponential case [λ(a) = λ]. The flexibility of the Weibull model, modified to include a guarantee time, enables it to represent broadly all the hypotheses except those that are U-shaped (H11 and H12). It does not, however, represent any of the hypotheses explicitly. There is no explicit solution for the Weibull model with a non-zero disease-induced mortality rate (G. Fulford, personal communication).

Finally, we fitted the polynomial hazard function (of order *k*) (model 6; Table 1) of Grenfell & Anderson (1985), which has the flexibility to fit many shaped curves, including those that the Weibull model is unable to fit. Grenfell & Anderson (1985) allowed λ(a) to be zero below a lower threshold age, which is analogous to the guarantee time used in this study. As for the Weibull model, analytical solutions for the polynomial models including disease-induced mortality do not exist (G. Fulford, personal communication), except for the case with *k* = 0 (equivalent to the model 2 version of the exponential model).

#### MODEL FITTING

All models were fitted by maximum likelihood. The likelihood (*L*) to be maximized was the binomial likelihood (equation 1), where *p*_{i} is the modelled probability of infection, and *y*_{i} is the number of *M. bovis*-infected individuals out of a total *n*_{i} in each age class *i* (*m* in total).

- (eqn 1)

Maximizing *L* was achieved by numerically minimizing the negative log-likelihood [ln(*L*)] with respect to (a separate estimate for each site), gender effect (single multiplicative factor) and (if applicable), after substituting for *p*_{i} from the relevant model. The gender effect acted only on the elevation (cf. slope) of the hazard function in question. The slope was considered a constant for all sites and sexes (e.g. γ held constant for Weibull models, and sex and site factors multiplied on *b*_{0} only for the polynomial hazard model). When undertaking numerical minimization, biological ( and were constrained to be positive for all models) and hypothesis-generated bounds were placed on the values for parameters. Hypothesis-generated bounds were: H6, _{2} ≤ _{3}; H7, _{1} ≥ _{2}; H8, _{1} ≤ _{2}; H9, _{1} ≤ _{2}; H10, _{1}≤ _{2} ≤ _{3}; H11, _{1} ≥ _{2} ≤ _{3}. S-PLUS (Data Analysis Products Division, MathSoft, Seattle, WA) was used for numerical minimization.

#### MODEL SELECTION

Akaike’s information criterion corrected for sample size (AIC_{c}) (Burnham & Anderson 1998) was used to compare models. Burnham & Anderson (2001) suggest that models having ΔAIC_{c} (difference in AIC_{c} scores) within 1–2 of the best model have substantial support. Models within about 4–7 of the best model have considerably less support, while models with ΔAIC_{c} > 10 have essentially no support. Plots of Pearson residuals (Collett 1991) were used to assess model fit further. For the chosen model, confidence intervals for were calculated by profile likelihood (McCallum 2000). Confidence intervals for estimates of λ were not estimated here. Rather, model 2.1 was used to test for the relative differences between sites and sexes in , as this model can be fitted as a generalized linear model (GLM), making estimates of standard errors for parameters relatively straightforward (see below).

As this study aimed to estimate the absolute rate at which ferrets encounter *M. bovis* infection, Cox’s proportional hazards model (Cox 1972) was not considered, despite its popularity for many epidemiological investigations. Cox’s model is primarily concerned with estimating the proportional effects of different factors on the hazard rate, rather than the baseline hazard function, which in the current study is the variable of intrinsic interest.

#### HYPOTHESIS TESTING FOR EFFECTS OF SEX AND SITE

The Weibull model may be linearized into the form of a GLM (equation 2), and this provides a convenient method for testing the relative effects of gender and site on λ. The simplest exponential model (model 2.1) is nested within this model by setting γ to 1.

- ln(−ln(1 −
*p*(*a*))) = lnλ + γ ln(*a* − *g*)(eqn 2)

Equation 2 was fitted to the data using the computer software package glim4 (Francis, Green & Payne 1993). *Mycobacterium bovis* prevalence data were classified by sex, site and age. The error structure of *p*(*a*) was specified as binomial, with the response variable the number of animals infected, and the binomial denominator the total number of individuals in that age group. The link function was specified as complementary log-log. The term ln(*a – g*) was fitted as an offset (Collett 1991). The significance of sex and site on the force of *M. bovis* infection was assessed using the deletion test (Crawley 1993), which assesses the change in model deviance (Collett 1991) arising from the removal of a parameter from the model. The interaction between sex and site was not examined, as there was no a priori reason for doing so. The adequacy of the fit of the chosen model was examined by testing the significance of the residual model deviance (Collett 1991).