Estimating population size by genotyping remotely plucked hair: the Eurasian badger



    Corresponding author
    1. Department of Biology and Environmental Science, University of Sussex, Brighton BN1 9QG, UK
    2. Musée National d’Histoire Naturelle, 25 rue Münster, L-2160 Luxembourg, Luxembourg
      Alain Frantz, Musée National d’Histoire Naturelle, 25 rue Münster, L-2160 Luxembourg, Luxembourg (fax +325 365295; e-mail
    Search for more papers by this author

    1. Musée National d’Histoire Naturelle, 25 rue Münster, L-2160 Luxembourg, Luxembourg
    Search for more papers by this author

    1. Department of Animal and Plant Sciences, University of Sheffield, Western Bank, Sheffield S10 2TN, UK
    Search for more papers by this author

    1. Laboratoire National de Santé, Institute of Immunology, PO Box 1102, L-1011 Luxembourg, Luxembourg
    Search for more papers by this author

    1. Service de la Conservation de la Nature, Direction des Eaux et Forêts, 16 rue Eugène Ruppert, L-2453 Luxembourg, Luxembourg
    Search for more papers by this author

    1. Laboratoire National de Santé, Institute of Immunology, PO Box 1102, L-1011 Luxembourg, Luxembourg
    Search for more papers by this author

    1. Department of Biology and Environmental Science, University of Sussex, Brighton BN1 9QG, UK
    Search for more papers by this author

Alain Frantz, Musée National d’Histoire Naturelle, 25 rue Münster, L-2160 Luxembourg, Luxembourg (fax +325 365295; e-mail


  • 1Size is a basic attribute of any population but it is often difficult to estimate, especially if the species under investigation is rare or cryptic. For example, there is currently no cheap and robust way of estimating the abundance of the European badger Meles meles, despite the species’ role as an agricultural pest and carrier of bovine tuberculosis.
  • 2We tested the reliability and accuracy of estimating badger abundance by genotyping DNA extracted from remotely plucked hair. We assessed the accuracy of our methodology by estimating local abundance by direct observation. Hair samples were collected near five target setts using a baited barbed wire enclosure or (at one sett) barbed wire suspended over a clearly visible badger run. All the hairs found on a barb were included in the extraction.
  • 3Of the 113 samples collected over a 6-month period, 105 gave rise to amplifiable DNA and originated from single animals. Through comparison with reliable reference genotypes of captured badgers, we showed that amplifiable DNA, including extracts obtained from single guard hairs, produced accurate profiles in a single round of amplifications.
  • 4Direct observation of the target setts suggested that a minimum of 13 badgers was present in the study area. Analysis of the 105 usable samples provided a baseline estimate of 15 animals.
  • 5To test the practical use of hair trapping to estimate population size, hair samples were collected daily during a 3-week period. The 66 usable samples obtained originated from 14 of the 15 known badgers. Estimates of true abundance were generated using rarefaction analyses, the least biased of which produced an abundance estimate of 14·23, corresponding well with the number of genetic profiles obtained over the 6-month period. The results allowed comparisons of theoretical predictions and empirical data relating to rarefaction analyses.
  • 6Synthesis and applications. DNA extracted from remotely plucked badger hair could form the basis of a potentially cost-effective, reliable and widely applicable method of estimating badger abundance. Hair trapping may offer a feasible method of estimating population size in a range of species even when the species are rare or patchily distributed.


Population size is a basic attribute of populations and its estimation is of central importance in conservation and wildlife management. Reliable estimates of population size are often necessary to assess the conservation status of a population, or may be required in the context of disease and pest control. However, censusing a population is often difficult, especially in species that are rare, cryptic, small, arboreal or fossorial. A case in point is the Eurasian badger Meles meles L., which, due to its nocturnal and semifossorial life-style, is notoriously difficult to census (Macdonald, Mace & Rushton 1998; Tuyttens et al. 2001). Nevertheless, accurate estimates of local abundance are required because the species is of conservation concern (Griffiths & Thomas 1993). It can also be an agricultural pest (Schley 2000) and has a potential role in the transmission of bovine tuberculosis to cattle in certain parts of Europe (for a review see Krebs et al. 1997). Indeed, the UK Department for Environment, Food and Rural Affairs (DEFRA) has called for research into innovative ways of censusing badgers (Krebs et al. 1997).

Badgers live in territorial, mixed-sex social groups, with the members of each group inhabiting a communal burrow system known as a main sett (Neal & Cheeseman 1996). As each social group usually has only one main sett, and main setts are relatively easy to recognize, the number of groups inhabiting a given area can be determined with a reasonable degree of accuracy (Harris, Cresswell & Jefferies 1989; Ostler & Roper 1998). However, there is as yet no reliable way of estimating social group size, or at least no reliable methodology that can be readily applied to all badger populations and that does not require extensive effort in time, human resources and money (Rogers et al. 1997; Schley 2000).

Non-invasive genetic sampling has been emerging over recent years as an important tool for the estimation of animal abundance, especially in rare or elusive species, because DNA can be extracted from sources such as faeces and hair follicles without the need to catch the target animal (Piggott & Taylor 2003b). The number of animals in a population can then be estimated from the number of individually distinct genetic profiles.

In a previous study, accurate estimates of badger group size were obtained by generating genetic profiles from badger faeces (Frantz et al. 2003; Wilson et al. 2003). However, this method is unlikely to be cost-effective on a larger scale because faecal DNA extracts are often of poor quality and therefore require repeated amplifications to obtain reliable profiles (Taberlet et al. 1996; Frantz et al. 2003). In the present study, we attempted to circumvent this problem by estimating the size of badger social groups using DNA obtained from hair samples collected non-invasively at barbed wire ‘hair traps’ placed close to main setts (cf. Sloane et al. 2000). We compared the results with an independent estimate of the size of the target groups of badgers obtained by a combination of live trapping and direct observation. The study was conducted on a medium-density population of badgers in Luxembourg.

Given information on individual badger genotypes, the true population size can be estimated using rarefaction methods. Estimates are obtained by plotting accumulation curves of the number of hairs sampled against the cumulative number of new profiles. Population size corresponds to the projected asymptote of the curve determined by the accumulation of unique genotypes (Kohn et al. 1999; Eggert, Eggert & Woodruff 2003). Three possible equations for the function to be fitted to the accumulation plot have been suggested in the literature. While two studies have reported simulations aimed at determining the accuracy of the three rarefaction methods (Eggert, Eggert & Woodruff 2003; Valière 2002), the theoretical predictions have not been tested empirically. We also tested the accuracy of these rarefaction methods using our independent estimates of population size.

Materials and methods

genetic variability of luxembourg badgers

In a previous study involving badger faecal DNA, seven microsatellite loci were required to produce individual-specific genetic profiles from a UK population (Frantz et al. 2003). Badgers in Luxembourg and neighbouring countries may have suffered a loss of genetic variability during the 1960s and 1970s, when large-scale gassing of setts for purposes of rabies control resulted in a dramatic decline in population size (Griffiths & Thomas 1993; Schley & Krier 2000). Consequently, the background genetic variability of the microsatellite loci used for genetic profiling might not be the same in Luxembourg as in the UK. Therefore, we collected DNA samples from road-killed badgers in Luxembourg and used these to analyse the variability of the target loci and to investigate their efficacy in generating individual-specific genetic profiles.

Samples of ear tissue were collected opportunistically from 52 badgers killed on Luxembourg roads during 2000 and 2001. The mean distance between the locations of the road kills was 16·8 km and the overall area from which the samples were collected, calculated as a minimum-area convex polygon, was 1325 km2 (51% of the total area of Luxembourg). DNA was extracted from tissue samples using an ammonium acetate precipitation method (Richardson et al. 2001).

Microsatellite loci were tested for linkage disequilibrium using an exact test based on a Markov chain method as implemented in genepop version 3.3 (Raymond & Rousset 1995). genepop version 3.3 was also used to implement the exact tests of Guo & Thompson (1992) to test for deviations from the Hardy–Weinberg equilibrium at each locus. In both analyses, the Bonferroni technique was used to eliminate false assignment of significance by chance (Rice 1989). Allele numbers and frequencies, estimates for expected heterozygosity (HE) and observed heterozygosity (HO) were calculated for each locus using gimlet 1.3.3 (Valière 2002).

When using microsatellite loci to genotype remotely collected hair samples, it was important to ensure that the resulting genetic profiles were specific to each individual. We tested whether the seven loci planned for use in our study would fulfil this requirement by computing a probability of identity of siblings (PID-Sib) statistic (Evett & Weir 1998; Waits, Luikart & Taberlet 2001). PID-Sib values for badgers in Luxembourg were calculated using gimlet 1.3.3 (Valière 2002) on the data set of 52 genotypes of road-killed badgers. prob-id5 (G. Luikart, unpublished data) was used to estimate the observed PID(PID-Obs) by computing the proportion of all possible pairs of individuals that had identical genotypes. Because the power of these loci to distinguish between individuals may vary between the whole and the study population (Banks et al. 2002), PID statistics were calculated for 13 animals captured in the main study area in 2002 and 2003.

study site

The main study site was located in the north-east of the Grand-Duchy of Luxembourg, east of the river Ernz Blanche and between the villages of Ermsdorf and Eppeldorf. The site covered approximately 5·4 km2, was situated between 225 and 420 m above sea level and consisted of a mosaic of pasture, arable land and woodland (for further details see Schley 2000). The study focused on five adjoining main setts previously identified by Schley (2000), named Ermsdorf 1, Ermsdorf 2, Knäipenhecken, Bëlz and Grott, respectively.

counting badgers by direct observation

An independent estimate of abundance was obtained by direct observation of the badgers at the five target setts. This was done mainly in the spring of 2003 because at that time of year it was still possible to distinguish between adults and cubs. Observations of setts began approximately 2 h before sunset and continued until 1 h after sunset, using night-vision equipment where necessary. We aimed to visit each sett until badgers were observed on a minimum of six occasions. This was achieved at Ermsdorf 1, Ermsdorf 2 and Bëlz setts. At Grott and Knäipenhecken, wind conditions were often unfavourable and the topography and vegetation cover of the setts made them difficult to observe. Consequently, badgers were observed only rarely and the censuses obtained at these setts should be considered a minimum estimate of social group size.

field methods

We set out to snare badger hair using barbed wire enclosures, similar to those described by Woods et al. (1999). They were constructed less than 10 m away from each main sett by suspending a single strand of barbed wire around two or three trees, 20 cm above ground level. The stations were baited with peanuts, placed under a pierced box covered with a stone in order to stop non-target species from reaching the bait. Previous studies suggested that it would be difficult to attract Luxembourg badgers to bait (Schley 2000), so peanuts were put near the setts up to 4 months prior to the construction of the hair trap. Old bait was replaced weekly until the peanuts first disappeared, after which bait was renewed every 2–3 days. By February 2003, badgers were eating the bait at only four of the five setts, the exception being Grott. Rather than delaying the start of the pilot study, two additional hair traps were constructed at Grott by suspending barbed wire between stakes set either side of a clearly visible badger run. Given the relatively low quantities of peanuts used as bait, the low- to medium-density of the animals and the relatively mild winter in 2003, we assume that the lengthy prebaiting period did not have an effect on the density of the animals.

A preliminary test of a barbed wire enclosure was carried out at Ermsdorf 2 during 8–15 December 2002 and 12–19 January 2003. The station was baited daily and all snared hairs were collected and stored in paper envelopes at −20 °C until DNA extraction. The main study was carried out from 14 February to 7 March 2003. The four enclosures and the suspended barbed wire were visited daily to replace bait and collect hair samples. A hair sample was defined as all the hair collected from a single barb. Hair samples that did not contain any follicles were discarded immediately. Both guard hairs and under-fur were collected, stored in paper envelopes at room temperature, and extracted the same day.

Because of the small number of hairs collected at some setts (see Results) and given that differing results on the reliability of genotyping from single hairs have been reported (Goossens, Waits & Taberlet 1998; Woods et al. 1999; Sloane et al. 2000), further samples were systematically collected during 17–24 March 2003. After this period, and until May 2003, hairs were collected opportunistically and extracted within 3 days of collection.

dna extraction and amplification

In order to avoid contamination, all extractions and polymerase chain reactions (PCR) were performed in a separate laboratory that was free of concentrated badger DNA or PCR product, aerosol-resistant pipette tips were used and negative controls were included in each manipulation to monitor contamination. When working with remotely collected samples, all the hair roots found on a barb were used in the extraction. Hair samples were extracted using a Chelex protocol (Chelex®-100, Bio-Rad, Hercules, CA; Walsh, Metzger & Higuchi 1991). After incubating the root portion of the hairs at room temperature for 30 min in 1 mL doubly-distilled H2O, 200 µL of 5% Chelex was added to the root and mixed well. This was followed by incubation at 56 °C for 30–45 min, mixing the samples occasionally. After checking that the hairs were immersed, the Chelex solution was boiled for 8 min. After centrifugation for 3 min at 13 000 g, the supernatant was removed and placed in a sterile tube.

As an initial test of the quality of the hair DNA extracts, we tried to amplify a 533-basepair (bp) product of exon-3 of the c-myc proto-oncogene (CMYC_seq_F1: 5′-GAAATCGATGT-TGTTTCTGTG-3′, CMYC_seq_ R1: 5′-CAAGAGTTCCGTAGCTGTTC-3′; Smith, Vigilant & Morin 2002). The sequences of these two primers are conserved among 18 eutherian mammalian species (Miyamoto, Porter & Goodman 2000; Smith, Vigilant & Morin 2002) and the product proved to be amplifiable using badger DNA. The PCR reaction conditions were the same as those used for the microsatellite loci. The c-myc–PCR products were visualized on 1% agarose gels.

Of the 39 microsatellite loci published by Carpenter et al. (2003), the following seven loci were used: Mel-105, Mel-106, Mel-109, Mel-111, Mel-113, Mel-116 and Mel-117. These loci have alleles shorter than 250 bp, as the amplification success of non-invasive DNA can be reduced for alleles longer than 300 bp (Frantzen et al. 1998). In order to be able to run all the samples on one gel, and after checking the allele range of the various loci in Luxembourg (Table 1), the primers were end-labelled with the following dyes: Mel-105, TET; Mel-106, TET; Mel-109, HEX; Mel-111, HEX; Mel-113, TET; Mel-116, 6-FAM; Mel-117, 6-FAM (Carpenter et al. 2003). The microsatellite loci were amplified in a 25-µL volume, each containing 5 µL of DNA extract. The final PCR concentrations and reaction times were the same as those described in Frantz et al. (2003). Reactions were performed using a Bio-Rad iCycler. Amplification products were separated on a 5% polyacrylamide gel using an ABI 377 DNA sequencer (Applied Biosystems, Foster City, CA, USA), and sized with a tamra-labelled size marker with bands of known size every 50 bp. All gels were analysed using genescan Analysis 2.0 software (Applied Biosystems).

Table 1.  Genetic variability of the seven microsatellite loci used in the study of Luxembourg badgers. The samples were obtained from road kills collected in 2000 and 2001. n, number of individuals analysed; A, number of different alleles observed; HE, expected heterozygosity; HO, observed heterozygosity; PID-Sib/locus, sibling probability of identity for individual loci, the loci in the table are arranged in order of increasing PID-Sib values; PID-Sib product, cumulative product of individual PID-Sib values; PID-Obs, proportion of all possible pairs of individuals that had identical genotypes after loci were added consecutively
LocusnAObserved allele size range (bp)HEHOPID-Sib/locusPID-Sib productPID-Obs
Mel-10552 6138–1480·790·730·383·75 × 10−16·11 × 10−2
Mel-1165210113–1360·750·730·401·49 × 10−14·52 × 10−3
Mel-11352 6120–1320·710·770·436·34 × 10−20
Mel-10652 7216–2280·690·560·442·79 × 10−20
Mel-11752 6174–1950·690·690·442·23 × 10−20
Mel-11152 4132–1420·580·500·526·35 × 10−30
Mel-10952 4106–1290·330·350·704·46 × 10−30

reliability of hair dna typing and identification of genetic profiles of individual badgers

By following the guidelines of Cheeseman & Mallinson (1979), 13 badgers were captured near the five target setts in 2002 and 2003. A hair sample was taken from each captured animal in order to obtain reference profiles to which the profiles generated from remotely plucked hairs could be compared. The reliability of the reference profiles was ensured by including at least 10 hairs in each extraction (Goossens, Waits & Taberlet 1998) and by genotyping every sample twice at all loci. After a single round of amplification, the profiles obtained from remotely collected hair DNA were tested for their accuracy. Samples that gave rise to three or more alleles at any locus originated from more than one animal, or could have been cross-contaminated, and were excluded from the analysis. gimlet 1.3.3 (Valière 2002) was used to compare the reference profiles with the remote DNA profiles and to group profiles together that were 100% identical.

Remote DNA profiles that were observed only once and did not match any reference profiles could have been obtained for a number of reasons. First, they could have originated from an unknown animal that was sampled only once. Secondly, they could correspond to a multiple-individual sample that did not have more than one or two alleles at all the loci examined. Thirdly, a genotyping error could have occurred. Finally, they could be the result of a multiple-individual sample that was genotyped with errors. In order to exclude genotyping errors, the unique profiles were amplified a total of three times.

Multiple-individual samples could be a mixture of two known profiles, one known and one unknown profile, or two unknown profiles. To test the first possibility, all the available reliable single-badger profiles were compared by hand to test whether a combination could be found that would give rise to the observed unique profiles. The possibility that a unique profile was a mixture of a known and unknown profile was tested by comparing the profile in question with all the single-badger profiles on a pairwise basis. If three different alleles were observed at a specific locus in a pairwise comparison this possibility was excluded. The probability of a unique fingerprint originating from one rather than two unknown animals was estimated by means of the likelihood ratio of Weir et al. (1997). The likelihood of generating a genotype homozygous for an allele A (that has a frequency of p) from a single contributor rather than two DNA contributors was p2/p4. The likelihood of generating a heterozygous genotype with alleles A and B (with frequencies p and q) from a single contributor rather than two contributors was [(p + q)2 − p2 − q2]/[(p + q)4 − p4 − q4]. Likelihood ratios were multiplied across loci.

Some researchers have tried to identify genotyping errors by comparing mother–offspring pairs and verifying that they share an allele at each locus as expected (Vigilant et al. 2001). This approach could be misleading in badgers, where animals are known to disperse from their native groups (Cheeseman et al. 1988; Christian 1994; Revilla & Palomares 2002) even though social groups are thought to form primarily through philopatry (Kruuk & Parish 1982; da Silva, Macdonald & Evans 1994). Revilla & Palomares (2002) identified annual dispersal rates of between 0·14 and 0·33 per territory.

estimation of local badger abundance

In order to simulate an actual census operation, we estimated local badger abundance using only the samples collected during the main 3-week study period. As a result of the relatively small number of different profiles collected during this period, mark–recapture analysis was unlikely to produce meaningful results so a rarefaction curve method was adopted to estimate population size. In this method, population size corresponds to the projected asymptote of a function of number of samples analysed vs. the cumulative number of unique genetic profiles.

Three possible equations for the function to be fitted to the accumulation plot have been suggested in the literature. (i) Kohn et al. (1999) used the following hyperbolic function to estimate coyote numbers with faecal genotyping: y=ax/(b + x), where y= cumulative number of genetic profiles, x= number of genotypes sampled, a= asymptote (or population size estimate) and b= non-linear slope of the function. This equation is referred to henceforth as Kohn's equation. (ii) Eggert, Eggert & Woodruff (2003) used an exponential function (Eggert's equation) to estimate elephant abundance from faecal DNA typing: y=a(1 − e(bx)). (iii) In the manual of the program gimlet (Valière 2002), D. Chessel suggests using the equation y=aa(1 − (1/a))x, corresponding to the expectation of the number of full boxes when x balls are distributed in a boxes. This equation will be referred to as Chessel's equation. Parameters in (ii) and (iii) are the same as in (i).

Program r (Ihaka & Gentleman 1996) can be used to perform analyses of the rarefaction curves using a script and data input file generated in gimlet 1.3.3 (Valière 2002). gimlet generates the data input file by regrouping and counting the samples that have an identical genetic profile. As the order in which the samples are added affects the shape of the accumulation curve (Colwell & Coddington 1994), the order of the profiles in the data set was randomized 1000 times and the asymptote was projected using the three equations described above for each of these randomizations. The mean value of all iterations for the asymptote, a, was taken to be the population estimate. The variance of the a estimate was analysed by calculating the SD and the 95% confidence intervals (CI) of that mean.


genetic diversity of badgers in luxembourg

The suitability of the seven microsatellite loci for genetic profiling of Luxembourg badgers was tested using 52 complete fingerprints obtained from road kills (Table 1). Two loci (Mel-111 and Mel-116) departed from Hardy–Weinberg expectations at the 0·05 level before adjustment with the sequential Bonferroni test, but not after. There was no linkage disequilibrium between any pair of loci. The average number of alleles per locus was 6·14 (SD = 2·04, range 4–10) and the mean expected heterozygosity value was 0·65 (SD = 0·15; range 0·33–0·79). The PID-Sib calculation suggested that, in the Luxembourg population, the six most informative loci would be sufficient to distinguish between sibling badgers with more than 99% certainty (Table 1). The observed PID showed that the proportion of individuals with identical profiles dropped to zero if the three most informative loci were included in a genetic profile. Considering only the 13 animals caught at the study side, PID-Sib across the seven loci was calculated to be 0·012. However, the two most informative loci were enough to distinguish between all the individuals.

success of hair capture

Overall, 113 hair samples were collected at the five target setts during the 6-month study, of which 71 were collected during the main 3-week study period (Table 2). Not more than a single guard hair could be included in about one-third of all the extractions.

Table 2.  Success of hair capture, DNA extraction and badger censuses in the five social groups for the period from 14 February to 7 March 2003. Hairs were remotely plucked with barbed wire hair traps. Independent estimates of group size were obtained by observation of setts
 Social group
Ermsdorf 1Ermsdorf 2KnäipenheckenBëlzGrottTotal
  • *

    Minimum number (see text for explanation).

Hair samples collected359516671
Samples that yielded DNA349514466
No. of genetic profiles 342 2314
No. of observed badgers 351* 22*13

reliability of hair dna typing

Of the 113 hair samples, 108 gave rise to a complete profile after a single round of amplification, with only five samples not containing any amplifiable DNA. Three profiles contained more than two alleles at one locus and were excluded from the analysis. The accuracy of the remaining 105 profiles was tested by comparing them with reliable reference profiles obtained from trapped animals (Table 3). There was a 100% match between 94 hair DNA profiles and the reference profiles. Of these 94 profiles, 31 were generated from a DNA extract obtained from a single guard hair.

Table 3.  Genetic profiles generated from badgers in the five social groups under investigation. Individuals whose profiles were only known through non-invasive hair capture are marked with – in the second column. The frequency with which the different genetic profiles were generated from 105 remote hair DNA extracts is given in the last three columns. Period A, 14 February 2003–7 march 2003; period B, 17 March 2003–2 May 2003; period C, 9 December 2002–25 January 2003. During period C, samples were only collected from Ermsdorf 2; – indicates that the profile of the captured badger had not been obtained non-invasively. The order of the loci in the table corresponds to the increasing probability of identity values determined for the local study population (see text)
IndividualYear and status when caughtAlleles at microsatellite loci under investigationObservations in period
Ermsdorf 2 social group
EMa12002, adult142 144121 136193 193124 130222 224132 138116 116 21 1
EMa22002, adult144 144123 132187 193124 124222 224132 132116 116 10 0
EMa32002, cub144 146121 123193 193124 124222 224132 132116 116 50 2
EMa42002, cub142 144121 123174 193124 124222 222132 132116 116 0010
Profile A142 144121 132193 193124 130222 222138 138116 116 10 0
Ermsdorf 1 social group
EMb12003, adult142 144132 136193 193124 124218 222132 138116 116197NA
EMb22002, cub142 142132 136174 193124 124222 222132 138116 116
EMb32002, cub142 142121 136193 195124 130222 224132 132116 116102NA
Profile B142 146121 132174 174124 130222 224132 138116 116 51NA
Knäipenhecken social group
KH12003, adult148 148123 123174 195126 130222 222132 140106 116 10NA
KH22002, cub146 148123 136174 174130 130222 222138 140106 116 45NA
KH32002, cub146 148123 132174 189130 130222 222132 140106 116
Bëlz social group
B12003, adult144 146132 136187 189124 124222 222132 138116 116 32NA
B22003, adult144 146123 132174 191124 124218 222132 140116 116112NA
Grott social group
G12003, adult142 142123 136174 189120 124222 224132 140116 116 24NA
Profile C144 146132 132174 193124 130222 222140 140116 116 12NA
Profile D142 144123 132189 193124 124224 224132 140116 116 10NA

The 11 samples that did not match any reference profiles were compared with one another to identify possible amplification errors. Four different profiles were identified. Two profiles (called B and C) were observed more than once (Table 3) and were considered accurate because identical genotypes were generated from reliable samples obtained from more than 10 outer guard hairs as well as from single-hair extracts. The two remaining profiles, A and D, were amplified a total of three times at all the loci. The same results were obtained all three times, so that a single round of amplification would have been sufficient to generate a reliable profile.

At the start of the study, we tried to predict the quality of the 99 samples collected from 14 February onwards by amplifying a 533-bp product of the c-myc proto-oncogene. A band of the expected size could be obtained from 92 extracts and these samples subsequently also allowed the generation of an accurate genetic profile in a single round of amplification.

To summarize, in a total of 749 positive amplifications of microsatellite loci from non-invasively collected hair DNA, there were no obvious genotyping errors. Even DNA samples obtained from single outer guard hairs provided reliable genotypes after only one round of amplification.

counting badgers by direct observation

Hair DNA typing results were validated by censuses obtained from direct observation of the five setts. Overall, the minimum number of badgers counted in the five social groups was 13 (Table 2).

identification of the genetic profiles of individual badgers

From the 105 usable DNA samples collected from December 2002 to May 2003, a total of 15 different profiles was generated (Table 3). Of these, 11 could be matched to known profiles from badgers captured during 2002 and 2003. As explained above, profiles B and C were scored reliably from single-hair extracts. Comparison of the reliable single-badger profiles showed that, assuming equal contribution from different genomes, the remaining profiles A and D could neither be a mixture of any two known individuals nor of a known and an unknown individual. The Weir et al. (1997) likelihood ratio suggested that it was 1·9 × 105 or 1·0 × 104 more likely that profiles A and D, respectively, originated from a single rather than two unknown individuals. The genetic profiles of badgers EMb2 and KH3 that had been captured as cubs in 2002 at Ermsdorf 1 and Knäipenhecken setts, respectively, were not obtained from the remotely collected hair samples (Table 3).

The results suggest that the 15 different profiles are genuine and that a minimum of 15 different animals was present at the three setts early in 2003. At Grott and Knäipenhecken setts, one more genetic profile was generated from the hair samples collected during 6 months than animals that had been observed. At the remaining three setts, direct counts and numbers of profiles were identical (Tables 2 and 3).

estimation of local badger abundance

In order to simulate an actual application of the hair trapping technique, only hair samples collected during the main study period from 14 February to 7 March were used to estimate the local abundance of badgers. Given that one sample originated from more than one individual and that four extractions did not produce any DNA, a total of 66 profiles was available for the analyses. During the 3-week study, a genetic profile was obtained from 14 of the 15 animals known to be present, even though only small numbers of samples were collected from Ermsdorf 2, Knäipenhecken and Bëlz setts (Table 2).

When estimating the total size of the local badger population, Program r found an asymptote for each equation in all 1000 iterations of the rarefaction method (Fig. 1). Using Chessel's equation, the asymptotic population size was estimated as 12·36 ± 0·92 (SD) with a 95% CI of 12·31–12·42. This estimate was therefore lower than the number of different profiles obtained during the 3-week study. Eggert's equation generated an estimate of 14·23 ± 1·45 (SD) badgers with a 95% CI of 14·14–14·32. Rounding up to the next integer, this estimate was equivalent to the actual number of profiles obtained from the 105 usable single-badger hair samples collected between December 2002 and May 2003. Kohn's equation suggested that 18·75 ± 3·07 (SD) badgers would be present in the study area, with a 95% CI of 18·56–18·94.

Figure 1.

Estimation of badger population size using rarefaction analysis of genotypes from hairs. Regression curves correspond to the mean of the coefficients calculated for the three equations used in this study after 1000 iterations of the regression, with the sample order randomized each time. Circles, combination of all the accumulation plots obtained after 1000 randomizations of the sample order; solid black line, Eggert's equation; short dashes, Kohn's equation; long dashes, Chessel's equation.


genetic variability of luxembourg badgers

Previous studies, using different types of genetic markers, have reported low to moderate values for the genetic variability of Eurasian badgers both in the UK (Carpenter et al. 2003; Domingo-Roura et al. 2003) and on the European mainland (Bijlsma et al. 2000; Domingo-Roura et al. 2003). Compared with these studies, the average HE of 0·64 in Luxembourg was relatively high. However, high variability (in addition to small product size) was one of the criteria for choosing our specific loci out of the panel of 39 markers reported by Carpenter et al. (2003). PID-Sib values showed that the seven loci exhibited sufficient variability to produce individual-specific profiles with more than 99% certainty in the Luxembourg population, but not in the local study population. However, given the fact that PID-Sib is the upper limit of the possible ranges of PID in a population (Waits, Luikart & Taberlet 2001) and that the two most informative loci were enough to distinguish the profiles of the animals captured in the core study area, we judged the seven loci to be sufficient to produce individual-specific profiles.

predicting the quality of dna extracts

Amplification of the 533-bp long fragment of exon-3 of the c-myc proto-oncogene appears to be a good predictor of DNA quality. The primers used are conserved across 18 mammalian species (Miyamoto, Porter & Goodman 2000; Smith, Vigilant & Morin 2002) and could thus also be used with work on other species.

hair-sampling as a practical method of estimating population size

If genotyping of non-invasively collected hair samples is to be an effective way of estimating the size of badger populations, it needs to be reliable, accurate and cost-effective. Here, we assess our results with respect to these three attributes.

As regards reliability, it is necessary that a high proportion of hair samples yields reliable genotypes without the need for repeated amplifications. Of 113 hair samples, five did not produce any amplifiable DNA and three originated from at least two different animals. By comparing with reliable reference genotypes of captured badgers, by comparing samples amongst themselves and by triple amplification of unique profiles, we showed that all extracts that contained amplifiable DNA, including those (about a third of the samples) that were obtained from single guard hairs, produced 100% accurate profiles in a single round of amplifications. Thus, the DNA extracted from remotely collected badger hairs allowed error-free genotyping.

A previous study has been equally successful in obtaining reliable genotypes from single hairs (Sloane et al. 2000) whereas others have needed to pool up to 10 hair follicles (Goossens, Waits & Taberlet 1998; Woods et al. 1999). A comparison of these studies suggests that a possible explanation for these discrepancies might be a delay between hair plucking and DNA extraction. According to Roon, Waits & Kendall (2003), the DNA quality of hair samples started to degrade after a storage period of 6 months and we suggest that it is important for DNA to be extracted as soon as possible after collection of the hair, as was done in our study. It is less likely that discrepancies between the studies arose from differences in the laboratory procedures as the studies mentioned above used exactly the same extraction procedure, namely a simple 5% Chelex-100 protocol. The advantage of this method is that it is simple enough to be carried out in the field on the day of sample collection to reduce the delay between collection and extraction.

It is possible that, in a high-density population, the proportion of samples that are a mix of two individuals will be greater than observed here. However, even though reliable genotypes can be obtained from single-hair extracts, on a practical level it is desirable to pool hairs in order to increase DNA quantity, the amount of PCR product and, ultimately, the ease with which samples can be genotyped. In future applications of this methodology, we recommend that two extractions should be performed with multiple-hair samples, one containing a single guard hair with a clearly visible follicle and the other containing all the remaining hairs. While mainly working with the pooled extracts, the corresponding single-hair extracts could always be used at a later stage to confirm or reject the profile generated using the first extract.

As regards accuracy, we tried to validate the counts obtained from genetic profiles by direct enumeration of badgers observed at the same setts. Overall, 13 animals were observed at the five setts whereas genotyping of hair samples yielded 15 profiles. However, because two setts, Grott and Knäipenhecken, were difficult to observe, it is unlikely that all the badgers from these setts were counted. Also, it is generally suspected that direct observation leads to underestimation of population sizes (Macdonald, Mace & Rushton 1998; Tuyttens et al. 2001), and the fact that more profiles were generated than badgers were counted by direct observation is consistent with this view. Nevertheless, it is encouraging that at Ermsdorf 1, Ermsdorf 2 and Grott setts, where badgers were relatively easy to observe and where most of the hair samples were collected, the number of badgers observed corresponded exactly with the number of genetic profiles compiled from hair samples.

The genetic profiles of two juvenile badgers that were caught in 2002 near the five target setts were not identified from the hair samples collected non-invasively in February/March 2003. Badger EMb2, a female that was captured and radio-collared in November 2002, was found dead on the 29 March 2003 on a road 8 km linear distance away from its natal sett. It therefore seems likely that the animal had already dispersed at the start of the hair-capture exercise. As no such claim can be made about female badger KH2, captured as a cub in 2002 at Knäipenhecken sett, it is possible that a total of 16 badgers was present in the study area. Both males and females have been shown to disperse from their native group (Cheeseman et al. 1988; Christian 1994; Revilla & Palomares 2002).

We estimated the true size of the population by applying a rarefaction analysis using the equations of Kohn et al. 1999), Eggert, Eggert & Woodruff (2003) and Valière (2002). The asymptotic population size obtained using Chessel's equation was lower than the number of different profiles identified, while Kohn's equation suggested that about four animals remained undetected using hair capture. With Eggert's equation, the estimated population size of 14·23 animals corresponded well not only with the number of genetic profiles obtained over a long study period (i.e. 15) but also with the higher estimate of 16 badgers. Thus, applying a rarefaction analysis based on Eggert's equation to a data set collected over a 3-week period generated a result similar to the best alternative baseline estimate.

Two studies have reported simulations on the accuracy of the projected results of all three equations for the rarefaction curve and the results reported here, especially the superiority of Eggert's equation, correspond well with the theoretical predictions. The gimlet manual (Valière 2002) reported results from limited simulations on the accuracy of Kohn's and Chessel's methods. It was predicted that the estimates generated using Chessel's method would be lower than the ones produced by Kohn's. Furthermore, in the presence of heterogeneity of capture probability amongst individuals, Chessel's method would underestimate population size while Kohn's method would generate an overestimate if a large proportion of the population had been sampled. Eggert, Eggert & Woodruff (2003) compared the accuracy of the estimates generated using Kohn's and Eggert's equations for the rarefaction curves. The results suggested that, while Kohn's method significantly overestimated population size, Eggert's approach would produce consistently unbiased results.

Finally, we turn to the issue of cost-effectiveness. During the main study, a relatively large number of hair samples (71) was collected at the five hair traps during a relatively short collection period (3 weeks). This was sufficient to provide a good estimate of the number of badgers in each social group, without the need for expensive repeated amplifications of DNA. However, setts needed to be prebaited for up to 4 months in order to attract badgers to the bait, and at one sett (Grott) bait was never taken. Lengthy prebaiting would obviously lower the cost-effectiveness of the technique.

In Luxembourg, badgers are unused to the presence of humans and this may be why they are relatively bait-shy (Schley 2000). In the UK, badgers are readily attracted to peanut bait (Delahay et al. 2000) so that the use of baited barbed wire enclosures should be feasible. An alternative approach, however, would be to suspend barbed wire over well-used badger runs, as was done at the Grott sett. The fact that this technique yielded three different profiles from four usable hair samples collected at a single run suggests that it deserves further testing. It might also be possible to suspend barbed wire or double-sided adhesive tape over sett entrances (Sloane et al. 2000).

To illustrate the efficacy of genotyping badger hair DNA compared with badger faecal DNA, it is worth considering the requirements for genotyping 100 samples of both types of DNA extract. First, virtually all the hair samples collected yielded fully amplifiable DNA, while this was only the case with 74% of faecal samples (Frantz et al. 2003). Thus fewer hair samples need to be collected and extracted to obtain the desired quantity of DNA samples. Secondly, while it would in principle require 700 PCR reactions to obtain genetic profiles consisting of seven loci from hair DNA, approximately 2240 reactions would be required to obtain reliable profiles from faecal DNA (with an average of 3·2 PCR per locus per genotype; Frantz et al. 2003). Thirdly, failed reactions occurred more frequently with faecal than hair DNA. Generally, sufficient PCR product was obtained from badger hair DNA to allow the visualization of all seven microsatellite loci in one lane of a polyacrylamide gel. This was not the case with faecal DNA, where often a microsatellite locus could only be visualized if it was run alone in a single gel lane. In other words, considerably more polyacrylamide gels are required when working with faecal compared with hair DNA samples. As a rough guide, genotyping 100 faecal samples would cost about £1000 (1500 euros) more, in consumables alone, than genotyping the same number of hair samples.


Our results show that genotyping of remotely plucked badger hair does not suffer from the drawbacks of faecal DNA typing. Reliable microsatellite profiles were obtained in a single round of amplifications, even from single-hair extracts. Baited barbed wire enclosures or suspension of barbed wire over sett entrances or clearly visible badger runs should allow easy collection of hair samples from most members of a social group, independent of population density. As we demonstrate that population size estimated from remotely collected hair is similar to a conservative baseline estimate, this method has the potential to form the basis of a feasible and practicable technique of estimating badger abundance, applicable independently of habitat characteristics and over a range of population densities. If methods of hair collection can be improved or prebaiting time reduced, the methodology will also be cost-effective.

Based on the present study and on our work with faecal DNA (Frantz et al. 2003; Wilson et al. 2003), we suggest that wildlife researchers working with badgers consider remotely plucked hair rather than faeces as a source of non-invasive DNA. However, it should be emphasized that the reliability of faecal DNA is species-dependent (Piggott & Taylor 2003a) and that it can be more practical and less disruptive to the target species to genotype faeces rather than plucked hair DNA ( Vigilant et al. 2001; Eggert, Eggert & Woodruff 2003). Nevertheless, hair trapping is a feasible approach in a variety of species, even in populations that are small and sparsely distributed (Foran, Minta & Heinemeyer 1997; Woods et al. 1999).


We would like to thank Edmée Engel and Guy Colling for their enthusiasm and support of the project. The laboratory work was initiated at the NERC-funded Sheffield Molecular Genetics Facility headed by Terry Burke and co-ordinated by Deborah Dawson, where the road kill samples were analysed. We would like to thank Andy Krupa, Sylvie Hermant, Wim Ammerlaan and Mick M. Mulders for help in the laboratory. Nathaniel Valière provided us with script files for the rarefaction analysis. A. C. Frantz was supported by a Bourse de Formation-Recherche of the Ministère de la Culture, de l’Enseignement Supérieur at de la Recherche, Luxembourg.