Microsatellite genotyping errors: detection approaches, common sources and consequences for paternal exclusion


Joseph I. Hoffman, Fax: +44 1223 336676; E-mail: jih24@cam.ac.uk


Microsatellite genotyping errors will be present in all but the smallest data sets and have the potential to undermine the conclusions of most downstream analyses. Despite this, little rigorous effort has been made to quantify the size of the problem and to identify the commonest sources of error. Here, we use a large data set comprising almost 2000 Antarctic fur seals Arctocephalus gazella genotyped at nine hypervariable microsatellite loci to explore error detection methods, common sources of error and the consequences of errors on paternal exclusion. We found good concordance among a range of contrasting approaches to error-rate estimation, our range being 0.0013 to 0.0074 per single locus PCR (polymerase chain reaction). The best approach probably involves blind repeat-genotyping, but this is also the most labour-intensive. We show that several other approaches are also effective at detecting errors, although the most convenient alternative, namely mother–offspring comparisons, yielded the lowest estimate of the error rate. In total, we found 75 errors, emphasizing their ubiquitous presence. The most common errors involved the misinterpretation of allele banding patterns (n = 60, 80%) and of these, over a third (n = 22, 36.7%) were due to confusion between homozygote and adjacent allele heterozygote genotypes. A specific test for whether a data set contains the expected number of adjacent allele heterozygotes could provide a useful tool with which workers can assess the likely size of the problem. Error rates are also positively correlated with both locus polymorphism and product size, again indicating aspects where extra effort at error reduction should be directed. Finally, we conducted simulations to explore the potential impact of genotyping errors on paternity exclusion. Error rates as low as 0.01 per allele resulted in a rate of false paternity exclusion exceeding 20%. Errors also led to reduced estimates of male reproductive skew and increases in the numbers of pups that matched more than one candidate male. Because even modest error rates can be strongly influential, we recommend that error rates should be routinely published and that researchers make an attempt to calculate how robust their analyses are to errors.