A simulation test of the effectiveness of several methods for error-checking non-invasive genetic data


All correspondence to: David A. Roon. Tel: (208) 885-7323; Fax: (208) 885-9080; E-mail: roon8505@uidaho.edu


Non-invasive genetic sampling (NGS) is becoming a popular tool for population estimation. However, multiple NGS studies have demonstrated that polymerase chain reaction (PCR) genotyping errors can bias demographic estimates. These errors can be detected by comprehensive data filters such as the multiple-tubes approach, but this approach is expensive and time consuming as it requires three to eight PCR replicates per locus. Thus, researchers have attempted to correct PCR errors in NGS datasets using non-comprehensive error checking methods, but these approaches have not been evaluated for reliability. We simulated NGS studies with and without PCR error and ‘filtered’ datasets using non-comprehensive approaches derived from published studies and calculated mark–recapture estimates using CAPTURE. In the absence of data-filtering, simulated error resulted in serious inflations in CAPTURE estimates; some estimates exceeded N bygeqslant R: gt-or-equal, slanted200%. When data filters were used, CAPTURE estimate reliability varied with per-locus error (Eμ). At Eμ=0.01, CAPTURE estimates from filtered data displayed <5% deviance from error-free estimates. When Eμ was 0.05 or 0.09, some CAPTURE estimates from filtered data displayed biases in excess of 10%. Biases were positive at high sampling intensities; negative biases were observed at low sampling intensities. We caution researchers against using non-comprehensive data filters in NGS studies, unless they can achieve baseline per-locus error rates below 0.05 and, ideally, near 0.01. However, we suggest that data filters can be combined with careful technique and thoughtful NGS study design to yield accurate demographic information.