Authors in alphabetic order.
Accuracy, efficiency and robustness of four algorithms allowing full sibship reconstruction from DNA marker data
Article first published online: 17 MAR 2004
Volume 13, Issue 6, pages 1589–1600, June 2004
How to Cite
Butler, K., Field, C., Herbinger, C. M. and Smith, B. R. (2004), Accuracy, efficiency and robustness of four algorithms allowing full sibship reconstruction from DNA marker data. Molecular Ecology, 13: 1589–1600. doi: 10.1111/j.1365-294X.2004.02152.x
- Issue published online: 17 MAR 2004
- Article first published online: 17 MAR 2004
- Received 14 September 2003; revision received 3 December 2003; accepted 16 January 2004
- DNA marker;
- full sib;
- pedigree reconstruction
In the problem of reconstructing full sib pedigrees from DNA marker data, three existing algorithms and one new algorithm are compared in terms of accuracy, efficiency and robustness using real and simulated data sets. An algorithm based on the exclusion principle and another based on a maximization of the Simpson index were very accurate at reconstructing data sets comprising a few large families but had problems with data sets with limited family structure, while a Markov Chain Monte Carlo (MCMC) algorithm based on the maximization of a partition score had the opposite behaviour. An MCMC algorithm based on maximizing the full joint likelihood performed best in small data sets comprising several medium-sized families but did not work well under most other conditions. It appears that the likelihood surface may be rough and presents challenges for the MCMC algorithm to find the global maximum. This likelihood algorithm also exhibited problems in reconstructing large family groups, due possibly to limits in computational precision. The accuracy of each algorithm improved with an increasing amount of information in the data set, and was very high with eight loci with eight alleles each. All four algorithms were quite robust to deviation from an idealized uniform allelic distribution, to departures from idealized Mendelian inheritance in simulated data sets and to the presence of null alleles. In contrast, none of the algorithms were very robust to the probable presence of error/mutation in the data. Depending upon the type of mutation or errors and the algorithm used, between 70 and 98% of the affected individuals were classified improperly on average.