Get access

Analysis of multilocus fingerprinting data sets containing missing data

Authors

  • PHILIPP M. SCHLÜTER,

    1. Department of Systematic and Evolutionary Botany, Institute of Botany, University of Vienna, Rennweg 14, A-1030 Vienna, Austria,
    2. Department of Plant Sciences, University of Oxford, South Parks Road, Oxford OX1 3RB, UK
    Search for more papers by this author
  • STEPHEN A. HARRIS

    1. Department of Plant Sciences, University of Oxford, South Parks Road, Oxford OX1 3RB, UK
    Search for more papers by this author

Philipp M. Schlüter, Fax: +43-1-4277-9541; E-mail: philipp.maria.schlueter@univie.ac.at

Abstract

Missing data are commonly encountered using multilocus, fragment-based (dominant) fingerprinting methods, such as random amplified polymorphic DNA (RAPD) or amplified fragment length polymorphism (AFLP). Data sets containing missing data have been analysed by eliminating those bands or samples with missing data, assigning values to missing data or ignoring the problem. Here, we present a method that uses random assignments of band presence–absence to the missing data, implemented by the computer program famd (available from http://homepage.univie.ac.at/philipp.maria.schlueter/famd.html), for analyses based on pairwise similarity and Shannon's index. When missing values group in a data set, sample or band elimination is likely to be the most appropriate action. However, when missing values are scattered across the data set, minimum, maximum and average similarity coefficients are a simple means of visualizing the effects of missing data on tree structure. Our approach indicates the range of values that a data set containing missing data points might generate, and forces the investigator to consider the effects of missing values on data interpretation.

Get access to the full text of this article

Ancillary