• Open Access

Genetic diversity contribution to errors in short oligonucleotide microarray analysis

Authors

  • Matias Kirst,

    Corresponding author
    1. Institute for Genomic Diversity, Cornell University, Ithaca, NY 14853-2703, USA
    Search for more papers by this author
    • Present address: School of Forest Resources and Conservation, University of Florida, PO Box 110410, Gainesville, FL 32611, USA

  • Rico Caldo,

    1. Department of Plant Pathology, Iowa State University, Ames, IA 50011, USA
    Search for more papers by this author
  • Paula Casati,

    1. Department of Biological Sciences, Stanford University, Stanford, CA 94305, USA
    Search for more papers by this author
    • Present address: Centro de Estudios Fotosinteticos y Bioquimicos (CEFOBI), Suipacha 531, 2000 Rosario, Argentina

  • Gene Tanimoto,

    1. Affymetrix, Inc., Santa Clara, CA 95051, USA
    Search for more papers by this author
  • Virginia Walbot,

    1. Department of Biological Sciences, Stanford University, Stanford, CA 94305, USA
    Search for more papers by this author
  • Roger P. Wise,

    1. Department of Plant Pathology, Iowa State University, Ames, IA 50011, USA
    2. USDA-ARS, Corn Insects and Crop Genetics Research, Iowa State University, Ames, IA 50011, USA
    Search for more papers by this author
  • Edward S. Buckler

    1. Institute for Genomic Diversity, Cornell University, Ithaca, NY 14853-2703, USA
    2. USDA-ARS, Plant, Soil, and Nutrition Research Unit, Cornell University, Ithaca, NY 14853, USA
    Search for more papers by this author

* Correspondence (fax +1 352 846 1277; e-mail: mkirst@ufl.edu)

Summary

DNA arrays based on short oligonucleotide (≤ 25-mer) probes are being developed for many species, and are being applied to quantify transcript abundance variation in species with high genetic diversity. To define the parameters necessary to design short oligo arrays for maize (Zea mays L.), a species with particularly high nucleotide (single nucleotide polymorphism, SNP) and insertion-deletion (indel) polymorphism frequencies, we analysed gene expression estimates generated for four maize inbred lines using a custom Affymetrix DNA array, and identified biases associated with high levels of polymorphism between lines. Statistically significant interactions between probes and maize inbreds were detected, affecting five or more probes (out of 30 probes per transcript) in the majority of cases. SNPs and indels were identified by re-sequencing; they are the primary source of probe-by-line interactions, affecting probeset level estimates and reducing the power of detecting transcript level variation between maize inbreds. This analysis identified 36 196 probes in 5118 probesets containing markers that may be used for genotyping in natural and segregating populations for association gene analysis and genetic mapping.

Ancillary