The effect of RAD allele dropout on the estimation of genetic variation within and between populations

Authors

  • Mathieu Gautier,

    Corresponding author
    • Inra, UMR CBGP (INRA ‒ IRD ‒ Cirad ‒ Montpellier SupAgro), Campus international de Baillarguet, CS 30016, F-34988, Montferrier-sur-Lez, France
    Search for more papers by this author
  • Karim Gharbi,

    1. The GenePool, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JT, UK
    Search for more papers by this author
  • Timothee Cezard,

    1. The GenePool, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JT, UK
    Search for more papers by this author
  • Julien Foucaud,

    1. Inra, UMR CBGP (INRA ‒ IRD ‒ Cirad ‒ Montpellier SupAgro), Campus international de Baillarguet, CS 30016, F-34988, Montferrier-sur-Lez, France
    Search for more papers by this author
  • Carole Kerdelhué,

    1. Inra, UMR CBGP (INRA ‒ IRD ‒ Cirad ‒ Montpellier SupAgro), Campus international de Baillarguet, CS 30016, F-34988, Montferrier-sur-Lez, France
    Search for more papers by this author
  • Pierre Pudlo,

    1. Inra, UMR CBGP (INRA ‒ IRD ‒ Cirad ‒ Montpellier SupAgro), Campus international de Baillarguet, CS 30016, F-34988, Montferrier-sur-Lez, France
    2. I3M, UMR CNRS 5149, Université Montpellier 2, F-34095 Montpellier, France
    Search for more papers by this author
  • Jean-Marie Cornuet,

    1. Inra, UMR CBGP (INRA ‒ IRD ‒ Cirad ‒ Montpellier SupAgro), Campus international de Baillarguet, CS 30016, F-34988, Montferrier-sur-Lez, France
    Search for more papers by this author
  • Arnaud Estoup

    1. Inra, UMR CBGP (INRA ‒ IRD ‒ Cirad ‒ Montpellier SupAgro), Campus international de Baillarguet, CS 30016, F-34988, Montferrier-sur-Lez, France
    Search for more papers by this author

Correspondence: Mathieu Gautier, Fax: +33 (0)4 99 62 33 45; E-mail: mathieu.gautier@supagro.inra.fr

Abstract

Inexpensive short-read sequencing technologies applied to reduced representation genomes is revolutionizing genetic research, especially population genetics analysis, by allowing the genotyping of massive numbers of single-nucleotide polymorphisms (SNP) for large numbers of individuals and populations. Restriction site–associated DNA (RAD) sequencing is a recent technique based on the characterization of genomic regions flanking restriction sites. One of its potential drawbacks is the presence of polymorphism within the restriction site, which makes it impossible to observe the associated SNP allele (i.e. allele dropout, ADO). To investigate the effect of ADO on genetic variation estimated from RAD markers, we first mathematically derived measures of the effect of ADO on allele frequencies as a function of different parameters within a single population. We then used RAD data sets simulated using a coalescence model to investigate the magnitude of biases induced by ADO on the estimation of expected heterozygosity and FST under a simple demographic model of divergence between two populations. We found that ADO tends to overestimate genetic variation both within and between populations. Assuming a mutation rate per nucleotide between 10−9 and 10−8, this bias remained low for most studied combinations of divergence time and effective population size, except for large effective population sizes. Averaging FST values over multiple SNPs, for example, by sliding window analysis, did not correct ADO biases. We briefly discuss possible solutions to filter the most problematic cases of ADO using read coverage to detect markers with a large excess of null alleles.

Ancillary