Statistical analysis of amplified fragment length polymorphism data: a toolbox for molecular ecologists and evolutionists


  • A. Bonin,

    1. Diversity Arrays Technology P/L, PO Box 7141, Yarralumla, ACT 2600, Australia,
    2. Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble cedex 09, France,
    Search for more papers by this author
  • D. Ehrich,

    1. National Centre for Biosystematics, Natural History Museum, University of Oslo, PO Box 1172, Blindern, NO-0318 Oslo, Norway
    Search for more papers by this author
  • S. Manel

    1. Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble cedex 09, France,
    Search for more papers by this author

  • Box 1 Analysing dominant genetic data such as AFLPs: two alternative philosophies

    Two different approaches exist to analyse individual AFLP profiles (Kosman & Leonard 2005). One can choose to focus on the pattern of band presence or absence and to compare it between samples. This is termed here the band-based approach, which is usually conducted at the individual level. Alternatively, one may decide to adopt the allele frequency-based approach, which involves estimating allele frequencies at each AFLP locus. This strategy is thus population orientated; moreover, as AFLPs are dominant markers, allele frequencies are accessible only with preliminary assumptions or additional data about the inbreeding coefficient in the examined population(s).

    For each approach, this box presents the main metrics or estimators used as starting points and discusses their respective statistical advantages and drawbacks.

    Metrics for the band-based approach

    The band-based approach usually resorts to several metrics called ‘coefficients of similarity’, which can be viewed as distance measures. Estimates of band frequencies, applied in some assignment tests (Duchesne & Bernatchez 2002), also belong to the band-based approach.

    The properties of the main coefficients of similarity can be illustrated with an example where two individuals i and j are genotyped at n AFLP loci. The following table summarizes the pattern of matches and mismatches among their respective band presences and absences:

     Individual i
    Band presence 1Band absence 0
    Individual jBand presence 1ab
    Band absence 0cd

    where n = a + b + c + d

    Jaccard coefficient (Jaccard 1908)


    The Jaccard coefficient only takes into account the bands present in at least one of the two individuals, and is therefore unaffected by homoplasic absent bands (when the absence of the same band is due to different mutations).

    Dice coefficient (Dice 1945)

    The Dice coefficient is equivalent to the Nei and Li coefficient (Nei & Li 1979) and the Sørensen coefficient (Sørensen 1948).


    Comparable to the Jaccard coefficient, the Dice coefficient gives more weight to the bands present in both individuals. It thus lays the emphasis on the similarity between individuals, rather than on their dissimilarity.

    Simple-matching coefficient (Sokal & Michener 1958)


    The simple-matching coefficient maximizes the amount of information drawn from an AFLP profile by considering all scored loci. Double-band absence and double band presence are given the same biological importance, which may not be adequate in case of frequent band absence homoplasy. This coefficient has interesting Euclidean metric properties that allow its use in an analysis of molecular variance (amova; Excoffier et al. 1992).

    Metrics for the allele frequency-based approach

    Allelic frequencies can be extracted from dominant data following at least five procedures:

    The square-root procedure

    This procedure simply uses the inbreeding coefficient and the square root of the frequency of null homozygotes (i.e. of band absences) to calculate the frequency of the null allele (Stewart & Excoffier 1996). Because the inbreeding coefficient is rarely known, this method often implies assuming Hardy–Weinberg equilibrium. Moreover, estimates tend to be downwardly biased for null alleles with low frequencies.

    The Lynch & Milligan procedure

    Lynch & Milligan (1994) refined the square-root method by proposing a more accurate estimate of null allele frequencies. However, their procedure requires restricting the analysis to AFLP loci where the frequency of band absence is higher than 3/N, N being the number of samples. It thus induces a bias in the choice of loci for analysis together with a loss of information, and this can ultimately bias estimates of genetic diversity and differentiation (Isabel et al. 1999).

    The Bayesian procedure

    Zhivotovsky (1999) introduced a Bayesian approach that gives satisfactory estimates of null allele frequencies, even in case of moderate departure from Hardy–Weinberg equilibrium. Because of these advantages, this procedure is now routinely employed.

    The moment-based procedure

    Hill & Weir (2004) developed a robust moment-based method using the observed mean and variance of the frequency of null homozygotes at any given locus. The underlying model assumes random mating, Hardy–Weinberg equilibrium, linkage equilibrium, no mutation from common ancestor and equally distant populations.

    The Holsinger procedure

    Holsinger et al. (2002) used a Bayesian framework to estimate allelic frequencies, FIS and FST from dominant data. This procedure only assumes that FIS and FST are similar across loci and that the pattern of allelic frequency variation follows a beta distribution among populations. However, it seems to provide poor estimates of allelic frequencies and FIS.

  • Box 2 Metrics commonly used to measure genetic diversity with AFLPs

    Genetic diversity can be estimated at different hierarchical levels (within populations, among populations within regions, among regions, etc.) and total diversity can be partitioned in components at these different levels. Such a hierarchical approach has been used for several of the metrics described below. The partitioning of genetic diversity among populations is specifically addressed in a subsequent section (Population structure) and in Box 3.

    Band-based metrics

    The various similarity coefficients (see Box 1)

    These coefficients can be calculated for each pair of individuals in the population and then averaged to give a measure of the genetic diversity in the population.

    The Shannon index of phenotypic diversity S, derived from the Shannon-Weaver index (Shannon 1948)


    where pi is the frequency of the band presence at the ith marker within the population. This index gives more weight to the presence than to the absence of bands. This has no real biological support, although it might account for the occurrence of homoplasic absences of bands.

    The nucleotide diversity π

    In a population, π can be defined as the average number of nucleotide differences per site between two randomly chosen DNA sequences. Borowsky (2001) proposed a simple equation to estimate π from band data, assuming the Hardy–Weinberg equilibrium, an absence of band homoplasy, and a low overall π:


    where Φe is the proportion (over all polymorphic loci) of mismatched bands between two individuals drawn at random in the population, and m is the number of bases screened per band. For AFLP data, m is the total number of bases in both restriction sites and in the selective extensions.

    Clark & Lanigan (1993) also introduced a method to assess π but it requires the prior calculation of allelic frequencies for diploid species. As for the estimate of Innan et al. (1999), it has to be implemented by computer because of its complexity, and its validity is based on the assumption that the GC content of the screened genome is around 50%. Computer simulations, however, have shown that in case of departure from this value, this method still gives reliable estimates as long as that π remains small.

    Allele frequency-based metrics

    Allele frequencies have to be estimated first to calculate these metrics (see Box 1).

    Percentage of polymorphic loci P at the x%confidence level

    This corresponds to the percentage of loci where p, the frequency of the allele presence, obeys the following criterion:


    The percentage of polymorphic loci P is sometimes calculated based on band frequencies.

    Nei's average gene diversity per locus H (Nei 1973)

    This parameter is equivalent to the average expected heterozygozity He in the population:


    where pi and qi stand for the frequencies of the presence and absence alleles at locus i, respectively, and n is the number of loci examined.

    For a small sample size, an unbiased estimate of He is given by the formula (Nei 1978; Nei 1987):


    where N is the number of diploid samples. Kosman (2003) showed that Nei's average gene diversity calculated from band frequencies, instead of allele frequencies, was equivalent to the average number of pairwise differences within populations (see simple-matching coefficient, Box 1).

  • Box 3Comparison of FST estimators using real and simulated AFLP data

    Real data

    We used a data set obtained for 13 vascular plant species sampled at 9–16 localities in Fennoscandia (nine individuals per locality on average) and scored for 65–181 polymorphic AFLP markers to compare the estimates of genetic differentiation obtained by different approaches (Skrede et al. 2006; P.B. Eidesen, D. Ehrich, I.G. Alsos, unpublished data). Region-wide estimates of differentiation were calculated by using the following methods: (1) FST according to Lynch & Milligan (1994) based on allele frequencies estimated (a) as band frequencies, (b) using the square-root method (assuming Hardy–Weinberg equilibrium) (c) using the Bayesian method with uniform priors, and (d) using the Bayesian method with non-uniform priors (see Box 1). Differentiation was calculated using aflp-surv. In addition, (e) GST was calculated from allele frequencies estimated according to the square-root method assuming Hardy–Weinberg equilibrium in popgene. Furthermore, differentiation was estimated (2) by the Bayesian approach of Holsinger et al. (2002) using the default settings in Hickory, and either (a) the full model or (b) the f-free model; and (3) as ΦST in an amova based on pairwise differences between AFLP profiles using arlequin.Figure 1a method shows that there are large discrepancies between some of the differentiation estimates for real data. GST (method 1e) is often the highest, followed by the two band-based measures (methods 1a and 3). Estimates based on Bayesian estimation of allele frequencies are the lowest, especially those using uniform priors. However, despite these large discrepancies, the relative differences between the data sets are largely comparable. The low Bayesian FST estimates obtained for Betula nana are a noteworthy exception. Figures 1b and 1c show the sensitivity of two estimators based on allele frequencies to assumptions about FIS: FST based on allele frequencies estimated with the Bayesian method with non-uniform priors (1d) and with the square-root method (1b), both calculated in aflp-surv. Kremer et al. (2005) showed that multilocus estimates of gene diversity were surprisingly robust to changes in assumed FIS. The same was true in general for FST (square-root method). For some data sets, however, FST (square-root) increased slightly with FIS, and FST (Bayesian non-uniform) almost always increased considerably. However, as for the comparison of methods, relative differences between the species were in general conserved.

    • image(1)

    [ Comparison of differentiation estimators (real data), and sensitivity of two estimators to assumptions about FIS. ]

    Simulated data

    Genetic drift was simulated at 300 biallelic loci for 10 populations of 100 diploid individuals each. Initial allele frequencies at each locus were chosen from a beta distribution with shape parameters 0.3 and 0.8. Preliminary simulations showed that these values produced U-shaped marker frequency distributions resembling those often observed in empirical data sets. Initial genotypes in each population were chosen at random from the same ancestral frequencies. There was no mutation, no migration, no selfing and mating was random. Each generation, FST was calculated from the diploid genotypes of all individuals in all populations according to Weir (1996) and considered as the real value. Simulations were stopped when FST reached 0.05 or 0.25 (five replicates each). Samples of 10 and 50 individuals were taken from each population, genotypes were converted to dominant data and these were analysed in the same way as the 13 data sets above. Simulations were carried out in r (The R Development Core Team 2004) using a script available from D.E. on request.

    Figure 2 shows that the FST estimates based on allele frequencies calculated with the square-root method (method 1b; according to Lynch & Milligan 1994) and the Bayesian method with non-uniform priors (method 1d) are closest to the real value based on the diploid genotypes. All estimates based on allele frequencies improved for larger sample sizes. GST based on allele frequencies calculated with the square-root method (method 1e) was particularly sensitive to small sample sizes. The approaches based on band frequencies (methods 1a and 3) overestimated differentiation considerably, as did the Bayesian calculations performed by Hickory (methods 2a and 2b). Larger sample sizes did not improve these last estimates. Estimates were also calculated for 150 loci (not shown). Differences were in general very small (< 0.01). For the two estimates based on the square-root method (1b and 1e; FST = 0.05, n = 10), however, the bias increased by 0.01–0.02. These simulations used only one type of initial distribution of allele frequencies, chosen to resemble the 13 empirical data sets, and their results may thus not be representative of all possible situations.

    • image(2)

    [  Comparison of differentiation estimators (simulated data). ]

Dr Stéphanie Manel, Fax: +33 4 76 51 42 79; E-mail:


Recently, the amplified fragment length polymorphism (AFLP) technique has gained a lot of popularity, and is now frequently applied to a wide variety of organisms. Technical specificities of the AFLP procedure have been well documented over the years, but there is on the contrary little or scattered information about the statistical analysis of AFLPs. In this review, we describe the various methods available to handle AFLP data, focusing on four research topics at the population or individual level of analysis: (i) assessment of genetic diversity; (ii) identification of population structure; (iii) identification of hybrid individuals; and (iv) detection of markers associated with phenotypes. Two kinds of analysis methods can be distinguished, depending on whether they are based on the direct study of band presences or absences in AFLP profiles (‘band-based’ methods), or on allelic frequencies estimated at each locus from these profiles (‘allele frequency-based’ methods). We investigate the characteristics and limitations of these statistical tools; finally, we appeal for a wider adoption of methodologies borrowed from other research fields, like for example those especially designed to deal with binary data.


The amplified fragment length polymorphism (AFLP) technique has aroused a lot of enthusiasm since its development in the mid-1990s (Vos et al. 1995). By bringing key answers to major biological issues in a wide variety of organisms, like fungi (Kis-Papo et al. 2003), plants (Savolainen et al. 2006), birds (Irwin et al. 2005), fish (Barluenga et al. 2006) and even humans (Prochazka et al. 2001), it has established itself as a valuable genetic marker system in population genetics, ecology and evolution.

Technical specificities of the AFLP procedure have been well documented over the years (see for example Mueller & Wolfenbarger 1999; Bensch & Akesson 2005; Mba & Tohme 2005; Meudt & Clarke 2007). Likewise, the performances of AFLPs compared to traditional codominant markers like microsatellites or allozymes have been investigated on several occasions (e.g. Mariette et al. 2001; Mariette et al. 2002; Gaudeul et al. 2004; Nybom 2004). On the contrary, there is little or scattered information about the statistical analysis of AFLPs. As a result, the whole body of population diversity and structure descriptors, originally developed for codominant and multi-allelic markers, has often been applied to AFLP data without any real assessment or discussion of their appropriateness (Hollingsworth & Ennos 2004). Even more surprising, AFLP studies largely ignore some alternative methods that could be particularly helpful. One example is logistic regression, which is very popular in ecology (e.g. Manel et al. 1999), but much less in population genetics (see Joost & Bonin in press; Joost et al. in press).

Two features of AFLPs considerably constrain their statistical analysis (Meudt & Clarke 2007). First, polymorphic AFLP loci are generally scored for two alleles, the ‘band-presence’ allele and the ‘band-absence’ allele. Each locus is thus less informative than a typical multi-allelic microsatellite locus, although the large number of AFLP markers available across the genome and their largely random (but see Rogers et al. 2007) distribution balance this drawback (Mariette et al. 2002; Campbell et al. 2003; Kremer et al. 2005). Second, AFLP markers are generally scored as dominant markers. Thus, it is difficult to distinguish heterozygous individuals from individuals homozygous for the band-presence allele, unless exact genotypes can be inferred from pedigree studies (van Haeringen et al. 2002).

On the basis of these characteristics, two different approaches exist to extract statistical information from AFLP data (Kosman & Leonard 2005; see Box 1). The first one corresponds to the direct study of AFLP band presences or absences. We will refer to it as the ‘band-based’ approach. The second one, that we call the ‘allele frequency-based’ approach, consists in estimating allelic frequencies at each locus. Several procedures have been proposed to obtain allelic frequencies from dominant biallelic data in diploids, and most of them rely either on the Hardy–Weinberg hypothesis or on a known inbreeding coefficient (Box 1). Estimates of allelic frequencies are then used to survey genetic diversity or differentiation with classical population genetics methods.

In this review, we aim to address the statistical aspects of AFLP analysis, focusing on four of the important research topics previously identified by Bensch & Akesson (2005): (i) assessment of genetic diversity; (ii) identification of population structure; (iii) identification of hybrid individuals; and (iv) detection of markers associated with phenotypes. More precisely, we present the statistical methods most widely used for AFLP data, including both band-based and allele frequency-based methods, and investigate their characteristics and limitations. Most of the procedures examined here were developed in a population genetics framework, to deal with codominant or, more rarely, with dominant markers, but we also consider more general statistical strategies suitable for binary data like presence or absence of bands. In addition, we discuss some aspects of the experimental design, which are important for subsequent statistical analysis. Finally, we define topics for exciting future researches and notably discuss the lack of an established mutation model for AFLP data. Our review is mainly centred on AFLPs, but will also be largely valid for other dominant biallelic markers: random amplified polymorphic DNA (RAPD, Williams et al. 1990), intersimple sequence repeat (ISSR, Zietkiewicz et al. 1994; Wolfe et al. 1998), and diversity arrays technology (DArT) markers (Jaccoud et al. 2001).

Experimental design of AFLP studies

The power of any AFLP analysis depends on the sampling strategy and experimental protocol chosen in the early stage of the study. Establishing the experimental design is thus a crucial step that deserves careful consideration and should take into account the specific features of AFLP markers (e.g. dominance and biallelism).

Sampling strategy: how many bands and individuals?

Various parameters (e.g. mating system, effective population size, existing level of population structure) influence the accuracy of population genetics estimates (Mohammadi & Prasanna 2003; Mendelson & Shaw 2005; Singh et al. 2006). Among these parameters, only two can really be chosen when establishing a sampling strategy: the numbers of individuals and loci sampled.

To achieve the same level of accuracy in estimates of population parameters, studies based on AFLPs require an extra sampling effort compared to those employing codominant markers, because of the low information content of dominant biallelic data. More precisely, it is suggested to genotype 2–10 times more individuals per population for AFLPs than for microsatellites (Lynch & Milligan 1994; Mariette et al. 2002; Nybom 2004). Krauss (2000) found that most procedures for estimating diversity in AFLP data yield accurate results when about 30 individuals were analysed per population. This recommendation is far from being followed in practice: for example, Nybom (2004) registered only an average of 14.5 samples per population in a compilation of 27 AFLP studies in plants.

The optimal number of AFLP markers to assess depends on the goal to achieve (Mendelson & Shaw 2005). When searching for loci under selection or associated with a phenotype, there is no theoretical upper limit where extra sampling effort is worthless since it is preferable to screen as many loci as possible (Luikart et al. 2003; Storz 2005). In practice, the true degree of genome coverage remains difficult to evaluate without information on both the genome size and the localization of the markers (by way of genome sequencing or linkage mapping). A literature survey reveals that genome scans looking for selection signatures among AFLPs are usually based on more than 300–400 markers (e.g. Campbell & Bernatchez 2004; Wilding et al. 2001; Bonin et al. 2006). On the contrary, for classical surveys of genetic diversity, population structure, genetic relatedness, or assignment tests, there is usually a range in the number of markers below which sampling variance is too high and estimates are thus not reliable. Conversely, sampling above this range does not necessarily increase the power but may add some noise in the data (Hollingsworth & Ennos 2004). Several research groups have tried to specify the acceptable number of markers to consider in particular situations. Cavers et al. (2005) explored the sampling limit for reasonable estimates of the fine-scale spatial genetic structure in natural tree populations with limited gene flow and seed dispersal. They simulated an artificial population of 1900 trees in a 1200 x 1200-m area (using diameter distribution and density data) for the neotropical tree species, Symphonia globulifera. An artificial genotype at 100 AFLP loci was assigned to each tree, so that (i) overall, the frequency of the band-presence allele was evenly distributed from 5% to 95% over all loci; (ii) there was no initial fine-scale genetic structure; and (iii) the genotypes were in Hardy–Weinberg proportions. Then, the evolution of this population was simulated for 1000 years several times given limited pollen and seed dispersal. The results indicated that 150 individuals have to be genotyped at 100 loci for a reliable estimation of the fine-scale genetic structure. However, these values may not be extendable to species with weaker genetic structure or to other allelic frequency distributions (Cavers et al. 2005). Hollingsworth & Ennos (2004) investigated the efficiency of simulated dominant data to build resolved neighbour-joining trees and showed that 250 loci were required to correctly cluster individuals at low levels of population differentiation. Data sets with fewer markers (e.g. 50) were unable to resolve the tree topology even at much higher levels of population differentiation. A sufficient number of markers is thus primordial to unravel the true genetic structure of populations using clustering methods. The general pattern emerging from these results is that the optimum number of loci differs considerably according to the studied species, its reproductive biology, the level of gene flow between populations, etc. However, assessing at least 200 AFLP markers seems to be an acceptable starting point when measuring genetic variation or differentiation (Mariette et al. 2002; Hollingsworth & Ennos 2004; Cavers et al. 2005; Singh et al. 2006), even if this might be sometimes difficult to achieve in practice.

Obtaining the data: technical pitfalls to be avoided

Although the AFLP technique is highly reproducible, with error rates typically falling in a 2–5% range (Hansen et al. 1999; Ajmone-Marsan et al. 2002; Bonin et al. 2004), genotyping errors should not be overlooked. They can arise from various causes such as sample contamination, biochemical artefacts, human error, low quality DNA, etc. (Bonin et al. 2004; Pompanon et al. 2005), but two types of errors prevail in AFLP genotyping: allele homoplasy and scoring errors.

Allele homoplasy occurs when nonhomologous fragments migrate at the same position in an electrophoretic profile, or when different mutations lead to the loss of the same fragment (Meudt & Clarke 2007; Simmons et al. 2007). Estimates of genetic diversity or differentiation are thus expected to be biased downwardly (Koopman & Gort 2004) and this ultimately limits the power of the analyses (Vekemans et al. 2002; Meudt & Clarke 2007). Investigating the reasons of the absence of a particular band is difficult. Thus, the empirical study of allele homoplasy has often been restricted to the sequencing of comigrating fragments at different taxonomic levels. It appears that homoplasy of comigrating fragments is limited in intraspecific comparisons (Rouppe van der Voort et al. 1997; Veckemans et al. 2002; Mendelson & Shaw 2005) but increases with the taxonomic distance (Mechanda et al. 2004). In addition, the smaller fragments (< 150 bp) and dense profiles are particularly subject to homoplasy (Veckemans et al. 2002; Mendelson & Shaw 2005). When choosing AFLP markers, we advise (i) confining the analyses to the intraspecific level and avoiding transfers of markers between species; (ii) favouring primer combinations that generate clearly readable and exploitable profiles (with a number of bands < 100 and whose bands are homogeneously scattered along the profile); (iii) giving preference to longer bands as markers; and (iv) if possible, assessing the extent of fragment homoplasy by means of sequencing, in silico analyses (Rombauts 2003), or other available protocols (Hansen et al. 1999; O’Hanlon & Peakall 2000).

Scoring errors can represent the vast majority of genotyping errors occurring in AFLP data sets (Bonin et al. 2004). This is mostly due to the difficulty and subjectivity in correctly reading profiles, especially when there are differences in band intensity between individuals or between runs. Therefore, the scoring step has to be taken seriously and entrusted to experienced and meticulous laboratory staff. Double reading of the profiles is also a helpful method to limit scoring errors (Bonin et al. 2004). In any case, running replicates, tracking all genotyping errors and estimating their rate should be a priority in any AFLP study (see Pompanon et al. 2005). At a particular locus, the genotyping error rate can be calculated as the ratio of the total number of mismatches (band presence vs. band absence) to the number of replicated individuals (Pompanon et al. 2005). The maximum acceptable error rate for an individual locus will vary according to the goal of the study (Bonin et al. 2004; Pompanon et al. 2005) but we recommend it never exceeds 0.1. The number of replicated individuals is also crucial for an accurate estimation of the error rate, and should represent a substantial fraction of the total number of samples (> 5–10%).

Another major technical concern is the problem of nonindependence of markers. It occurs, for example, when a mutation, deletion or insertion displaces an AFLP band along a profile and when the two positions are scored as two independent loci whereas they are linked (Simmons et al. 2007). This issue of nonindependence of AFLP markers is increasingly recognized in phylogenetic reconstruction (Koopman 2005; Simmons et al. 2007). On the contrary, it is still largely ignored in population genetics and ecology although it has the potential to artificially inflate similarity and relatedness indices. In practice, nonindependence of markers can be ruled out by testing marker linkage disequilibrium (e.g. Gaudeul et al. 2004), but there is a clear need for more sophisticated methods to allow for nonindependence of markers in the analyses.

Statistical methods in AFLP analysis

Genetic diversity

Since its development, the AFLP technique has been primarily dedicated to assessments of intraspecific genetic diversity. This is especially true for plants (e.g. Gaudeul et al. 2004; Nybom 2004; Mba & Tohme 2005) and crop cultivars (e.g. Shan et al. 2005; Wu et al. 2006), bacteria and fungi (e.g. Kis-Papo et al. 2003; Kolliker et al. 2006), but also for invertebrates (Mendelson & Shaw 2005; Conord et al. 2006) and vertebrates, including fish (McMillan et al. 2006), birds (Wang et al. 2003) and mammals (Polyakov et al. 2004; Foulley et al. 2006; SanCristobal et al. 2006).

Box 2 lists several parameters which are routinely encountered in studies of genetic variation with AFLPs and all fall either under the band-based or the allele frequency-based approaches previously described. The band-based metrics can be directly estimated from the AFLP profiles and include various coefficients of similarity (the Jaccard, Dice, or simple-matching coefficients; Box 1), in addition to the Shannon index (Shannon 1948) and, less employed so far, the nucleotide diversity π (Clark & Lanigan 1993; Borowsky 2001). Some coefficients of relatedness developed for dominant markers could also qualify for the band-based category; however, they will not be mentioned further here because of their more restricted use (see Hardy 2003; Wang 2004; Ritland 2005). On the other hand, Nei's gene diversity (Nei 1973; Nei 1978) is based on allele frequencies and requires additional assumptions, but produces estimates which are directly comparable with estimates from codominant markers.

The properties and robustness of these different measures of diversity have seldom been properly assessed, and in many studies, the choice of a particular metric seems to be attributable to chance more than to rationale. This has the potential to lead to serious inconsistencies and debatable results (Kosman & Leonard 2005). A few general rules can help to select the most appropriate diversity metrics in a specific case. First, even if coefficients of similarity are expected to be highly correlated (Duarte et al. 1999; Shan et al. 2005), this should not be taken for granted and should be carefully tested. Poor correlation may for example be a clue of frequent homoplasy of band absence, and in that particular case, more credit is given to results based on Jaccard or Dice coefficients (Duarte et al. 1999; Mohammadi & Prasanna 2003; Meudt & Clarke 2007). Nonetheless, as noticed by Koopman & Gort (2004), these two similarity coefficients still ignore the issue of comigrating bands. Second, although the use of Nei's gene diversity is questionable because of its dependency on the Hardy–Weinberg hypothesis (Mendelson & Shaw 2005), multilocus estimates of this index have proved to be robust to violations of this assumption (Kremer et al. 2005). This underlines the importance of surveys of genetic diversity based on a sufficiently large number of loci, that is, at least several hundreds (Mariette et al. 2002; Kremer et al. 2005). More generally, Nei's gene diversity is considered reliable when (i) a substantial number of individuals are sampled in the population, allowing an accurate estimation of allele frequencies (Mba & Tohme 2005); and when (ii) outcrossing species are examined (Meudt & Clarke 2007).

Example. SanCristobal et al. (2006) examined the genetic diversity in 58 European pig breeds and one Chinese breed (n = 50 individuals per breed) using a set of 148 AFLP markers generated with four different primer combinations. Allele frequencies were estimated with the square-root procedure under the Hardy–Weinberg hypothesis. The results showed that AFLP markers could be partitioned into two distinct groups according to their levels of polymorphism within breeds: a quasi-monomorphic group (M) and a more polymorphic one (P). On average, the percentage of monomorphic loci per breed was 63%. The correlation coefficient between the average simple-matching coefficient and Nei's gene diversity was 0.69 (P < 0.001). Nei's gene diversity obtained with markers from the P group was comparable to estimates from 50 microsatellite loci genotyped in the same animals. On the contrary, there was little correlation between diversity indices calculated from the M group and from microsatellites. Foulley et al. (2006) re-examined the same data set but estimated allelic frequencies following Hill & Weir (2004). This procedure allowed considering the quasi-monomorphic (M group) and the more polymorphic (P group) markers together in the calculations of genetic diversity. Results were considerably different from those obtained with the square-root method, which indicates that the method by Hill & Weir (2004) may be particularly helpful in cases of low levels of polymorphism.

Population structure

As underlined by Bensch & Akesson (2005), ‘The top of the agenda for many molecular ecologists is to study the genetic structure of populations’. Evaluating population structure is of considerable interest because it is a precursor to answering many other questions such as estimating migration, identifying conservation units, and specifying phylogeographical patterns (Manel et al. 2005). So far, AFLPs have been used in a range of applications to assess population structure from small to large scales (e.g. de Casas et al. 2006; Hardy et al. 2006; Rivera-Ocasio et al. 2006). The methods mentioned below and the related softwares are summarized in Table 1.

Table 1.  Methods and associated softwares for the statistical analysis of AFLP data. The software list is not exhaustive
MethodsApproachExample of software(s)Underlying assumptionsCommentsReference(s)
Genetic diversity
Estimation of band-based metricsBand-basedpco* Estimation of various similarity indices (Jaccard, Dice, simple-matching, etc.)(Anderson 2003)
 Band-basedpopgene Estimation of the Shannon Index(Yeh & Boyle 1997)
Estimation of allele frequency-based metricsAllele frequency-basedaflp-survHardy–Weinberg equilibrium or known FISEstimation of Nei's gene diversity from allelic frequencies calculated with the square-root or the Bayesian procedure(Veckemans et al. 2002)
 Allele frequency-basedpopgene
Hardy–Weinberg equilibrium (or known FIS for popgene)Estimation of Nei's gene diversity and Nei's genetic distance from allelic frequencies calculated with the square-root procedure(Yeh & Boyle 1997; Peakall & Smouse 2006)
Population structure — no prerequisite about populations
Nonspatial descriptive methods
Neighbour-joining treesBand- or allele frequency-basedtreecon
phylip version 3.6**
Independent lociVisual representation of a distance matrix(Van de Peer & De Wachter 1994; Felsenstein 2004)
Multivariate analyses (e.g. principal coordinate analysis)Band- or allele frequency-basedpco* Visual representation of a distance matrix(Anderson 2003)
Nonspatial Bayesian clustering methods
 Band-basedstructure version 2.1††Hardy–Weinberg equilibrium
Independent loci or known linkage groups
Requires successive trials to estimate the number of clusters.
Analysis of band frequencies: data are coded with missing values for the second allele
(Pritchard et al. 2000)
 Allele frequency-basedstructure version 2.2††Hardy–Weinberg equilibrium
Independent loci or known linkage groups
Accounts for the genotypic ambiguity inherent in dominant markers(Falush et al. in press)
 Band-basedbaps‡‡Hardy–Weinberg equilibriumIndependent loci Analysis of band frequencies: data are coded with missing values for the second allele(Corander et al. 2003)
Spatial Bayesian clustering methods
 Band-basedgeneland§§Hardy–Weinberg equilibrium Independent loci Analysis of band frequencies: data are coded with missing values for the second allele(Guillot et al. 2005)
 Band-basedtess¶¶Hardy–Weinberg equilibriumIndependent loci Analysis of band frequencies: data are coded with missing values for the second allele(François et al. 2006)
 Band-basedbaps4‡‡Hardy–Weinberg equilibriumIndependent lociAnalysis of band frequencies: data are coded with missing values for the second allele(Corander et al. in press)
Methods to identify barriers to gene flow
Based on the Monmonier AlgorithmBand-basedbarrier*** Analysis of distances between individuals(Manni et al. 2004)
 Band-basedais††† Analysis of band frequencies(Miller 2005)
Based on WomblingBand-basedwombsoft‡‡‡ Analysis of band frequencies(Crida & Manel in press)
Mantel test to test for isolation by distance
 Allele frequency-basedspagedi§§§Hardy–Weinberg equilibrium or known FISAnalysis of pairwise relatedness between individuals(Hardy & Vekemans 2002)
 Band- or allele frequency-basedgenalex§ Analysis of the correlation between a geographic matrix and a distance matrix for individuals or populations(Peakall & Smouse 2006)
Spatial autocorrelation
 Band-basedgenalex§ Based on a distance matrix of individuals or populations(Peakall & Smouse 2006)
 Band-basedsgs¶¶¶ No missing data allowed(Degen et al. 2001)
Population structure — prerequisite about populations
Estimation of FST
 Band- or allele frequency-basedpopgeneHardy–Weinberg equilibrium or known FISEstimation of band frequencies or estimation of allelic frequencies calculated with the square-root procedure(Yeh & Boyle 1997)
 Band- or allele frequency-basedaflp-survHardy–Weinberg equilibrium or known FISEstimation of band frequencies or estimation of allelic frequencies calculated with the square-root procedure or the Bayesian procedure(Vekemans et al. 2002)
 Allele frequency-basedhickory****Beta distribution of allelic frequency variation across populations FIS and FST similar across lociBayesian estimation FST
Also provides estimates of allelic frequencies and FIS, but these are less reliable
(Holsinger et al. 2002)
Analysis of molecular variance (amova)Band-basedarlequin†††† Based on a phenotypic (band-based) distance matrix(Excoffier et al. 1992)
Identification of hybrids
Bayesian clustering methodsAllele frequency-basednewhybrids‡‡‡‡ Bayesian identification of hybrid individuals
Distinction between several categories of hybrids
(Anderson & Thompson 2002)
 Band- or allele frequency-basedstructure†† baps‡‡See above — Population structure
Assignment testBand-basedaflpop§§§§  (Duchesne & Bernatchez 2002)
Identification of candidate loci for selection
Estimation of the null distribution of genetic differentiation
 Allele frequency-basedwinkles¶¶¶¶Hardy–Weinberg equilibriumSimulation of neutral genetic differentiation between two populations
Estimation of allelic frequencies by the square-root procedure
(Wilding et al. 2001)
 Allele frequency-baseddfdist*****Hardy–Weinberg equilibriumSimulation of neutral genetic differentiation between two populations or more
Estimation of allelic frequencies by the Bayesian procedure
(Beaumont & Nichols 1996)
Logistic regression
 Band-basedr†††††Binomial distribution for the response variable No correlation between explanatory variablesAllows identifying the specific selection pressure(s)(R Development Core Team 2004)

Identification of populations.  For many species, demarcation of populations is problematical, and there is no a priori knowledge of population entities. Tree-based methods (Hollingsworth & Ennos 2004) or multivariate analysis (e.g. principal coordinates analysis), which allow to graphically represent distance matrices, are well adapted to exploratory analyses (see Box 1 for distances). But for statistical inferences, model-based approaches such as the Bayesian clustering methods are more suitable (e.g. Pritchard et al. 2000).

Most of the clustering methods (Pritchard et al. 2000; Falush et al. 2003; Corander et al. 2004; Wu et al. 2006), reviewed in Manel et al. (2005) and in Wu et al. (2006), are applied to dominant markers, although until recently, no AFLP-specific implementation existed. Clusters are in this case characterized by their band frequencies. For example, in the case of structure 2.1, the most widely used method to infer population structure (not AFLP specific), the presence/absence of a band is treated as a haploid allele and the second allele at each locus is entered as missing. Using presence/absence of bands directly is valid under the no-admixture model, but not correct for the estimation procedure of structure 2.1 under the admixture model (see documentation of structure 2.1). Similarly, in the case of baps (Corander et al. 2004), dominant data are entered into the program as haploid.

Recently, a modified version of structure treating dominant markers explicitly was developed (version 2.2, Falush et al. in press). This version allows estimating admixture proportions. Instead of assuming that the genotype of each individual at each locus is known (or entirely unknown in the case of missing data) as in the previous version, structure version 2.2 treats the genotypes themselves as unknown. The observations (i.e. the dominant data) provide only partial information about the genotypes, and an additional step is introduced into the algorithm, which updates the diploid genotypes based on the probability of all possible genotypes (Falush et al. in press). From simulations, Falush et al. (in press) showed that the allelic frequency estimates are more accurate when the alleles are codominant, especially when the frequency of the presence allele is high. However, the difference to the accuracy obtained when making one allele recessive to the other (i.e. dominant markers) is rather small. We strongly recommend using the version adapted for dominant markers.

When geographical locations of individuals are known and sampling is relatively even in space, spatial model-based clustering methods are available to identify clusters of individuals (Guillot et al. 2005; François et al. 2006; Corander et al. in press). Assuming that populations occupy geographically delimited areas, the use of spatial information increases the power to correctly detect the underlying population structure. But these spatial methods have not been developed specifically for dominant markers and require adding missing values (see above). We will thus not discuss these approaches any further (see François et al. 2006 for a technical discussion).

Specific spatial methods looking for genetic boundaries at the individual level can be applied directly to infer population structure from dominant markers. Once genetic boundaries have been identified and their significance assessed, it is possible to assemble clusters of individuals. Two different band-based approaches can be used. First, the Monmonier algorithm (Monmonier 1973; Manni et al. 2004; Miller 2005) is based on the analysis of the genetic distances between individuals (see Box 1). The first step of this approach consists in connecting the sampled localities using a graphical method for defining adjacent points on a map. Genetic distances are computed between all pairs of localities connected by direct edges, and genetic boundaries are then associated with the highest genetic distances. Second, the Wombling is based on the analysis of band frequencies (Womble 1951; Barbujani et al. 1989). It locates boundaries across a sampled area by searching regions where the gradient (i.e. slope) in allelic frequency is steep.

Finally, for species in which individuals are continuously distributed and/or where sampling was continuous, spatial autocorrelation (Epperson 2003; Hardy 2003) or regression methods (Rousset 1997; Bensch et al. 2002a) can be used to investigate a pattern of genetic isolation by distance (Table 1).

Estimation of FST values.  Once populations have been defined, a common question is ‘how different are they?’ (Manel et al. 2005). Five different methods can be found in the literature to estimate FST from dominant markers (Table 1): (i) classical estimation of FST (Weir & Cockerham 1984) or GST (Nei 1987) based on the estimation of allele frequencies (see Holsinger 1999 for the difference between the two parameters); (ii) Bayesian estimation of FST that introduces uncertainty about the magnitude of FIS (Holsinger et al. 2002); (iii) ΦST, which can be estimated in an analysis of molecular variance (amova; Excoffier et al. 1992) from a distance matrix. The amova requires a metric with Euclidean properties (see Reif et al. 2005 for description of these properties), such as the simple-matching coefficient (Box 1). In addition, genetic differentiation can be estimated by (iv) partitioning nucleotide diversity (Box 1) according to Charlesworth (1998) (e.g. Tero et al. 2005); or by (v) calculating FST according to the moment-based method developed by Hill & Weir (2004) (e.g. Foulley et al. 2006). In addition, structure 2.2 calculates population-specific FST estimates for identified clusters under the F-model (Falush et al. 2003).

Box 3 presents a comparison of the three first estimation approaches for real and simulated AFLP data. The partitioning of nucleotide diversity is based on several additional assumptions (Innan et al. 1999) and rarely used; therefore it was not considered further. The method proposed by Hill & Weir (2004) was also not included in our comparisons because it is by now not implemented in any easy accessible software, and has to our knowledge been applied only once (Foulley et al. 2006).

Our calculated examples showed that one has to be cautious in interpreting differentiation estimated from dominant data. While several methods overestimated differentiation, others clearly underestimated it. It is thus only possible to compare levels of differentiation estimated using the same method. Sample sizes of 10 individuals, regularly occurring in empirical studies, produced biased estimates in most cases. Larger sample sizes (50 individuals) improved allele frequency-based estimates considerably. The number of loci used (300 or 150) had a negligible effect on the estimates. Our simulations showed that, given a sufficient number of sampled individuals, allele frequency-based estimates should be preferred. However, the analysis of empirical data sets showed that the relative level of differentiation is in general the same independently of the estimation method chosen, and independently of assumptions about the FIS value. Comparative interpretations, which are common in phylogeographical and ecological studies, are thus rather robust.

Example.  Few studies use several approaches simultaneously to estimate population differentiation. Tero et al. (2005) investigated the genetic structure in the endangered plant species Silene tatarica by analysing 193 polymorphic AFLPs in plants from seven sites (24–30 individuals per site) in order to address the degree of isolation of the subpopulations. Using structure 2.1 (Pritchard et al. 2000), they identified the seven subpopulations in which the individuals were collected. They found considerable discrepancies between the results of different estimators of differentiation, in accordance with Box 3. The Bayesian method of Holsinger et al. (2002) provided only weak evidence of inbreeding in the total population, but Holsinger et al. (2002) advise to regard estimates of FIS derived from dominant markers with caution. This method estimated the FST value among all subpopulations around 0.287 ± 0.012. This was lower than the FST estimates based on allele frequencies using the square-root method as implemented in popgene (FST = 0.390 ± 0.018, assuming Hardy–Weinberg equilibrium), the amova estimate (ΦST = 0.369), and the value estimated on the basis on nucleotide diversities (FST = 0.580).

Interpretation of population structure: dispersal and phylogeography.  Once the population structure has been identified and quantified, it can be interpreted in the context of various biological questions. Below we discuss two common examples: the estimation of dispersal rates and the discussion of phylogeographical patterns.

Several approaches exist to estimate dispersal rates from AFLP data. Methods differ by the timescale over which gene flow is estimated and by their dependence on underlying models, whose assumptions are usually not tested (Sork et al. 1999; Hardy et al. 2006). The difficulties associated with the estimation of dispersal rates from genetic data are not specific to AFLPs. The oldest and most classical approach is to assume an island model of structure and migration–drift equilibrium, and to estimate a mean dispersal rate from FST. This approach has been amply criticized (e.g. Whitlock & McCauley 1999). Applying this approach to AFLP data, de Casas et al. (2006) estimated gene flow and showed that reproductive barriers separated neither populations nor lineages of Olea europaea in the western Mediterranean area. In a continuous habitat, dispersal distances can be estimated from the slope of genetic differentiation against geographical distance (isolation-by-distance approach; Rousset 1997). As the previous one, this approach assumes migration–drift equilibrium and has been shown to perform best at small geographical scales (Rousset 1997). Hardy (2003) introduced an estimation of pairwise relatedness between individuals for dominant markers that can be used instead of genetic differentiation in the isolation-by-distance approach (Table 1). Dispersal distance can thus be estimated directly from individual data, without calculating pairwise FST estimates between populations. This estimator does not assume genotypes to be in Hardy–Weinberg proportions but requires knowledge of the inbreeding coefficient. It has been successfully used to compare gene dispersal distance in 10 neotropical tree species (Hardy et al. 2006). A third approach to assess recent gene flow consists in identifying individual migrants using assignment tests (Paetkau et al. 2004; Manel et al. 2005). Assignment of individual genotypes to the most likely population of origin can be carried out using the band-based method implemented in aflpop (Raffl et al. 2006), an assignment test based on genetic distances as implemented in geneclass (Mariac et al. 2006), or a Bayesian clustering program such as structure. The two first approaches are clearly best at small scales, as genetic differentiation at larger scales is strongly influenced by history, migration–drift equilibrium is reached very slowly and the assumption of no mutation is more probable to be wrong at large scales.

AFLPs are commonly used in phylogeographical studies, especially in plants (e.g. Després et al. 2002; Schonswetter et al. 2004; Skrede et al. 2006). In such a framework, population structure is inferred by a combination of the methods described above and then interpreted as reflecting the history of the species, for example as resulting from past range contractions and expansions in response to climate change. Common questions are the identification of refugial areas, the source for (re)colonization of other areas and the distinction between old vicariance and recent dispersal. In addition to looking at the distribution of genetic diversity (assumed to be higher in refugial areas, but also in meeting zones), the distribution of individual markers provides some information. Fragments exclusive to one population (or region) can be counted and a large number of such private fragments indicates old divergence (Schonswetter & Tribsch 2005). Rare markers are expected to accumulate in long-term isolated populations. In order to avoid setting an arbitrary threshold for rareness, Schonswetter & Tribsch (2005) suggested calculating a frequency down-weighed (DW) marker value equivalent to range-down-weighted species values in historical biogeography. For some areas known to have been colonized recently, like areas extensively glaciated during the Pleistocene (e.g. Scandinavia), the source for recolonization has been identified by assignment tests (e.g. Skrede et al. 2006). Because of the lack of an appropriate mutation model, statistical phylogeographical methods based on coalescent simulations (Knowles 2004; Mallet 2005) have not been applied to AFLPs. It is also very difficult to suggest a time frame for processes identified exclusively on the basis of AFLP data, which limits the conclusions which can be drawn from phylogeographical studies restricted to this one marker type.

Identification of hybrid individuals

The identification of hybrid individuals and backcrosses between genetically distinct species or populations is important in many biological investigations addressing a broad range of topics including speciation, hybrid zones and conservation biology (e.g. Mallet 2005). In addition to this individual level, hybridization can be studied in a phylogenetic framework, addressing the role of historical hybridization in the formation of the gene pool of a species (e.g. Guo et al. 2005; Paun et al. 2006). Because AFLPs allow generating a large number of markers distributed throughout the genome, they are a useful tool to study hybridization both as current admixture and historical introgression (e.g. Bensch et al. 2002b). However, because our review focuses on studies of individuals and populations, we will not discuss the case of hybrid species.

Different approaches exist to identify hybrids in AFLP data sets. Analyses can be based on a set of ‘diagnostic’ markers, chosen to be fixed or to show clear differences in frequency between the parent groups (e.g. Bensch et al. 2002b), or on total data sets of randomly generated markers (e.g. Albert et al. 2006). Because diagnostic markers may be difficult to find in closely related species, the second approach is more common. Several statistical methods to identify hybrids are related to assignment tests (Manel et al. 2005). These methods have been developed for codominant data but were modified to be applied to dominant data. In a first step, pure individuals of the parent species are identified morphologically or genetically (e.g. using mtDNA). Using the AFLP data, unknown individuals can then be assigned to these two groups. Hybrids are identified as individuals with a small difference in likelihood between the parental species (Bensch et al. 2002b; Helbig et al. 2005). Alternatively, artificial hybrid genotypes can be simulated to create additional gene pools to which individuals can be assigned (Congiu et al. 2001). The software aflpop (Duchesne & Bernatchez 2002) performs assignment tests based on band frequencies (Table 1). It also allows simulating each of the following categories of hybrids: F1 hybrids, backcrosses to each parental species and F2 hybrids. Band frequencies in the hybrid gene pools are calculated from observed frequencies in the parental species assuming Hardy–Weinberg equilibrium. Given a sufficient power in the data, the status of hybrids can thus be analysed in detail. It is recommended to investigate the power of the data to distinguish these different categories using the simulation procedure implemented in aflpop (Duchesne & Bernatchez 2002).

In the case of two known and distinct parental gene pools, it is possible to estimate a hybrid index, which is an estimate of the proportion of the alleles of an individual that were inherited from each parental species (Rieseberg et al. 1999; Rogers et al. 2001). For nondiagnostic markers, such an index can be estimated both for dominant and codominant data using a maximum-likelihood approach (Rieseberg et al. 1999; Buerkle 2005).

The software structure 2.2 (Falush et al. in press; see Population Structure section) allows implementing an admixture model also for dominant data and can thus be used to address hybridization. Under the admixture model, the proportion of ancestry of each individual in each cluster or population is estimated by its posterior probability. Hybrids are identified as individuals with ancestry in two different clusters. If information about the relative position of the markers is available, the linkage model can also be used. It would be particularly appropriate in the case of a large number of markers. Anderson & Thompson (2002) developed a specific model-based Bayesian method for identifying hybrids, which is implemented in the program newhybrids. This method computes the posterior probability that an individual in the sample belongs to each of several different hybrid categories. Given sufficient data, it allows distinguishing between F1 hybrids, different backcrosses or later generation hybrids (insufficient data lead to a lack of resolution). It does not require that parental populations are sampled separately, thus avoiding the sometimes difficult step of identifying pure individuals. Dominant data are treated explicitly in newhybrids using an approach similar to that used in structure 2.2 (Falush et al. in press). An extra layer of latent variables is added to the model. Each marker is modelled as a biallelic locus and the individual genotypes are treated as latent variables that cannot be observed directly, but are estimated together with the rest of the model variables in order to fit the observed data (i.e. the dominant phenotypes; E. Anderson, personal communication). The advantage of structure or newhybrids over frequentist assignment tests is that these programs can identify hybrids also without reference samples of ‘pure’ genotypes. In addition, both of them treat dominant data explicitly and incorporate the incertitude about the true genotypes into the model.

More descriptive approaches, such as principal coordinate analysis (PCO, e.g. Helbig et al. 2005) or neighbour-joining trees (e.g. Congiu et al. 2001) are also informative. Hybrid individuals are identified on the diagrams from their intermediate position between the clusters of the parent species. It is important to note, however, that tree-based analyses are per definition not well adapted to investigate hybridization, as hybridization introduces reticulation. Tree-based analyses are thus bound to produce ‘wrong’ or ambiguous results for AFLP data sets including hybrids.

Example. Albert et al. (2006) studied the dynamics of introgression between the American and European eels genotyping 1127 individuals at 373 AFLP loci. They used the simulation option of aflpop to investigate the power of their data to discriminate between different categories of hybrids. As the probability of erroneous assignment was relatively high between backcrosses and F2 hybrids, these two categories were combined to a pool of later generation hybrids. Thus, individuals were classified either as pure American or European eels, F1 or later generation hybrids. Hybrids were identified mainly in Iceland. As a second approach, Albert et al. (2006) used structure 2.2. They ran an analysis assuming two genetic clusters. Based on the 90% posterior probability interval of the admixture value, individuals were assigned to four possible categories: pure if their probability interval overlapped with 0 or 1, F1 hybrid if their probability interval overlapped with 0.5, but not with 0 or 1, and later generation hybrid if their probability interval did not overlap with either 0, 0.5 or 1. The results of this approach were in agreement with those from aflpop for individuals with pure European origin and F1 individuals; however, structure identified considerably less later generation hybrids than aflpop (34 vs. 180; V. Albert, personal communication). At last, they compared the results from structure with those obtained from newhybrids using four categories of hybrids (distinguishing backcrosses and F2 hybrids). The results were largely congruent with 95% of 1127 assignments being identical. The largest discrepancy in the results was thus between the band-based approach of aflpop and the two allele-frequency based Bayesian approaches.

Detection of markers associated with phenotype

AFLPs are a tool of choice when it comes to unravel the genetic architecture of complex traits because they are particularly suitable to the screening of basically any genome, at low cost and effort (Blears et al. 1998; Mueller & Wolfenbarger 1999; Bensch & Akesson 2005). As a result, the AFLP technique is popular in quantitative trait loci (QTL) analyses or linkage studies (see for example Hawthorne 2001; Verhoeven et al. 2004; Rogers & Bernatchez 2005; Assunção et al. 2006; Gardner & Latta 2006). Phenotypic information is central in these two approaches, that aim at detecting statistical associations between the genotype at a given locus and the phenotypic value of the studied trait under a quantitative genetic framework (Mackay 2001). QTL analysis requires manipulated pedigrees or hybridizing populations and its usefulness for natural populations remains to be evaluated. On the contrary, it is possible to build partial linkage maps or to saturate existing linkage maps using AFLPs on unmanipulated pedigrees or natural populations. The methodological advantages and drawbacks of AFLP genotyping in QTL and linkage analysis will not be addressed further here, since they have already inspired many exhaustive reviews (see for example Erickson et al. 2004; Gupta et al. 2005; Slate 2005).

Less documented is the application of anonymous markers to the search of genes or genomic regions under selection when prior knowledge about the advantageous trait is missing. So far, this has been carried out mainly under a population genomics framework. The underlying philosophy of population genomics consists in capturing the patterns of genetic diversity at the genome scale by genotyping many markers scattered in the genome of many individuals for several populations (Black et al. 2001). All the sampled loci are expected to be influenced similarly by the global evolutionary forces, such as genetic drift or gene flow. A few of them, however, may also be subject to locus-specific forces like selection and therefore display an unusual schema of genetic differentiation (Luikart et al. 2003; Storz 2005). Identifying loci presumably under selection thus comes down to detect those with a deviant behaviour, which are often referred to as ‘outlier loci’.

Because AFLP genotyping allows an accurate assessment of baseline levels of neutral genetic variation across the genome, it is particularly appropriate to the population genomics approach. This is well-demonstrated by the increasing number of articles reporting genome surveys using AFLPs, which revealed loci potentially involved in adaptation to biotic (Mealor & Hild 2006) or abiotic factors (Bonin et al. 2006; Jump et al. 2006), as well as loci associated with adaptive divergence between sympatric ecotypes, morphotypes or host races (Wilding et al. 2001; Campbell & Bernatchez 2004; Emelianov et al. 2004). Exploratory AFLP-based genome scans have also been carried out in parallel in closely related species, to disentangle the genetic basis of speciation (Scotti-Saintagne et al. 2004; Savolainen et al. 2006).

However, despite the enthusiastic use of AFLPs in population genomics, methodological tools to detect outlier loci among such markers are still scarce and not always well adapted. Only two programs designed for this purpose are currently available: winkles (originally wink; Wilding et al. 2001) and dfdist, modified from fdist (Beaumont & Nichols 1996) to deal with dominant data (see Table 1). Both of these programs simulate the theoretical null distribution of the genetic differentiation between two (winkles) or several populations (dfdist) conditional on the allelic frequency (winkles) or heterozygosity (dfdist) under specific models of neutral evolution. They thus rely on prior assumptions about population structure, size and history, migration and mutation rates, etc. that may introduce a bias in the results if these assumptions strongly deviate from the generally unknown empirical conditions (Storz 2005). In addition, FST values have to be estimated from allele frequencies (Box 1).

Several directions can be explored in order to enhance the potential of population genomics to successfully identify loci with a genuine selection signature by means of AFLP-based genome surveys. First, we would like to call for a proper validation of the outlier detection programs already existing for AFLP data, in terms of power and robustness. In this respect, particular attention has to be paid to the confounding effects of demography that can create fake selection footprints (Akey et al. 2004). Second, more effort is needed to expand the range of methods capable to handle dominant binary data. It would be especially profitable to design methods implementing various demographic scenarios and/or examining different population parameters (FST, FIS, excess in homozygosity, etc.; Luikart et al. 2003). Indeed, such parameters are not all equally sensitive to population history (Nielsen 2005). Third, it is time to initiate a global reflection on statistical significance thresholds to be used in population genomics. Conducting multiple statistical tests indeed goes together with an increased probability of making type-I errors, which is traditionally controlled by adopting the Bonferroni correction. However, this procedure is now more and more criticized because it greatly reduces the statistical power of the analyses. Recently, some alternative methods were implemented to control the false discovery rate (FDR) while preserving a satisfactory power (Storey & Tibshirani 2003; Narum 2006). Fourth, the construction of realistic mutation models for AFLP data would also allow a substantial refinement to the current analysis means.

When investigating the genetic basis of adaptation without any phenotypic information, a promising alternative to population genomics may be embodied by regression models such as logistic regression, which is a band-based approach. Logistic regression describes the association between a qualitative response variable following a binomial distribution [B(n,p), where n is the number of Bernoulli trials and p is the probability of success] and one or more explanatory variables, assuming a logit link [log(p/(1 − p)] (McCullagh & Nelder 1989). Such a regression is perfectly adapted to the analysis of the relationship between an AFLP marker (n = 1) with two possible states (band presence or absence with p as the probability of band presence) for each individual, and an environmental variable (e.g. mean temperature, concentration in nutrients, luminosity, probability of predation, etc.). The likelihood ratio (or G-statistic) and/or the Wald test (Hosmer & Lemeshow 2000) can then be used to know if the examined association is significant, that is, if the model with the studied variable fits the observed distribution better than a model considering only a constant. Logistic regression is commonly applied in ecology, for example to investigate the best combination of explanatory (i.e. environmental) variables that explain the presence or absence of a species (Manel et al. 1999). On the opposite, it is currently underexploited in population genetics to search for adaptive loci (but see Joost & Bonin in press; Joost et al. in press), despite several significant advantages compared to the population genomics methods mentioned above. First, it works directly with the probability of occurrence of bands and not with allele frequencies, so it does not require any knowledge or assumption about FIS. Second, and this is a corollary of the previous advantage, logistic regression is an individual-based method free from the notion of population, as long as environmental data can be provided for each individual genotype. Third, it is able to specify which variable is coupled with the candidate locus, giving a valuable clue about the selective pressure at stake. Like other outlier detection method, however, logistic regression may highlight false-positives.

Example. Bonin et al. (2006) investigated adaptation along a gradient of altitude in the common frog (Rana temporaria), by means of a genome survey based on 392 AFLP loci. Using two different population genomics programs, they revealed eight strong candidate loci possibly under selection on the basis on their systematically high genetic differentiation between independent pairs of populations of different altitudes. Confirmation of this result is currently under way using other populations of the same geographical area, and seven of the identified loci were significantly associated to altitude in logistic regression analyses (Joost & Bonin in press). These promising results may trigger a fruitful joint application of logistic regression and population genomics methods in the quest for selected loci (Joost et al. in press).

Discussion and future directions

In this article, we focused on the statistical aspects of AFLP analysis and described methods available in the toolbox of molecular ecologists and evolutionists to assess genetic diversity, to identify population structure and potential hybrids, and to detect loci affecting phenotypes. These methods are of two kinds: the band-based methods, which are based on the direct study of band presences or absences in AFLP profiles; and the allele frequency-based methods, whose application is contingent upon the estimation of allelic frequencies within populations, and which are thus population-centred. Because of the dominant nature of AFLPs, these last methods require additional assumptions (i.e. Hardy–Weinberg equilibrium) or information about the population genotypic structure (FIS) to estimate allele frequencies.

Several valuable observations emerge from this review. First, there is no general recipe and no magical tool to extract information from AFLP data. Instead, one should (i) carefully adapt the experimental design to the goal of the study; (ii) favour specific and robust statistical tools; (iii) be aware about the underlying hypotheses and check their biological validity; and (iii), if possible, test both band-based and allele frequency-based methods to strengthen the results. Second, when using population-based methods, it appears that the sample size is a crucial parameter that greatly influences the accuracy of allelic frequency estimates. Therefore, we recommend granting special effort to the sampling, that is, collecting at least 30 individuals per population. If this is not feasible, we suggest favouring individual-based methods. Third, some methods mentioned here (e.g. structure 2.1) were originally conceived for codominant markers and can only handle AFLP data if these are recoded by adding a missing value. The consequences of such a coding still need to be properly evaluated. Fourth, as demonstrated by the example of logistic regression, the AFLP toolbox can be easily enriched by methodologies borrowed from other research fields, especially those designed to deal with binary data. Therefore, we would like to encourage the adoption and/or development of such promising methods in molecular ecology and evolution, together with the more traditional ones.

This review revealed another startling fact: very little is known about mutation processes in AFLPs, and in most statistical analyses mutation is ignored. Whereas this may be appropriate when working with related populations on short timescales (Excoffier & Heckel 2006), mutation is expected to have a non-negligible role in shaping genetic variation on longer timescales. Moreover, even if mutation is not totally ignored, the implemented mutation model is often the infinite-allele model with a single mutation rate (e.g. Mariette et al. 2002). This model assumes that any allele mutates at a given rate and that every mutation leads to a new allele, that is, an infinite number of alleles can be generated. The application of this model to AFLPs is debatable for two reasons. First, AFLP markers have only two alleles, not many; and second, it is not obvious that the presence allele is lost at the same rate than the absence allele. As a matter of fact, in phylogenetic reconstruction based on AFLPs, the probabilities of loosing or gaining a band are generally considered as asymmetrical (Koopman 2005). Mutation rates found in the literature for AFLPs are not particularly low and range from 10−6 in Gaudeul et al. (2004) to 10−4 in Wilding et al. (2001) and Campbell & Bernatchez (2004). It is thus time to initiate a thorough reflection about how mutation can precisely affect AFLP alleles and to integrate this knowledge in the existing statistical procedures.

In conclusion, in the last decade, the AFLP technique has proven to be a useful marker system, and hundreds of AFLP data sets have been produced for various species. Conversely, the statistical aspects specific to AFLP analysis are beginning to be addressed only now. For example, new methods are being explored to accurately estimate FIS directly from AFLP profiles, which would allow circumventing one of the major drawbacks of AFLP markers, that is, their dominance. An adequate mutation model for AFLP data would also allow taking advantage from the powerful approach of coalescent simulations. We thus predict that a new research phase has been reached, where more emphasis will be laid on AFLP data analysis to make the most of this helpful genotyping technique.


We thank P.B. Eidesen, M. Foll and other members of the LECA Journal Club for helpful comments on an earlier version of the manuscript. We also thank P.B. Eidesen and I.G. Alsos for providing some of the data sets analysed in Box 3. A.B. was funded by an Emergence grant from the Région Rhône-Alpes. D.E. was funded by the Research Council of Norway (grant 150322/720 to C. Brochmann). S.M. was funded by Fonds National de la Science (ACI IMPBIO).

The authors have a long-standing experience in the production and analysis of AFLP data sets. A.B. is a postdoctoral fellow whose research focuses on understanding the genetic basis of adaptation in various biological models. D.E. is a postdoctoral researcher working now at the University of Tromsø (Norway) on the ecology and phylogeography of arctic animals and plants. S.M. is an assistant professor with a special interest in landscape genetics, i.e., the study of the interactions between landscape features and microevolutionary processes.