Genome-wide scans for selection are based on the premise that while demographic events impact variation across all parts of the genome equally, the effects of selection are unique to the target gene or genomic region (Maynard Smith & Haigh 1974; Slatkin 1995). Although there are exceptions to this general rule, it forms a very solid basis for exploring the genomes of a variety of organisms and can be used to detect the effects of both natural and artificial selection. The two main drawbacks of this approach are that it can be difficult to move from identifying a marker that shows the signature of selection (generally directional or positive selection) to identifying the gene or genetic element under selection, especially in the absence of a sequenced genome, and that the scan itself is not informative about the phenotype that is the target of selection (Butlin 2010). This can leave researchers with a black box scenario: they know that selection has occurred in the past, and they see the result of it in the patterns of variation in the genome, but the mechanism and target of selection is not clear.
Mariac et al. (2011) took an interesting approach to opening the black box, in part by developing AFLP markers that were enriched for MADS-box motifs (by approximately 60–70%), a family of genes that contribute to a diversity of plant developmental processes, and using these markers to conduct a genome scan. While MADS-box genes are best known for their prominent role controlling floral morphology (e.g. as articulated in the ABC model; Coen & Meyerowitz 1991), many of the genes in the family also contribute to developmental processes related to how plants respond to environmental conditions, such as emergence from dormancy and initiation of flowering (Aikawa et al. 2010; Horvath et al. 2010). These developmental traits were of particular interest in domesticated pearl millet, where varieties grown in rainier climates tend to flower later, and varieties grown in drier climates tend to flower earlier.
For this study, cultivated pearl millet was sampled from the southern border of Niger to the northern limit of rain-fed agriculture at the Saharan Desert, spanning a pronounced environmental gradient of decreasing rainfall from the south to north (plants flower late the south, early in the north). The authors used a differentiation-based Bayesian approach (Foll & Gaggiotti 2008) to detect the signature of selection and identified two markers that showed signs of divergence across the extremes of the sampled populations. The two markers were highly correlated and confirmed as corresponding to a single MADS-box gene named PgMADS11. The identification of a candidate gene directly from a genome scan was made possible because of the fact that their AFLP markers were highly enriched for the MADS-box gene family; randomly distributed SSRs or AFLPs would generally only allow the researcher to say that something in closely linked to the marker was under selection. As the authors note, however, the functional variation at the gene was not identified and it is still possible that the target of selection is closely linked to the gene, rather than being the gene itself.
The other tool used to open the black box was association mapping. Based on the significant AFLP markers identified in the genome scan, a single indel marker was developed to track the alternate alleles of PgMADS11 and association between the marker and morphological variation was tested in an independent mapping population consisting of 90 inbred lines. Polymorphism in the gene was significantly correlated with both flowering time and spike length (Fig. 1), and these correlations were also significant in the original populations used in the genome scan. The use of an association mapping approach to understand the likely function of a gene showing the signature of selection is a key aspect of the study, as it allowed the authors to connect a gene showing the signature of selection to the traits that are the possible targets of selection.
While this association mapping step is not trivial, requiring either the development of an artificial mapping population or at the very least the thorough phenotyping and genotyping of a naturally occurring admixed population, it can potentially be accomplished for many nonmodel systems (Buerkle & Lexer 2008). Moreover, the probability of a successful outcome was enhanced because the candidate gene was identified through a screen enriched for MADS-box genes contributing to development and morphology. This meant that the researchers were not trying to map a gene that was actually associated with, for example, water use efficiency, against a population for which only morphological traits had been measured. Overall, the combination of a targeted genome scan with association mapping exemplifies a powerful approach for dissecting the genetic basis of traits under selection in nonmodel systems.
Just as the approach and techniques in this study are worthy of note, so too are the results. The MADS-box gene identified, PgMADS11, was correlated with flowering time and spike length, and allele frequencies covaried with rainfall within southern Niger. The differences in rainfall range from a high of approximately 650 mm/year at the southernmost point to approximately 250 mm/year at the northernmost point, and pearl millet can be cultivated across this range without irrigation (Fig. 2). This geographical gradient in annual rainfall potentially mimics environmental change over time, either in the past or in the future. It is possible that changes in allele frequency at the PgMADS11 gene could be important in allowing pearl millet cultivars to survive climate change, a scenario that can be tested either by historical sampling or by tracking allele frequencies in the future. Finally, it will be interesting to determine whether selection on PgMADS11 was local in nature, or whether this signature can be found across the range of cultivated pearl millet in Africa.