• Please log in or register to access this feature.

SEARCH

SEARCH BY CITATION

Keywords:

  • cline;
  • domestication;
  • microsatellite;
  • MITE;
  • pearl millet;
  • Teosinte-Branched1

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data accessibility
  10. Supporting Information

Unravelling the mechanisms involved in adaptation to understand plant morphological evolution is a challenging goal. For crop species, identification of molecular causal polymorphisms involved in domestication traits is central to this issue. Pearl millet, a domesticated grass mostly found in semi-arid areas of Africa and India, is an interesting model to address this topic: the domesticated form shares common derived phenotypes with some other cereals such as a decreased ability to develop basal and axillary branches in comparison with the wild phenotype. Two recent studies have shown that the orthologue of the maize gene Teosinte-Branched1 in pearl millet (PgTb1) was probably involved in branching evolution during domestication and that a miniature inverted-repeat transposable element (MITE) of the Tuareg family was inserted in the 3′ untranslated region of PgTb1. For a set of 35 wild and domesticated populations, we compared the polymorphism patterns at this MITE and at microsatellite loci. The Tuareg insertion was nearly absent in the wild populations, whereas a strong longitudinal frequency cline was observed in the domesticated populations. The geographical pattern revealed by neutral microsatellite loci clearly demonstrated that isolation by distance does not account for the existence of this cline. However, comparison of population differentiation at the microsatellite and the MITE loci and analyses of the nucleotide polymorphism pattern in the downstream region of PgTb1 did not show evidence that the cline at the MITE locus has been shaped by selection, suggesting the implication of a neutral process. Alternative hypotheses are discussed.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data accessibility
  10. Supporting Information

Plant domestication has led, over a span of only a few thousands of years, to the parallel evolution of many phenotypic traits for a large number of crop species (Purugganan & Fuller 2009), through the conscious and unconscious selection for similar characters of interest. The domestication process is consequently a good model for unravelling the molecular polymorphisms that have been involved in repeated and fast morphological evolution (Gepts 2004; Ross-Ibarra et al. 2007). Many morphological traits associated with plant domestication are controlled by a limited number of genes (Konishi et al. 2006; Frary et al. 2000). Although several of these genes are now well described (Doebley et al. 2006; Sang 2009), the molecular polymorphisms responsible for the phenotypic variations between domesticated and wild forms remain largely unknown (but see Komatsuda et al. 2007; Konishi et al. 2006; Simons et al. 2006 for some examples). Identification of these causal polymorphisms is crucial to understand the evolutionary history of crops (Shomura et al. 2008).

Pearl millet [Pennisetum glaucum (L.) R. Br.] is a cereal belonging to the family Poaceae and the subfamily Panicoideae, like sorghum, foxtail millet and maize. As it is a drought tolerant crop and has the ability to grow on low fertility soil, it is mostly cultivated in the semi-arid areas of the Sahel in Africa and in India, where it is a major food source. Pearl millet is highly outcrossing. Crosses between the domesticated form (P. glaucum ssp. glaucum) and the wild progenitor (P. glaucum ssp. monodii) produce viable and fertile hybrids. Wild pearl millet is found only in the Sahelian zone in Africa, where the domestication probably occurred (Brunken et al. 1977). Whether this domestication happened according to a noncentre model (Harlan 1971) or in a single west Sahelian centre (Oumar et al. 2008; Tostain 1992) followed by a possible secondary diversification in the eastern Sahel (Tostain 1992) still remains debated. Still now, wild and domesticated forms of pearl millet can be found in sympatric situations in some places in West Africa, and evidence of gene flow between them has been shown (Marchais 1994; Mariac et al. 2006b). This makes the inference of the domestication history of pearl millet using neutral molecular markers a challenging issue. It has been suggested that tracking the genealogical histories of domestication genes in complement to neutral genes could be a useful approach to study the origin of this crop (Robert et al. 2011) and the role of genetic introgression in the domestication process (Zhang et al. 2009). Up to date, very few studies aiming at identifying domestication genes in pearl millet have been carried out (Clotault et al. 2012; Lakis et al. 2012a; Remigereau et al. 2011).

In maize, Tb1 is a major domestication gene involved in branching architecture and in the determination of inflorescence sex (Doebley & Stec 1991; Doebley et al. 1995). It encodes for a putative transcription factor (Cubas et al. 1999), acting as a growth repressor in organs where its transcript accumulates (Doebley et al. 1997; Hubbard et al. 2002). During maize domestication, selection did not target the coding region of Tb1 itself, but instead the putative cis-regulatory regions upstream of the gene (Clark et al. 2006; Wang et al. 1999). Studer et al. (2011) have demonstrated that a transposable element 63 kb upstream of the coding region in domesticated genotypes enhances Tb1 expression, being therefore responsible for decreased branching relatively to wild plants.

As in maize and other grasses such as sorghum and foxtail millet, basal and axillary branching is greatly reduced in domesticated pearl millet in comparison with the wild form, even though the domesticated form generally has still some branches (Remigereau et al. 2011). Whereas branching architecture is mainly controlled by Tb1 in maize, the effect of Tb1 in pearl millet (PgTb1) is minor, explaining 10–18% of the branching variation between wild and domesticated pearl millet (Poncet et al. 2000; Remigereau et al. 2011). Evidence of signature of a selective sweep in PgTb1 in the domesticated gene pool has been reported, although the causal polymorphisms have not yet been identified (Remigereau et al. 2011). This suggests that this gene has been a target of human selection during pearl millet domestication. Even though the selective sweep was shown to be stronger in the 5′ untranslated region of the gene, a positive signal was also detected in the 3′ untranslated region (3′ UTR). The presence of a 315-base pair (bp)-long miniature inverted-repeat transposable element (MITE) of the Tuareg family has been reported in the 3′ UTR of PgTb1, 66 bp downstream of the stop codon (Remigereau et al. 2006). MITEs are short (less than 600 bp) nonautonomous transposable elements with no coding potential, and are very abundant in plants (Feschotte et al. 2002). They are mostly found in coding or regulatory regions and could be involved in gene regulation (Lisch & Bennetzen 2011; Wessler et al. 1995), in particular as polyadenylation signals (Bureau & Wessler 1994), DNA methylation signals (Yang et al. 2005) and micro DNA and short interfering DNA sequences (Piriyapongsa & Jordan 2008). Because the MITE Tuareg in the 3′ UTR of PgTb1 has the capacity to form hairpins (Remigereau et al. 2006), its presence raises the question of its potential role in PgTb1 expression and of the possible existence of selection events having targeted this gene during pearl millet domestication.

In this study, we analysed the variation in MITE frequencies with respect to neutral genetic variation observed at microsatellite loci for a collection of domesticated and wild populations sampled across the geographical distribution area of pearl millet in Sahel in Africa. Specifically, we first analysed the genetic structure of African wild and domesticated pearl millet populations and then tested whether MITE frequencies departed from neutral expectations for domesticated populations. We also checked for a specific selective sweep signature within PgTb1 sequences having the MITE insertion.

Materials and methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data accessibility
  10. Supporting Information

Plant material

The sampling consisted of seeds from 35 pearl millet populations, 22 domesticated and 13 wild populations, collected by the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) and the Institut de Recherche pour le Développement (IRD) or as part of the PLANTADIV ANR project. Sampled populations were chosen to provide the best coverage of the distribution range of wild and domesticated pearl millet in Africa, except the most southern part of the continent (Fig. 1). The sampled wild populations were distributed from Senegal to Sudan, and the sampled domesticated populations from Morocco to Tanzania. The geographical location of each sampled population and the number of individuals analysed per population are provided in (Table S1, Supporting information). For three domesticated populations, the geographical coordinates of the sampling location were unknown. To replace these missing data, we either used the coordinates of the centre of pearl millet's cultivation area in the country (for the population CHA3-D) or the centre of the country where the populations were sampled (for KEN-D and TAN-D).

Figure 1. Geographical distribution of the sampled populations. *: populations collected in localities where wild and domesticated forms were both present.

Download figure to PowerPoint

image

Molecular analyses

DNA extraction

Plants were grown in a greenhouse, and genomic DNA was extracted from fresh young leaves ground mechanically in liquid nitrogen using the NucleoSpin 96 Plant II Kit (Macherey-Nagel) following the manufacturer recommendations.

MITE genotyping and sequencing

Presence or absence of the MITE Tuareg in the 3′ UTR of PgTb1 was determined by PCR amplification using primers MITE-F and MITE-R (as described in Remigereau et al. 2006) and electrophoresis separation of the PCR products on a 1.5% agarose gel. Amplified fragments without the MITE insert have a length of about 330 bp, while those with the MITE have an expected length of about 645 bp. 1357 individuals were genotyped (21–48 individuals for each population). Among genotypes having the MITE insertion, we chose 11 individuals from domesticated populations distributed across the Sahelian area (BEN-D, NIA-D, NIG3-D, CHA1-D, CHA2-D, CAM1-D, CAM2-D, CAR-D, SUD-D, KEN-D and TAN-D) to amplify a 2493-bp fragment surrounding the MITE insertion including part of the coding sequence (1019 bp) and the 3′ UTR of PgTb1 (1474 bp). Primers used for these amplifications are in (Table S2, Supporting information). PCR amplifications were conducted in 30-μL reaction volumes (1X Taq Platinum DNA polymerase buffer (Invitrogen), 0.08 mm MgSO4, 0.2 mm dNTP, 0.2 μm of each primer, 0.75 U of high fidelity Taq Platinum (Invitrogen) and 2% DMSO). The cycling profile consisted of an initial denaturation phase at 94 °C for 4 min, followed by a touchdown step of 10 cycles (94 °C for 1 min, annealing phase of 1 min with temperature varying from 62 to 52 °C, 68 °C for 2 min 30), followed by 30 cycles at 94 °C for 1 min, 52 °C for 1 min and at 68 °C for 2 min 30, and a final extension step at 68 °C for 10 min. Amplified PCR products were cloned using the TOPO TA cloning kit (Invitrogen) with the TOP10 E. coli strain and the pCR2.1-TOPO plasmid vector. Cells were grown in LB with kanamycin (50 μg/mL), and plasmids were extracted using the NucleoSpin Plasmid kit (Macherey-Nagel) and sent for sequencing (Cogenics, Beckman Coulter Genomics; primers used for sequencing are in Table S2). Sequences were assembled, checked and edited when required using CodonCode Aligner v 3.7.1. We aligned sequences using ClustalW (Thompson et al. 1994) in BioEdit v7.0.5.3 (Hall 1999).

Microsatellite genotyping

Sixteen dinucleotide repeat microsatellite loci were selected among those already developed for pearl millet (Allouis et al. 2001; Qi et al. 2001, 2004; Senthilvel et al. 2008) and grouped into 5 PCR multiplexes (Table S3, Supporting information). PCR amplifications were conducted as described in Lakis et al. (2012b). Size separation of the PCR products was carried out on an ABI 3730XL sequencer at the INRA Research Unit Génétique, Diversité et Ecophysiologie des Céréales (UMR 1095), Gentyane platform, at Clermont-Ferrand. Alleles were scored using GeneMapper v4.0 (Applied Biosystems) and checked visually. Among the 16 loci, three (psmp2201, psmp2043 and psmp2267) were discarded, because their profiles showed multiple peaks in several populations and could therefore lead to an ambiguous determination of genotypes. Analyses were consequently conducted with 13 microsatellite loci, and we genotyped a subset of the total sample, that is, 1068 individuals (15–34 individuals per population, Table S1).

Data analyses

Genetic diversity within and among populations at microsatellite loci

Intrapopulation genetic polymorphism was estimated by calculating the allelic richness (Rs, El Mousadik & Petit 1996) using the software Fstat v2.9.3 (Goudet 2001). Estimations of observed (Ho) and unbiased expected heterozygosity (He, Nei 1978) for each population were performed using the software Genetix v4.05.2 (Belkhir et al. 2004). FIS values (Weir & Cockerham 1984) were also estimated for each population and for the domesticated and wild groups separately using Genetix. The 95% confidence interval (95% CI) was computed on the basis of 1000 bootstraps, and departure from Hardy–Weinberg expectations was tested by 10 000 permutations over loci. Pairwise FST values between all pairs of populations and overall FST values for the wild and domesticated groups were also calculated, and their significance was tested by 1000 permutations of individuals among the populations. The allelic richness, the expected heterozygosity and the FIS values were compared between the domesticated and the wild groups, using a Wilcoxon rank sum test implemented in the R statistical software (R Development Core Team 2011).

An analysis of molecular variance (amova, Excoffier et al. 1992) was performed using the software Arlequin v3.5.1.2 (Excoffier et al. 2005) considering two groups: the wild populations on one side and the domesticated populations on the other side. Significance of the F-statistics estimators was assessed using 1000 permutations. We tested for the occurrence of isolation by distance between populations considering wild and domesticated populations separately. We used the software Genepop v4.0.1 (Raymond & Rousset 1995; Rousset 2008) to perform a Mantel test (10 000 permutations) for assessing whether the genetic distances between pairs of populations, estimated by FST /(1− FST), were significantly correlated with the logarithm of their geographical distances, as suggested in Rousset (1997).

Spatial genetic structure of the pearl millet populations

We further characterized the genetic population structure of pearl millet based on the variability of the microsatellite loci using a Bayesian model-based approach implemented in the software tess v2.3.1 (François et al. 2006; Chen et al. 2007; Durand et al. 2009). The algorithm implemented in tess v2.3.1 clusters individual multilocus genotypes into a number of clusters (K), while minimizing departure from Hardy–Weinberg proportions and linkage disequilibria within each cluster. Specificities of tess among others include the ability to account explicitly in the population structure for spatial patterns in population structure (Durand et al. 2009; François & Durand 2010).

The analysis was performed using the whole data set including the domesticated and wild groups and on each of them separately. Two admixture models were tested (i) the conditional autoregressive (CAR) model with a linear trend surface (updating the spatial interaction parameter ψ, initially set to the default value 0.6, and using the mean of pairwise distances between individuals for the geographical scale parameter θ); and (ii) a model where any contribution of the spatial information was removed (i.e. constant spatial trend ψ = 0) close to the algorithm implemented in Structure (Falush et al. 2003; Pritchard et al. 2000). For both models, we allowed K to vary from 2 to 15, with 10 replicates for each value to check for convergence. Each individual replicate consisted of 3 × 104 burn-in iterations followed by 7 × 104 iterations. The two models were compared using the deviation information criterion (DIC), a model-complexity penalized measure of how well the model fits the data: models with the lowest DIC values are those which best fit the data. tess also includes a regularization procedure that can lead to a less ambiguous decision than using DIC values regarding the choice of the optimal number of clusters (François et al. 2006). For this optimal number, 90 additional replicates were carried out, and the 10 ‘best’ replicates with the lowest DIC values were analysed. The software clumpp v1.1.2 (Jakobsson & Rosenberg 2007), whose Greedy algorithm can compute a symmetric similarity coefficient between the different replicates (100 random input sequences, G’ statistic), was used to account for label switching and to detect multimodality, that is, the existence of distinct solutions across replicates. Graphical displays of the results were generated using the software Distruct v1.1 (Rosenberg 2004).

MITE and role of selection in domesticated populations

Nucleotide polymorphism in the MITE and the region surrounding the insertion was investigated using the sequences from the 11 domesticated individuals belonging to populations used in the microsatellite polymorphism survey and sequenced for this study (see above). We also used 15 published sequences of this genomic region originating from other domesticated individuals (Remigereau et al. 2011). Altogether, 26 sequences (12 including the MITE insertion and 14 without it) were obtained from domesticated individuals distributed in the whole cultivation area of pearl millet in Sahel (Table S4, Supporting information). We also used three published sequences (Remigereau et al. 2011) from wild plants having the MITE insertion to compare sequences of the element between wild and domesticated populations (Table S4). Nucleotide variation (nucleotide diversity π, Nei 1987) and the minimum number of recombination events (RM, Hudson & Kaplan 1985) were estimated in the studied sequences. For domesticated individuals, Tajima's D values (Tajima 1989) were computed for the genomic region surrounding the MITE separately for sequences with and without the MITE, to compare the level of departure from the strict Wright–Fisher neutral model between those two groups. Π, D and RM values were estimated using DnaSP v5.10 (Librado & Rozas 2009).

Geographical variation of the MITE frequencies was investigated with multiple linear regression between MITE frequencies and longitude and latitude, using the software Statistica 6.0 (StatSoft Inc). We also tested whether the MITE frequencies significantly departed from neutral expectations following the approach of Beaumont & Nichols (1996). The rationale of this approach is that loci under local directional selection should show a genetic differentiation across populations significantly larger than the level of differentiation observed at neutral loci, while in contrast, loci under balancing selection should show a smaller differentiation (Lewontin & Krakauer 1973). The level of differentiation at the MITE insertion and at microsatellite loci among populations for the domesticated group was estimated using overall FST values. However, differences in levels of heterozygosity and mutational rates and modes make FST values hard to compare between loci and require a standardization procedure as described in Hedrick (2005). We used the program smogd (Crawford 2010) to compute the standardized FST value for each microsatellite locus and for the MITE.

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data accessibility
  10. Supporting Information

Microsatellite genetic diversity within and among populations

We first analysed how the genetic variation was distributed between domesticated and wild groups using an amova (Table 1). Most of the genetic diversity (about 75%) was observed at the intrapopulation level, indicating the existence of a very high genetic diversity within pearl millet populations. However, about 10% of the genetic variance was distributed between the wild and domesticated groups (< 10−5), showing a substantial differentiation between the two groups.

Table 1. Analysis of molecular variance (amova). The two groups considered were the domesticated populations and the wild populations
Sources of variationd.f.Sum of squaresVariance componentsPercentage variationF-statistics
  1. a

    < 10−5.

Among groups1437.80.414 Va9.66FCT = 0.097a
Among populations within groups3213500.641 Vb 14.98FSC = 0.166a
Within populations203465633.227 Vc 75.36FST = 0.246a
Total20678350.8224.28136  

Allelic richness (Rs) ranged from 2.14 to 5.65 alleles per locus, with a mean of 4.89 ± 0.47 for the wild populations (range: from 4.00 to 5.65), and of 3.94 ± 0.73 (range: from 2.14 to 4.98) for the domesticated populations. Gene diversity (He) ranged from 0.52 to 0.68, with a mean of 0.61 ± 0.04, for the wild populations, and from 0.26 to 0.70, with a mean of 0.54 ± 0.10, for the domesticated populations. Domesticated pearl millet exhibited 81% of the total allelic richness and 89% of the total genetic diversity of the wild form, showing the existence of a significantly (P < 0.01) higher genetic diversity in the latter. Almost all populations, except NIG1-D, CHA3-D, CAM1-D, CAM2-D and CAR-W, showed a homozygote excess. This excess was higher (P < 0.001) in wild populations (FIS = 0.33, 95% CI: [0.26–0.41]) than in domesticated populations (FIS = 0.16, 95% CI: [0.10–0.22]). Values of diversity indices and FIS for each population are presented in Table 2.

Table 2. Intrapopulation diversity assessed with microsatellite markers
  N R s Ho (SD)He (SD)FIS [CI 95%]HW
  1. N: number of individuals; Rs: allelic richness; Ho: observed heterozygosity, He: unbiased expected heterozygosity, (SD: standard deviations). FIS calculated according to Weir & Cockerham (1984) and 95% confidence intervals (95% CI). HW: departures from Hardy-Weinberg expectations (***: P < 0.001, **: P < 0.01, *: P < 0.05, ns: non significant).

Domesticated populations
MOR-D313.880.46 (0.27)0.55 (0.21)0.16 [0.04–0.24] ***
MAU-D324.80.58 (0.15)0.70 (0.09)0.17 [0.08–0.23] ***
SEN-D324.80.50 (0.21)0.61 (0.19)0.18 [0.09–0.22] ***
MAL-D313.690.48 (0.25)0.53 (0.20)0.10 [0.00–0.17] **
IVO-D323.780.42 (0.28)0.49 (0.28)0.15 [0.04–0.21] ***
BUR-D323.520.42 (0.27)0.48 (0.26)0.14 [0.03–0.21] ***
BEN-D322.720.33 (0.21)0.42 (0.26)0.21 [0.08–0.29] ***
NIA-D313.320.44 (0.26)0.51 (0.17)0.13 [0.03–0.19] **
NIG1-D153.90.53 (0.23)0.56 (0.21)0.05 [−0.09–0.12]ns
NIG2-D194.320.47 (0.27)0.57 (0.22)0.18 [0.05–0.25] ***
NIG3-D314.210.49 (0.24)0.54 (0.22)0.09 [−0.01–0.15] **
NIG4-D294.350.45 (0.29)0.58 (0.21)0.23 [0.14–0.27] ***
NIG5-D314.650.51 (0.22)0.61 (0.18)0.17 [0.07–0.22] ***
CHA1-D324.20.46 (0.24)0.54 (0.22)0.14 [0.05–0.19] ***
CHA2-D314.980.57 (0.20)0.62 (0.18)0.08 [0.00–0.12] **
CHA3-D322.140.28 (0.27)0.26 (0.25)−0.08 [−0.18–0.00]ns
CAM1-D304.230.54 (0.22)0.57 (0.20)0.06 [−0.03–0.10]ns
CAM2-D314.640.61 (0.20)0.62 (0.17)0.03 [−0.06–0.07]ns
CAR-D323.460.43 (0.26)0.47 (0.25)0.07 [−0.02–0.13] *
SUD-D323.820.39 (0.17)0.58 (0.19)0.33 [0.28–0.40] ***
KEN-D304.40.40 (0.16)0.63 (0.13)0.37 [0.25–0.46] ***
TAN-D322.810.17 (0.13)0.36 (0.20)0.55 [0.43–0.63] ***
Average (SD) 3.94 (0.73) 0.54 (0.10)0.16 [0.10–0.22] 
Wild populations
SEN-W304.90.46 (0.30)0.59 (0.26)0.23 [0.12–0.30] ***
MAU1-W3050.39 (0.24)0.60 (0.24)0.35 [0.24–0.42] ***
MAU2-W314.530.40 (0.19)0.61 (0.23)0.36 [0.24–0.43] ***
MAU3-W325.350.38 (0.20)0.63 (0.24)0.39 [0.28–0.46] ***
MAU4-W324.480.49 (0.29)0.63 (0.23)0.22 [0.13–0.27] ***
MAL1-W324.190.31 (0.19)0.57 (0.28)0.45 [0.33–0.53] ***
MAL2-W324.370.25 (0.46)0.53 (0.26)0.53 [0.43–0.61] ***
MAL3-W315.510.49 (0.26)0.66 (0.28)0.26 [0.15–0.33] ***
MAL4-W325.650.49 (0.21)0.68 (0.25)0.29 [0.18–0.36] ***
NIG-W324.560.33 (0.17)0.60 (0.24)0.44 [0.33–0.52] ***
CAR-W3440.49 (0.24)0.52 (0.21)0.05 [−0.06–0.13]ns
CHA-W305.160.43 (0.20)0.62 (0.24)0.30 [0.19–0.37] ***
SUD-W304.950.49 (0.28)0.63 (0.26)0.23 [0.13–0.29] ***
Average (SD) 4.82 (0.51) 0.61 (0.05)  
without CAR-W 4.89 (0.47) 0.61 (0.04)0.33 [0.26–0.41] 

Pairwise FST values ranged from 0 to 0.503. Significant departures from 0 at the 1% threshold were observed between all pairs of populations (Table S5, Supporting information), except for the pairs SUD-D/NIG1-D, SUD-D/NIG2-D, CHA2-D/NIG5-D, CAM1-D/CAM2-D and SEN-W/MAU1-W. Genetic structure was detected both in wild and domesticated groups: the overall FST value was 0.16 (95% CI: [0.13–0.20]) for the domesticated group and 0.17 (95% CI: [0.13–0.22]) for the wild group.

Spatial structure of genetic diversity in pearl millet

The Bayesian clustering analysis implemented in tess revealed that, for the three data sets analysed (all populations, wild group only and domesticated group only), the spatial model (i.e. CAR model) provided a better fit to the data than the nonspatial model, which had higher DIC values (Fig. S1, Supporting information). As the results obtained under the two models were similar (data not shown), we only present here those obtained under the spatial model. A first major result of the analysis of the whole data set is the existence of a clear genetic differentiation between wild and domesticated populations, which were clearly assigned to 2 distinct clusters when the allowed number of clusters was set to K = 2 (Fig. 2A). However, admixture between wild and domesticated genomes was observed in some populations (e.g. the domesticated accession from Mauritania and wild accessions from Mali, Niger, Chad and Sudan). One wild population, CAR-W, was clustered with the domesticated populations. For this reason, it was excluded from other analyses, as its status as a wild population could be called into question. The maximum observed number of clusters for pearl millet in Africa was 10. For this value of K, the 10 replicates with the lowest DIC values out of the 90 analysed displayed seven distinct solutions. These distinct clustering results were very similar to each other, still separating domesticated populations from wild populations, and revealing a fine structure inside those two groups (Fig. S2, Supporting information). This fine genetic structure was confirmed when analysing the wild and domesticated groups separately (Fig. 2B and C, Fig. S3, Supporting information).

Figure 2. Population structure estimated by the tess analysis for pearl millet in Sahel for (A) the whole data set when K = 2, (B) the domesticated populations when K = 7 and (C) the wild populations when K = 7. In each case, only the most frequent solution among the 10 best replicates is displayed (all the solutions among the 10 best replicates, including those shown here, are presented in Fig. S3). Numbers on the right of barplots show how many times the solution was observed. In barplots, each individual is represented by a thin vertical line divided into K coloured segments representing the proportion of each in individual's genome assigned to each cluster. On the maps, pie charts sections represent the average proportion of membership to each cluster for each population.

Download figure to PowerPoint

image

When focusing on the wild group alone, we observed a clear structure of the neutral genetic diversity along the Sahelian area, with the existence of six or seven clusters, according to the two best solutions inferred from the Bayesian approach (Fig. 2C, Fig. S3B, Supporting information). This revealed a geographical structure in wild pearl millet. Results also showed admixture between geographically close clusters, indicating the likely existence of preferential gene flow between wild populations at short distances. This was confirmed by the existence of a strong isolation by distance (IBD) pattern (R² = 0.337, P < 0.004; Fig. S4C, Supporting information), indicating that dispersal and gene flow were geographically restricted.

Genetic diversity in the domesticated gene pool was also geographically structured (Fig. 2B, Fig. S3A). Clustering of populations was clearly dependent on their geographical proximity, even though the level of genetic admixture between domesticated populations, even when they were distant from each other, was much more important than between wild populations. A very weak IBD pattern was detected in domesticated pearl millet (R² = 0.050, P < 0.05) when considering all populations (Fig. S4A, Supporting information). However, we found no significant IBD pattern (R² = 0.003, P = 0.2697) when the two furthest populations in the east (KAN-D and TAN-D, Fig. S4B, Supporting information) were excluded. This shows that the detected IBD was largely influenced by the genetic and geographical distances of these two populations only, in comparison with the rest of the sampling. Thus, in contrast to wild populations, the pattern of neutral genetic diversity in domesticated pearl millet in the Sahelian area of Africa is not structured by an IBD process. This is in agreement with the signal of admixture between distant domesticated populations (Fig. 2B, Fig. S3A).

Polymorphism pattern at the MITE insertion locus

The MITE insertion was almost absent from wild pearl millet populations, except in populations CHA-W, SUD-W and CAR-W (frequencies of 9%, 14% and 27% respectively) (Fig. 3), which were found in localities where domesticated populations were also present and were the wild populations showing the higher membership coefficients in the domesticated cluster (Fig. 2A). We observed a longitudinal cline of the MITE insertion frequency in Africa for domesticated populations (Fig. 3). The insertion was absent in domesticated West African populations and reached a frequency of 70% in East Africa. The multiple regression analysis showed that MITE frequencies were not correlated with latitude (P = 0.868), but strongly correlated with longitude (P = 0.001), which explained 58.3% of the variation of the MITE frequency. The overall genetic differentiation for domesticated populations was similar for the MITE locus and for the microsatellite markers. Indeed, the standardized FST value estimated on the basis of MITE frequencies in domesticated populations was 0.28. This value was in the range of values estimated for microsatellite markers, which varied from 0.18 to 0.55 for a mean of 0.37 (Table S6, Supporting information).

Figure 3. Geographical repartition of the MITE insertion in Africa. (A) Frequency of the MITE insertion in the wild and domesticated populations. (B) Linear regression of the MITE frequencies on the longitude for domesticated pearl millet.

Download figure to PowerPoint

image

The MITE sequences found in wild plants were identical to the major haplotype found in domesticated plants (data not shown). Nucleotide polymorphism in the MITE sequence was very low for the whole sample (only one polymorphic site, π = 0.00048). Observation of the region surrounding the MITE showed that the presence of the MITE in the 3′ UTR of PgTb1 in our sequences corresponded to a unique insertion event. Therefore, the presence/absence polymorphism for the MITE insertion defined two independent PgTb1 allele lineages.

The studied 2.5-kb sequence of the region surrounding the MITE insertion included 66 polymorphic sites, 64 of which were singletons (π = 0.00247). A Tajima's D test carried out on this region confirmed the existence of an excess of rare alleles compared to strict neutral expectations. Because the presence of the MITE could influence PgTb1 expression and consequently branching architecture, we wanted to check whether one of the two lineages of PgTb1 alleles could have been more specifically targeted by selection, which would result in a difference of singletons excess between them. Tajima's D values for the region surrounding the MITE were similar for sequences with (= −2.238, P < 0.001) and without the insertion (= −2.195, P < 0.01), indicating that the patterns of nucleotide polymorphism were similar for the two PgTb1 lineages. This strong similarity between the two observed patterns could not be attributed to recombination, because we detected only one recombination event (RM = 1). As a result, it seems likely that the two PgTb1 lineages were subjected to the same demographic and selection events during the domestication process.

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data accessibility
  10. Supporting Information

Diversity and differentiation of wild and domesticated pearl millet

We found that domesticated pearl millet retained 81 to 89% of the diversity of wild pearl millet, and that differentiation between those two groups accounted for about 10% of the total genetic variation. This is in good agreement with values found in other studies using microsatellites markers and carried out on a regional (Mariac et al. 2006a) or a continental scale (Oumar et al. 2008). Comparison with studies using nucleotide polymorphism is less straightforward, as values reported varied from 68% (Clotault et al. 2012) to 84% (Lakis et al. 2012a) of the level of genetic diversity retained in domesticated pearl millet compared with the wild form.

Despite the existence of situations of sympatry between domesticated and wild pearl millet in some regions of the Sahel, these two groups are genetically well differentiated with a relatively low level of admixture, showing that gene flow is low, even in these regions, as already shown by Mariac et al. (2006b). This can be caused by the existence of partial postzygotic reproductive barriers (Amoukou & Marchais 1993) or pollen competition (Robert et al. 1991). However, gene flow may explain the high admixture rates observed specifically in the wild populations from Chad and Sudan, which were collected in locations where domesticated pearl millet is grown. The wild population from Central African Republic (CAR-W) was grouped with the domesticated populations and was collected farther south than the generally assumed distribution area of wild pearl millet (Brunken et al. 1977; Tostain 1992). It is very possible that this population is a weedy pearl millet population. Indeed, weedy pearl millet displaying intermediate phenotypes between domesticated and wild forms and high levels of genetic introgression by genes from the domesticated gene pool are commonly found in pearl millet fields throughout its cultivation area, at least in the Sahelian region (Mariac et al. 2006b). In any case, it is actually impossible to know whether the observed admixture between domesticated and wild pearl millet populations is a consequence of introgression after domestication, a signal of shared ancestral variation or both.

Genetic structure of wild and domesticated pearl millet

Wild pearl millet is geographically structured, as already showed by Tostain (1992) in his study of pearl millet isoenzymatic diversity. However, the populations of Mali were assigned to several groups in our study, whereas they clustered together on the basis of genetic distances in Tostain's study. This difference could be due to the fact that microsatellite markers are more polymorphic than isoenzyme markers, thus allowing the detection of a finer genetic structure. Additionally, our results showed a pattern of isolation by distance (IBD) in wild populations, strongly suggesting that gene flow between wild populations occurs preferentially at short range. On the contrary, the absence of IBD pattern and the large number of admixed individuals in domesticated pearl millet show that gene flow between domesticated populations is not proportional to geographical distances. The pattern of genetic diversity revealed by the spatial structure analysis of domesticated pearl millet suggests that gene flow can sometimes occur over long distances. This may be the result of seed migration due to human exchanges, which have already been revealed to be important on a regional scale in southwestern Niger (Allinne et al. 2008) and could occur at longer distances in some circumstances. The lack of IBD pattern could also be explained by departure from migration–drift equilibrium in domesticated populations due to nonrecurring gene flow: because of the irregularity of precipitations during the rainy season in Sahel between years and localities, exchanges of seeds and seed sources could very well be extremely variable from 1 year to another (Allinne et al. 2008). Regardless of the high level of genome admixture, domesticated populations show a clear geographical structure in Africa, with clusters of populations similar to that found previously based on isoenzyme diversity (Tostain & Marchais 1989; Tostain 1992). Our results contrast with those of another recent study on domesticated Sahelian populations of pearl millet (Stich et al. 2010) in which no geographical structure was found on the basis of microsatellite markers diversity (seven markers in common with our study). An explanation for such a discrepancy could be the differences in sampling schemes. Indeed, this study used a very limited number of inbred lines (1 or 2) derived from each domesticated population. However, as we found that about 75% of the total genetic diversity is observed at the intrapopulation level, we argue it is preferable to have larger samples for each sampling point and avoid using inbred lines derived from populations when investigating population genetic structure in pearl millet because allelic sampling within populations could otherwise be strongly biased.

Clinal repartition of the MITE insertion Tuareg

The near absence of polymorphism in the MITE sequences in both wild and domesticated populations indicates a recent insertion. The fact that it is present, even at very low frequency, in wild populations isolated from the domesticated populations and that the MITE sequence is identical in these two groups could mean that the insertion predated the domestication of pearl millet. It is also possible that the insertion of Tuareg in PgTb1 in the domesticated gene pool occurred after pearl millet domestication and that the MITE was introduced in wild populations through introgression from domesticated pearl millet, as gene flow has been shown to occur between wild and domesticated populations (Mariac et al. 2006b). Unfortunately, because of the absence of polymorphism within the sequence of the MITE, it was not possible to use a phylogeographical approach to test these different hypotheses.

A key result concerning the MITE insertion is the existence of a longitudinal cline in the domesticated pearl millet populations, the MITE being absent in West Africa. Several mechanisms can lead to the formation of a cline: isolation by distance (Barbujani 1987; Sokal & Wartenberg 1983), selection (Nagylaki 1975) or, as it has been shown recently, gene surfing during range expansion (Excoffier et al. 2009). Isolation by distance (IBD) tested using microsatellite markers was not observed in domesticated populations, indicating that this mechanism could not be responsible for the formation of the cline observed at the MITE locus, as IBD is expected to affect all neutral loci in the pearl millet genome.

The location of the MITE in the 3′ UTR of PgTb1, a gene probably involved in the branching architecture of pearl millet and bearing the signature of a selective sweep (Remigereau et al. 2011), suggests a possible role in the regulation of PgTb1 for this element, which then could have been the subject of selection in domesticated populations. If this was the case, alleles of PgTb1 with the MITE insertion could have been selected positively in East Africa but not in West Africa (or the allele without the MITE could have been selected in West Africa and not in East Africa) for environmental or anthropogenic reasons. In this condition, a cline could indeed occur (Nagylaki 1975). Our results did not support this hypothesis. First, the MITE locus showed a level of differentiation similar to that of microsatellite markers. This indicated that local directional selection or balancing selection was probably not involved in the frequency differences for the MITE insertion between domesticated populations. Additionally, despite the existence of a departure from neutral expectations shown by the nucleotide polymorphism pattern in the region of PgTb1 surrounding the MITE insertion, already observed by Remigereau et al. (2011), this departure was similar for both sequences with and without the MITE insertion. Altogether, these results did not support the hypothesis of a selection event that would have specifically targeted haplotypes with or without the MITE insertion at the PgTb1 locus. Nevertheless, the role of selection in the geographical distribution of the MITE insertion cannot be discarded: it has been shown through theoretical simulations that individual loci contributing to quantitative traits such as branching architecture in pearl millet (Poncet et al. 2000; Remigereau et al. 2011) can show levels of differentiation similar to that of neutral markers, even though these quantitative traits are the targets of diversifying selection, and especially when gene flow is important between populations (Le Corre & Kremer 2003). Furthermore, the weak effect of PgTb1 on branching variation could explain that no selection footprint is detected at the MITE locus. The third mechanism likely to produce an allelic frequency cline is allelic surfing: during range expansion, alleles can be fixed due to founder effects at the edge of the expansion wave (Excoffier et al. 2009). The MITE could have already existed in the gene pool of the founder population of domesticated plants, have been introgressed from the wild gene pool after domestication or have inserted itself in PgTb1 after pearl millet domestication. In these three cases, it could have surfed during expansion and reached high frequencies far from its point of origin.

More in-depth studies are needed to definitely reject the hypothesis of selection acting at the MITE insertion locus. Association studies between polymorphisms at PgTb1 (including the presence/absence of the MITE insertion) and phenotypic traits, especially branching architecture, will be needed to confirm the role of PgTb1 in the control of branching and to assess the hypothetical role of the MITE in the variation of domesticated traits. However, even if the MITE did not undergo selection, it could have a role in the regulation of PgTb1 given its ability to form hairpins (Remigereau et al. 2006). Hence, it would be interesting to study the effect of the MITE insertion on the expression of PgTb1. More generally, other genes implicated in the control of branching architecture in pearl millet have not yet been identified, even though it has been found that at least two other loci are involved in the branching variation between wild and domesticated forms (Poncet et al. 2000). The use of a genome scan approach will be useful to identify such genes, once the pearl millet genome sequence is available.

Acknowledgements

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data accessibility
  10. Supporting Information

We thank three anonymous reviewers for their helpful suggestions for the improvement of the manuscript. This work was funded by the Agence Nationale pour la Recherche Scientifique projects PLANTADIV (ANR-07Biodiv-005-04) and TRANSBIODIV (ANR-06Bdiv003-05). YD is the recipient of a PhD grant from Université Pierre et Marie Curie Paris VI.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data accessibility
  10. Supporting Information

M.-S.R., T.L., A.S., and T.R. designed research. Y.D., M.-S.R., A.S., G.L., and S.S. conducted the laboratory work. Y.D., M.-S.R., and M.C.F. analysed the data. Y.D., M.-S.R., M.C.F., and T.R. wrote the manuscript.

Supporting Information

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data accessibility
  10. Supporting Information
FilenameFormatSizeDescription
mec12139-sup-0001-TableS1-S6-FigS1-S4.pdfapplication/PDF1335K

Table S1 Sampling sites. Nμ: number of individuals genotyped with microsatellite loci; NM: number of individuals genotyped for the MITE insertion. *: populations collected in localities where wild and domesticated forms are both present.

Table S2 Primers used for the study of the MITE insertion.

Table S3 Microsatellite multiplexes. LG: linkage group on the pearl millet genetic map; Labelling: fluorescent dye used for each forward primer; Ta: annealing temperature; Na: number of alleles identified in this study. u: unknown, (imp): imperfect repeats.

Table S4 Origin of the sequences used for the study of the nucleotide polymorphism in the region surrounding the MITE insertion.

Table S5 Pairwise FST values. All values are significant (P < 0.01) except when underlined.

Table S6 Standardized FST values for the microsatellite loci.

Fig. S1 DIC analysis for the whole data set (A), the domesticated group (B) and the wild group (C), with the maximum numbers of clusters K ranging from 2 to 15, for the model with (CAR) and without (NoSpatial) spatiality (error bars: 95% CIs).

Fig. S2 Population structure estimated by the tess analysis for the whole data set assuming 10 clusters (K = 10). All solutions among the 10 best replicates are displayed. Numbers on the right show how many times a solution was observed. Each individual is represented by a vertical line divided into K colored segments representing the individuals estimated membership to each cluster.

Fig. S3 Population structure estimated by the tess analysis for pearl millet in Sahel for the domesticated populations when K = 7 (A) and the wild populations when K = 7 (B). All solutions among the 10 best replicates are displayed. Numbers on the right show how many times a solution was observed. Each individual is represented by a vertical line divided into K colored segments representing the individuals estimated membership to each cluster.

Fig. S4 Isolation by distance tested on the basis of microsatellite markers in domesticated populations when taking into account (A) or not (B) KAN-D and TAN-D, and in wild populations (C).

Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.