The RDB population structure
This study analyzed the structure of genetic diversity in a subdivided bread wheat population-variety named RDB. The sub-populations have been circulated for several years in a network of French actors (including farmers and the national genebank) involved in conservation and use of crop diversity. The goal of these analyses was to provide insights into the history of the populations to assess the impact of human practices on genetic diversity at the molecular level, to guide decisions on the conservation of genetic resources. In this study, we did not analyze quantitative genetic variation of adaptive or economical significance.
We applied the Population Graph method (Dyer and Nason 2004), which is a network theory-based method, to study inter-population relationships rather than FST-based or distance-based methods developed within the theoretical framework of population genetics (Wright 1951; Nei 1972; Excoffier et al. 1992). While both methods rely on the covariance structures between all populations with no assumptions about the underlying evolutionary processes, the Population Graph method accounts for multiple relationships among populations using partial regression coefficients. Nineteen sub-populations (586 individuals) were analyzed using 19 neutral markers. Two main genetic groups of populations (group1 and group2) were detected and found to be connected to each other. These two groups were also detected based on the four VRN1 polymorphisms. The Population Graph topology is expected to strongly reflect the migration model, as shown by a simulation approach using N-island and one-dimensional stepping-stone models (Dyer 2007). The observed topology of the RDB population-variety differed from both the stepping-stone and the N-island model because a strong clustering was detected, highlighting a more complex migration system. This pattern seemed to be mostly shaped by human activities (in particular by seed diffusion practices). A similar pattern was encountered in natural populations of Sonoran Desert cactus (Lophocereus schottii L.) submitted to an historical vicariance (induced splitting of population, into discontinuous parts, by sea) (Dyer and Nason 2004).
In a study on a metapopulation of the seagrass Poseidonia oceanica in the Mediterranean basin, the authors highlighted the key role of a few populations as hubs for relaying gene flow (Rozenfeld et al. 2008). In the RDB case, five populations contributed to the transition between the two genetic groups and might play an analogous role. Yet, we should be cautious in the comparison because Rozenfeld et al. (2008) used a different network theory-based approach. In our study, the three populations from group2 (JAS04, JOP06, JFB05) were composed of haplotypes from classes II, III, or IV. As haplotypes from class II were very close to haplotypes from the class I, almost all alleles were shared between both classes, which could explain their position in the Population Graph (Fig. 3B). Except for one individual found in JFB03, there was thus no evidence that group2 received specific haplotypes or alleles from group1. Two populations of group1 (ALB03A and ALB06B) showed one specific allele from class III that explained their boundary position in the Population Graph. This shared allele could be the footprint of an ancestral common population rather than recent gene flow between the two groups of populations. With recent gene flows, we would expect a higher frequency of haplotypes intermediate between the two groups.
Intra-population genetic structure was studied through the haplotype spanning network. Indeed, defining the haplotype approach was relevant because as bread wheat is mainly a self-pollinated species [5–10% outcrossing (Enjalbert et al. 1998; Enjalbert and David 2000)] recombination is not expected to be frequent. Thus, pairwise linkage disequilibrium estimated for each pair of loci over all the 19 populations was significant for more than 80% of the cases. Haplotype clustering revealed 29 OT, while these were not detected using STRUCTURE-like softwares. Thus, when we used the INSTRUCT software (Gao et al. 2007) on this dataset, it induced instability in assigning OT to the genetic groups and altered likelihood values for the different number of ancestral group assessed (data not shown). As a consequence, the criterion to choose the optimal number of groups did not show a strong and stable elbow. Haplotype clustering highlighted different population substructures ranging from homogeneous populations (composed of only one haplotype class) to composite populations (composed of up to three haplotype classes). In addition, the global genotype richness (polyclonality) level was 19.4%. Polyclonality has been previously observed in cassava (Manihot esculenta Crantz) landraces (Elias et al. 2000, 2001; Pujol et al. 2005a,b) with values between 29% and 55% associated with an excess of heterozygote genotypes (−0.94 < FIS < −0.37). This was because of a complex system of agricultural management: volunteer plants recruited from soil seed banks often resulted from outcrosses. The most productive volunteer plants, in general largely heterozygous, are propagated by clonal reproduction. For this reason, heterozygotes occured at a high frequency. In bread wheat, rare spontaneous cross-pollination can also occur, which could increase the heterozygosity. However, after successive generations of self-pollination, heterozygosity decreases. Thus, self-pollination in heterogeneous populations can lead to the maintenance of polyclonal or composite populations with a low level of heterozygotes, as has been shown in natural population of Medicago truncatula (Siol et al. 2008).
Following the practices of the different actors (farmers and genebank curators) have been divided into two distinct processes, one acting at the overall scale of the system, that is, seed diffusions, and the other acting locally, at the farm level, that is, reproduction of the seed lot, which is largely dependent on agronomic practices.
Impact of the seed diffusion network on the genetic structure
As far as we know, this is the first interdisciplinary ethnobotanic and genetic study conducted at the level of a single population-variety. Previous studies have pointed out that seeds have such a symbolic importance for farmers. In most cases, farmers explain that they have been maintaining the same variety for a long time, even if they occasionally substitute entirely or mix their own seed with seed from external sources (Louette et al. 1997; Smale et al. 1999; Badstue et al. 2007), actions which would affect the genetic make-up of populations. Contrary to these situations, the genetic structure found in our study was highly consistent with the SDRNs obtained through interviews: within-SDRN cGD was significantly lower than between-SDRN cGD. Consistence between the rules described as structuring social networks of seed exchange between farmers communities and the genetic structure of manioc (Manihot esculenta Crantz) was also recently described in Gabon (Delêtre et al. 2011). In general, several cycles of reproduction are conducted between two events of seed diffusion. Recycling seeds from one’s own harvest is the backbone of local seed supply (Perales et al. 2003; Carpenter 2005; Delaunay et al. 2008). This is also what we observed in this network of actors. On average, the 19 populations sampled in this study had been grown 5.7 generations in the same farm since the previous diffusion event. In comparison, populations were grown from 4.1 to 15 generations in farmer communities in Ethiopia (McGuire 2007). In other words, in our study, 89% of the seed source comes from the previous harvest of the same farmer. This value is similar to those observed in local farming contexts [80% in farmer communities growing sorghum in Burkina Faso (Delaunay et al. 2008), 53% in farmer communities growing maize in Mexico (Louette et al. 1997)].
Seed diffusion can be considered as a colonization event in the metapopulation model with two basic mechanisms: the ‘migrant pool’ model and the ‘propagule pool’ model (Slatkin 1977). In the seed diffusion process described here, colonization events mainly correspond to the propagule model with the exception of one seed sample (JOP06), which came from seed mixtures (following the migrant model). Even though strong differentiation among subpopulations is expected because of strong founder effects in the propagule model of colonization (Whitlock and McCauley 1990), the fact that we found no evidence of connection between the two SDRNs might indicate that two independent founding effects have occurred in the past. In addition, as bread wheat is mainly a self-pollinated species, the differentiation might be increased by a family group founding effect (Ingvarsson and Giles 1999). This lack of evidence for connection was consistent with the high level of differentiation between the two connected components (SDRN1 and SDRN2: FST = 0.697). Furthermore, the fact that all the populations have been diffused suggested that populations might not yet have achieved equilibrium.
Thus, the genetic analysis provided new insights into the seed diffusion history and by extension into the associated social processes. Relying on information collected through the interviews, it was initially not possible to connect three populations (JEF06, FRP06, ALP05) to any SDRN although we collected seed circulation information back to the 1990s. With the molecular analyses of the population structure, it was possible to assign these three populations to the SDRN2, because they showed a pattern similar to that of SDRN2 populations. In addition, because two of them also presented a composite structure, we thought that the property of composite population was relatively old in the history of the RDB population-variety. Because JEF06 was not a composite population and showed no trace of alleles from haplotype class II while showing several satellite haplotypes from class III, JEF probably received a seed lot from a RDB population before the composite pattern occurred in SDRN2. We also showed that haplotypes at low frequency were shared by different populations of the SDRN2 (Fig. 6). This result confirmed that these populations were connected by seed circulation. Although a farmer (JFB) from SDRN2 received his RDB population from a unique source (ARC) (Fig. 2), we detected that his oldest RDB population (JFB03) was composed of individuals sharing three classes of haplotypes, including one belonging to class I. This is an argument for a complex ancestral population-variety composed of three main haplotype classes (I–III). However, this hypothesis needs to be considered carefully because only one individual was observed to come from haplotype class I. Furthermore, we showed that only a few specific alleles were shared between both SDRNs. An alternative hypothesis could be that two distinct cryptic varieties with almost the same phenotypic traits are being maintained independently in these two SDRNs.
Impact of human local practices on the genetic structure
We showed that, on average, the genetic diversity observed in SDRN1 was significantly lower than that in SDRN2. According to the information collected during the interviews, populations from SDRN1 (Fig. 2, in blue) come from the formal seed sector. The initial donor of the SDRN1 populations was a breeder. Thus, these populations were initially subjected to a strong homogenizing pressure to follow the distinction, uniformity, and stability (DUS) criteria of the formal system. Consequently, the CLM genebank sample (CLM03) obtained from this source showed a much lower genetic diversity than most of the other samples. The trend for genebank accessions to have lower genetic diversity than in situ collection was also highlighted in several papers (see Negri et al. 2009 for a review). In contrast to the populations of SDRN1, the populations of SDRN2 have always been grown on farm without the DUS constraints and diversified agricultural practices among farms, so they were subjected to less homogenization.
Demographic size of crop populations is generally highly variable (Rice et al. 1998). In this context, population size could play an important role in the evolution of populations depending upon the seed quantity obtained after the diffusion event and/or the seed quantity recycled. Generally, actors who practice variety conservation grow their populations on small plots (a few m2), in contrast to others who follow multiplication, isolation, or production practices (field surfaces from 10 to several thousand m2). Genetic drift, particularly in diversified populations with a small demographic size, might reduce the genetic diversity and increase the genetic load. This situation could account for some patterns observed in SDRN1, because five populations out of seven were grown in small plots. However, as mentioned in the previous paragraph, the overall low level of genetic diversity found in SDRN1 could be explained by the historical conservative practices of the formal system. Using the temporal variation of allele frequencies between the two samples available at the farm BER resulted in an infinite estimate of effective size, Ne, because allelic frequency variation was too low. This was associated with a low variation in terms of haplotype composition of the population between 2003 and 2006 which is consistent with the conservative practices used by BER. Except for JFB05 and JOP06, which followed cultural practices best described as selection, populations in SDRN2 seemed to have larger size than populations from SDRN1. Estimated Ne based on the JFB03 and JFB06 populations, within SDRN2, was of the same order of magnitude of bread wheat populations grown under dynamic management experiment [104.5 in this study compared with 123.0 after 10 generations of evolution in Goldringer et al. (2001)], while within-population genetic diversity was relatively high in these populations (0.32 and 0.31, respectively, for 2003 and 2006). This trend might be amplified when there was occasional past or recent mixture with other varieties (ARC80 and JOP06 respectively).
Migration is one of the evolutionary forces that could significantly influence the differentiation within the system. In the case of an open-pollinated species such as maize, pollen-mediated gene flow is important and generally leads to a low level of genetic differentiation, though farmers’ selection on ear type induces stronger phenotypic differentiation among landraces (Pressoir and Berthaud 2003). Because phenotypes are quite distinct between varieties and because wheat is a self-pollinated species, uncontrolled migration among populations is expected to be rare. However, the composite property of some populations of SDRN2 (mainly haplotype classes II and III) and the higher number of haplotypes observed in class III indicated that migration might have occurred in the past with individuals of haplotype class II that migrated into populations of haplotype class III. In addition, we know that haplotype class II is genetically very similar to class I, thus possibly indicating a common ancestral origin. While this is only the structure of the neutral genetic diversity, if a convergent phenotype was also to be observed between the different haplotype classes that could explain why farmers continue to grow these different populations under the same name RDB, a detailed phenotyping of these different haplotype classes would be helpful to confirm this point. The low outcrossing rate found in wheat [5–10% (Enjalbert et al. 1998; Enjalbert and David 2000)] is consistent with finding some recombinant individuals. This was observed in CLM04 and FRP06. Present at low frequencies, this phenomenon illustrates contact with other varieties. This is consistent with two identified practices: as already mentioned, some farmers have grown their RDB populations in mixture with other varieties, while other farmers maintain their populations in collections and grow them in small plots close together that could result in mixtures or outcrosses at different steps of the reproduction process.
Genetic differentiation (pairwise FST) measured in neutral regions was highly correlated with genetic differentiation measured in VRN-1 genes involved in flowering time (adaptive trait) (Fig. 4). Divergent selection between wheat populations grown for several generations in contrasted sites would have led to specific patterns such as higher FST at genes under selection compared with FST at neutral markers (Vitalis et al. 2001; Rhoné et al. 2010). Thus, the structure of genetic diversity observed seems more influenced by actors’ practices rather than by the short-term environmental conditions where populations have been grown. Different types of selection can be described. The first is negative selection performed by farmers or genebank curators when they remove off-type plants that appeared spontaneously in the population in the field. These practices could explain the low rate of OT in the dataset. The second selection is positive: for example, the ear-based selection for the RDB ear type [red awnless (JOP06)]. The farmer explained that he received a mixture of different wheat varieties including RDB. He thus decided to select a few RDB ears type to initiate a new cycle of multiplication as a pure variety. This selected population showed low genetic diversity (unbiased He = 0.008) with only one class of haplotype detected (class II). Finally, there was another case of positive selection when in 2001, one farmer (JFB) made a selection of a new derived ear type (red awned) which appeared spontaneously in his RDB population. He further grew the progeny as a separate population, which he named ‘Rouge du Roc’. This process corresponds to the creation of a new population-variety related to RDB. In 2003, he gave a sample to CLM.