SEARCH

SEARCH BY CITATION

Keywords:

  • diversity pattern;
  • domestication;
  • geographic variation;
  • introgression;
  • population structure

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information
  • The study of genetic diversity between a crop and its wild relatives may yield fundamental insights into evolutionary history and the process of domestication.
  • In this study, we genotyped a sample of 303 accessions of domesticated soybean (Glycine max) and its wild progenitor Glycine soja with 99 microsatellite markers and 554 single-nucleotide polymorphism (SNP) markers.
  • The simple sequence repeat (SSR) loci averaged 21.5 alleles per locus and overall Nei’s gene diversity of 0.77. The SNPs had substantially lower genetic diversity (0.35) than SSRs. A SSR analyses indicated that G. soja exhibited higher diversity than G. max, but SNPs provided a slightly different snapshot of diversity between the two taxa. For both marker types, the primary division of genetic diversity was between the wild and domesticated accessions. Within taxa, G. max consisted of four geographic regions in China. G. soja formed six subgroups. Genealogical analyses indicated that cultivated soybean tended to form a monophyletic clade with respect to G. soja.
  • G. soja and G. max represent distinct germplasm pools. Limited evidence of admixture was discovered between these two species. Overall, our analyses are consistent with the origin of G. max from regions along the Yellow River of China.

Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Plant domestication fundamentally altered the course of human history, prompting the shift from hunter–gatherer to agricultural societies. Domestication is a multifaceted process that is amenable to study by a wide range of disciplines, including archaeology, anthropology, molecular genetics and evolutionary biology. Accordingly, the study of domestication has yielded fundamental insights into early societies, the genes and biological mechanisms that underlie morphological change, and the strength and patterns of selection (Doebley et al., 2006).

Recent genetic and archaeological investigations have also shown that the process of domestication can vary substantially among crop species. For example, genetic studies of maize suggest that it was domesticated only once, from a wild progenitor located in highland Mexico (Matsuoka et al., 2002). After domestication c. 9000 yr ago (Matsuoka et al., 2002), archaeological evidence indicates that cultivated maize dispersed throughout the Americas quite rapidly – that is, within, perhaps, hundreds of years (Pohl et al., 2007). By contrast, evolutionary genetic analyses of barley and rice have demonstrated at least two domestication events for both species (Cheng et al., 2003; Morrell & Clegg, 2007). Moreover, archaeological study of cereal grains suggests that the process of domestication may have taken thousands of years (Fuller, 2007), and perhaps as long as five millennia for rice (Fuller et al., 2009).

While the result for rice is still open to interpretation (Jones & Liu, 2009) and may not be completely compatible with genetic evidence (Zhang et al., 2009), extending the duration of domestication from a rapid to a multimillennial process has important implications for interpreting patterns of genetic diversity between crops and their wild relatives. For example, a domestication process of thousands (as opposed to hundreds) of years provides more opportunities for local domestication, migration among local domesticates and local extinction. Just as importantly, a domestication event of long duration provides an expanded period for potential introgression between the domesticated and local populations of its wild relative(s). Evidence for local introgression between a crop and its wild relatives can be inferred from patterns of genetic diversity – for example, rice (Garris et al., 2005).

Here, we investigate genetic diversity in cultivated soybean (Glycine max) and its wild progenitor Glycine soja. Soybean is cultivated globally, in part because it produces among the highest gross oil output – with the highest protein content – of any vegetable crop (Mohamed & Rangappa, 1992). The weight of cytological, biochemical and molecular evidence supports the domestication of soybean from G. soja, a wild annual species that is native throughout China and parts of Korea, Japan and Russia (Fig. 1). Nonetheless, several aspects of soybean domestication are not well established, which is surprising given its agricultural importance. For example, the location of domestication in China is not yet clearly substantiated. It has been hypothesized that soybean was domesticated in north-eastern China (Fukuda, 1933; Li, 1994), the Yellow River valley of northern China (Vavilov, 1951; Hymowitz & Newell, 1981; Chang, 1989; Zhou et al., 1998; Dong et al., 2004; Zhao & Gai, 2004; Li et al., 2008) and southern China (Gai et al., 2000). It is also unclear as to whether soybean was domesticated more than once, but multiple domestications have been explicitly suggested (Xu et al., 2002).

image

Figure 1.  The geographic distributions of samples used in this study. Regional sampling is designated by circles, where Glycine soja is represented by the open part of the circle and Glycine max is the closed portion. Each circle represents a different province. The number of samples per province is indicated by the number. The four colored portions divide China into four regions: NER (northeast region), NR (north region), HR (Huanghuai region) and SR (south region). The blue lines represent the Yellow and Yangtze rivers.

Download figure to PowerPoint

Patterns of molecular diversity often yield insights into the location and number of domestication events. Molecular diversity in G. max and G. soja have been examined with a series of markers, including simple-sequence repeats (SSRs), random amplified polymorphic DNA (RAPD) markers and amplified fragment length polymorphisms (AFLPs). To date, these studies have yielded similar insights into the patterning of genetic diversity in G. soja and its relationship to G. max. Typically, genetic diversity clusters by taxon, with a clear differentiation between wild and domesticated taxa (Powell et al., 1996). Within a taxon, the genetic structure of G. max and G. soja typically agree with geographic location (Dong et al., 2001, 2004; Abe et al., 2003; Xu & Gai, 2003; Li et al., 2008b). For example, Chinese and Japanese G. soja populations form distinct germplasm pools (Hirata et al., 1999; Kuroda et al., 2006), and Asian accessions of G. max (Abe et al., 2003) group in general accordance with planting region and also the sowing season.

In this study, we investigate genetic diversity in a broad sample of G. max and G. soja. Our study differs from previous studies of genetic diversity in Glycine in three important ways. First, we rely on two types of molecular markers – SSRs and SNPs, which differ in mutational properties (Payseur & Jing, 2009) – and compare results between them. Second, our study differs with respect to the size and extent of the Glycine sample. Our survey includes G. max individuals that represent over 90% of the phenotypic diversity found within the Chinese soybean germplasm collection and G. soja individuals representative of its natural range. Third, we explicitly consider the possibilities of multiple centers of domestication and of admixture and prolonged gene flow between the wild and the cultivated species after the initial domestication event(s).

With SSR data from 99 loci and SNP data from 738 loci typed in a common sample of 303 individuals, we address the following questions: Do G. soja and G. max continue to represent distinct germplasm pools, as in previous studies? Is G. soja geographically structured? If so, can we identify the region or the regions of China in which G. max was domesticated? Is there evidence that the process of domestication included substantial admixture between wild and cultivated populations? Along the way, we also evaluate the relative merits of SNPs and SSRs to address these questions.

Materials and Methods

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Plant materials

We sampled a total of 435 accessions representing cultivated G. max (L.) Merr. (321), its wild progenitor species G. soja Sieb. et Zucc (112) and two outgroup species (Fig. 1). The G. max population consisted of 240 landraces and 81 cultivars, including the minicore collection of cultivated soybean in the Chinese National Soybean GeneBank (CNSGB) (248 G. max accessions). The minicore collection represents most of the phenotypic diversity and c. 70% of the molecular genetic diversity of 23 587 cultivated accessions housed in the CNSGB (Wang et al., 2006; Qiu, 2009). Our accessions originate from four large ecological regions, northeast region (NER), north region (NR), Huanghuai region (HR) and south region (SR) (Fig. 1), respectively, ranging over 19.4 to 50.2°N and 86.3 to 130.2°E. These four regions represent the four major planting areas of soybean in China (Li et al., 2008).

Accessions of the wild progenitor were selected to represent the geographical range of this species from 24.5 to 52.2°N and 100.5 to 141.2°E. Of the 112 G. soja accessions, 73 were from China, 8 from Korea, 9 from Russia and 22 from Japan (Fig. 1). A single accession of Glycine tomentalla Hayata (one of two perennial species found in China) and Glycine falcata Benth. were included as outgroups. The G. max accessions and 73 Chinese G. soja were obtained from the CNSGB, with the remainder provided by Dr Randall Nelson from the USDA-ARS Soybean Germplasm Collection (University of Illinois, Urbana, IL, USA). Detailed information about each accession is provided in the Supporting Information, Table S1.

Data collection

For both SNP and SSR analyses, DNA was extracted from young leaf tissue of one plant (G. tomentella and G. falcata) or a bulk of young leaf tissue of 20–30 plants (G. soja and G. max) per accession as previous described (Xie et al., 2005). We bulked samples to produce enough DNA for genotyping, but the bulking of samples is justified by the fact that each accession of the minicore collection has been culled for both phenotypic and genotypic homogeneity.

Ninety-nine simple sequence repeats (SSRs) were selected for genotyping, based on their distribution across the genetic linkage map (http://bldg6.arsusda.gov/cregan/soymap.htm). The SSR loci were mapped onto the Williams82 genome sequence (http://www.phytozome.net) with blast (E-value < 10−10), using the SSR primers as a query. If a best hit of blast mapped only part of the primer sequence, we extended it in order to map the full length. Then, the allele size in Williams82 was calculated based on the boundaries of the extended hit. Overall, the 99 SSR loci were located on 20 integrated genetic linkage groups, covering 1581.8 cM of soybean genome, with 20.0 cM average genetic distance between adjoining loci (Table S2). A PCR amplification of SSRs followed Xie et al. (2005). The PCR products were separated on an ABI PRISM 377 DNA Analyser (Applied Biosystems, Foster City, CA, USA). Allele sizes were estimated with an automated sequencer (Applied Biosystems) and inspected manually. When the genotyping had multiple (≥ 3) peaks, we treated the observation as missing data. These had little overall effect, because they represented only 0.2% of the total genotyping data. In a small proportion of the time two peaks were identified in four SSR markers (c. 4.0% of 99 SSR markers), we scored the higher peak after confirmation by the repeat test (ABI PRISM 377 DNA Analyser) and polyacrylamide gel electrophoresis (PAGE).

We assayed 738 SNPs in the complete set of 435 accessions. These SNPs were polymorphic in a set of six diverse G. max accessions and used to build the first transcript map of soybean (Choi et al., 2007). They were chosen based on a design ability rank score > 0.6 and a pre-evaluation of 60 bp of upstream and downstream flanking regions by Illumina (http://www.illumina.com/, San Diego, CA, USA). The upstream and downstream data were accessed at http://bfgl.anri.barc.usda.gov/soybean/.

The SNPs were assayed using the Illumina GoldenGate assay which was performed based on the manufacturer’s protocol and the methods described in Shen et al. (2005). For each SNP, the lowest acceptable score of GenCall and GenTrain were set at 80% and 0.6, respectively, for separating homozygote and heterozygote clusters.

The 738 SNPs were also mapped to the Williams82 genome sequence with blast (E-value < 10−10) using the SNP flanking sequences as queries; 17 of the 738 SNP markers either overlapped with another marker or had an ambiguous location and were not considered further.

A list of the SNP and SSR loci, along with genetic and physical positions is provided in Table S2.

Analyses of genetic diversity and population structure

Summary statistics were computed for both the SSR and SNP data sets. The statistics included the number of alleles, the proportion of heterozygous individuals in the population and Nei’s gene diversity, as calculated by powermarker 3.25 (Liu & Muse, 2005). In addition, π was calculated on SNP data with DnaSP (Rozas et al., 2003). As sample sizes differed across populations, the number of distinct alleles per sample was estimated by adze (Szpiech et al., 2008), which employs a rarefaction approach to obtain sample-size corrected estimates. Sample-size corrected estimates of allelic richness were calculated with fstat (Goudet, 2001).

We examined population structure and differentiation with two methods. First, we used two Bayesian Markov Chain Monte Carlo approaches, structure 2.1 (Pritchard et al., 2000; Falush et al., 2003) and instruct (Gao et al., 2007). structure minimizes deviations from Hardy–Weinberg equilibrium within an inferred population; by contrast, instruct, uses expected genotype frequencies and estimates of selfing rate to make population assignments.

For instruct, one haplotype per line was included in the dataset. Haplotypes were inferred from the structure dataset using phase (Stephens et al., 2001) version 2.1 (Stephens & Donnelly, 2003). For both structure and instruct analyses, we employed the admixture and independent allele frequency models, using a number of clusters (K) ranging from 1 to 8. Five runs were performed for each value of K, without using previous population information. Burn-in time and replication number were both set to 100 000 for each run. Additional parameters in the instruct analyses were set to the default values on the website (http://cbsuapps.tc.cornell.edu/instruct.aspx/). The value of logePr(X/K) and the variance in logePr(X/K) (Var(logeP(X/K))) were used to identify the appropriate values of K.

Our second method of examining population structure was analyses of molecular variance (AMOVA), based on implementations in arlequin (http://cmpg.unibe.ch/software/arlequin3/).

Phylogenetic analyses

We constructed two types of phylogenetic trees. Both trees were based on the shared-allele distance among accessions, as calculated by powermarker 3.25 and displayed by mega4 (Tamura et al., 2007). Both tree types were also based on the neighbor-joining algorithm (Saitou & Nei, 1987) implemented in mega, using the G. tomentella and G. falcata accessions as outgroups. The first type of tree used data from all 435 accessions and treated each accession as an operational taxonomic unit (OTU). The second tree grouped accessions into 15 OTUs on the basis of their geographic location and their position in the first type of tree. Accessions from the same geographic region or subregion were grouped as an OTU when they clustered together in the first tree. Both trees were calculated for three data sets – SSRs, SNPs and SSRs + SNPs – resulting in a total of six trees. Confidence on each tree was assessed with 1000 bootstrap replications.

We compared the shared-allele distance matrices between SSR and SNP data with a Mantel test, which was based on 1000 random permutations as implemented in MXCOMP within the ntsyspc 2.10j package (New York, NY, USA).

Results

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Diversity among G. soja and G. max samples

We attempted to genotype 435 accessions of wild and cultivated soybean at 99 SSRs and 738 SNPs. Both SSRs and SNPs were culled with respect to quality and failure rates. For the SSR dataset, all 99 loci provided reliable results, but 62 accessions had missing data for nine or more loci; these accessions were removed from further analyses. For the SNP dataset, 167 SNP loci were removed in total, owing to failure in 20% or more of samples (121 SNPs), apparent heterozygosity in > 20% of samples (34 SNPs), suggesting paralogous markers (Fig. S1), or monomorphism in all accessions (12 SNPs). The SNP dataset thus ultimately consisted of 554 SNPs (Table S2). However, 70 accessions had missing data for 55 SNP loci, and these accessions were removed. Thus, the genotyping data resulted in three data sets: the SSR dataset, consisting of 373 accessions scored for 99 SSRs; the SNP dataset, consisting of 365 accessions genotyped for 554 SNPs; and a combined SSRs + SNPs dataset, comprising 303 common accessions scored for 554 SNPs and 99 SSRs.

To explore properties of genetic variation, summary statistics were calculated for the various datasets (Table 1). The 99 SSR loci averaged 21.5 alleles per locus, with an overall Nei’s gene diversity of 0.77. The SNPs had substantially lower genetic diversity (0.35) than the SSRs (t-test, P < 0.001). The marker types also had markedly different allele frequency distribution (Fig. S2); for SSRs, most (80.9%) of the alleles were at < 5% frequency, but most (92.0%) SNPs had an overall frequency ≥ 5%. Presumably, the relatively high frequency of SNP markers reflects ascertainment biases.

Table 1.   Summary statistics for Glycine max and Glycine soja populations by single-nucleotide polymorphism (SNP) (554) and simple sequence repeat (SSR) (99) loci
SpeciesTypeOriginSNPSSR
Sample sizeNumber of polymorphic lociπ1Nei’s gene diversityHeterozygosity2f3Sample sizeNumber of allelesNei’s gene diversityAllelic richness4Heterozygosity2f3
  1. 1Nucleotide diversity. Sites with alignment gaps or missing data were considered.

  2. 2The average of proportion of heterozygous individuals in the population (Liu & Muse, 2005).

  3. 3f, Inbreeding coefficient.

  4. 4The number in the bracket displays minimum sample size for testing allelic richness.

  5. 5The overall estimates are calculated as the average across all loci, whereas variances and confidence intervals are estimated by nonparametric bootstrapping (100 times) across different loci.

G. max  2985460.3430.3370.0490.85427914730.68710.1 (43)0.0490.929
 BredChina655240.3220.3130.0500.842628840.6728.0 (30)0.0560.917
 LandraceChina2335410.3440.3380.0490.85621713320.6828.8 (30)0.0460.932
G. soja  655330.3090.3010.0720.7639218070.87116.7 (43)0.1530.826
  China415250.3140.3400.0670.8096115060.8515.4 (4)0.1550.821
  Japan143190.2590.2690.0590.795158420.8075.1 (4)0.1560.819
  Korea53000.2420.2330.1130.59375420.7364.5 (4)0.2290.730
  Russian51960.1850.1770.0130.94294630.6954.0 (4)0.0730.909
Total5  3635540.3570.3500.0540.83637121330.76614.0 (43)0.0750.903

A few major themes become apparent when contrasting diversity between G. soja and G. max. First, SSRs indicate that wild G. soja has significantly higher allelic richness, gene diversity and allele numbers than cultivated G. max (t-test, < 0.01). For example, the number of alleles observed in G. soja (1807) exceeded that of G. max (1473), despite smaller sample sizes in G. soja (92 vs 279). After using the rarefaction method to standardize for sample sizes, G. soja still exhibited a higher numbers of expected distinct and private alleles than G. max at different samples sizes (Fig. S3).

Second, SNPs provided a slightly different snapshot of molecular diversity between the two taxa. The numbers of distinct and private alleles expected in G. soja was higher than in G. max (Fig. S3), as was found with SSR data. However, genetic diversity, as measured both by Nei’s gene diversity and by π, was lower in G. soja (0.301 and 0.309, respectively) than G. max (0.337 and 0.343, respectively) (Table 1).

The information from SSR and SNP markers was concordant in some respects, however. For example, both indicate that China had the highest gene diversity within G. soja followed by Japan, Korea and Russia (Table 1). Both marker types also provided similar insights into the partitioning of diversity based on AMOVA analyses (Table 2). The proportion of variance caused by differences between species was 9.7–10.4% for SSRs and 14.3–15.2% for SNPs. For both, the largest component of variation was among individuals within population (SSRs, 73.9–84.1%; SNPs, 65.7–78.4%).

Table 2.   Analyses of molecular variance (AMOVA) based on simple sequence repeat (SSR) (99 loci) and single-nucleotide polymorphism (SNP) (554 loci) analyses
SampleNumber of groupsNumber of populationsPercentage of variation and 95% confidence intervals (%)
TotalIn Glycine sojaIn Glycine maxAmong groupsAmong populations within groupsAmong individuals within populationsWithin individuals
SSRSNPSSRSNPSSRSNPSSRSNP
Total2 (G. soja, G. max)117410.415.24.65.973.965.711.213.3
Total2 (G. soja, G. max)8449.714.34.66.374.566.211.213.3
Total1211  11. 516.577.370.211.313.3
G. max14 4  5.07.284.178.410.914.4
G. soja4 (China, Korea, Japan, Russia)77 5.87.21.52.475.666.817.223.7
G. soja177   5.27.377.368.517.624.3
G. soja144   6.88.776.167.717.123.6

Population structure

structure vs instruct We applied two Bayesian approaches –structure (Pritchard et al., 2000; Falush et al., 2003) and instruct (Gao et al., 2007) – to investigate genetic clustering among G. max and G. soja accessions. Each approach was applied to SSR data alone, SNP data alone, and combined SSR + SNP data. The analyses using structure did not produce a clear ‘plateau’ as the estimated log probability of data Pr(X/K) increased gradually as values of K increased (Fig. S4). The variance in logePr(X/K) (Var(logeP(X/K))) increased constantly from = 1 to = 4 (SSR dataset) or = 5 (SNP dataset), with only slight changes at higher K-values. For SSR data, most (83.4%) accessions were assigned to a population at = 6, and this subdivision seemed biologically sensible for selfing soybean (see the section–structure analyses among datasets). Hence we selected = 6 as the optimal cluster number. For SNPs we selected = 5 because only 57.1% accessions could be assigned to a single population when = 6, but 76.3% accessions could be assigned with = 5. instruct suggested the same number of populations for the SSR (= 6) and SNP (= 5) datasets (data not shown).

The results were highly concordant among runs in structure and instruct analyses, respectively, and thus results are shown for a single run. At = 2, the population structure inferred by these two approaches were similar (Fig. 2a,b); both first differentiated G. soja and G. max. However, the assignment of accessions differed slightly between the two approaches (Table S1). For example, structure split cultivated soybean into four populations and wild soybean into two populations, but instruct split cultivated soybean into three populations and wild soybean into three populations.

image

Figure 2.  Population structure inferred by Bayesian clustering approaches based on simple sequence repeat (SSR), single-nucleotide polymorphism (SNP) and SSR + SNP data, respectively. (a) Total accessions using structure; (b) total accessions using instruct; (c) Glycine soja without inferred hybrids using structure. Each individual is shown as a thin vertical line partitioned into K colored components, representing inferred membership in K genetic clusters. The top row (a and b) provided the species name and the bottom row (a, b and c) indicates geographic region. NER, northeast region, China; NR, north region, China; HR, Huanghuai region, China; SR, south region, China; C, China; R, Russia; K, Korea; J, Japan.

Download figure to PowerPoint

Overall, the differences in assignment seemed to be relatively minor, because the consistency of assignment was very high for most groups. For example, we calculated the frequency of assignment between structure and instruct based on SSR, SNP and SSR + SNP data (Table S3). For the SSR data, 100% of NER and NR structure-inferred accessions were assigned by instruct into the NER and NR clusters. Similarly, 97.3% of the HR and 99.0% of the SR accessions inferred by structure were assigned into the HR + SR cluster identified by InStruct.

The greatest difference between the structure and instruct analyses was in the number of admixed individuals. structure identified far more admixed individuals (78), but the number of unassigned accessions was much lower in instruct analyses (8). As the structure results more closely fit the geographical distribution and the results of previous studies (Li et al., 2008; Wen et al., 2008). We thus chose to focus on their description in more detail.

structure analyses among datasets  The structure results varied somewhat among the three data sets (Fig. 2a,b). For SSRs alone, the two taxa were clearly delineated at = 2. Each additional cluster delineated geographic regions (Fig. 3): at = 3, G. max accessions from south China separated from north China; at = 4, accessions from northeast China were separated from the Yellow River region; at = 5, G. soja accessions split into two clusters (China vs neighboring countries); finally, at = 6, the K-value with the highest likelihood, G. max separated into two clusters along the Yellow River. Thus, the cultivated accessions ultimately grouped into four clusters that were largely concordant with major geographic regions in China, including NER, NR, HR and SR.

image

Figure 3.  Schematic clustering procedure during inferring population structure using structure, based on simple sequence repeat (SSR), single-nucleotide polymorphism (SNP) and SSR + SNP data for Glycine max and Glycine soya. NER, northeast region, China; NR, north region, China; HR, Huanghuai region, China; SR, south region, China; C, China; R, Russia; K, Korea; J, Japan.

Download figure to PowerPoint

The structure analyses of the SNP and SNP + SSR datasets agree with the SSR data in most respects – that is, G. max is differentiated into four regional groups, and G. soja is clearly separated from G. max. However, the group delineation with = 2 was not primarily along taxonomic lines. This initial delineation separated G. max from a group that included G. soja and the part of NR group of G. max (the other NR accessions were defined as unassigned accessions), thus suggesting the possibilities either of introgression between G. soja and NR or recent shared ancestry.

The structure analyses provided limited evidence of admixture between G. soja and G. max (Fig. 4). For all three datasets, some accessions labeled as G. max contained an appreciable component of diversity that was assigned to the wild gene pool. For the SSR data set, for example, the accessions could be defined by accessions with low, medium and high assignment probabilities to the G. soja gene pool. The low set consisted of 264 G. max and two G. soja accessions with an ancestry coefficient ≤ 0.27. (Here ‘ancestry coefficient’ is defined as the inferred proportion of membership in the G. soja gene pool when = 2.) The high group included 61 accessions, including 60 G. soja and one G. max accession with ancestry coefficients ≥ 0.88. The middle group, which is the most interesting because it may represent hybrids or introgressed material, consisted of 44 accessions (15 from G. max and 29 from G. soja) with ancestry coefficients between 0.33 and 0.83. This middle group contained more accessions with the SNP dataset (60 G. max and 14 G. soja accessions) and with SNP + SSR dataset (43 G. max and nine G. soja accessions). Thus, while Bayesian analyses clearly delineate between taxa, these analyses also suggest that c. 20% of wild and domesticated accessions are either poorly differentiated or owe their origin to admixture.

image

Figure 4.  Distinction of wild and cultivated soybean, expressed as individual ancestry to the wild gene pool in a structure analyses while assuming two populations. Ancestry was ranked for each individual, and the ranks are plotted against the ancestry in the wild gene pool. The arrows indicate small gaps in the distribution (see text for details). Glycine max, circles; Glycine soja, crosses. SSR, simple sequence repeat; SNP, single-nucleotide polymorphism.

Download figure to PowerPoint

To detect population structure within G. soja, we removed the putative hybrids (Fig. 4) and reanalyzed these G. soja datasets with structure. The SSR, SNP and SSR + SNP dataset yielded similar results (Fig. S4b). We found = 6 to converge well and show the highest averaged likelihoods among runs of the program for all three datasets. The accessions from different regions (except NER) tended to form six distinct clusters, corresponding to their geographical origins (Japan, Korea, Russia and NR, HR and SR in China) (Fig. 2c). This pattern demonstrates that geographical genetic differentiation exists in wild soybean (G. soja).

Genealogical analyses

To gain insight into potential locations of G. max domestication, we constructed a neighbor-joining tree, based on individual accessions, with G. tometella and G. falcata outgroups (Fig. 5a). Regardless of the dataset examined, with few exceptions, cultivated soybean tended to form a monophyletic clade with respect to G. soja. Moreover, accessions within G. max tended to form subpopulations corresponding to geographic origin, but there was overlap, particularly among accessions from the NR and HR regions with SSR data, consistent with the structure analyses (Fig. 2a).

image

Figure 5.  Neighbor-joining tree of Glycine soja and Glycine max rooted with Glycine tomentella and Glycine falcata based on shared allele pairwise distances. (a) Trees for individual soybean accessions. Colored symbol indicate the inferred genetic cluster from structure analyses. SSR, simple sequence repeat; SNP, single-nucleotide polymorphism. (b) Trees for operational taxonomic units (OTUs) of individual soybean. The percentage bootstrap support is indicated at each node. Abbreviation for OTUs are: c, cultivated soybean (G. max); W, wild soybean (G. soja); NER, northeast region, China; NR, north region, China; HR, Huanghuai region, China; SR, south region, China; EP, east part; WP, west part; NP, north part; SP, south part.

Download figure to PowerPoint

To better assess the monophyly of G. max and relationships among geographic regions, we collapsed clades to form OTUs. We pooled the accessions into 13 (SNPs and SSRs + SNPs) or 15 (SSRs) OTUs based on their geographic origin (latitude and longitude), their position in Fig. 5(a) and their assignment in structure analyses (see Table S1). The OTUs comprised 3–20 accessions. This approach again resulted in a monophyletic grouping of G. max (Fig. 5b), but also revealed that wild accessions from the HR region (SSR analyses) or the NR + HR cluster (SNP analyses) are closest phylogenetically to the G. max clade. Interestingly, structure analyses of SNP data assigned NR cultivated accessions into G. soja clusters at = 2 (Fig. 3). These patterns suggest that the NR accessions best represent early domestication germplasm.

In most trees, the deepest split separated the two species (Fig. 5). Within species, there were clear geographic patterns of the OTUs. Within G. max, OTUs from NR split first. Within G. soja, the populations from China exhibited a closer relationship to G. max than the relationship between those and populations from Japan or Korea. Although not entirely consistent among data sets, phylogenetic analyses tended to suggest that wild soybeans from NR and HR, both of which are along the Yellow River, were genetically most closely related to cultivated soybeans.

Discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

The analyses of genetic diversity in domesticated crops and their wild ancestors is typically used for one of three purposes. The first is to identify distinct genetic groups for retention of germplasm (Agrama et al., 2009). The second is to identify the genes that underlie important phenotypic and genetic shifts during domestication and breeding, using the approach of selective sweep mapping (Vigouroux et al., 2002; Wright et al., 2005; Ross-Ibarra et al., 2007; Chapman et al., 2008). The third is to infer aspects of the history and timing of domestication. Here, we have assayed genetic diversity in a broad sample of domesticated soybean (G. max) and its wild progenitor (G. soja) with this third purpose in mind – that is, to provide additional insights into soybean domestication – but the breadth of our study ensures that it is also useful for analyses of germplasm collections.

Our study uses both SSRs and SNPs and our sampling is much broader than previous studies (Matsuoka et al., 2002; Becquet et al., 2007). Nonetheless, our data, like previous data (Powell et al., 1996; Kuroda et al., 2006) suggest that the primary division of genetic diversity is consistently that between wild (G. soja) and domesticated soybean (G. max). For example, structure and instruct analyses on all datasets clearly delineate wild and cultivated germplasm at = 2, and AMOVA analyses also clearly differentiate between wild and cultivated accessions. Moreover, phylogenetic analyses of both individual accessions (Fig. 5a) and collapsed OTUs (Fig. 5b) tend to suggest that the cultivated germplasm is monophyletic. Based on these lines of evidence, we favor the interpretation that soybean, like maize (Matsuoka et al., 2002), barley (Badr et al., 2000), pearl millet (Oumar et al., 2008), emmer wheat (Ozkan et al., 2002) and einkorn wheat (Heun et al., 1997), may result from a single domestication event.

If this inference is correct, soybean differs from that of other species studied for which genetic evidence provides compelling evidence of multiple domestication events (Londo et al., 2006; Morrell & Clegg, 2007; Sang & Ge, 2007; Aguilar-Melendez et al., 2009). In this context, it should be emphasized that inferential methods are imperfect, because simulations indicate that multiple domestication events can lead to monophyletic clustering of domesticated accessions under some conditions (Allaby et al., 2008; Ross-Ibarra & Gaut, 2008). Thus, a pattern of monophyly could provide a false signal of a single domestication event, and there may be a bias toward concluding there has been a single domestication event even when it is untrue. Nonetheless, our inference of a single domestication event is consistent with most previous studies of soybean (Xu et al., 1986; Zhu et al., 1995; Zhou et al., 1998; Gai et al., 2000), except one based on a modest sample of chloroplast DNA (cp)SSRs, which suggested that the cultivated soybean originated independently in different regions from different wild gene pools (Xu et al., 2002).

Admixture and geographic subdivision

Because our study relies on more genetic markers than previous studies, our data provide more potential for insight into geographic delineations within species and hybridization between species. With regard to genetic subdivisions within species, G. max clustered by geographic location; G. max in China divides into the geographic regions NER, NR, HR and SR. This is a coarser clustering than inferred previously from a study that used fewer SSR markers (59) but more landrace accessions (1863) (Li et al., 2008). The latter identified seven clusters, representing roughly the clusters inferred here, except that four separate clusters were inferred within the geographical region of SR, and these SR subclusters reflected differences in sowing season. Thus, increasing sample size may be beneficial for inferring fine-tuned geographic structure (Morin et al., 2004).

We infer six genetic subgroups within G. soja. These six clusters separate geographically, corresponding to Japan, Korea, Russia and three distinct regions in China. This pattern was also discovered in a previous study (Wen et al., 2009). AMOVA analyses at 60 SSR loci and eight morphological traits with 196 Chinese G. soja accessions also showed that significant variation exists among northeast China, the Huanghuai Valleys and southern China. The lone exception was the Northeast region (NER) of China, for which accessions were mainly assigned to mixture cluster and HR, NR, Korea subgroups. This was perhaps contributed by the small number of accessions from NER (six for SSR, five for SNP and three for SSR + SNP analyses).

The structure and instruct analyses provided slightly different insights into the extent of potential hybridization between wild and cultivated soybean. structure suggests that fully 20% of our accessions are admixed, which may indicate extensive post-domestication hybridization between species. By contrast, instruct assigns only 0.5% of individuals as hybrid, based on the SSR data set, which is more similar to the measured natural hybridization rate of 0.73% (Nakayama & Yamaguchi, 2002). Potential hybrid individuals have been noted in the field, and sometimes these hybrids are considered to belong to an intermediate evolutionary species, Glycine gracilis (Skvortzow, 1927; Fukuda, 1933; Chen & Nelson, 2004), but others have considered them as hybridization products of G. soja and G. max (Hymowitz, 1970).

Almost all unassigned accessions originated from the region from 30 to 40°N latitude in China, but along a wide longitudinal swath. We thus examined morphological characteristics of NR accessions that were collected from the 34 to 40°N region and assigned into the G. soja cluster when = 2 with SNP data. Based on the description of seed color, 100-seed weight, growth habit and stem termination of catalogs (Wang, 1982; Chang & Sun, 1991; Chang et al., 1996), most of these exhibit ancestral traits, including: 72% with black and 22% with bicolor or green seed coat color, 100% with small seed size (100-seed weight < 12 g), 78% with viney (or semi-viney) and 89% with indeterminate (or semi-determinate) characters. Thus, these accessions seem to show evidence of admixture.

SNPs vs SSRs

This study utilized extensive data from both SNP and SSR markers and hence provides an opportunity to carefully assess the relative utility of these two marker types. Overall, SNPs had lower resolving power for detecting population structure. For example, SSR data yielded six clusters that were consistent with geographical origin, but SNPs revealed only five clusters without resolving groups expected to be clearly differentiated (e.g. Chinese vs Russian populations and Japanese vs Korean populations within G. soja). Despite these differences, SSR and SNP analyses still yielded similar population structure within the species, especially within G. max, similar fractions of diversity attributable to various hierarchical components of population structure (Table 2) and similar phylogenetic information, as measured by Mantel tests on the pairwise shared-allele distance (= 0.505, = 0.001).

Some previous studies have also found that SSR markers perform better at clustering germplasm into populations than SNP markers (Rosenberg et al., 2003; Hamblin et al., 2007; Payseur & Jing, 2009). In our case, it seems that there are two reasons for the discrepancy in the power to resolve populations. The first is the level of genetic diversity. The number of observed SNP alleles (1108) was only half that of SSR markers (2133), despite assaying > 5 times as many SNP loci (554 vs 99). The second, as noted previously (Morin et al., 2004; Kauwe et al., 2005), is the frequency of distinct alleles. The SNP markers used in this study were discovered in only six G. max cultivars (Choi et al., 2007) and hence most of our alleles were neither specific to G. soja nor rare within populations. Hence, population structure was probably more poorly resolved because common alleles are more likely to be shared among populations.

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

This research was supported by the State Key Basic Research and Development Plan of China (973) (Nos. 2010CB125900 and 2004CB117203), National Key Technologies R&D Program in the 11th Five-Year Plan (No. 2006BAD13B05), State High-tech (863) (Nos. 2006AA10A110 and 2006AA10Z164), International Science and Technology Cooperation and Exchanges Projects (No. 20061773) and the Academy and Institute Foundation for Basic Scientific Research in Institute of Crop Science, Chinese Academy of Agricultural Sciences. We thank Dr. Song Ge (Institute of Botany, Chinese Academy of Sciences, Beijing, China), Dr Marinus. J. M. Smulders (Plant Research International, Wageningen UR, the Netherlands.), Dr Richard Abbott and two anonymous reviewers for stimulating discussions and useful suggestion.

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information
  • Abe J, Xu D, Suzuki Y, Kanazawa A, Shimamoto Y. 2003. Soybean germplasm pools in Asia revealed by nuclear SSRs. Theoretical and Applied Genetics 106: 445453.
  • Agrama HA, Yan WG, Lee F, Fjellstrom R, Chen MH, Jia M, McClung A. 2009. Genetic assessment of a mini-core subset developed from the USDA rice genebank. Crop Science 49: 13361346.
  • Aguilar-Melendez A, Morrell PL, Roose ML, Kim SC. 2009. Genetic diversity and structure in semiwild and domesticated chiles (Capsicum annuum; Solanaceae) from Mexico. American Journal of Botany 96: 11901202.
  • Allaby RG, Fuller DQ, Brown TA. 2008. The genetic expectations of a protracted model for the origins of domesticated crops. Proceedings of the National Academy of Sciences, USA 105: 1398213986.
  • Badr A, M K, Sch R, Rabey HE, Effgen S, Ibrahim HH, Pozzi C, Rohde W, Salamini F. 2000. On the origin and domestication history of barley (Hordeum vulgare). Molecular Biology and Evolution 17: 499510.
  • Becquet C, Patterson N, Stone AC, Przeworski M, Reich D. 2007. Genetic structure of chimpanzee populations. PLoS Genetics 3: e66. 0617-0626.
  • Chang R. 1989. Studies on the origin of the cultivated soybean (Glycine max (L.) Merr.). Oil Crops of China: 16.
  • Chang R, Sun J. 1991. Catalogues of Chinese soybean germplasm and resources (sequel 1). Beijing, China: China Agricultural Press.
  • Chang R, Sun J, Qiu L, Chen Y. 1996. Catalogues of Chinese soybean germ-plasm and resources (sequel 2). Beijing, China: China Agricultural Press.
  • Chapman MA, Pashley CH, Wenzler J, Hvala J, Tang S, Knapp SJ, Burke JM. 2008. A genomic scan for selection reveals candidates for genes involved in the evolution of cultivated sunflower (Helianthus annuus). The Plant Cell 20: 29312945.
  • Chen Y, Nelson RL. 2004. Genetic variation and relationships among cultivated, wild, and semiwild soybean. Crop Science 44: 316325.
  • Cheng C, Motohashi R, Tsuchimoto S, Fukuta Y, Ohtsubo H, Ohtsubo E. 2003. Polyphyletic origin of cultivated rice: based on the interspersion pattern of sines. Molecular Biology and Evolution 20: 6775.
  • Choi I, Hyten D, Matukumalli L, Song Q, Chaky J, Quigley C, Chase K, Lark K, Reiter R, Yoon M. 2007. A soybean transcript map: gene distribution, haplotype and single-nucleotide polymorphism analysis. Genetics 176: 685696.
  • Doebley J, Gaut B, Smith B. 2006. The molecular genetics of crop domestication. Cell 127: 13091321.
  • Dong Y, Zhao L, Liu B, Wang Z, Jin Z, Sun H. 2004. The genetic diversity of cultivated soybean grown in China. Theoretical and Applied Genetics 108: 931936.
  • Dong Y, Zhuang B, Zhao L, Sun H, He M. 2001. The genetic diversity of annual wild soybeans grown in China. Theoretical and Applied Genetics 103: 98103.
  • Falush D, Stephens M, Pritchard J. 2003. Inference of population structure using multilocus genotype data linked loci and correlated allele frequencies. Genetics 164: 15671587.
  • Fukuda Y. 1933. Cytological studies on the wild and cultivated Manchurian soybeans. Japanese Journal of Botany 6: 489506.
  • Fuller D. 2007. Contrasting patterns in crop domestication and domestication rates: recent archaeobotanical insights from the old world. Annals of Botany 100: 903924.
  • Fuller D, Qin L, Zheng Y, Zhao Z, Chen X, Hosoya L, Sun G. 2009. The domestication process and domestication rate in rice: spikelet bases from the lower Yangtze. Science 323: 16071610.
  • Gai J, Xu D, Gao Z, Shimamoto Y, Abe J, Fukushi H, Kitajima S. 2000. Studies on the evolutionary relationship among eco-types of G. max and G. soja in China. Acta Agronomica Sinica 26: 513520.
  • Gao H, Williamson S, Bustamante CD. 2007. A Markov chain Monte Carlo approach for joint inference of population structure and inbreeding rates from multilocus genotype data. Genetics 176: 16351651.
  • Garris A, Tai T, Coburn J, Kresovich S, McCouch S. 2005. Genetic structure and diversity in Oryza sativa L. Genetics 169: 16311638.
  • Goudet J. 2001. Fstat, a program to estimate and test gene diversities and fixation indices (version 2.9.3) [WWW document]. URL http://www2.unil.ch/popgen/softwares/fstat.htm [accessed on 28 June 2010].
  • Hamblin MT, Warburton ML, Buckler ES. 2007. Empirical comparison of simple sequence repeats and single nucleotide polymorphisms in assessment of maize diversity and relatedness. PLoS ONE 2: e1367.
  • Heun M, Schafer-Pregl R, Klawan D, Castagna R, Accerbi M, Borghi B, Salamini F. 1997. Site of einkorn wheat domestication identified by DNA fingerprinting. Science 278: 13121314.
  • Hirata T, Abe J, Shimamoto Y. 1999. Genetic structure of the Japanese soybean population. Genetic Resources and Crop Evolution 46: 441453.
  • Hymowitz T. 1970. On the domestication of the soybean. Economic Botany 24: 408421.
  • Hymowitz T, Newell C. 1981. Taxonomy of the genus Glycine, domestication and uses of soybeans. Economic Botany 35: 272288.
  • Jones MK, Liu X. 2009. Origins of agriculture in East Asia. Science 324: 730731.
  • Kauwe J, Bertelsen S, Bierut L, Dunn G, Hinrichs A, Jin C, Suarez B. 2005. The efficacy of short tandem repeat polymorphisms versus single-nucleotide polymorphisms for resolving population structure. BMC Genetics 6(Suppl. 1): S84.
  • Kuroda Y, Kaga A, Tomooka N, Vaughan D. 2006. Population genetic structure of Japanese wild soybean (Glycine soja) based on microsatellite variation. Molecular Ecology 15: 959974.
  • Li FS. 1994. A study on origin and evolution of soybean. Soybean Science (China) 13: 6166.
  • Li Y, Guan R, Liu Z, Ma Y, Wang L, Li L, Lin F, Luan W, Chen P, Qiu L. 2008. Genetic structure and diversity of cultivated soybean (Glycine max (L.) Merr.) landraces in China. Theoretical and Applied Genetics 117: 857871.
  • Liu K, Muse S. 2005. Powermarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21: 21282129.
  • Londo JP, Chiang YC, Hung KH, Chiang TY, Schaal BA. 2006. Phylogeography of Asian wild rice, Oryza rufipogon, reveals multiple independent domestications of cultivated rice, Oryza sativa. Proceedings of the National Academy of Sciences, USA 103: 95789583.
  • Matsuoka Y, Vigouroux Y, Goodman M, Sanchez G. 2002. A single domestication for maize shown by multilocus microsatellite genotyping. Proceedings of the National Academy of Sciences, USA 99: 60806084.
  • Mohamed AI, Rangappa M. 1992. Nutrient composition and anti-nutritional factors in vegetable soybean. II: Oil, fatty acids, sterols, and lipoxygenase activity. Food Chemistry 44: 277282.
  • Morin PA, Luikart G, Wayne RK. 2004. SNPs in ecology, evolution and conservation. Trends in Ecology and Evolution 19: 208216.
  • Morrell P, Clegg M. 2007. Genetic evidence for a second domestication of barley (Hordeum vulgare) east of the Fertile Crescent. Proceedings of the National Academy of Sciences, USA 104: 32893294.
  • Nakayama Y, Yamaguchi H. 2002. Natural hybridization in wild soybean (Glycine max ssp. soja) by pollen flow from cultivated soybean (Glycine max ssp. max) in a designed population. Weed Biology and Management 2: 2530.
  • Oumar I, Mariac C, Pham J, Vigouroux Y. 2008. Phylogeny and origin of pearl millet (Pennisetum glaucum (L.) R. Br) as revealed by microsatellite loci. Theoretical and Applied Genetics 117: 489497.
  • Ozkan H, Brandolini A, Schafer-Pregl R, Salamini F. 2002. AFLP analysis of a collection of tetraploid wheats indicates the origin of emmer and hard wheat domestication in Southeast Turkey. Molecular Biology and Evolution 19: 17971801.
  • Payseur BA, Jing P. 2009. A genome-wide comparison of population structure at STRPs and nearby SNPs in humans. Molecular Biology and Evolution 26: 13691377.
  • Pohl M, Piperno D, Pope K, Jones J. 2007. Microfossil evidence for pre-Columbian maize dispersals in the neotropics from San Andrés, Tabasco, Mexico. Proceedings of the National Academy of Sciences, USA 104: 68706875.
  • Powell W, Morgante M, Doyle J, McNicol J, Tingey S, Rafalski A. 1996. Genepool variation in genus Glycine subgenus soja revealed by poly-morphic nuclear and chloroplast microsatellites. Genetics 144: 793803.
  • Pritchard J, Stephens M, Donnelly P. 2000. Inference of population structure using multilocus genotype data. Genetics 155: 945959.
  • Qiu L. 2009. Establishment, representative testing and research progress of soybean core collection and mini core collection. Acta Agronomica Sinica 35: 571575.
  • Rosenberg NA, Li LM, Ward R, Pritchard JK. 2003. Informativeness of genetic markers for inference of ancestry. The American Journal of Human Genetics 73: 14021422.
  • Ross-Ibarra J, Gaut BS. 2008. Multiple domestications do not appear monophyletic. Proceedings of the National Academy of Sciences, USA 105: E105.
  • Ross-Ibarra J, Morrell PL, Gaut BS. 2007. Plant domestication, a unique opportunity to identify the genetic basis of adaptation. Proceedings of the National Academy of Sciences, USA 104(Suppl. 1): 86418648.
  • Rozas J, Sanchez-DelBarrio J, Messeguer X, Rozas R. 2003. DnaSP, DNA polymorphism analysis by the coalescent and other methods. Bioinformatics 19: 24962497.
  • Saitou N, Nei M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4: 406425.
  • Sang T, Ge S. 2007. The puzzle of rice domestication. Journal of Integrative Plant Biology 49: 760768.
  • Shen R, Fan J, Campbell D, Chang W, Chen J, Doucet D, Yeakley J, Bibikova M, Wickham Garcia E, McBride C. 2005. High-throughput SNP genotyping on universal bead arrays. Mutation Research-Fundamental and Molecular Mechanisms of Mutagenesis 573: 7082.
  • Skvortzow BV. 1927. The soybean–wild and cultivated in Eastern Asia. In: Proceedings of the Manchurian Research Society, Natural History Section Publication Series A, No. 22. Harbin, China.
  • Stephens M, Donnelly P. 2003. A comparison of Bayesian methods for haplotype reconstruction from population genotype data. The American Journal of Human Genetics 73: 11621169.
  • Stephens M, Smith NJ, Donnelly P. 2001. A new statistical method for haplotype reconstruction from population data. The American Journal of Human Genetics 68: 978989.
  • Szpiech Z, Jakobsson M, Rosenberg N. 2008. ADZE: Allelic diversity analyzer version 1.0. WWW document. URL http://rosenberglab.bioinformatics.med.umich.edu/adze.html [accessed on 28 June 2010].
  • Tamura K, Dudley J, Nei M, Kumar S. 2007. MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Molecular Biology and Evolution 24: 15961599.
  • Vavilov N. 1951. The origin, variation, immunity and breeding of cultivated plants. New York, NY, USA: Ronald Press, Translated from the Russian by K. Starrchester.
  • Vigouroux Y, McMullen M, Hittinger CT, Houchins K, Schulz L, Kresovich S, Matsuoka Y, Doebley J. 2002. Identifying genes of agronomic importance in maize by screening microsatellites for evidence of selection during domestication. Proceedings of the National Academy of Sciences, USA 99: 96509655.
  • Wang G. 1982. Catalogues of Chinese soybean germplasm and resources. Beijing, China: China Agricultural Press.
  • Wang L, Guan R, LIU Z, Chang R, Qiu L. 2006. Genetic diversity of Chinese cultivated soybean revealed by SSR markers. Crop Science 46: 10321038.
  • Wen Z, Ding Y, Zhao T, Gai J. 2009. Genetic diversity and peculiarity of annual wild soybean (G. soja Sieb. et Zucc.) from various eco-regions in China. Theoretical and Applied Genetics 119: 371381.
  • Wen ZX, Zhao TJ, Zhang YZ, Liu SH, Wang CE, Wang F, Gai JY. 2008. Association analysis of agronomic and quality traits with SSR markers in Glycine max and Glycine soja in China: I. Population structure and associated markers. Acta Agronomica Sinica 34: 11691178.
  • Wright SI, Bi IV, Schroeder SG, Yamasaki M, Doebley JF, McMullen MD, Gaut BS. 2005. The effects of artificial selection on the maize genome. Science 308: 13101314.
  • Xie H, Chang R, Guan R, Qiu L. 2005. Genetic diversity of Chinese summer soybean germplasm revealed by SSR markers. Chinese Science Bulletin 50: 526535.
  • Xu B, Zheng H, Lu Q, Zhao S, Zhou S. 1986. Three new evidences of the origional area of soybean. Soybean Science (China) 5: 123130.
  • Xu D, Abe J, Gai J, Shimamoto Y. 2002. Diversity of chloroplast DNA SSRs in wild and cultivated soybeans: evidence for multiple origins of cultivated soybean. Theoretical and Applied Genetics 105: 645653.
  • Xu D, Gai J. 2003. Genetic diversity of wild and cultivated soybeans growing in China revealed by RAPD analysis. Plant Breeding 122: 503506.
  • Zhang LB, Zhu Q, Wu ZQ, Ross-Ibarra J, Gaut BS, Ge S, Sang T. 2009. Selection on grain shattering genes and rates of rice domestication. New Phytologist 184: 708720.
  • Zhao TJ, Gai JY. 2004. The origin and evolution of cultivated soybean (Glycine max (L.) Merr.). Scientia Agricultura Sinica 37: 954962.
  • Zhou X, Peng Y, Wang G, Chang R. 1998. Preliminary studies on the centers of genetic diversity and origination of cultivated soybean in China. Acta Agronomica Sinica 31: 3743.
  • Zhu T, Shi L, Doyle JJ, Keim P. 1995. A single nuclear locus phylogeny of soybean based on DNA sequence. Theoretical and Applied Genetics 90: 991999.

Supporting Information

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information
FilenameFormatSizeDescription
NPH_3344_sm_FiguresS1-S5.doc1844KSupporting info item
NPH_3344_sm_TableS1.xls267KSupporting info item
NPH_3344_sm_TableS2.xls186KSupporting info item
NPH_3344_sm_TableS3.doc63KSupporting info item