SEARCH

SEARCH BY CITATION

Keywords:

  • Carthamus tinctorius L;
  • Genetic diversity;
  • Safflower;
  • Simple sequence repeat;
  • SSR enrichment

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data Accessibility
  10. Supporting Information

A genetic evaluation of safflower germplasm collections derived from different geographical regions and countries will provide useful information for sustainable conservation and the utilization of genetic diversity. However, the molecular marker information is limited for evaluation of genetic diversity of safflower germplasm. In this study, we acquired 509 putative genomic SSR markers for sufficient genome coverage using next-generation sequencing methods and characterized thirty polymorphic SSRs in safflower collection composed of 100 diverse accessions. The average allele number and expected heterozygosity were 2.8 and 0.386, respectively. Analysis of population structure and phylogeny based on thirty SSR profiles revealed genetic admixture between geographical regions contrary to genetic clustering. However, the accessions from Korea were genetically conserved in distinctive groups in contrast to other safflower gene pool. In conclusion, these new genomic SSRs will facilitate valuable studies to clarify genetic relationships as well as conduct population structure analyses, genetic map construction and association analysis for safflower.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data Accessibility
  10. Supporting Information

Safflower (Carthamus tinctorius L.) is one of the oldest domesticated crops that was grown traditionally as a source of natural dye, food flavouring and medicine (Dajue & Mundel 1996). In particular, the seed oil of safflower has a high content ratio of linoleic and oleic acid in the total fatty acid, and it is used for cosmetics and food oils (Futehally & Knowles 1981). Although the yield of oil production is lower than that of the other oilseed crops, safflower is cultivated throughout the world with advantages of adaptability to salinity and drought (Weiss 1983).

It is believed that after domestication in the Fertile Crescent region over 4000 years ago, safflower diverged into phenotypically distinguished seven ‘centres of similarity’: the Far East, India–Pakistan, the Middle East, Egypt, Sudan, Ethiopia and Europe (Knowles 1969; Knowles & Ashri 1995). While Chapman & Burke (2007) provided little support for distinctive genetic architecture by geographical regions based on DNA sequence difference, Johnson et al. (2007) found some proposed centres (Far East, Middle East and European) are biologically meaningful based on AFLP variation, and Chapman et al. (2010) suggested five genetic clusters (Europe, Turkey–Iran–Iraq–Afghanistan, Israel–Jordan–Syria, Egypt–Ethiopia and the Far East–India–Pakistan) with microsatellite analysis.

As breeding sources, diverse genetic resources are crucial for supplying valuable alleles based upon changing environment. The diversity that traditional safflower accessions possessed has narrowed during domestication, and therefore, adaptability against threatening surroundings has decreased (Yang et al. 2007). Thus, genetic evaluation of safflower collections derived from diverse origins will give potentially useful information for sustainable conservation and utilization of diversity.

To reveal the genetic diversity of each crop, various molecular markers have been used as genetic evaluation tools. However, genomic studies on safflower have been restricted in that applicable molecular markers are insufficient compared with other major crops; genetic diversity in safflower has been mainly assessed by randomly amplified polymorphic DNA (RAPDs) (Amini et al. 2008; Khan et al. 2009; Mahasi et al. 2009), amplified fragment length polymorphisms (AFLPs) (Johnson et al. 2007), intersimple sequence repeats (ISSRs) (Ash et al. 2003; Yang et al. 2007) and mixes of these markers (Sehgal & Raina 2005; Sehgal et al. 2009). These markers typically are dominant and not suitable for the detection of hybrid types within population or species.

The length variation in simple sequence repeats (SSRs) occurs primarily by slipped-strand mispairing during DNA replication (Levinson & Gutman 1987) and is one of the mutations with relatively high frequency in eukaryotic genomes. SSRs are codominant, multiallelic, highly reproducible and polymorphic markers. With these properties, SSRs have been favourably used for population genetic analysis, genetic mapping, and for molecular breeding in diverse species (Feingold et al. 2005; La Rota et al. 2005; Belzile et al. 2007; Laurent et al. 2007; Park et al. 2008; Ma et al. 2009). Recently, with the availability of expressed sequenced tags (ESTs) in public databases, the EST-based SSR markers have been developed and applied for genetic analysis in many crop (Feingold et al. 2005; La Rota et al. 2005; Belzile et al. 2007; Laurent et al. 2007).

In safflower (Carthamus tinctorius L.), Chapman et al. (2009) developed 104 polymorphic EST-SSR markers with an average of 6.0 alleles per locus, and they showed 50% transferability in the Asteraceae. The EST-derived SSRs are part of or adjacent to functional genes, and therefore, these SSRs are conserved and concentrated in gene-rich regions with low polymorphism (Thiel et al. 2003). On the other hand, genomic SSRs tend to be spread throughout the genome and are better for map coverage with high polymorphism (La Rota et al. 2005). Hamdan et al. (2011) constructed a SSR-enriched library from genomic DNA of safflower and developed 64 polymorphic SSR markers which revealed an average of 3.2 alleles per locus among 10 genotyped safflower accessions.

Recently, next-generation sequencing methods have been developed, producing massive sequence information that could be used for developing DNA markers. These methods might be useful for development of SSR markers cost-effectively in that a portion of acquired sequences included SSR motifs. In this study, we aimed to develop more genomic SSR markers for sufficient genome coverage based on the acquisition of massive sequence information using next-generation sequencing methods.

Materials and methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data Accessibility
  10. Supporting Information

Plant materials and DNA extraction

In total, 100 safflower (Carthamus tinctorius L.) accessions (Table 1), mainly composed of Far East, India–Pakistan and the Middle East groups according to the seven ‘centres of similarity’, were used for the genetic variability test of new SSR markers and for surveying the diversity status of safflower collection conserved in Rural Development Administration (Korea). Of these, three accessions from Uzbekistan (No. 19), Korea (No. 44) and Mexico (No. 91) were sequenced with the Genome Sequencer (GS)-FLX Titanium System (Roche, Mannheim, Germany). Total genomic DNA was extracted from young fresh leaves using a modified cetyltrimethylammonium bromide (CTAB) method (Dellaporta et al. 1983). The DNA concentration and quality were analysed with a NanoDrop ND-1000 spectrometer (NanoDrop Technologies, Wilmington, DE, USA) and the extracted DNA was diluted to a final working concentration of 20 ng/μL.

Table 1. Information on the safflower (Carthamus tinctorius L.) accessions used in this study
NoIT/Temp NoaOrigin (Geographical groupb)SubpopulationcNoIT/Temp NoOrigin (Geographical group b)Subpopulationc
  1. Abbreviation: AFG, Afghanistan; AZE, Azerbaijan; CAN, Canada; CHN, China; IRN, Iran; KAZ, Kazakhstan; KOR, Korea; MEX, Mexico; PAK, Pakistan; TJK, Tajikistan; TUR, Turkey; USA, United states; UZB, Uzbekistan.

  2. a

    Introduction number (National Agrobiodiversity Center of RDA in Republic of Korea).

  3. b

    Geographical group I; Central Asia, II; Eastern Asia, III; Southern Asia, IV; Western Asia, V; America.

  4. c

    Deduced subpopulation groups by structure software (K = 3).

  5. d

    the pyrosequenced accessions.

1202721KAZ (I)ISP-151K002627KOR (II)ISP-2
2202722KAZ (I)ISP-152K002628KOR (II)ISP-2
3202725KAZ (I)ISP-153K002694KOR (II)Admixture
4202727KAZ (I)ISP-154K011540KOR (II)ISP-2
5209543KAZ (I)ISP-155K014334KOR (II)ISP-2
6209545KAZ (I)ISP-156K014335KOR (II)ISP-2
7202723TJK (I)ISP-357K014336KOR (II)ISP-2
8202724TJK (I)ISP-358K138397KOR (II)ISP-2
9209511TJK (I)ISP-359K141870KOR (II)ISP-2
10209535TJK (I)ISP-360209512AFG (III)ISP-3
11202728UZB (I)ISP-161209513AFG (III)ISP-1
12202729UZB (I)ISP-162209514AFG (III)ISP-1
13209506UZB (I)ISP-163209515AFG (III)ISP-3
14209507UZB (I)ISP-164209516AFG (III)ISP-1
15209508UZB (I)ISP-165209517AFG (III)ISP-1
16209509UZB (I)ISP-366209521AFG (III)ISP-1
17209510UZB (I)ISP-167209540AFG (III)ISP-3
18209518UZB (I)ISP-168209547IRN (III)ISP-3
19d209525UZB (I)Admixture69209548IRN (III)ISP-1
20209527UZB (I)ISP-170209549IRN (III)ISP-1
21209539UZB (I)ISP-171209550IRN (III)ISP-1
22209542UZB (I)ISP-172209551IRN (III)ISP-1
23209544UZB (I)ISP-173209552IRN (III)ISP-1
24209546UZB (I)ISP-174209553IRN (III)ISP-3
25909219UZB (I)ISP-175209554IRN (III)ISP-3
26909220UZB (I)ISP-176209555IRN (III)ISP-1
27909221UZB (I)ISP-177209556IRN (III)ISP-3
28909222UZB (I)ISP-178209557IRN (III)ISP-1
29909223UZB (I)ISP-179209561PAK (III)ISP-3
30909224UZB (I)ISP-180209563PAK (III)ISP-3
31909225UZB (I)ISP-181209564PAK (III)ISP-1
32909226UZB (I)ISP-182209565PAK (III)ISP-1
33909227UZB (I)ISP-383209558AZE (IV)ISP-3
34909228UZB (I)ISP-184209559AZE (IV)ISP-1
35K014642UZB (I)ISP-385209560AZE (IV)ISP-1
36191030CHN (II)ISP-186209531TUR (IV)ISP-1
37K002847CHN (II)Admixture87K014636TUR (IV)ISP-3
38K003636CHN (II)ISP-288K014639TUR (IV)ISP-3
39K003637CHN (II)Admixture89K014640TUR (IV)ISP-3
40K024715CHN (II)ISP-390K019153TUR (IV)ISP-1
41K035535CHN (II)ISP-391d183706MEX (V)ISP-1
42175073KOR (II)ISP-392183707MEX (V)ISP-1
43183233KOR (II)ISP-293183708MEX (V)ISP-1
44d185433KOR (II)ISP-294183709MEX (V)ISP-3
45209125KOR (II)ISP-195K131662MEX (V)ISP-3
46209879KOR (II)ISP-296154918USA (V)ISP-1
47212780KOR (II)ISP-297201434USA (V)ISP-1
48221709KOR (II)ISP-298K131659USA (V)ISP-3
49804334KOR (II)Admixture99K001079CAN (V)ISP-1
50807731KOR (II)ISP-2100K131661CAN (V)ISP-1

454 pyrosequencing

The genomic DNA (5 μg) of each safflower accession was fractionated into smaller pieces (300 to 800 bp) by nebulization and the range of fragmented DNA was confirmed with the DNA 7500 LabChip (Agilent Technologies, Waldbronn, Germany) for further analysis. After the fragmentation of DNA samples, ten multiplex identifier (MID) adaptors serving as priming regions for amplification and sequencing were ligated to the size selected blunt end DNA according to manufacturer's instructions(MID set, Roche). The optimal amount of the ssDNA library for emPCR was assessed by emulsion titration. The immobilized ssDNA library was amplified and loaded onto a PicoTiterPlate (PTP) device and finally sequenced as single read using a GS-FLX Titanium instrument (Roche) as per manufacturer's instructions. The pyrosequencing results were assembled with gs de novo assembler software (Roche) to build the contigs.

Sequence data analysis and SSR marker screening

Identification of SSR motifs and designation of primer pairs flanking SSR regions were conducted with the SSR Manager program (Kim 2004). The type of SSR motifs was distinguished by the core repeat unit. The M13F-tail PCR method was used to measure the size of the PCR products, as described previously (Schuelke 2000; Lee et al. 2011). Briefly, each 20-μL reaction mixture contained 2 μL genomic DNA (20 ng/μL), 0.2 μL of the specific primer (10 pmol/μL), 0.4 μL M13 universal primer (10 pmol/μL), 0.6 μL normal reverse primer, 2.0 μL 10 × h-Taq PCR buffer (Solgent, Daejeon, Korea), 1.6 μL dNTP mix (2.5 mm) and 0.3 μL h-Taq polymerase (2.5 unit/μL; Solgent). PCR was performed as follows: initial denaturation at 94 °C (3 min), then 30 cycles each at 94 °C (30 s), 55 °C (45 s) and 72 °C (1 min), followed by 10 cycles at 94 °C (30 s), 53 °C (45 s), a 1 min extension step at 72 °C and a final extension at 72 °C for 10 min with a PTC-200 thermocyclers (MJ Research, Waltham, MA, USA). SSR alleles were resolved on the ABI 3130x1 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA) and sized precisely in base pair levels based on an internal size standard (35–500 bp, GeneScan 500 ROX; Applied Biosystems).

Data statistics

To evaluate the genetic variability of genomic SSRs, we used the following indexes: major allele frequency (MAF), number of alleles (NA), observed heterozygosity (HO), expected heterozygosity (He) and polymorphism information content (PIC) using powermarker software (ver.3.25) (Liu & Muse 2005). With this programme, an unweighted pair group method with arithmetic averages (UPGMA) dendrogram was constructed based on a genetic distance matrix. The possible subpopulations within the safflower collection, ranging from K = 2 to K = 15 (three independent runs), were analysed using the structure software (version 2.2) (Pritchard et al. 2000), using a model allowing for admixture and correlated allele frequencies with a burn-in period of 100 000 and MCMC repeats of 100 000 followed by three iterations. The molecular variance for geographical group and estimated subpopulations was calculated with an analysis of molecular variance (AMOVA) approach using the genalex software (version 6.1) (Peakall & Smouse 2006).

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data Accessibility
  10. Supporting Information

Characterization of sequenced contigs

From the pyrosequenced genomic DNA, we acquired a total of 15 304 contigs (average contig length = 373 bp) in three safflower accessions (Table 2). Among these, 1133 contigs contained a unique SSR motif with more than four repeat units: specifically, 295 SSR contigs within a total of 4247 contigs in IT183706, 413 contigs within 5497 contigs in IT185433 and 425 contigs within 5290 contigs in IT209525. The dinucleotide repeat motifs were predominant in 817 contigs (72.1%), followed by trinucleotide motifs in 300 contigs (26.9%). The repeat number in dinucleotide and trinucleotide motifs ranged from 4 to 17 and from 4 to 12, respectively (data not shown). Within the dinucleotide motifs, AT was the most frequent repeat as seen in 298 contigs followed by AG/CT and AC/GT repeats in 284 and 231 contigs, respectively. The most frequent trinucleotide motif was ACC (81), followed by AAG (78) and ATC (57). A total of 509 primer pairs covering SSR motifs were designed from sequence data. The remaining contigs contained too short flanking region to design primers.

Table 2. Description of the safflower used for microsatellite discovery and results of sequencing SSR-enriched
IT anumberOriginNumber of sequenced contigsAverage contig length (bp)Number of contigs with SSRPrimer designed
Di-/Tri-/Other nucleotideb
  1. a

    Introduction number (National Agrobiodiversity Center of RDA in Republic of Korea).

  2. b

    SSR motif types.

  3. c

    The number of uinque contigs except the redundant.

183706Mexico4247378295 
228(77.3%)/59(20.0%)/8(2.7%)
185433South Korea5497368413 
295(71.4%)/115(27.9%)/3(0.7%)
209525Uzbekistan5290374425 
294(69.2%)/126(29.6%)/5(1.2%)
Total 15 034 (14 974c)3731133 (1115 c)508
817(72.1%)/300(26.5%)/16(1.4%)

Polymorphic SSRs and genetic variability

In this study, 509 markers were validated for consistent amplification in eight diverse safflower accessions. Of these, 302 markers revealed reproducible amplicons and thirty SSRs were polymorphic. Registered 30 SSR clones in NCBI database do not coincide with published nucleotide sequences of safflower. These thirty SSR markers were applied to reveal genetic variability and population structure in safflower collection composed of 100 accessions (Table 3). Among the 30 SSRs tested on 100 accessions, the perfect repeat type was the most predominant with a value of 63.3%. The number of alleles detected with thirty genomic SSR markers in safflower collection ranged from 2 to 7 with average alleles of 2.8. The major allele frequency (MAF) ranged from 0.424 to 0.995 with a mean of 0.703. Hamdan et al. (2011) showed similar allele range from 2 to 8 (mean = 3.2) on genomic SSRs of safflower based on AG and AC repeat enriched library. The mean values of observed and expected heterozygosity were 0.452 and 0.386, respectively. The PIC values ranged from 0.010 (GB-CT-022) to 0.682 (GB-CT-081) with a mean value of 0.325 (Table 3).

Table 3. Characteristics and genetic diversity statistics of the polymorphism microsatellite markers isolated for safflower
LocusGenBank accession no.Repeat motifTm (℃)Repeat statusPrimer sequence (5′[RIGHTWARDS ARROW]3′)MAFNA H E H O PICSize range (bp)
  1. Abbreviation: MAF, major allele frequency; NA, number of alleles; HO, observed heterozygosity; HE, expected heterozygosity; PIC, polymorphic information content.

GB-CT-006 HQ153146 (AC)855Perfect

GGAGGGGCGACTTTAGAA

CGAGGAGTTTTGTGTCCG

0.96520.0680.0300.065243–245
GB-CT-011 HQ153147 (TA)5, (TA)658ImPerfect

AAAATTTGACATCCCGCC

TGCGTACCAAACGCTACC

0.83030.2890.1300.260288–294
GB-CT-022 HQ153148 (AG)655Perfect

TTTCCATGTCCAACCCTG

TTCAATCGGTTTTCACGG

0.99520.0100.0100.010299–301
GB-CT-023 HQ153149 (AC)658Perfect

ATCATCAACTTCGGCCCT

TGGTGTAAGGGCAACCTG

0.51620.5000.9680.375228–240
GB-CT-035 HQ153150 (AT)658Perfect

GGCCCAAAGAAACGAAAG

CTCGCTTTCATCCTTCCC

0.73220.3920.0000.315264–268
GB-CT-042 HQ153151 (AACCTC)458Perfect

CTCAACCCATCTCAGCCA

CCTTCCCTTTGTCCTTCG

0.76350.3940.4120.366210–237
GB-CT-057 HQ153152 (AG)658Perfect

TCGGTCTGACGCTCTGAT

GGAAGCAACTGCTTGTAGGA

0.71240.4430.4140.393289–300
GB-CT-081 HQ153153 (CTT)958Perfect

CAGACGCTGATGGGGTAG

TTCCCCAGCTGTACCTCC

0.42470.7220.4240.682198–224
GB-CT-094 HQ153154 (AC)757Perfect

TTTTTGAAGGCATAGCGG

CGATCCCAAGGGGAGTTA

0.92020.1470.1600.136256–260
GB-CT-100 JX313037 (TA)555Perfect

GCAAACGCCTACGAAAAG

AACATACGATTCACGCGG

0.72020.4030.0000.322286–288
GB-CT-107 JX313038 (ATTT)355Perfect

AGGCGAAAGTTTACAATCCT

GGATTGGAACTCTTAGTTCTCG

0.85420.2500.1110.219231–233
GB-CT-112 JX313039 (ATT)2TT(ATTT)255ImPerfect

CGTACCAAACGCACCCTA

ACCGATGGAGTGAACACG

0.92320.1410.1530.131201–207
GB-CT-115 JX313040 (AC)4,(AC)455ImPerfect

CCAACCCTCCCAACCTAA

CGGTACGCAACCTGTGAT

0.90420.1740.1920.158198–202
GB-CT-121 JX313041 (CA)555Perfect

GCACTGCCACATCAGCTT

CTTAGCGGGTTGGTAGCC

0.56920.4900.6910.370201–218
GB-CT-146 JX313042 (AGC)3,(AGC)255ImPerfect

GCACTTTTGGAGTCGCAG

CGGATTTGAGCTTGTTGC

0.77820.3460.0610.286247–250
GB-CT-176 JX313043 (TG)555Perfect

ATTGGCAAATGAACGGTG

TGGCATACAAGGTGGGAA

0.50030.6241.0000.554230–236
GB-CT-184 JX313044 (GAT)2,(GAT)355ImPerfect

GAGGAGGTCGAAAGCCTG

TGGTAAAGGGTTGAGGGC

0.51530.5780.9690.492201–207
GB-CT-198 JX313045 (GAA)3(GA)255ImPerfect

ACCGTCAATTAGGGGGAA

TGCGTTGCAATGTGGATA

0.80130.3210.0100.272292–296
GB-CT-264 JX313046 (GT)4AT(GT)455ImPerfect

AGGGAAGGTCTCAAAGGC

AAACCCAGATCCTTTGGC

0.51530.5510.9700.452170–178
GB-CT-294 JX313047 (GT)3(GCGTGT)255ImPerfect

GTTCGATAGTTTGGGGGC

AAATCCCTGCCTCTCTGG

0.56540.5720.8700.501199–206
GB-CT-298 JX313048 (CA)4GT(CA)355ImPerfect

ACCTCTCATCGTACCCCC

CGTGCACAAGAGTGTAGGAG

0.49530.5490.1400.448231–235
GB-CT-329 JX313049 (ACAT)555Perfect

TGTCCTTACGGCTCTGTACC

CAAAAAGGGCCCTCTGAC

0.50530.5630.9900.468169–177
GB-CT-337 JX313050 (GCTT)355Perfect

TGGTTTAAGGGCCATGTG

GACTCATGCGCTTTGGAG

0.98020.0400.0400.039310–313
GB-CT-347 JX313051 (AC)455Perfect

GCAGCTGCTCTCCAAATA

TCAGGGACCAATTGTGACTT

0.62520.4690.7500.359191–197
GB-CT-370 JX313052 (TA)2CA(TA)455ImPerfect

TCAAAACCCCTTCACCAA

AAGGGAGAAGGGAGAGGG

0.88530.2050.2300.186251–265
GB-CT-394 JX313053 (AAC)455Perfect

AATTTCCACCTCCTCCGA

CGAATCGAGGTCTTGACG

0.71720.4060.5660.323219–222
GB-CT-400 JX313054 (AAG)455Perfect

AAGCCAAAGAGAATCCATGA

TTCCTCAACCTGCTTTGC

0.53530.5390.9300.438163–169
GB-CT-412 JX313055 (TG)4CA(TG)255ImPerfect

TGGCCAGTGTAACTGTGGA

TCCATCGGTATCGTTTCG

0.51520.5000.9700.375184–191
GB-CT-445 JX313056 (ATGGC)455Perfect

GCCTAGAAGGACCAAGGC

CATGCGAGGTTTTAAGCG

0.49530.5751.0000.484212–242
GB-CT-499 JX313057 (AT)455Perfect

GCCAAGTTTGCTGATTCG

GCTCACTCATGCTACCTCG

0.82330.3050.3540.280234–248
Average     0.7032.80.3860.4520.325 

Population structure and genetic relationship based on SSR profiles

For the effective conservation and utilization of germplasm, one must analyse the population structure in each crop. The possible population structure in the safflower collection (100 accessions) was deduced with a model-based clustering method based on SSR profiles. Estimated log-likelihood values in three independent runs revealed consistent results for a given K. Finally, we acquired the maximum K value at K = 3, which means three possible informative subpopulations within the germplasm of safflower. The relatively small value of the alpha parameter (a = 0.038) indicates that most accessions originated from one primary ancestor, with a few admixed individuals (Ostrowski et al. 2006). The distribution of safflower accessions within inferred subpopulation with >80% shared ancestry is summarized in Table 1. Of the 100 accessions, 95 were distinguished into three subpopulations (ISP-1, ISP-2, ISP-3) in the 80% inferred ancestry cut value, while remaining five accessions were categorized into an admixture group (Fig. 1a). The phylogenetic tree based on a genetic distance matrix could be divided into two main groups (Group-I, Group-II) (Fig. 1b). Group-I included 28 accessions predominantly appertained to ISP-3 subpopulation, and this group included diverse accessions originating from five group. Group-II consisted of accessions belonging to ISP-1 and ISP-2 (mainly accessions from Korea) subpopulation. ISP-1 was divided into two subgroups in the dendrogram. The AMOVA analysis revealed that the molecular variance among deduced subpopulations accounted for 14% of the total variation while the remaining 86% of the total variation was due to the differences within the subpopulations (Table 4). For the geographical groups, 4% and 96% of the total variations were due to the molecular variance among and within subpopulations, respectively. These AMOVA results indicated that the accessions were structured in this study.

Table 4. Analysis of molecular variance (AMOVA) for geographical origins and deduced subpopulations
Sourced.f.SSMSEst. Var.%ValueP value
  1. Abbreviation: SS, Sum of squares; MS, Mean squares; EST. Var., Estimates of variance.

Geographical group
Among Pops458.914.70.24  
Within Pops1951096.55.65.6960.0410.001
Deduced subpopulation
Among Pops3124.941.60.914  
Within Pops1961032.15.35.3860.1440.001
image

Figure 1. Model-based population structure (a) and phylogenetic dendrogram (b) based on SSR profiles in safflower collection. Geographical groups I; Central Asia, II; Eastern Asia, III; Southern Asia, IV; Western Asia, V; America.

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data Accessibility
  10. Supporting Information

Safflower is cultivated in various regions of the world for vegetable oil extracts from the seeds and also used for herbal medicine (Dajue & Mundel 1996). There are some reports to evaluate safflower varieties for yield and production (Kolsarici & Eda 2002). The diversity of safflower germplasm is not only complex in genetic structure but also dynamic with constantly evolving entities (Sehgal et al. 2009). Hence, consistent efforts are needed to measure the diversity in safflower collections.

The molecular marker based genetic diversity analysis, particularly using SSR markers, has been frequently used in germplasm management and genetic and breeding research because of relatively high allelic polymorphism and easy genotyping by PCR. In this study, we developed additional genomic SSR markers to evaluate the genetic diversity of safflower collection. SSR markers, having characters of reproducibility, codominance, high polymorphism and genome-wide distribution, are widely used for genetic fingerprinting and to analyse genomic variation and population structure (Varshney et al. 2005). Sequence information, traditionally acquired based on probe hybridization (Kresovich et al. 1995; Varshney et al. 2002; Feingold et al. 2005; Ma et al. 2009; Hamdan et al. 2011), is needed to develop SSR markers. This information is scarce in safflower and related species contrary to other crops, although Chapman et al. (2009), Naresh et al. (2009), Mayerhofer et al. (2010) and Hamdan et al. (2011) reported several polymorphic EST-SSR or genomic SSR markers in safflower, respectively. In this study, we used a cost-effective pyrosequencing method for searching genome-wide SSRs in safflower.

AT motif was the most frequent repeat motif followed by AG/CT in this study. AG/CT motif showed a little higher frequency than AC/GT, but these two motifs revealed similar frequencies, resembling motif distributions of Allium species (Tsukazaki et al. 2007; Lee et al. 2011). Generally, the most common dinucleotide repeat is AT followed by AG/CT in plants, in contrast to animals, which have AC/GT as the most common, and our results were in accordance with previous reports. (Stallings et al. 1991; Lagercrantz et al. 1993; Cardle et al. 2000; Katti et al. 2001). Among genomic SSRs, dinucleotide motifs show a higher frequency compared with trinucleotide motifs, but trinucleotide motifs are more frequent in EST-SSRs (Cardle et al. 2000; Gao et al. 2003). Our results corroborated with those of other studies (Cardle et al. 2000; Gao et al. 2003); a high percentage of dinucleotides (72.1%) was detected compared with trinucleotides (26.5%) among the genomic SSRs.

In relation to genetic informativeness of thirty SSR markers, the average allele number and heterozygosity were relatively low as 2.8 and 0.386, respectively, while Hamdan et al. (2011) reported average 3.2 alleles and 0.52 heterozygosity among 10 diverse safflower cultivars and breeding lines. In composite family, Tang & Knapp (2003) reported average 3.5 alleles and 0.510 heterozygosity per locus in 19 elite lines of sunflower, and Van de Wiel et al. (1999) showed averge 3.4 alleles and 0.55 heterozygosity per locus in 18 genotypes of lettuce. The relatively low informativeness in this study might be due to biased diversity of used samples.

Genomic SSRs are spread throughout the whole genome and show good map coverage with higher polymorphisms compared with EST-SSRs (La Rota et al. 2005). On the other hand, EST-SSRs are adjacent to functional genes, so these repeat regions are more conserved than genomic SSRs (Thiel et al. 2003). However, in this study, the newly developed thirty SSR markers were slightly low average allele number and expected heterozygosity (genetic diversity) of 2.8 and 0.386, respectively, while EST-SSRs by Chapman et al. (2009) revealed an average alleles of 6.0 per locus and average expected heterozygosity of 0.540 in the genus Carthamus. These results might be acceptable level in that the genetic variability of EST-SSRs in safflower was surveyed in genus Carthamus composed of three species and included interspecific variability.

The model-based clustering method and genetic distance-based phylogenetic tree based on SSR profiles revealed similar grouping patterns; three subpopulations (ISP-1, ISP-2, ISP-3) were inferred by model-based clustering, and these three subpopulations were accordingly divided into divergent clades in the phylogenetic tree (Fig. 1). Contrary to the hypothesis of geographical distinction of seven ‘centre of similarity’ (Knowles 1969), our results provided the genetic admixture between geographical regions although clear three subpopulations were estimated. Chapman et al. (2010) and Johnson et al. (2007) showed that geographical groups were correlated with genetic clustering at significant levels, but accessions of ISP-1 and ISP-3 were not clearly divided on the basis of geographical origin in our study. Because we tested only partial groups (Far East, India–Pakistan and the Middle East) of ‘centre of similarity’, this suggests that genetic and geographical distances were not related in our tested safflower collection, and this genetic mixture may be attributed to human interventions via breeding and domestication processes in these regions. On the other hand, most accessions collected from the Korea region were distinguished into ISP-2 (Fig. 1). These findings revealed that the safflower accessions of the Korea were genetically conserved in distinctive groups in contrast to other safflower clusters. Sehgal et al. (2009) obtained significant variation within the safflower group, and the large amount of variation was detected within the inferred subpopulation (86%) in this study. In that differences between gene pools are likely to show high level in self-pollinating crops during domestication (Bussell 1999), low variation between subpopulations in safflower might be caused by artificial breeding and weakened divergent directional selection. Likewise, low variation between geographical groups indicate that domestication process has less been affected by regional barrier and formed a major gene pool in Asia region.

In conclusion, we acquired thirty polymorphic SSR markers based on a cost-effective pyrosequencing method for evaluating safflower genetic resources. As an efficient genetic characterization tool for safflower, these new genomic SSRs as a new addendum to markers in safflower (Chapman et al. 2009; Naresh et al. 2009; Mayerhofer et al. 2010;; Hamdan et al. 2011) can be used in clarifying taxonomic relationships, population structure analysis, genetic map construction and association analyses of safflower genetic resources.

Acknowledgements

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data Accessibility
  10. Supporting Information

This study was supported by the Rural Development Administration (RDA), a grant (Code # PJ008625) from the National Academy of Agricultural Science, RDA, Republic of Korea.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data Accessibility
  10. Supporting Information
  • Amini F, Saeidi G, Arzani A (2008) Study of genetic diversity in safflower genotypes using agro-morphological traits and RAPD markers. Euphytica, 163, 2130.
  • Ash G, Raman R, Crump N (2003) An investigation of genetic variation in Carthamus lanatus in New South Wales, Australia, using intersimple sequence repeats (ISSR) analysis. Weed Research, 43, 208213.
  • Belzile FBF, Hanai LRHLR, Tatiana de Campos T et al. (2007) Development, characterization, and comparative analysis of polymorphism at common bean SSR loci isolated from genic and genomic sources. Genome, 50, 266277.
  • Bussell J (1999) The distribution of random amplified polymorphic DNA (RAPD) diversity amongst populations of Isotoma petraea (Lobeliaceae). Molecular Ecology, 8, 775789.
  • Cardle L, Ramsay L, Milbourne D, Macaulay M, Marshall D, Waugh R (2000) Computational and experimental characterization of physically clustered simple sequence repeats in plants. Genetics, 156, 847854.
  • Chapman M, Burke J (2007) DNA sequence diversity and the origin of cultivated safflower (Carthamus tinctorius L.; Asteraceae). BMC Plant Biology, 7, 60.
  • Chapman MA, Hvala J, Strever J et al. (2009) Development, polymorphism, and cross-taxon utility of EST-SSR markers from safflower (Carthamus tinctorius L.). Theoretical and Applied Genetics, 120, 8591.
  • Chapman MA, Hvala J, Strever J, Burke JM (2010) Population genetic analysis of safflower (Carthamus tinctorius; Asteraceae) reveals a Near Eastern origin and five centers of diversity. American journal of botany, 97, 31840.
  • Dajue L, Mundel H (1996) Safflower (Carthamus tinctorius L.) Promoting the Conservation and Use of Underutilized and Neglected Crops. 7. Inst. Plant Genetic Resources Institute (IPGRI), Roma, Italy.
  • Dellaporta SL, Wood J, Hicks JB (1983) A plant DNA minipreparation: version II. Plant Molecular Biology Reporter, 1, 1921.
  • Feingold S, Lloyd J, Norero N, Bonierbale M, Lorenzen J (2005) Mapping and characterization of new EST-derived microsatellites for potato (Solanum tuberosum L.). Theoretical and Applied Genetics, 111, 456466.
  • Futehally S, Knowles P (1981) Inheritance of very high levels of linoleic acid in an introduction of safflower (Carthamus tinctorius L.) from Portugal. Proc. 1st Int. Safflower Conf, Davis, CA, USA. pp. 5661.
  • Gao L, Tang J, Li H, Jia J (2003) Analysis of microsatellites in major crops assessed by computational and experimental approaches. Molecular Breeding, 12, 245261.
  • Hamdan Y, Garcia Moreno M, Redondo Nevado J, Velasco L, Perez Vich B (2011) Development and characterization of genomic microsatellite markers in safflower (Carthamus tinctorius L.). Plant Breeding, 130, 237241.
  • Johnson RC, Kisha T, Evans M (2007) Characterizing safflower germplasm with AFLP molecular markers. Crop Science, 47, 17281736.
  • Katti MV, Ranjekar PK, Gupta VS (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Molecular biology and evolution, 18, 11611167.
  • Khan MA, von Witzke-Ehbrecht S, Maass BL, Becker HC (2009) Relationships among different geographical groups, agro-morphology, fatty acid composition and RAPD marker diversity in Safflower (Carthamus tinctorius). Genetic Resources and Crop Evolution, 56, 1930.
  • Kim K (2004) Developing one step program (SSR Manager) for rapid identification of clones with SSRs and primer designing. MsD thesis Seoul National University, Seoul.
  • Knowles P (1969) Centers of plant diversity and conservation of crop germ plasm: Safflower. Economic Botany, 23, 324329.
  • Knowles PF, Ashri A (1995) Safflower: Carthamus tinctorius (Compositae). In: Evolution of Crop Plants, 2nd edn(eds Smartt J & Simmonds NW), pp. 4750. Longman, Harlow, UK.
  • Kolsarici O, Eda G (2002) Effects of different row distances and various nitrogen doses on the yield components of a safflower variety. Sesame and Safflower Newsletter, 17, 108111.
  • Kresovich S, Szewc-McFadden A, Bliek S, McFerson J (1995) Abundance and characterization of simple-sequence repeats (SSRs) isolated from a size-fractionated genomic library of Brassica napus L. (rapeseed). Theoretical and Applied Genetics, 91, 206211.
  • La Rota M, Kantety R, Yu JK, Sorrells M (2005) Nonrandom distribution and frequencies of genomic and EST-derived microsatellite markers in rice, wheat, and barley. BMC genomics, 6, 23.
  • Lagercrantz U, Ellegren H, Andersson L (1993) The abundance of various polymorphic microsatellite motifs differs between plants and vertebrates. Nucleic Acids Research, 21, 11111115.
  • Laurent V, Devaux P, Thiel T et al. (2007) Comparative effectiveness of sugar beet microsatellite markers isolated from genomic libraries and GenBank ESTs to map the sugar beet genome. Theoretical and Applied Genetics, 115, 793805.
  • Lee GA, Kwon SJ, Park YJ et al. (2011) Cross-amplification of SSR markers developed from Allium sativum to other Allium species. Scientia Horticulturae, 128, 401407.
  • Levinson G, Gutman GA (1987) Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Molecular biology and evolution, 4, 203221.
  • Liu K, Muse SV (2005) PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics, 21, 21282129.
  • Ma KH, Kim NS, Lee GA et al. (2009) Development of SSR markers for studies of diversity in the genus Fagopyrum. Theoretical and Applied Genetics, 119, 12471254.
  • Mahasi M, Wachira F, Pathak R, Riungu T (2009) Genetic polymorphism in exotic safflower (Carthamus tinctorious L.) using RAPD markers. Journal of Plant Breeding and Crop Science, 1, 008012.
  • Mayerhofer RMR, Archibald CAC, Bowles VBV, Allen G, Good GA (2010) Development of molecular markers and linkage maps for the Carthamus species C. tinctorius and C. oxyacanthus. Genome, 53, 266276.
  • Naresh V, Yamini K, Rajendrakumar P, Dinesh Kumar V (2009) EST-SSR marker-based assay for the genetic purity assessment of safflower hybrids. Euphytica, 170, 347353.
  • Ostrowski MF, David J, Santoni S et al. (2006) Evidence for a large-scale population structure among accessions of Arabidopsis thaliana: possible causes and consequences for the distribution of linkage disequilibrium. Molecular Ecology, 15, 15071517.
  • Park YJ, Dixit A, Ma KH et al. (2008) Evaluation of genetic diversity and relationships within an on-farm collection of Perilla frutescens (L.) Britt. using microsatellite markers. Genetic Resources and Crop Evolution, 55, 523535.
  • Peakall R, Smouse PE (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes, 6, 288295.
  • Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics, 155, 945.
  • Schuelke M (2000) An economic method for the fluorescent labeling of PCR fragments. Nature Biotechnology, 18, 233234.
  • Sehgal D, Raina SN (2005) Genotyping safflower (Carthamus tinctorius) cultivars by DNA fingerprints. Euphytica, 146, 6776.
  • Sehgal D, Rajpal VR, Raina SN, Sasanuma T, Sasakuma T (2009) Assaying polymorphism at DNA level for genetic diversity diagnostics of the safflower (Carthamus tinctorius L.) world germplasm resources. Genetica, 135, 457470.
  • Stallings R, Ford A, Nelson D, Torney D, Hildebrand C, Moyzis R (1991) Evolution and distribution of (GT)n repetitive sequences in mammalian genomes. Genomics, 10, 807815.
  • Tang S, Knapp SJ (2003) Microsatellites uncover extraordinary diversity in native American land races and wild populations of cultivated sunflower. Theoretical and Applied Genetics, 106, 9901003.
  • Thiel T, Michalek W, Varshney R, Graner A (2003) Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theoretical and Applied Genetics, 106, 411422.
  • Tsukazaki H, Nunome T, Fukuoka H et al. (2007) Isolation of 1,796 SSR clones from SSR-enriched DNA libraries of bunching onion (Allium fistulosum). Euphytica, 157, 8394.
  • Van de Wiel C, Arens P, Vosman B (1999) Microsatellite retrieval in lettuce (Lactuca sativa L.). Genome, 42, 139149.
  • Varshney RK, Thiel T, Stein N, Langridge P, Graner A (2002) In silico analysis on frequency and distribution of microsatellites in ESTs of some cereal species. Cellular and Molecular Biology Letters, 7, 537546.
  • Varshney RK, Graner A, Sorrells ME (2005) Genic microsatellite markers in plants: features and applications. TRENDS in Biotechnology, 23, 4855.
  • Weiss E (1983) Oilseed Crops. Chapter 6. Safflower. Longman Group Limited, Longman House, London, UK.
  • Yang YX, Wu W, Zheng YL, Chen L, Liu RJ, Huang CY (2007) Genetic diversity and relationships among safflower (Carthamus tinctorius L.) analyzed by inter-simple sequence repeats (ISSRs). Genetic Resources and Crop Evolution, 54, 10431051.

G.-A.L. performed research, analysed date and wrote the manuscript. J.-S.S. phenotyped samples, performed research and analysed date. S.-Y.L. designed research, performed research and wrote the manuscript. J.-W.C. performed research and analysed date. J.-Y.Y. performed research. Y.-G.K. performed research. M.-C.L. designed research, performed research, analysed date and wrote the manuscript.

Supporting Information

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Data Accessibility
  10. Supporting Information
FilenameFormatSizeDescription
men12146-sup-0001-TableS1.docWord document70KTable S1 Characteristics of the microsatellite sequences identified in pyrosequenced safflower accessions.
men12146-sup-0002-TableS2.xlsapplication/msexcel102KTable S2 Designed 508 primers.
men12146-sup-0003-TableS3.xlsapplication/msexcel70KTable S3 Genotypes of 100 accessions using 30 microsatellite markers.

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.