Development and characterization of EST‐SSR markers for Camellia reticulata

Premise Camellia reticulata, which is native to southwestern China, is an economically important plant belonging to the family Theaceae. We developed expressed sequence tag–simple sequence repeat (EST‐SSR) markers for C. reticulata, which can be used to investigate its genetic diversity, population structure, and evolutionary history. Methods and Results We detected 4780 SSRs in C. reticulata from Camellia RNA‐Seq data deposited in the National Center for Biotechnology Information's expressed sequence tags database (dbEST). Primer pairs for 70 SSR loci were designed and used for PCR amplification using 90 individuals from four populations of C. reticulata. Of these loci, 50 microsatellite markers were successfully identified, including 11 polymorphic markers. The allele number per locus ranged from two to seven (mean = 4.182), and the levels of observed and expected heterozygosity ranged from 0.044 to 0.567 and from 0.166 to 0.642, respectively. Eleven primer pairs amplified PCR products in three other species of Camellia (C. saluenensis, C. pitardii, and C. yunnanensis). Conclusions The set of microsatellite markers developed here can be used to study the genetic variation and population structure of C. reticulata and related species and thereby help to develop conservation strategies for this species.

characteristics of C. reticulata, evaluate its evolutionary potential, and develop effective strategies for the conservation, development, and utilization of wild natural populations. In addition, we tested the cross-species transferability of these markers in three other species of Camellia: C. saluenensis, C. pitardii, and C. yunnanensis, which are thought to be involved in the polyploidy of C. reticulata.

METHODS AND RESULTS
Fresh healthy leaves collected from 90 individuals of C. reticulata sampled from four wild populations from Yunnan Province, China, were freeze-dried or silica-dried. Forty samples from three populations of C. saluenensis, C. pitardii, and C. yunnanensis were also collected to test the cross-amplification of the markers. Voucher specimens were deposited at the Kunming Institute of Botany, Chinese Academy of Sciences (KUN) (Appendix 1). Genomic DNA was extracted from 20-30 mg of dried leaf tissue using a modified cetyltrimethylammonium bromide (CTAB) method (Doyle and Doyle, 1987).
We obtained 50,287 EST sequences from the National Center for Biotechnology Information (NCBI) expressed sequence tags database (dbEST) (accessed on June 2019) (Boguski et al., 1993). To obtain a nonredundant EST data set for SSR identification and primer design, vectors were removed from EST sequences using SeqTrim (Falgueras et al., 2010) and poly(A) tails were trimmed using est-trimmer.pl. Clean EST sequences were then clustered and assembled into contigs and singletons using CAP3 (Huang and Madan, 1999), generating 17,989 unigenes consisting of 5099 contigs and 12,890 singletons. These unique sequences were further used to screen for the presence of microsatellites using the MISA Perl program (Thiel et al., 2003). The criteria for SSRs were set as sequences having at least 10, six, five, five, five, and five repeat units for mono-, di-, tri-, tetra-, penta-, and hexanucleotide motifs, respectively. In total, 4780 SSRs were identified, with an average frequency of 1 SSR/1.57 kbp. Primers were designed using Primer Premier 5.0 software (PREMIER Biosoft International, Palo Alto, California, USA) with the following criteria: primer lengths of 16-22 bp, GC content of 40-65%, annealing temperature ranging from 40°C to 60°C, and a predicted PCR product size ranging from 100 to 300 bp.
We randomly selected 70 primer pairs and tested them for PCR amplification in 12 accessions of C. reticulata (three individuals in each population, Appendix 1) in an initial screening test. PCR amplification was performed in an 18-μL reaction mixture containing 20-30 ng of genomic DNA, 9 μL of 2× Easy Taq PCR Super Mix (TransGen Biotech, Beijing, China), and 1 μL of each primer (10 μM), adding ddH 2 O to a final volume of 18 μL. Cycling consisted of 30 s of denaturation at 94°C, 30 s at the optimized annealing temperature (Table 1), and a 1-min extension at 72°C for 32 cycles, followed by a final extension at 72°C for 5 min. The amplified products were separated on 8% polyacrylamide denaturing gels, and the bands were developed with silver staining with a 2-kbp DNA Ladder Marker (Hangzhou Bioer Technology Co. Ltd., Hangzhou, China) as a reference. The ploidy level of the sampled populations was unknown, but multiple copy bands per locus due to polyploidy were not observed. Of the 70 primer pairs tested, 50 yielded clear and reproducible amplicons in C. reticulata; the others were unstable or gave no product. Eleven loci showed polymorphisms (Table 1), and 39 loci were monomorphic (Appendix 2). These 11 polymorphic primers were used in 90 individuals of C. reticulata (four populations) for the population genetic analyses using the same protocol as the initial test. The polymorphic SSR loci were analyzed with POPGENE 32 software (Yeh et al., 2000) and GenAlEx (Peakall and Smouse, 2006) for the number of alleles per locus, observed heterozygosity, and expected heterozygosity (Table 2). Hardy-Weinberg equilibrium by 1000 randomizations and linkage disequilibrium were estimated using POPGENE 32 software (Yeh et al., 2000).  Chi-square test for Hardy-Weinberg equilibrium. Locus showed significant deviations from Hardy-Weinberg equilibrium (P < 0.001). Among the 11 polymorphic loci, the number of alleles per locus ranged from two to seven with a mean of 4.182. The levels of observed and expected heterozygosity were 0.044-0.567 and 0.166-0.642, with averages of 0.242 and 0.457, respectively (Table 2). Four SSR markers were able to detect levels of expected heterozygosity above 0.5, indicating a high level of polymorphism in C. reticulata. All 11 polymorphic loci showed deviation from Hardy-Weinberg equilibrium within two or more populations (Table 2) as a result of heterozygosity deficits. This was most likely the result of the reproduction mode, habitat fragmentation, and inbreeding. We found no consistent deviation from linkage disequilibrium for any loci within the populations (P < 0.001). Cross-species amplification of the 11 polymorphic loci was tested on C. saluenensis, C. pitardii, and C. yunnanensis. All 11 EST-SSR markers were amplified successfully, using the same PCR protocol for C. reticulata, and were shown to be polymorphic (Table 3).

CONCLUSIONS
The EST-SSR polymorphic markers developed in this study will add to the existing resources of molecular markers and are expected to be useful for studies on population structure and genetic diversity in C. reticulata. The microsatellite loci described here were successfully cross-amplified in C. saluenensis, C. pitardii, and C. yunnanensis, suggesting that these markers may also be applicable to the study of genetic diversity in other Camellia species.

ACKNOWLEDGMENTS
This work was supported by the Surface Project of the Natural Science Foundation of Yunnan Province (2016FB031) (to Y.T.), the Key Project of the Natural Science Foundation of Yunnan Province (2015FA030), and the Yunnan Innovation Team Project (to L.Z.G.).

DATA AVAILABILITY
Expressed sequence tag sequences for the newly developed primers have been deposited to the National Center for Biotechnology Information (NCBI)'s GenBank database; accession numbers are listed in Table 1 and Appendix 2.