DNA barcoding the flowering plants from the tropical coral islands of Xisha (China)

Abstract Aim DNA barcoding has been widely applied to species diversity assessment in various ecosystems, including temperate forests, subtropical forests, and tropical rain forests. However, tropical coral islands have never been barcoded before due to the difficulties in field exploring. This study aims at barcoding the flowering plants from a unique ecosystem of the tropical coral islands in the Pacific Ocean and supplying valuable evolutionary information for better understanding plant community assembly of those particular islands in the future. Location Xisha Islands, China. Methods This study built a DNA barcode database for 155 plant species from the Xisha Islands using three DNA markers (ITS, rbcL, and matK). We applied the sequence similarity method and a phylogenetic‐based method to assess the barcoding resolution. Results All the three DNA barcodes showed high levels of PCR success (96%–99%) and sequencing success (98%–100%). ITS performed the highest rate of species resolution (>95%) among the three markers, while plastid markers delivered a relatively poor species resolution (85%–90%). Our analyses obtained a marginal increase in species resolution when combining the three DNA barcodes. Main conclusions This study provides the first plant DNA barcode data for the unique ecosystem of tropical coral islands and considerably supplements the DNA barcode library for the flowering plants on the oceanic islands. Based on the PCR and sequencing success rates, and the discriminatory power of the three DNA regions, we recommend ITS as the most successful DNA barcode to identify the flowering plants from Xisha Islands. Due to its high sequence variation and low fungal contamination, ITS could be a preferable candidate of DNA barcode for plants from other tropical coral islands as well. Our results also shed lights on the importance of biodiversity conservation of tropical coral islands.


| INTRODUC TI ON
The primary application of DNA barcoding is to identify unknown samples, and the emergence of DNA barcoding has greatly promoted the survey of biodiversity (Gregory, 2005). DNA barcoding has been particularly valuable in the inventorying of biodiversity hotspots. Successful investigations have been carried out in Mount Kinabalu, Malaysia (Merckx et al., 2015), and Ontario, Canada (Telfer et al., 2015), providing a convenient and efficient way for recognition of nature in these regions. DNA barcoding can also be a powerful tool for addressing fundamental questions in ecology, evolution, and conservation biology (Kress, García-Robledo, Uriarte, & Erickson, 2015). A considerable number of cryptic and new species have been discovered based on evidence from DNA barcodes (García-Robledo, Kuprewicz, Staines, Kress, & Erwin, 2013;Hamsher & Saunders, 2014;Hebert, Penton, Burns, Janzen, & Hallwachs, 2004;Silva, de Abreu, Orlando, Wisniewski, & Santos-Wisniewski, 2014;Smith et al., 2012;Winterbottom, Hanner, Burridge, & Zur, 2014). DNA barcoding data prodigiously contribute to understanding the evolutionary relations within a given community (Kress, et al., 2009).
As molecular information makes the interpretation of phylogenetic relationships easier and more reliable (Webb, Ackerly, McPeek, & Donoghue, 2002), DNA barcoding data could also be useful in plant community ecology concerning the relationships among coexisting species. Since its recent consummation, the DNA barcoding library has become a standard metric for biological conservation assessment (Faith, 1992(Faith, , 2008Forest, et al., 2007). DNA sequence data provide an evolutionary dimension to diversity estimates, which is of great value to compare diversity and establish protected areas across the landscape .
The development of sequencing technologies and their broad usage in species identification, biodiversity assessment, and ecological researches have greatly promoted the studies in DNA barcoding.
Tropical coral islands represent a unique ecosystem: (a) They are far away from continental ecological systems with clear oceanic boundaries; (b) Their species composition could be very different from those of the mainland; (c) They often represent small geographical areas, where the species pool may come either from closely related species or from distantly related clades; (d) They are of particular conservation and scientific interests in the global inventory of biodiversity (Monaghan, Balke, Pons, & Vogler, 2006), and they badly need a comprehensive understanding on biodiversity and ecology due to the increasingly anthropogenic disturbance.
The flora of tropical oceanic islands has rarely been studied so far. The extreme difficulties of plant exploring and sampling from these areas might be the major limiting factors. The high cost of exploring these islands, combined with unstable weather condition, also contributes to the poor knowledge concerning their plant biodiversity. Although both the Chinese version Flora Reipublicae Popularis Sinicae (FRPS)  and the English version Flora of China were fully compiled (Wu, Raven, & Hong, 1994, the collections of plant specimens from the oceanic islands are still rare.   (Pagel, 1999). Unfortunately, the full diversity of species in most regions is still unknown and information on the DNA sequences is extremely poor, so that understanding the mechanisms of maintaining biodiversity patterns has remained elusive (Purvis & Hector, 2000). Our study filled the gap on the barcode library from tropical coral islands in the Pacific Ocean and recovered the lineage relationships for this unique flora. Our study provides a precise snapshot of plant biodiversity in tropical oceanic coral islands, and the DNA sequence data could contribute to comprehensively understanding biodiversity patterns of island floras in future studies.

| Taxon sampling and data collection
We conducted our field work for the DNA samples on the Xisha Islands from 2014 to 2017. We sampled 312 individuals representing 155 species in 120 genera of 42 angiosperm families, of which 132 species have been sampled at least twice. Considering that there is little intraspecific variation due to the similar geography and climate conditions, we only sampled two individuals of most species. We sampled individuals of the same species from different islands as far as possible to guarantee that the two individuals represent distinct lineages. We also explored the islands at both dry and wet seasons to make sure that the samples bear flowers and fruits, and thus can be correctly identified based on morphology. All the voucher specimens were deposited at Herbarium of the South China Botanical Garden (IBSC). A list of the plant samples and the GenBank accessions are provided in Supporting Information Table S1.

| DNA extraction, PCR, and sequencing
Total plant DNA was extracted from dried leaves preserved through silica gel desiccation using the CTAB method (Doyle & Doyle, 1987
The ITS sequences were first aligned automatically via Geneious for all samples, then we retained three parts of conservative sequences for all samples, the remaining parts of variant sequences were partitioned by families. We applied two analytical methods to assess the barcoding resolution, the sequence similarity method (BLASTn searches) and a phylogenetic-based method. BLASTn method (Altschul, Gish, Miller, Myers, & Lipman, 1990) was operated through BLAST+2.6.0 in which the sequence was correctly determined when that species has the highest Bit-Score among all candidates.

| PCR and sequencing success rates
The PCR and the sequencing success rates were generally high for the three regions. rbcL exhibited the highest amplification success rate (99%), followed by matK (97%) and ITS (96%

| Species discrimination
Our results showed that the two analytical methods performed similar trends among the three barcodes. BLASTn analysis provided the highest success rate of species-level resolution with ITS (95%), followed by matK (87%), and then rbcL (85%) ( Table 3). All markers delivered 100% correct assignments to genus and family. We assessed the discriminatory power of each DNA barcode by evaluating the percentage monophyly of each species using MP trees ( Figure 2).
Of the single marker analyses, all the three DNA barcodes showed high rates of species resolution. ITS was the most successful at the species-level resolution (96%), while rbcL and matK resolved 87% and 89% of species, respectively. In addition, the combination of rbcL + ITS (97%) and matK + ITS (97%) resulted in the highest rate of species resolution, followed by rbcL + matK (92%). The three-barcode combination (rbcL + matK + ITS) (97%) did not improve species resolution compared to the two-barcode combination of ITS with either rbcL or matK. The families Poaceae, Fabaceae, Asteraceae, Euphorbiaceae, and Malvaceae each had more than 10 species.
The genus Sida L. of Malvaceae had seven species, and the gen-

| D ISCUSS I ON
Primer universality and species identification are two crucial criterions for an ideal DNA barcode. The three DNA barcodes showed high rates of amplification and sequencing successes, among which rbcL had the best performance of universality. For matK, three pairs of primers were used due to the lack of universal primers for the broad plant taxa (Costion et al., 2011;Kress, Wurdack, Zimmer, Weigt, & Janzen, 2005;Parmentier et al., 2013). Additionally, we failed to obtain 13 matK sequences, of which 46% belongs to Poales. ITS showed an excellent universality (PCR success rate: 96%; sequencing success rate: 98%), and the fungal contamination was detected in only seven samples (2.2%) of 312 individuals.
Two analytical methods both showed that the three barcodes alone could identified>85% species and ITS performed the best. In the tree-based method, the combination of ITS and the plastid DNA barcodes did not get a better species resolution than the single ITS barcode, while the combination of two plastid DNA barcodes got even lower species resolution than ITS. Each of the MP trees had considerably high terminal branch supports. Generally, nuclear DNA (nrDNA) evolves rapidly and possesses more variations that can differentiate closely related species (Kress, et al., 2005;Nieto Feliner & Rosselló, 2007;Sass, Little, Stevenson, & Specht, 2007 proposed based on the considerable universality and relatively high discriminatory power (Burgess et al., 2011;CBOL, 2009;Fazekas et al., 2008;Kress & Erickson, 2007;Kress et al., 2009;Parmentier et al., 2013;Pei et al., 2015). In some cases, however, it is especially difficult to distinguish closely related species with plastid DNA barcodes.
ITS has been commonly used in plant molecular systematic investigations owing to a high rate of species resolution (Alvarez & Wendel, 2003;Chen et al., 2010;Kress et al., 2005;Li et al., 2011;Yao et al., 2010), but has not been widely applied in DNA barcoding studies because of several limitations: incomplete evolutionary history, fungal contamination, and poor primer universality (Hollingsworth, Graham, & Little, 2011). It could be difficult to be amplified and sequenced from diverse samples because of fungal contamination and divergent paralogue copies caused by incomplete concerted evolution of ITS (Baldwin et al., 1995;Cullings & Vogler, 1998;Hollingsworth et al., 2011;Zhang, Wendel, & Clark, 1997). In this study, our results indicated that ITS could be a better choice for barcoding the flora of the tropical coral islands of Xisha compared to rbcL and matK. First, ITS showed outstanding primer universality. The flora of Xisha Islands is mainly composed of herbs (78%) and lianas (11%) (Figure 3). The primer universality of ITS in our study (94%) is much higher than in the study of 285 tropical trees reported by Gonzalez et al. (2009) (41%) and a sample of 531 woody species published by Liu et al.
(2015) (73%). The domination by herbs and lianas on Xisha Islands might contribute to the high primer universality. Second, endophytic fungi are considered to have a great effect on plant DNA barcoding when using ITS in different ecosystems (Hollingsworth et al., 2011). However, the fungal contamination could be much lower in the vegetation of the tropical coral islands than in other vegetations.
In this study, only seven species (2.2%) were detected with fungal contamination of ITS. Finally, we analyzed the taxonomic structure of plants on Xisha Islands. Interestingly, the genera/species (77%) and monotypic genera/genera (86%) ratios are extraordinarily high ( Figure 4). Among the 42 families and 120 genera from Xisha Islands in this study, 20 families and 103 genera contain only one species.  relationships of co-occurring species and the species assembly within community when combining more information including species abundance on each island and the functional traits of all the species in the future. In addition, the taxonomic structure of flowering plants on Xisha Islands indicates that biodiversity loss could be more likely to happen on tropical coral islands. As a result, we need to pay more attention to the biodiversity conservation in this unique ecosystem.
The nucleotide information of the dominant or constructive species (such as Suriana maritima, Guettarda speciose, Pisonia grandis) and the rare and endangered species (such as Pemphis acidula, Eulophia graminea Lindl. [Orchidaceae]) in this study may be of significance for ecological restoration and species conservation.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
T.T., D.Z, and S.L. conceived the ideas; T.T., S.L., X.Q., X.C., X.L., and J.L. conducted the flora exploring and taxa sampling; S.L. and Z.Z performed the experiments of PCR, and S.L. analyzed the data; S.L.
wrote the first draft of the manuscript; S.L., M.S., T.T., and D.Z. contributed to finalizing of the manuscript.

DATA ACCE SS I B I LIT Y
A list of the plant samples and their GenBank accession details are provided in Supporting Information Table S1.