Association between blooming time and climatic adaptation in Prunus mume

Abstract Prunus mume Sieb. et Zucc. is an important fruit crop of the subtropical region, originating in China. It blooms earlier than other deciduous fruit trees, but different regions have different blooming periods. The time of anthesis is related to the dormancy period, and a certain amount of chilling promotes bud break and blooming. To identify the relationship between blooming time and the climatic adaptation of P. mume cultivars in China, the nuclear and chloroplast genomes of 19 cultivars from the main cultivation areas of P. mume in China were resequenced. The average depth of coverage was 34X–76X, and a total of 388,134 single nucleotide polymorphisms were located within the coding regions of the gene (CDs). Additionally, the 19 cultivar accessions were divided into three groups based on their blooming time: early, mid, and late. Associated with the blooming time groups, 21 selective sweep regions were identified, which could provide evidence supporting the possible model of P. mume domestication originating due to natural selection. Furthermore, we identified a flowering gene, FRIGIDA‐LIKE 3 (FRL3), seems to affect the blooming time and the climatic adaptation of P. mume cultivars. This study is a major step toward understanding the climatic adaptation of P. mume cultivars in China.

stimulation toward dormancy. P. mume has a wide range of chilling requirements from 479 CU to 1,323 CU corresponding with cultivars from Southern China to Northern China (Gao et al., 2012). The chilling requirements of the varieties in southern China such as Guangdong and Fujian province are early while that of varieties in central China such as Jiangsu and Zhejiang province are late. The varied blooming time of varieties may be due to different climatic conditions in different regions (Zhuang, Shi, Gao, Zhang, & Zhang, 2013).
DORMANCY-ASSOCIATED MADS box6 (PmDAM6), a MADS box gene, was a candidate for growth inhibition and dormancy-controlling gene (Sasaki et al., 2011). PmDAM3, PmDAM5, and PmDAM6 expressions are closely associated with dormancy release in both flower and vegetative buds. PmDAM1 and PmDAM6 could form heteromeric complexes with C-repeat binding factor 5 (CBF5), a cold response signal factor. PmCBF1, PmCBF3, and PmDAM4 recognized the promoter of PmDAM6 by the alternative binding sites. Moreover, PmDAM6 could interact with a homolog of SUPPRESSOR OF OVEREXPRESSION OF CONSTANS1 (PmSOC1; Kitamura, Takeuchi, Yamane, & Tao, 2016).
The interactions of these genes were key factors for early dormancy release and blooming time in P. mume, which made it sensitive to temperature changes, resulting in short dormancy and early flowering (Kitamura et al., 2016;Zhao et al., 2018). Other than these regulators, there are many other genes have been reported to be punitively involved in dormancy release, such as RGL2 (Lv, Huo, Wen, Gao, & Khalil-Ur-Rehman, 2018) and hormone-related genes (Wen et al., 2016).
PmRGL2 (REPRESSOR-OF-GA-LIKE2), a DELLA protein, could play a negative role in bud dormancy release by regulating the GA biosynthetic enzymes (Lv et al., 2018). Gibberellin (GA 4) has a significant effect on promoting dormancy release in flower buds of P. mume Zhuang et al., 2015). Abscisic acid (ABA) is another plant hormone involved in regulating the onset of dormancy and in maintaining the dormant state (Karssen, Brinkhorst-van der Swan, Breekland, & Koornneef, 1983;Wen et al., 2016). Furthermore, the QTL analyses localized the significant QTLs controlling leaf bud chilling requirements and heat requirements, leafing date, and PmDAM6 expression in leaf buds to a region in linkage group LG4, which suggests that this locus controls dormancy release, bud break, and PmDAM6's downregulation in P. mume leaf buds (Kitamura et al., 2018). QTLs for chilling requirements and blooming time were identified on the LG4 of almond (Sánchez-Pérez, Dicenta, & Martínez-Gómez, 2012) and sweet cherry (Castède et al., 2014). The QTLs for apricot blooming initiation time were located on LG1 and LG4 (Dirlewanger et al., 2012;Olukolu et al., 2009;Socquet-Juglard et al., 2013). In peach, major QTLs for chilling requirements and blooming time were detected on LG1, LG4, and LG7 (Bielenberg et al., 2015;Fan et al., 2010;Zhebentyayeva et al., 2014). Genome-wide association studies for CR in peach identified seven association peaks, located on LG 1, 3, 7, and 8. The strong association peak on LG 1 overlapped with a known major CR QTL (qCR1a) and co-localized with the EVG locus underlying the evergrowing peach dormancy mutation (evg). DAM1-6 were identified in qCR1a conferring a nondormancy phenotype in evg peach (Li et al., 2019). There were three most significant QTLs associated with 2.8, 1.8, and 1.0 days bloom delay, respectively, in sour cherry (Prunus cerasus L.). These QTLS were also demonstrated to have additive effects on delaying blooming date for both individual and multiple QTLs (Cai et al., 2018).
Next-generation sequencing (NGS) technologies are faster and cheaper than Sanger sequencing, which is frequently used in plant studies and allows for a deeper genome variant analysis (Deschamps & Campbell, 2010;Jackson, Iwata, Lee, Schmutz, & Shoemaker, 2011).
So far, genome sequencing and evolutionary genetics have provided information about the origin, evolution (Ellegren, 2014;Sedivy, Wu, & Hanzawa, 2017;Velasco, Hough, Aradhya, & Ross-Ibarra, 2016;Wu, Terol, et al., 2018;Yu et al., 2018), and domestication (Akagi, Hanada, Yaegaki, Gradziel, & Tao, 2016;Myles et al., 2011;Qiu et al., 2015;Velasco et al., 2016;Wu, Wang, et al., 2018). The first 237M long genomic map of P. mume was constructed in 2012, and the actual size of the genome of P. mume was estimated to be about 280M. The processes of chromosome fusion and fragmentation in three Genera of Malus, Fragaria, and Prunus were analyzed . Furthermore, Zhang et al constructed the phylogenetic tree of P. mume cultivars by using related species of Prunus armeniaca and Prunus persica as reference. After resequencing and genome assembly with P. armeniaca and P. persica and in conjunction with the published genomes of P. persica and P. mume, GWAS analysis identified multiple quantitative trait locus regions and found that the MYB108 gene was associated with flower color .
To identify the climatic adaptation and relationship among the domesticated P. mume in China, 19 cultivated P. mume accessions from Yunnan, Sichuan, Guizhou, Jiangsu, Zhejiang, Hunan, Guangdong, Fujian, and the Taiwan provinces, the main cultivation area of P. mume in China, were resequenced and their accessions were locally grown for many years to be strong representatives of the cultivated P. mume. Based on analyses of population genetics and the evolution of cultivated P. mume, and selective sweeps associated with the analysis of the blooming time of the related genes, this study aimed to propose a model to explain the relationship between the climatic adaptation and blooming time of cultivated P. mume.

| Sampling information and genome resequencing
In this present study, a total of 19 P. mume accessions were collected and resequenced. All the accessions were from nine provinces in China

| Investigation of the blooming time of Prunus mume accessions
Because of the local climatic conditions, such as temperature and humidity are very important factors for the blooming time and the chilling hours is one of the most important conditions for the adaptation of P. mume. According to the phenological period of each accession and the temperature data of each meteorological station over the past 60 years from the National Centers for Environmental Information (NCEI; gis.ncdc.noaa.gov), we roughly investigate the blooming time of these 19 varieties of P. mume used in this study.

| Chloroplast genome resequencing
Chloroplast DNA was extracted from the fresh leaves of each accession using a modified CTAB method. A DNA concentration >50 ng/µl was measured using a NanoDrop spectrophotometer and fragmented through sonication. Then, fragmented DNA was purified and endrepaired, and the size was determined through gel electrophoresis.
The PCR products were constructed as short-insert (300 bp) libraries using Illumina Nextera XT and then sequenced using the HiSeq ×10 platform (Illumina). Raw reads were filtered using NGSQC Tool kit v2.3.3 to obtain high-quality reads, and then, the chloroplast genomes were assembled by NOVOPlasty using clean data and annotated with CpGAVAS (Hu et al., 2016). The chloroplast genome of P. persica (NCBI Accession number NC_014697.1) was used as the reference genome in this study. The MIcroSAtellite identification tool (MISA) was used to analyze the SSR (simple sequence repeat) from the chloroplast genome.

| Genome mapping and variant calling
Bwa software (Li & Durbin, 2009) was used to align the reads obtained from resequencing onto the P. mume reference genome , and subsequent mutation analysis was performed. Based on the position of clean reads on the reference genome, the sequencing depth, genomic coverage, and other information from each sample were measured and the mutation was detected. Single nucleotide polymorphism (SNP) detection was primarily implemented using GATK (McKenna et al., 2010). Based on the location results of clean reads in the reference genome, using SAMtools  to mark duplicates, GATK was used for local realignment, base recalibration, and other pretreatments to ensure the detection of SNP accuracy and then use GATK for SNP and Indel detection, filtering, and obtaining the final SNP loci. SNP and small Indel annotation were performed using SnpEff (Cingolani et al., 2012), which predicted the effects of mutations. The structural variation (SV) was detected by BreakDancer (Fan, Abbott, Larson, & Chen, 2014 Genes and Genomes (KEGG), and these annotations were performed according to the method of previous reports (Conesa & Gotz, 2008).
The following population genetic summary statistics were then calculated for each dataset: nucleotide diversity (π; Tajima, 1989), heterozygosity (Hetobs), the inbreeding coefficient (F IS ), and differentiation, as given by F ST . Nucleotide diversity is related to expected heterozygosity and is an overall measure of genetic variation.

| Evolution analysis
To acquire more intuitive information of the evolution analysis, a principal components analysis (PCA) was performed using GCTA software, and the main clustering element was obtained. Correlation coefficients (r 2 ) of the alleles of the 19 varieties were calculated using PLINK software with a pair-wise algorithm to measure the linkage disequilibrium (LD) value.

| Population genetics analysis, identification of selective sweeps, and gene flow analysis
ADMIXTURE was used to infer population structure among the 19 samples, based on the SNPs. The population groups (K) varied from 2 to 10. Based on the SNPs of 19 accessions, using the peach as an exogenous reference species, a phylogenetic tree was constructed and PCA analysis was performed. The phylogenetic analysis was performed using FastTree software with the neighbor-joining method.
To identify the genomic regions that may have been subjected to selection during domestication, we calculated nucleotide diversity (π), genetic differentiation (F ST ), and the Watterson estimator (θ π ) using a sliding-window approach (50-kb windows sliding in 10-kb steps) with SNPs of the whole genome. The log2θ π ratio was calculated as the nucleotide diversity values for the cultivars with late-blooming and early blooming time. The regions with significantly late F ST values (in the 5% right tail of the empirical distribution of F ST values) and a significant reduction in diversity (in the 5% right tail of the empirical distribution of the log2θ π ratio) were considered candidate selective sweeps.
Genes in the candidate selective sweeps were grouped into 1-Mb nonoverlapping segments throughout the genome, according to their physical locations. Adjacent segments and segments with no more than 1-Mb were separated and merged, while segments containing no more than five genes were discarded. GO term enrichment analyses were conducted for the genes in each merged segment along the genome using GO::TermFinder.
To estimate the gene flow between the high, mid, and early blooming time of P. mume, we used Treemix version 1.13 to investigate the gene flow between groups/subgroups.

| Blooming time-related genes analysis
Phenotypic data of blooming time were used for the association study. For the grouping of blooming time (Table 1), "low," "mid," and "high" groups were used.  (Martinez et al., 2018). Raw p-values were adjusted for multiple testing using the Benjamini-Hochberg procedure (Benjamini & Hochberg, 1997), and a significant association was based on a false discovery rate threshold of 0.01.

| Blooming time and climate warming in the cultivated area of Prunus mume
Based on flowering time and the hours at 0-7.2°C, we divided the 19 varieties into three groups of blooming time: early, mid, and late ( Libo; R15, R16, and R17 were from Hangzhou; and R18 and R19 were from Changde (Table 1).

| Genome, chloroplast genome resequencing, and genetic analysis of Prunus mume
Genomes of the 19 cultivated P. mume accessions were resequenced using Illumina HiSeq ×10. A total of 239.33 GB of data were obtained, and the Q30 was over 82.24%, with an average mapping rate of 93.9% (Additional file: Tables S1 and S2). The average depth of coverage was 34X-76X, and the genome coverage was 87.86% (at least one base was covered; Additional file: Table S3).
A total of 38,321,207 SNPs were identified (Additional file: Table   S4). A total of 388,134 SNPs were located within the coding regions of the gene (CDs). We also identified 71,265 INDEL and 36,659 SV in the CDs (Table 2). They are filtered to only those that have a potential functional impact.
The average number of nucleotide diversities produced by these mutations was 20,739 among these 19 cultivated accessions (Table 2), and the summary of genetic diversity in P. mume cultivars is shown in Table 3.
A summary of the population genetic statistics is presented in

| Population phylogenetic analysis of Prunus mume cultivars
A subset of 388,134 SNPs was screened in greater detail to construct a neighbor-joining tree, using the peach genome as an outgroup. The 19 cultivars were clustered into two independent clades after branching from the peach (Figure 1a). To get a better understanding of the phylogenetic relationships of P. mume, we assembled and analyzed the chloroplast genome of P. mume. Chloroplast genome resequencing of the 19 accessions was completed, and 239.3 GB of clean data was obtained, in which the percentage of the Q30 base was more than 82.24%. The chloroplast assembly results from 17 samples were very good, and they were able to produce a full-circle figure with zero gaps. In addition, two samples of R16 and R17 could not be assembled into a circle, but could form better scaffolds (Additional file: Figure S1). Furthermore, the phylogenetic tree of the chloroplast showed the same cluster pattern as the genome phylogenetic tree (Figure 1b).
We used the program ADMIXTURE to fit a model of admixture, in which an individual's genome is composed of sites from up to K ancestral populations. We explored K = 1 through six ancestral populations ( Figure 2) to investigate how assumptions regarding  Linkage disequilibrium (LD) analysis showed that genomes have relatively short LD distances and relatively rapid LD decays ( Figure 3a). The average r 2 value among P. mume SNPs, corresponding to the LD levels of the population, was relatively low. The average distance over which LD decayed to ~50% of its maximum value in cultivated P. mume was very short. Figure 3b-d showed the distribution of each accession in PCA space.

| Natural selection of Prunus mume in the process of domestication
To further study the genes in the transfer process, we performed selective clearance analysis. Because blooming time between the EARLY and LATE groups was significantly different, we used the EARLY and LATE-blooming time groups to perform the selective sweep analysis (Figure 4a). Twenty-one selective sweep regions and a total of 283 genes had been found in 19 selective sweep regions (  Table 4, and harbored 371 genes ( Figure 4b, Table S5). The identified selective sweeps were enriched with genes associated with post-translational modification, protein turnover, chaperones, replication, recombination and repair, transcription, translation, ribosomal structure and biogenesis, cell cycle control, cell division, chromosome partitioning, RNA processing and modification, chromatin structure and dynamics, cytoskeleton, energy production and conversion, amino acid transport and metabolism, nucleotide transport and metabolism, carbohydrate transport and metabolism, lipid transport and metabolism, inorganic ion transport and metabolism, signal transduction mechanisms, intracellular trafficking, secretion and vesicular transport, secondary metabolites biosynthesis, transport, and catabolism ( Figure 4a). These biological processes are usually involved in natural selection during the process of domestication. According to the gene flow analysis, no gene flow occurred between the late and early cultivars (Figure 4c).

| Genes related to blooming time for the climatic adaptation of cultivated Prunus mume
According to the blooming time groups, we identified a total of 127 SNPs, and 54 genes associated with these SNPs (SNPs within the gene or 2 kb upstream or downstream of the gene; Figure 5, Table S6).

| High genetic diversity in the fruiting of Prunus mume
Prunus is a large genus in the Rosaceae family, with more than LG numbering in Dirlewanger' paper (Dirlewanger et al., 1998), the numbering below the chromosome is LG numbering in Zhang's paper .  noted that their genetic diversity was 2.01 × 10 −3 , which is lower than that of our data. The main reason for this difference may be that the accessions Zhang et al. used were ornamental P. mume, but our accessions were fruiting P. mume. It was also indicated that natural selection played an essential role during the process of the domestication of fruiting P. mume, but for the ornamental P. mume, artificial selection might have been the main factor during domestication.
Many ancient books in China also record the ornamental P. mume cultivars and planting modes, but there was scarcely any reference record of the planting modes of fruiting P. mume. including almonds (Velasco et al., 2016), grapes (Myles et al., 2011), and apples (Gross, Henk, Richards, Fazio, & Volk, 2014), lack domestication bottlenecks, but maintain much of their ancestral genetic diversity (Velasco et al., 2016). Moreover, in this study, the analysis of selective sweep regions and gene flow also indicated that the fruiting P. mume barely experienced artificial selection. Altogether, this suggests that fruiting P. mume has high genetic diversity, with less domestication and artificial selection. Kitamura et al. (2018) localized the significant QTLs controlling the leaf bud blooming time to a region on linkage group 4 (equal to LG3 in our study); they thought that this locus controlled dormancy release in P. mume leaf buds (Kitamura et al., 2018). In our study, the sweep tor must be considered, as the light intensity and light quality may be different between regions (Vanhaelewyn et al., 2016). To survive, the plant must adapt to the new environment. After receiving signals from environment, some transcription factors are involved in this adaptation, like TGA2, which is a SA-responsive and NPR1-dependent transcription activator (Fan & Dong, 2002); transcription factor MYB36-like orchestrates Casparian strip formation to offer the potential for improved water and nutrient use efficiencies and enhanced resistance to abiotic stresses (Kamiya et al., 2015). Transcription factor-like protein DPB is involved in the regulation of the G1/S transition and increases the DNA binding activity of E2F proteins after hetero-dimerization (Kosugi & Ohashi, 2002). Transcription factor Puralpha 1 specifically binds the purine-rich double-stranded telomeric repeated sequence 5′-AAACCCTAA-3′ found in promoter telo boxes (Tremousaygue, Manevski, Bardet, Lescure, & Lescure, 2010) and the ethylene-responsive transcription factor RAP2-12. Finally, mostly the protein and enzyme genes, such as photosystem II core complex proteins psbY (Plochinger, Schwenkert, Sydow, Schroder, & Meurer, 2016), protein phytochrome kinase substrate 4 (PKS4) (Fankhauser & Christie, 2015), and light-inducible protein CPRF2, participate in the light reaction (Monir & Zhu, 2018). Abscisic acid-insensitive 5-like protein 1 (Wang, Li, Mao, Li, & Jing, 2016), ethylene-responsive transcription factor RAP2-12 (Kosmacz et al., 2015), auxin-induced protein AUX28like , and auxin-induced protein 22D-like (Han et al.,

TA B L E 4
The location of the selective sweep regions between the groups with early and late-blooming time
We also used GWAS to analyze the genes associated with

| A proposed model for the climatic adaptation of cultivated Prunus mume
To explore the phylogenetic relationships of these 19 cultivated P. mume varieties, we performed a phylogenetic analysis of all identified SNPs, using the peach as an outgroup (Figure 1a). In the phylogenetic trees, the 19 accessions divided into two subgroups. In the first subgroup, R01 and R02, the varieties from Jiangsu province, were grouped. These two varieties are particularly close to the outgroup (Figure 1). It also seems that the Jiangsu varieties are relatively After this, we used the Bayesian clustering program ADMIXTURE, changing K progressively from 2 to 10. When K = 2, all the cultivars separated clearly into two parts. One part included the varieties from the mainland and Fujian, while the other included the coastal region varieties. When K = 3, these varieties split into three parts. The cultivars of the coastal region, except for Fujian, still formed a group-like structure similar to K = 2, but the mainland varieties divided into two parts: one part was a cultivar from Jiangsu and Zhejiang, and the other part was the remaining varieties. Based on the phylogenetic trees, PCA analysis, and population structure analysis, we assume that the transfer direction of cultural P. mume had three directions. The first direction is from Yunnan, to Sichuan, Guizhou and Hunan provinces, close to the origin area (southwest China) of P. mume. The second direction is a transfer to Fujian, Guangdong, and Taiwan provinces, located to the southeast from Hunan. The last direction is from Hunan to Jiangsu and Zhejiang provinces, in mid-eastern China. In the PCA and ADMIXTURE analyses, the Sichuan and Fujian varieties were all clustered together, and Hunan province may be the hub in the transfer process of P. mume (Figure 7).

| CON CLUS ION
In this study, a genome variation mapping 19 P. mume accessions, collected from nine provinces in China, was performed. We found F I G U R E 7 Climatic adaptation process of cultivated Prunus mume that fruiting P. mume has higher genetic diversity than ornamental P. mume. Associated with the blooming time data, 21 selective sweep regions, with a total of 283 genes involved in transcription, translation, post-translational modification, and mostly metabolism, were identified. There was no gene flow between the late and early cultivars, which provides evidence to support a possible model of P. mume domestication that might include natural selection. Furthermore, we identified a total of 127 SNPs and 54 genes associated with the blooming time, and the flowering gene FRL3, which could affect the blooming time and the climatic adaptation of P. mume cultivars. This study is a major step toward understanding the climatic adaptation of P. mume cultivars in China and further contributes a molecular foundation for the origination and evolution of P. mume.

CO N FLI C T O F I NTE R E S T
None of the authors have any competing interests in the manuscript.

AUTH O R CO NTR I B UTI O N S
TS, WL, HL, and XH performed the experiments. ZN prepared the materials; TS and WL wrote the manuscript; HG analyzed the bioinformatics data; SI modified the language; ZG designed the experiments.

DATA AVA I L A B I L I T Y S TAT E M E N T
The sequence data of P. mume genome resequencing involved in this study are being deposited in NCBI (SRA accession is PRJNA561464).
All other relevant data (Tables S5 and S6, Figures S1 and S2)