A near‐complete genome sequence of mungbean (Vigna radiata L.) provides key insights into the modern breeding program

Mungbean (Vigna radiata L.), a fast‐growing legume species, is an important source of carbohydrates and proteins in developing countries of Asia. Here, we constructed a near‐complete genome sequence of mungbean with a scaffold N50 value of 5.2 Mb and only a 0.4% gap, with a total scaffold size of 475 Mb. We identified several misassembled pseudomolecules (Chr03, Chr04, Chr05, and Chr08) in the previous draft assembly; Chr03, Chr04, and Chr08 were assembled into one chromosome, and Chr05 was broken into two chromosomes in the improved reference genome assembly, thus providing more accurate linkage information to breeders. Additionally, using an ultra‐high‐resolution linkage map constructed based on resequencing data, we identified several quantitative trait loci (QTLs) and the underlying candidate genes affecting synchronous pod maturity (SPM). Mungbean homologs of two soybean ([Glycine max (L.) Merr.] flowering genes, E3 (phytochrome A) and J (early flowering 3), were identified as candidate genes for the QTLs, and the candidate genes for plant height, node number, and SPM showed critical nucleotide substitutions between the reference cultivar and other genotypes (landraces and wild accessions). Based on the analysis of genetic diversity among 276 accessions collected from 23 countries, we identified 36 selective sweep regions and observed that the overall genetic diversity of cultivars decreased to 30% of that in wild accessions postdomestication. The near‐complete genome sequence of mungbean represents an important resource for genome‐assisted improvement in the mungbean breeding program.


INTRODUCTION
Mungbean (Vigna radiata L.) is an annual self-pollinated species belonging to the papilionoid subfamily of the Fabaceae (2n = 2x = 22) (Kang et al., 2014). It is a versatile legume crop that is widely consumed in Asia. As a protein source, the production of mungbean seed proteins requires fewer resources than that of animal proteins. Moreover, mungbean is an excellent source of folate and other vitamins when consumed as vegetable sprouts (Shohag et al., 2012). The ability of mungbean to host nitrogen-fixing bacteria improves soil fertility and reduces the greenhouse gas emission footprint commonly found in agricultural production (Nair et al., 2012). Despite the effect of mungbean on the diet of a sizable proportion of the world population, genomic information for molecular breeding in this species is lacking compared with other legumes such as soybean [Glycine max (L.) Merr.] and chickpea (Cicer arietinum L.) (Schmutz et al., 2010;Varshney et al., 2013). A draft mungbean genome sequence constructed previously covers 80% of the estimated genome size but still contains gaps in 7.2% of the total scaffold, and only 239 out of 2,748 scaffolds have been mapped to pseudochromosomes (Kang et al., 2014). This draft reference genome has helped breeders to conduct quantitative trait locus (QTL) mapping or map-based cloning, and to transfer QTL information from soybean through translational genomics . However, high-quality reference sequence map data are crucial for forward genetics approaches to determine the exact location of genes and mutations underlying desirable phenotypes.
Mungbean is characterized by nonsynchronous pod maturity, i.e., continued flowering and pod production. Therefore, a single plant must be harvested multiple times to reduce yield loss. However, this costs extra labor and time. Additionally, each harvest must be conducted carefully to avoid damage to plants, which makes mechanical harvesting difficult. Although determinate growth habit has been characterized in mungbean (Li et al., 2018), the genetic basis of SPM remains unresolved. Synchronous pod maturity (SPM) is a complex trait affected by multiple factors, such as flowering time, maturity time, inflorescence structure, and plant architecture (Ha et al., 2020a(Ha et al., , 2020b. Quantitative trait locus analyses of these factors in mungbean are limited, especially those that

Core Ideas
• A near-complete reference genome sequence of mungbean was constructed. • Misassembled pseudomolecules in the previous assembly were corrected. • QTLs were identified based on an ultra-highresolution genetic map constructed by resequencing. • Genetic factors affecting synchronous pod maturity were identified. • Genetic diversity among 276 accessions from 23 countries were analyzed.
utilize high-density single nucleotide polymorphism (SNP) markers. Only a few traits, such as flowering and disease resistance in mungbean, have been studied using this method (Hwang et al., 2017;Schafleitner et al., 2016). Sequencing platforms capable of producing long reads have been utilized to reduce gaps and improve coverage in genome assemblies (English et al., 2012). Here, we report a nearcomplete genome sequence of mungbean using long sequence reads as well as an ultra-high-resolution genetic map obtained from whole genome resequencing data of a genetic mapping population of mungbean. We took advantage of the resulting combination of a high-quality reference genome and a dense linkage map to identify QTLs and the underlying candidate genes associated with agronomically important traits that affect SPM. We also examined the genetic diversity of 275 wild and cultivated mungbean accessions. Our results provide high-quality genomic information for the improvement of all Vigna species, especially mungbean.

Plant materials and phenotyping
One hundred eighty-seven F 10 plants of a recombinant inbred line (RIL) population were generated from F 2 seed using single-seed descent derived from a cross between V. radiata cultivar VC1973A and landrace V2984. The RIL population were planted at the Seoul National University Experimental Farm in Suwon, Korea (37˚16′12.7″ N; 126˚59′19.2″ E). To analyze QTLs, various agronomic traits were measured including plant height, node number, branch number, flower initiation, and SPM. Plant height, node number, and branch number were measured when the first flower appeared and vegetative growth stopped. The number of branches with two or more nodes generated on the main stem were evaluated. To measure flower initiation, the date of the appearance of the first flower was recorded for each line. To measure SPM, 10 plants of each line were harvested eight times at weekly intervals between 26 Aug. and 14 Oct. 2016, and the sum of the two highest grain yields from two consecutive weeks was divided by the total yield (Ha et al., 2020a). Theoretically, a SPM value of 1 indicates that all pods mature at the same time.

Genome assembly and annotation
Vigna radiata var. radiata, the pure line VC1973A, was sequenced for genome assembly using the PacBio RS II platform and data were deposited in the National Center for Biotechnology Information sequence read archive database under the accession number SRRS9994113 (Supplemental Table S1). Raw sequence reads were error-corrected and trimmed by Canu (Version 1.0) (Koren et al., 2017). The corrected reads were assembled into contigs using Falcon (Version 0.3.0) (Chin et al., 2016). The contigs were scaffolded with Illumina mate-pair reads with library sizes of 5, 10, and 40 kb using SSPACE (Version 3.0) (Boetzer et al., 2011). The scaffolds were anchored to pseudochromosomes using two genetic maps by ALLMAPS (Tang et al., 2015). Gaps in the superscaffolds were filled by Illumina short reads using Gapfiller (Version 1.10) (Boetzer & Pirovano, 2012). Illumina reads were obtained from the National Center for Biotechnology Information under the accession number JJMO00000000 (Kang et al., 2014). The assembled genome was annotated via the pipeline described by Ha et al. (2019) based on transcriptome data from JJMO00000000. To examine the mungbean genome assembly, core eukaryotic genes from CEGMA (Version 2.5) (Parra et al., 2007) were mapped to the assembly using BLAST (Version 2.2.31) (Camacho et al., 2009). In addition, single-copy orthologous genes of eukatyota and embryophyta selected from BUSCO (Simão et al., 2015) were mapped to the mungbean genome assembly. Default settings were used to run software programs.

Genetic map construction and QTL analysis
Illumina short reads (SRR10083737-SRR10083923) generated from 187 RILs with the Illumina HiSeq 4000 platform were used for SNP analysis (Supplemental Table S3). Reads were mapped to scaffolds using BWA (Version 0.7.15) (Li, 2013). The SNPs were called from the mapped reads and filtered using SAMtools (Version 1.3) (Li et al., 2009) using the following criteria: mapping quality ≥ 30, depth ≥ 5, heterozygosity ≤ 10, and missing data ≤ 12. The detected SNPs were used to construct a genetic map using JoinMap (Version 4.1) (Van Ooijen, 2006). To construct robust pseudochromosomes, a secondary genetic map was constructed (Supplemental Table S5) using the same method but with a different set of SNP markers with mapping quality ≥30, depth ≥5, heterozygosity ≤0, and missing data ≤18.
The QTL analysis was conducted using the primary genetic map composed of 8,966 SNP markers (Supplemental Table  S4). The QTL positions were identified by inclusive composite interval mapping for additive QTLs using QTL ICIMapping (Version 4.0.6) (Meng et al., 2015). To determine the significance thresholds for QTLs, a logarithm of odds (LOD) score was calculated with 1,000 replications of a permutation test, with 99% confidence. Two significant QTLs, with the highest LOD scores, were reported per trait.

Genome sequence comparison
A total of 100 evenly distributed SNP markers per chromosome from the previous mungbean genome assembly were mapped to the newly assembled chromosomes using ±100 bp of the sequence flanking each marker using BLAST. Only the markers with 90% or higher identity were used to compare the two genome assemblies. Genes from the previous mungbean assembly were aligned to the genome of close relatives including V. angularis (Kang et al., 2015), Medicago truncatula Gaertn. (Tang et al., 2014), and C. arietinum (Varshney et al., 2013) using BLAST. To study synteny and collinearity between mungbean and other species, BLAST results were analyzed using MCScanX . Syntenic blocks containing ≥20 genes were used to compare the two species.

Genotyping-by-sequencing and variant calling
A total of 275 mungbean accessions, including 233 cultivars and 42 wild accessions, and one accession of V. glabrescens Maréchal, Mascherpa & Stainier (outgroup), collected from 23 countries (Supplemental Table S9), were grown and used for DNA extraction. Library construction for genotyping-bysequencing (GBS) was performed as described previously (Elshire et al., 2011). Sequence reads generated using the Illumina HiSeq 2000 platform were mapped to the mungbean reference genome using BWA (Li, 2013), and variants were identified using bcftools call command in SAMtools (Li et al., 2009). The variants were filtered using vcftools (Danecek et al., 2011) and SnpEff (Cingolani et al., 2012). Only variants with a quality score greater than 30, sequencing depth of more than two reads per site, and a minimum of four reads per heterozygous variant were used for further analysis.
The circular plot for variant number across chromosomes was drawn in Circos (Krzywinski et al., 2009).

Phylogenetic and population structure analyses
A neighbor-joining tree was constructed based on the dissimilarity matrix generated using filtered SNPs with DARwin6 (Perrier & Jacquemoud-Collet, 2015). The tree was drawn and edited using Figtree (Version 1.4.2) (http://tree.bio.ed.ac.uk) and iTOL (Letunic & Bork, 2006). Population structure analysis was performed using a random sample of 1,000 SNPs with STRUCTURE (Version 2.3.4) (Pritchard et al., 2000). The number of clusters was set from K = 2 to K = 7, with 50,000 burn-in and 100,000 Markov Chain Monte Carlo replications. The optimal K value was identified according to the methods of Evanno et al. (2005). To observe the dominant cluster in each country of origin, the population structure data of each accession were grouped according to the country, and average q-values (the proportion of the SNPs from each group) were calculated. Principle component analysis (PCA) was performed using GenAlEx (Version 6.5) (Peakall & Smouse, 2006).

Linkage disequilibrium profiling and identification of selective sweep regions
Linkage disequilibrium (LD) was calculated based on R 2 statistics (Hill & Robertson, 1968) implemented in vcftools (Danecek et al., 2011), and the calculation was performed separately for wild and cultivated mungbean accessions. The R 2 values between markers were then binned and averaged for each 10,000-bp increment. The LD decay cutoff was set at 0.2, and the exact base pair position for the cutoff was calculated based on the trendline equation.
The fixation index (Fst) and cross-population composite likelihood ratio (XP-CLR) were calculated to identify selective sweep regions (Chen et al., 2010). A 100-kb window was analyzed for XP-CLR, and Fst analysis was based on markers. Segments with the top 5% values from each program were searched, and intervals that overlapped with each other were identified as selective sweep regions. Gene ontology (GO) analysis was performed by submitting orthologous Arabidopsis genes, identified by blastp, to BiNGO (Maere et al., 2005). Soybean QTLs were searched in regions homologous to mungbean selective sweep regions based on syntenic relationship identified using MCScanX .

Mungbean genome assembly and genetic map construction
The genome of mungbean cultivar VC1973A was sequenced using the PacBio platform at 68.21X coverage, and the resulting long sequence reads were de novo assembled into a nearcomplete reference genome (Supplemental Table S1). The primary contigs consisted of 1,511 sequences with an N50 value of 2.8 Mb, which is significantly fewer and longer compared with the 25,922 contigs with N50 value of 42 kb in the previous draft assembly (Kang et al., 2014). The total length of contigs also increased from 431 to 473 Mb (Table 1; Supplemental Table S2), which represents 87.1% of the predicted genome size of 543 Mb (Arumuganathan & Earle, 1991). These contigs were then used to generate 487 scaffolds with a scaffold N50 value of 5.2 Mb, a notable improvement from the 2,748 scaffolds with an N50 value of 1.5 Mb in the previous assembly. The new scaffolds also contained 0.4% gaps compared with the 7.2% gaps found in the previous draft assembly.
To assemble the scaffolds into pseudochromosomes, a new genetic linkage map was constructed using Illumina resequencing data of 187 RILs sequenced at an average coverage of 12.15X (Supplemental Tables S3 and S4). A highresolution genetic map consisting of 8,966 markers was used to place 110 scaffolds into 11 pseudochromosomes (Supplemental Table S5; Supplemental Figure S1). Gaps were further filled and assembly errors (0.02% error rate) were corrected using Illumina short reads to generate the final pseudochromosomes of 476 Mb. While the previous draft assembly contained only 22,427 predicted genes, the new assembly contained 30,958 high-confidence genes, of which 29,792 (96%) were annotated (Kang et al., 2014). The predicted genes showed ∼98% identity to 248 core eukaryotic genes (Table 1).

Improvement of the mungbean reference genome sequence
Using data from the previous draft assembly, 100 evenly distributed SNP markers per chromosome (Chr) were mapped to the newly assembled chromosomes. Out of 1,100 markers on 11 chromosomes, 959 markers were mapped with 90% or higher identity (Supplemental Table S6). A total of 91, 79, and 41 markers from Chr03, Chr04, and Chr08, respectively, of the previous assembly were mapped to Vr04 in the new assembly ( Figure 1).
Approximately 35% of the 85 markers on Chr05 of the previous assembly were mapped to Vr10, while 54% of

F I G U R E 1 Syntenic relationship between
Vigna radiata and its close relatives. Evenly distributed 100 markers per chromosome of the previous assembly (center) were mapped to the new assembly (left column). Some markers from chromosome (Chr) 03, Chr04, and Chr08 of the previous assembly were mapped to Vr04 of the new assembly as indicated by blue lines. Markers from Chr05 of the previous assembly that were mapped to Vr10 and Vr11 of the new assembly are indicated by red lines. Gene-based synteny analysis between mungbean and Cicer aretinum, Medicago truncatula, and V. angularis (right column) shows that these close relatives have the same syntenic pattern with the new mungbean assembly. Black arrow head indicates the location of DF3-1 and DFF3-1 those markers are mapped to Vr11 in the new assembly ( Figure 1; Supplemental Table S6). To resolve the discrepancies between the old and new genome assemblies, syntenic relationships of genes on Chr03, Chr04, Chr05, and Chr08 in the old genome assembly were analyzed in the close relatives of mungbean, including chick pea (C. arietinum), barrel clover (M. truncatula), and adzuki bean (V. angularis). A large proportion of genes on Chr03, Chr04, and Chr08 in the old mungbean assembly converged to a single chromosome in the three related species: in Chr03 of C. arietinum, Chr07 of M. truncatula, and Chr04 of V. angularis (Figure 1; Supplemental Table S7). On the other hand, genes on Chr05 in the old mungbean assembly mapped to two different chromosomes in the related species (Chr01 and Chr05 of C. aretinum, Chr02 and Chr03 of M. truncatula, and Chr05 and Chr09 of V. angularis) (Figure 1; Supplemental Table S7). These syntenic relationships between mungbean and its close relatives indicate that the new genome assembly is more reliable than the previous assembly.
In a previous QTL study, days-to-flowering3-1 (DF3-1) and days-to-first-flowering3-1 (DFF3-1) were mapped to Chr03, one of the misassembled pseudomolecules in the previous draft (Hwang et al., 2017). Such misassemblies on Chr03, Chr04, Chr05, and Chr08 in the previous assembly were corrected in the new assembly. Thus, the new assembly provides bonafide linkage information to breeders, which is crucial for marker-assisted breeding and map-based cloning.

Molecular diversity among Asian mungbean accessions
The new mungbean genome assembly was used as a reference to assess genetic variation among 276 accessions, includ-ing 233 cultivars, 42 wild mungbean accessions, and one V. glabrescens accession (outgroup), collected from 23 countries (Supplemental Table S9). Raw sequence reads were generated via GBS at a sequencing depth of 4.3-13X, depending on the accession (Supplemental Table S10). The distribution of nucleotide variations was relatively even across the genome (Supplemental Figure S6), with an average nucleotide diversity (π) of 7.63 × 10 −6 among wild accessions and 2.33 × 10 −6 among cultivated accessions. Phylogenetic analysis confirmed this reduction of diversity among cultivated mungbean accessions (Supplemental Figure S7). The mungbean cultivar JP231223 from India was the closest to wild accessions, and the cluster closest to wild accessions was dominated by accessions from India and neighboring regions including Pakistan, Myanmar, and China (Supplemental Figure S8). This supports the hypothesis that mungbean was first domesticated in India (Fuller, 2007). Phenotypically, JP231223 has smaller seeds and lower yield than average, but it is not an outlier when compared with other cultivated accessions. This is consistent with the fact that this accession shares more genetic similarity with other cultivated mungbeans than with the nearest wild accession in this study.
A PCA of selected mungbean cultivars with reliable origin data showed some correlation between population group membership and geographical origin. The largest principle component (PC) value (PC1) showed a strong correlation with latitude (Pearson's r = −.74, p < 2.2 × 10 −16 ) (Figure 4a), and a color chart representing sorted PC1 values on the country of origin showed a gradient of PC1 from high latitude countries to tropical countries (Figure 4b). Additionally, the population structure of wild accessions was more complex than that of cultivated accessions, as expected (Figure 4c). When cultivated accessions were grouped according to their country of origin, the genetic background of one subgroup was predominant in accessions that originated from nearby areas (Figure 4d).

Selective sweeps in mungbean cultivars
Based on sequence variation among cultivated and wild accessions, we identified chromosome segments under selective pressure during domestication. A total of 36 selective sweep regions, comprising 224 genes, were detected in mungbean (Figure 2; Supplemental Table S11). The number of genes in each selective sweep region varied from 1 to 14. The GO analysis of these genes showed enrichment in developmental processes such as embryonic development, unidimensional cell growth, and reproductive developmental process. Based on the syntenic relationships, we identified soybean genomic regions homologous to mungbean selective sweep regions (Figure 2). Multiple soybean QTLs related to yield, morphology, and biotic and abiotic resistance The Plant Genome F I G U R E 3 Genetic and morphological differences among VC1973A, V2984, and wild mungbeans. (a) The location of single nucleotide polymorphisms (SNPs) and insertions and deletions resulting in the losses of stop codons and frameshifts. Grey and pink rectangles indicate exons and untranslated regions. Blue arrow heads indicate SNPs. (b) Sequence comparison among VC1973A, V2984, and wild mungbean species. VC1973A, the mungbean reference cultivar, is highlighted in green. (c) Morphological changes among wild mungbean, landrace, and elite cultivar. VC1973A is shorter and has fewer nodes and branches than V2984. The candidate genes identified in this study are listed below the traits, and the genes with critical SNPs between VC1973A and V2984 with wild species are indicated in red. phyA, phytochome A; PKp1, plastidial pyruvate kinase were identified in syntenic regions (https://www.soybase.org) (Supplemental Table S12), indicating that the associated traits have been positively selected in mungbean during domestication. The selective sweep region chr7-1 and mungbean QTL SPM7-1 were located close to each other on Vr07 (∼0.3 Mb apart) (Supplemental Table S11; Table 2). In the soybean homologous region of chr7-1 and SPM7-1, a soybean QTL of seed_weight_50-16 was reported previously (Supplemental Table S12) (Kato et al., 2014). In mungbean, early and even pod maturity were reported to positively affect grain yield, indicating that QTL information on soybean homologous regions can be transferred to mungbean selective sweep  Table S12) (Chen et al., 2008).

DISCUSSION
In this study, we constructed a near-complete reference genome sequence of mungbean using long PacBio reads and an ultra-high-resolution linkage map constructed using resequencing data (Table 1). Compared with the previous draft assembly of the mungbean genome constructed using Illumina short reads, with possible polymerase chain reaction bias (Ha et al., 2019), and linkage maps based on the GBS strategy with nonuniform sequence coverage (Kang et al., 2014), the quality of the new mungbean reference genome sequence is significantly improved. Because highquality reference genome sequences of several economically important legume species have been updated, the accumulated genomic information on related legume species can be used to validate our assembly data without molecular cytogenetic approaches such as fluorescence in situ hybridization (Chamala et al., 2013;Hoang et al., 2020;Yang et al., 2012). Syntenic analysis with C. arietinum, M. truncatula, and V. angularis provided strong evidence that Chr03, Chr04, Chr05, and Chr08 were misassembled in the previous draft assembly, and these sequences have been corrected in the new assembly. The improved mungbean genome assembly will serve as a reference assembly for Vigna species and provides high-quality genomic information for comparative and translational genomics among legume species. Although determinate vegetative growth habit has been acquired during domestication, SPM has not yet been achieved in mungbean accessions, including in elite cultivars (Li et al., 2018). In soybean, the initiation of flowering marks the end of the juvenile growth period and subsequently affects plant morphology and grain yield . In mungbean, because indeterminate inflorescences are generated from each node on branches and the upper stem, plant morphology critically influences flowering capacity and eventually affects SPM and grain yield (Figure 3c). VC1973A, which is shorter and has fewer numbers of nodes and branches than V2984, is known to have higher SPM than V2984. Therefore, to elucidate the QTLs that influence SPM, all traits that are affected by the length of the juvenile growth phase, such as plant height, flower initiation, node number, branch number, and SPM itself, need to be evaluated.
In soybean, which is the best-characterized model legume related to mungbean, 10 major loci (E1-E9 and J) are involved in the regulation of flowering . To date, genes underlying E1-E4, E9, and J have been cloned and functionally characterized in soybean (Guo et al., 2015;Lu et al., 2017;Watanabe et al., 2009;Xia et al., 2012;Zhao et al., 2016). In this study, we identified mungbean homologs (Vradi09g00002773 and Vradi11g00000623) of phyA (E3) and ELF3 (J) as the candidate genes underlying Height4-1, SPM4-1 and Node4-1, and Node11-1, respectively (Table 2; Figure 5). The loss-of-function allele of phyA leads to a short juvenile growth phase and early flowering in soybean under long-day conditions due to photoperiod insensitivity (Watanabe et al., 2009). Functional characterization shows that ELF3 delays maturity in soybean, leading to increased plant height, node number, internode length, and grain yield under short-day conditions (Lu et al., 2017). Thus, VrELF3 (Vradi11g00000623) was identified as the candidate gene for Node11-1 (Table 2).
VrPhyA was previously identified as a candidate gene underlying mungbean QTLs DF3-1 and DFF3-1 (Supplemental Figure S5) (Hwang et al., 2017). In the current study, the QTL interval of DF3-1 and DFF3-1 was split into two intervals by Height4-1, Node4-1 and SPM4-1 QTL cluster and FI4-1 QTL. VrPhyA (Vradi09g00002773) was mapped within the interval containing Height4-1, Node4-1, and SPM4-1, while the interval intersecting the flowering QTL FI4-1 contained the bonsai gene (Vradi04g00002764), a homolog of AT1G73177, which is responsible for inflorescence development and shoot elongation (Saze & Kakutani, 2007). Although the bonsai mutant in petunia (Pentunia Juss.) regulates MADS-box genes, such as AP1, and resembles the phenotype of the leafy mutant in Arabidopsis, the bonsai gene was not investigated separately in the previous flowering QTL study because it was located in the same marker interval as the VrPhyA gene ( Figure 5) (Hwang et al., 2017;Schorderet et al., 2018). Our results indicate that mungbean homologs of bonsai and soybean E3 and J genes regulate juvenile growth traits, such as plant height and node number, as well as flowering initiation, thus affecting the degree of synchronicity in pod maturity (Figure 3).
Branch3-1 contained the AP2/ERF gene (Vradi03g00002294), a homolog of AtERF4 (AT3G15210), which negatively regulates ethylene and abscisic acid responses (Liu et al., 2019). Ethylene responsive factors function downstream of the ethylene signaling pathway in plant development and hormone signaling (Müller & Munné-Bosch, 2015). Pyruvate kinase (Vradi07g00000916) was identified at SMP7-1. The Arabidopsis homolog of PKp1 affects seed filling and embryo development (Baud et al., 2007). Only one gene (Vradi05g00000394) was identified at Height5-1, which is a homolog of leucine-rich repeat receptor-like kinase (LRR-RLK; AT1G06840). Although the function of LRR-RLK has not been characterized in any model species, it represents a large gene family predominantly involved in developmental processes and hormone perception (Dufayard et al., 2017). Interestingly, all six genes identified at FI9-1 were R genes. Because a shorter life cycle decreases the possibility of exposure to diseases and pathogens, vegetative lifespan has been associated with defense alleles. For example, expression levels of immunity-related genes were associated with flowering time in the Arabidopsis natural population (Glander et al., 2017). In soybean, QTLs for disease resistance are reportedly correlated with maturity (Sun et al., 2013), thus supporting the association between R genes and flowering time in mungbean.
The candidate genes showed different patterns of nucleotide diversity in landraces and wild accessions of mungbean compared with the reference cultivar (Supplemental Table S13). Landraces and wild mungbean accessions carry frameshift or stop-loss mutations in phyA, kinase, and PKp1 compared with the elite cultivar (Figure 3b), and diverse nucleotide variations were found in the upstream region, 5′ untranslated regions (5′UTR), exons, introns, 3′UTR, and downstream regions of bonsai, ELF3, and ERF4 in landraces and wild accessions (Supplemental Table S13). These genetic variations may have been caused by modern breeding. Our results can be used by breeders to improve mungbean elite cultivars with higher synchronicity of pod maturity (Figure 3c).
Using the near-complete mungbean reference genome, analysis of nucleotide variations in the natural population provided key insights into mungbean domestication, geographical isolation, and genetic admixture. Domestication of mungbean resulted in a dramatic reduction in genetic diversity. The level of nucleotide diversity in cultivated mungbean accessions (π = 2.33 × 10 −6 ) was only 30% of that in wild accessions (π = 7.63 × 10 −6 ) (Supplemental Figure S6). This reduction in nucleotide diversity due to domestication is worse than that in soybean, where the diversity in landraces and elite cultivars is approximately 47 and 35% of that in wild accessions, respectively (Zhou et al., 2015). Linkage disequilibrium also decayed over a longer distance in cultivated mungbean accessions than in wild accessions (decay cutoff at 71.5 and 2.5 kb, respectively) (Supplemental Figure  S9). This study illustrates that wild mungbean germplasm represents a key genetic resource for breeding (Muñoz et al., 2017).
Population structure analysis of mungbean cultivars indicated that geographical isolation occurred following the introduction of different mungbean cultivars in different locations across Asia. Because latitude affects day length and temperature, which have profound effects on growth and flowering, it is likely that this genetic differentiation was driven by selection for good agronomic performance at different latitudes ( Figure 4b). This suggests that exchanges of germplasms with different genetic backgrounds among countries will also improve the genetic diversity of mungbean cultivars (Figure 4d). Accurate genomic information is required for crop improvement. Here, we developed a near-complete genome sequence of mungbean, which can serve as a reference genome sequence for Vigna species. Genomic information of mungbean accessions have revealed that mungbean cultivars still have much potential to be improved. Additional phenotypic data from the 276 accessions will enable future genome-wide association study. Along with the expression profiles of the candidate genes, cross-validation of QTL and genome-wide association study results will help to validate the function of the candidate genes. The genetic and genomic data reported here will serve as excellent resources for genome-assisted breeding to improve crop quality in Asia.

A C K N O W L E D G M E N T S
This work was carried out with the support of Cooperative Research Program for Agriculture Science and Technology Development (Project No. PJ01582901), Rural Development Administration, Republic of Korea.

D A T A AVA I L A B I L I T Y S T A T E M E N T
Genome assembly and annotation data are available at http: //plantgenomics.snu.ac.kr/.

C O N F L I C T O F I N T E R E S T
The authors have no conflict of interest to declare.