Genomic Resequencing Unravels the Genetic Basis of Domestication, Expansion, and Trait Improvement in Morus Atropurpurea

Abstract Mulberry is an economically important plant in the sericulture industry and traditional medicine. However, the genetic and evolutionary history of mulberry remains largely unknown. Here, this work presents the chromosome‐level genome assembly of Morus atropurpurea (M. atropurpurea), originating from south China. Population genomic analysis using 425 mulberry accessions reveal that cultivated mulberry is classified into two species, M. atropurpurea and M. alba, which may have originated from two different mulberry progenitors and have independent and parallel domestication in north and south China, respectively. Extensive gene flow is revealed between different mulberry populations, contributing to genetic diversity in modern hybrid cultivars. This work also identifies the genetic architecture of the flowering time and leaf size. In addition, the genomic structure and evolution of sex‐determining regions are identified. This study significantly advances the understanding of the genetic basis and domestication history of mulberry in the north and south, and provides valuable molecular markers of desirable traits for mulberry breeding.


Content
.24 Table S4.The statistics of annotated genes by different databases of M. atropurpurea .........25 Table S5.The content of major TE subfamilies in the updated genome of M. atropurpurea (Female)..The male plant was used for analysis.

Figure S3 .
Figure S3.The genome size estimate of M. atropurpurea.(a) Genome size of M. atropurpurea was measured by flow cytometry with Zea mays L. as an internal reference.The x-axis is the relative DNA quantity (Da).The y-axis is the number of counted cells.(b-c) The genome size was estimated by calculating the distribution of 19-mer frequency in the sequencing reads.The x-axisis the depth (X), and the y-axis is the proportion of sequences that represent the frequency at that depth divided by the total frequency of all depths.Overall, 35 Gb of data were retained for 19-mer analysis.The main distribution peak had approximately 71× coverage, and the genome size was estimated to be 308 Mb (Genome Size=K-mer number/Peak depth).The small peak at 1/2 of the main peak depth (approximately 36× coverage) shows the intermediate heterozygosity rate of the genome.The K-mer depth appears more than 2 times (approximately 142×coverage) at the corresponding depth of the main peak, representing repetitive sequences; that is, the K-mer sequence with depth greater than 142 is a repetitive sequence.(d) Estimate of mulberry genome size based on 19 K-mer statistics.

Figure S4 .
Figure S4.Collinearity analysis of M. atropurpurea.(a-b) Validation of Hi-C-assisted pseudochromosome assembly by calculating the thermal interaction correlation for 'Tang 10' and 'huiqiu1'.The coordinates of the dots represent the physical locations (x-axis) and map locations of the markers (y-axis).(c) Collinearity analysis of M. atropurpurea with M. alba.x-axis, M. alba; y-axis, M. atropurpurea.

Figure S5 .
Figure S5.Chromosome synteny and structural variation between M. atropurpurea and M. alba.

Figure S6 .
Figure S6.Schematic representation of syntenies among M. atropurpurea, Populus trichocarpa and grape genomes.Each line represents a syntenic region.Red lines highlight the one-to-one syntenic relationships between M. atropurpurea and grape or poplar.

Figure S7 .
Figure S7.Density distribution of (a) 4DTv (fourfold synonymous third-codon transversion) and (b) Ks for paralogous genes in M. atropurpurea.The color code of the curves is shown in the inset.

Figure S9 .
Figure S9.Functional enrichment analysis of significant (P<0.05)expanded gene family and contracted gene family in M. atropurpurea.(a-b) Gene Ontology (GO) functional analysis of (a) expanded gene families and (b) contracted gene families.(c-d) Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of (c) expanded gene families and (d) contracted gene families.The male plant was used for analysis.

Figure S10 .
Figure S10.Change rate of cross-validation (CV) error value of admixture in K-values ranged from 1 to 12.

Figure S11 .
Figure S11.Admixture of mulberry accessions based on different numbers of clusters (K = 2-5).Population structure at K = 1-12 in which each individual is represented by a vertical color-coded column, inferred which subgroup the mulberry accession belongs to and the proportion of genetic components from ancestral populations.

Figure S12 .
Figure S12.Relatedness estimation between kinship and population structure.PI_HAT indicates Proportion of Identity By Descent (IBD).

Figure S13 .
Figure S13.Comparison of the mulberry accessions in the phylogenetic tree between accessions in this study and accessions published by Jiao et al (2020).

Figure S14 .
Figure S14.Analysis of gene flow among different populations and geographic areas in mulberry.

( a )
Inferred population splits and migrations among different geographic areas of China and (b) different genetic population of mulberry accessions from the TreeMix analysis.The orange line shows the potential migration events among groups.M. atropurpurea (including Landrace1, Landrace 2, and MECMA (elite cultivars) groups), CIH (Interspecific hybrid from China), JIH (Interspecific hybrid from Japan), MA (M.alba), and MM (M.multicaulis).(c) Population splits and migrations among accessions from different provinces of China.The arrows correspond to the direction of gene flow.

Figure S15 .
Figure S15.Estimates of the effective population size (Ne) for each subgroup of M. alba (MM and MA) and M. atropurpurea (Landrace1, Landrace2, and MECMA) using SMC++.We set the generation time (g) as 1 years and the mutation rate (µ) as 7e × 10 −09 per site per generation.

Figure S16 .
Figure S16.Demographic history of mulberry.(a) Comparison of effective population size change over time of mulberry from southeast Asian countries with MAT.MAT, M. atropurpurea.(b) Comparison of effective population size change over time of mulberry from Japan and Argentina with MAM.MAM, M. alba (MA) and M. multicaulis (MM).

FigureFigure S18 .
Figure S17 Quantile-quantile plots for key agronomic traits GWAS analysis in the mulberry population.The horizontal axis shows −log10 transformed expected P values, and the vertical axis indicates −log10 transformed observed P values.(a) Leaf size.(b) Leaf weight.(c) Flowering time.(d) Sex.