Genome‐wide association study of soluble solids content, flesh color, and fruit shape in citron watermelon

Fruit quality traits are crucial determinants of consumers’ willingness to purchase watermelon produce, making them major goals for breeding programs. There is limited information on the genetic underpinnings of fruit quality traits in watermelon. A total of 125 citron watermelon (Citrullus amarus) accessions were genotyped using single nucleotide polymorphisms (SNPs) molecular markers generated via whole‐genome resequencing. A total of 2,126,759 genome‐wide SNP markers were used to uncover marker‐trait associations using single and multi‐locus GWAS models. High broad‐sense heritability for fruit quality traits was detected. Correlation analysis among traits revealed positive relationships, with the exception of fruit diameter and fruit shape index (ratio of fruit length to fruit diameter), which was negative. A total of 37 significant SNP markers associated with soluble solids content, flesh color, fruit length, fruit diameter, and fruit shape index traits were uncovered. These peak SNPs accounted for 2.1%–23.4% of the phenotypic variation explained showing the quantitative inheritance nature of the evaluated traits. Candidate genes relevant to fruit quality traits were uncovered on chromosomes Ca01, Ca03, Ca06, and Ca07. These significant molecular markers and candidate genes will be useful in marker‐assisted breeding of fruit quality traits in watermelon.

Soluble solids content and flesh color are two of the major internal fruit quality characteristics that influence consumer choice (Kaiser, 2012;Maynard, 2001).Fruit flesh color is a prominent quality trait in watermelon with red, yellow, and orange being the most consumed types worldwide (Wehner, 2008).Both soluble solids content and flesh color are polygenically inherited traits (Bang et al., 2010;Hashizume et al., 2003).Broad-sense heritability for soluble solids content ranging from moderate to high in inter-specific bi-parental populations generated from C. lanatus and C. amarus has been reported (Ren et al., 2014).Genomic regions controlling soluble solids content and flesh color on chromosomes 2, 4, 5, and 10 in C. lanatus have been reported (Fall et al., 2019;Guo et al., 2019;Wu et al., 2019).Fruit shape is an important horticultural trait in commercial watermelon breeding.Fruit shape can be used to categorize watermelon into market classes as shape greatly influences the number of fruits that can be packed into standard boxes before shipment and marketing of the produce (Kaiser, 2012;Wehner, 2008).The oval or blocky-oval fruit shape is preferred for packing in standard shipment boxes, especially in the seedless watermelon industry (Maynard, 2001;Wehner, 2008).For diploid watermelon markets and where consumers prefer large fruit size, elongate fruit shape is more common (Kaiser, 2012;Wehner, 2008).Fruit shape is a function of length, diameter, and ratio of length to diameter or fruit shape index.In watermelon, fruit shape can be classified as round, oval, or elongate (UPOV, 2013;Wehner, 2008;Wehner et al., 2001).Fruit shape is quantitatively inherited with genomic regions on chromosomes 1, 2, 3, and 6 reported in previous genetic studies (Dou et al., 2018;Guo et al., 2019;Wu et al., 2019).
The continuous improvement of DNA sequencing platforms and cost reductions have facilitated the development of chromosome-scale genomes and high-density molecular markers in watermelon (Guo et al., 2019;Katuuramu et al., 2022;Wu et al., 2019).Availability of these genomic resources has enabled implementation of genome-wide association mapping to uncover genomic regions and causal polymorphisms underlying economically important traits in watermelon.The genome-wide association study (GWAS) trait mapping methodology utilizes densely genotyped diverse germplasm with historical meiotic events and high allelic diversity to uncover marker-trait associations (Ingvarsson & Street, 2011;Korte & Farlow, 2013;Zhu et al., 2008).In this project, 125 citron watermelon accessions were extensively phenotyped for fruit quality traits (soluble solids content, flesh color, and fruit shape) and GWAS was applied to these phenotypes to uncover marker-trait associations using both singleand multi-locus models.

Core Ideas
• A panel of citron watermelon accessions was extensively phenotyped for fruit quality traits over two field growing seasons.• Using 125 accessions and 2,126,759 genome-wide SNP markers, three GWAS models were fitted to discover marker-trait associations.• Genomic loci and candidate genes were identified and will be useful in the improvement of watermelon for fruit quality traits.

Plant materials, cultural practices, and experimental design
A total of 125 citron watermelon accessions were received from the USDA germplasm repository in Griffin, GA, and used in this research.This germplasm has been described and used previously in a downy mildew-citron watermelon pathosystem genetic study (Katuuramu et al., 2022).The genotypes were grown in the field over 2 years (2019 and 2022) at the U. S. Vegetable Laboratory (USVL) research site located in Charleston, SC.The soil type is mostly Yonges loamy fine sand with a pH of 6.0, organic matter of 0.9%, and cation exchange capacity of 4.7 meq 100 g −1 .In both years, the field site was double-cultivated using a tractor-mounted offset disc harrow (John Deere, Model 425) and a tractor-mounted Perfecta field cultivator (Unverferth) to loosen the soil and destroy any winter/spring vegetation before forming raised beds and laying of the plastic mulch.Raised field beds were created using the Kennco superbedder (Kennco Manufacturing Co.).The field beds were 121.9-m long, 91.4-cm wide, and 20.3-cm high.The distance between beds was 3.4 m.Subsurface drip irrigation was applied using drip tapes placed centrally underneath the plastic mulch on all raised beds prior to transplanting to maintain adequate soil moisture.The drip tape specifications were 16-mm hose diameter, 0.20-mm wall thickness, 30-cm emitter spacing, with an emitter flow rate of 1.0 L h −1 when the pressure regulator was set to 8 psi (Aqua-Traxx; Toro Agricultural Irrigation).The beds were sprayed with herbicides and the next day they were covered with a black and white totally impermeable film (TIF) plastic mulch (0.03-mm thick; Polygro LLC) using the Kennco superbedder prior to transplanting.
Seeds for each accession were sown directly into Metro-Mix 360 soilless media (Sun Gro Horticulture) in 36-cell vegetable propagation trays (120 cm 3 cell size; T.O.Plastics Inc.) in the greenhouse at the USVL.Four weeks old seedlings were transplanted by hand to the field plots.In both years, there were three plants per genotype replicated three times.Within genotype spacing, the distance between individual plants was 1.8 m, while the space between genotypes was 2.7 m. Fertilization was conducted via injection into the drip irrigation system with approximately 168 kg ha −1 of nitrogen applied over the whole growing season.Weed management was accomplished using a combination of pre-transplanting herbicide application, plastic mulch, and weekly hand pulling.The experimental design in both 2019 and 2022 field seasons was a randomized complete block design with three replications.

Phenotypic data collection and statistical analyses
Citron watermelon fruits were harvested at physiological maturity based on the presence of a brown and dry tendril at the node bearing the fruit, dull and waxy fruit surface, and light-colored ground-spot on the fruit (Maynard, 2001).Majority of the citron watermelon genotypes exhibited a concentrated (uniform) fruit-set pattern, and fruit harvest for the entire trial was completed over 3 weeks.Three fruits per genotype within each replication were sampled for evaluation of fruit quality traits.Fruit samples were phenotyped based on the fruit quality traits descriptor list for watermelon (UPOV, 2013;Wehner, 2008;Wehner et al., 2001).Fruits were cut open longitudinally (from point of pedicel attachment to blossom end apex) and evaluated for soluble solids content, flesh color, fruit length (distance from pedicel to apex), fruit diameter (distance across widest part of fruit [transverse section] from edge to edge), fruit shape index, and fruit shape.Fruit cores were removed from the center of the cut fruit surface and crushed to extract juice for soluble solids content analysis.A drop of juice from each citron watermelon accession was placed on a hand-held digital refractometer to record soluble solids content in % Brix (ATAGO).Flesh color was visually scored as red, yellow, orange, green, or white.To assess fruit length and fruit diameter, cut fruits for every genotype were photographed using a Sony Cyber-shot DSC-HX400V camera equipped with a 24-to 1200-mm ZEISS lens (SONY).Photographs (RGB images) were then processed using Fiji (ImageJ) software to calculate fruit length and fruit diameter in millimeter (Schindelin et al., 2012).Fruit shape index was computed as the ratio of length to diameter.Fruit shape index distribution was used to delineate fruit shape as round (<1.1), oval (1.1-1.3), or elongate (>1.3) (UPOV, 2013;Wehner, 2008;Wehner et al., 2001).
Statistical analyses on the fruit quality data were performed using PROC MIXED in SAS v9.4 (SAS Institute, 2013).The statistical model for PROC MIXED was given as follows: where Y ijk is the fruit quality trait value of the ith genotype in the kth replication of the jth field season, μ is the grand mean, G i is the fixed effect of the ith genotype, S j is the random effect of the jth field season, GS ij is the random interaction term between the ith genotype and the jth field season, rep(S)jk is the random effect of kth replication within the jth field season, and ε ijk is the residual error.The raw traits data were transformed to best linear unbiased estimates (BLUEs) using the restricted maximum likelihood (REML) method in SAS v9.4 (SAS Institute, 2013) based on the above-mentioned statistical model.The BLUEs were then used as trait input files to execute GWAS.Pearson correlation among traits was performed using PROC CORR in SAS v9.4 (SAS Institute, 2013).Variance components for estimating broad-sense heritability were calculated using PROC VARCOMP in SAS v9.4 using the REML method (SAS Institute, 2013).Broad-sense heritability on an entry-mean basis was calculated using the mathematical formula proposed in Holland et al. (2003) as follows: where Var(G) is genotypic variance, Var(GS) is the genotype by field season variance, and Var(Error) is the residual error variance.The denominators s and r represent the number of field seasons and replications, respectively.

Genotyping, kinship, and population structure
The citron watermelon collection was genotyped using wholegenome resequencing.Details on genomic DNA isolation, sequencing technologies, and single nucleotide polymorphism (SNP) calling have been presented previously (Katuuramu et al., 2022).Briefly, whole-genome resequencing data for the 125 citron watermelon (C.amarus) accessions were obtained using Illumina NovaSeq 6000 sequencing technology (Illumina).Shotgun genomic libraries were sequenced on a single lane using the Illumina NovaSeq 6000 machine to generate 150 bp paired-end reads.Reads were filtered and trimmed as presented previously in Katuuramu et al. (2022).Variant discovery and quality control were conducted using the GATK v3.6 best practices workflow (DePristo et al., 2011;McKenna et al., 2010;van der Auwera et al., 2013).After filtering for minor allele frequency (≥0.05), a total of 2,126,759 SNPs remained and were used in population structure, relatedness, and marker-trait association analyses.A kinship matrix to account for relatedness in GWAS was created using the VanRaden method in GAPIT v3.0 within R v4.2.0 ( Van-Raden, 2008;Wang & Zhang, 2021).Population structure was assessed using neighbor-joining clustering and principal component analysis (PCA).The phylogenetic tree matrix was created by invoking the neighbor-joining tree algorithm in TASSEL v5.2 software on the command-line (Bradbury et al., 2007).The neighbor-joining dendrogram was displayed using FigTree software (http://tree.bio.ed.ac.uk/software/figtree/).Principal component (PC) analysis to control for population structure was conducted using the prcomp function in R v4.2.0 within GAPIT v3.0 (R Core Team, 2022; Wang & Zhang, 2021).The optimum number of PCs to include in the GWAS models for every trait was determined using the scree plot and Bayesian information criterion (Schwarz, 1978).

Genome-wide association study
Three models were used to perform GWAS and included mixed linear model (MLM), fixed and random model circulating probability unification (FarmCPU), and Bayesianinformation and linkage-disequilibrium iteratively nested keyway (BLINK) (Huang et al., 2018;Liu et al., 2016;Yu et al., 2006).The MLM is a single-locus model testing one marker at a time during analysis, while both FarmCPU and BLINK are multiple loci models (Huang et al., 2018;Liu et al., 2016;Yu et al., 2006).MLM model, first published in 2006, incorporates kinship and population structure matrices to control for false negatives and positives.The single-marker testing makes it computationally intensive and has limited statistical power (Wang & Zhang, 2021;Yu et al., 2006).FarmCPU uses forward-backward stepwise regression to incorporate multiple markers as covariates to limit confounding markers and kinship (Liu et al., 2016).FarmCPU incorporates fixed-effects model and a random-effects model in an iterative fashion resulting in increased statistical power, computational efficiency, and control for both false positives and negatives (Liu et al., 2016;Wang & Zhang, 2021).BLINK runs two fixed effect models: the first one tests for associations between markers and trait values with pseudo quantitative trait nucleotides (QTNs; i.e., previously associated SNP markers with the trait or QTNs) as covariates.In the second step, BLINK optimizes the selection of QTNs to be included as covariates for the first step using Bayesian Information Criterion.BLINK has increased statistical power, is computationally efficient for big genotypic datasets with millions of SNP markers, and better controls both false positives and false negatives compared to the preceding models such as FarmCPU and MLM (Huang et al., 2018;Liu et al., 2016;Wang & Zhang, 2021).Genome-wide Manhattan and quantile-quantile (QQ) plots to visualize marker-trait associations and model fitting were created using the CMplot package in R v4.2.0 (R Core Team, 2022; Yin, 2022).Model-trait associations exhibiting irregular deviations in QQ plots are not presented in the results.Correction for multiple marker-trait association tests was accomplished using the false discovery rate method at an alpha level of 5% (Benjamin & Hochberg, 1995).Population structure, kinship, and GWAS analyses were implemented in GAPIT v3.0 within R v4.2.0 (Lipka et al., 2012;R Core Team, 2022).The likelihood ratio-based R 2 statistic was used to calculate the phenotypic variation explained (PVE) by the peak SNP markers for every trait (Sun et al., 2010).Comparison of the allelic effects of the significant SNP markers on the phenotypic means was performed using the Student's two-tailed t test and visualized in R v4.2.0 (R Core Team, 2022).Candidate genes were selected within a ±85 kb interval from the peak SNP for the traits of interest using the released USVL 246-FR2 C. amarus genome (http://cucurbitgenomics.org/).The search window was chosen based on the linkage disequilibrium pattern present in this citron watermelon collection (Katuuramu et al., 2022).Putative candidate genes were those whose annotation, description, and gene ontology functions are relevant to molecular control of the traits evaluated in this research.

Phenotypic summary statistics, heritability, and trait correlations
Soluble solids content varied from 0.9% to 6.8% Brix with a 7.6-fold variation and a broad-sense heritability of 0.97 (Table 1; Figure 1).Five genotypes had a red flesh color, 25 were yellow, 15 had orange flesh, seven were green, and 73 accessions had white flesh color (Figure 1).Fruit flesh color had a 0.96 broad-sense heritability estimate.Fruit length ranged from 88.5 to 242.9 mm with a 2.7-fold variation and had a heritability of 0.94 (Table 1; Figure 1).Fruit diameter varied from 90.5 to 209.1 mm with a 2.3-fold variation and a heritability of 0.89 (Table 1; Figure 1).Fruit shape index ranged from 0.7 to 1.9 with a 2.7-fold variation and had a heritability estimate of 0.93 (Table 1; Figure 1).Based on the fruit shape index distribution, a total of 25 genotypes had an elongate fruit shape, 34 were oval, and 66 accessions had a round fruit shape (Figure 1).Correlation analysis revealed presence of low to moderate relationships among traits (Table 2).Soluble solids content was positively correlated to fruit length (r = 0.50) and fruit diameter (r = 0.41; Table 2).Fruit length was positively correlated to fruit diameter (r = 0.38) and fruit shape index (r = 0.67; Table 2).Fruit diameter was negatively correlated to fruit shape index (r = −0.40;Table 2).

Population structure
Neighbor-joining phylogenetic tree and PCA based on SNP markers were used to determine clustering of the genetic variation present in the C. amarus collection (Figure S1).There was subtle genetic clustering within the core collection especially since 112 (89.6%) of the genotypes were collected/received from Southern Africa, three were from Asia, five were collected from Europe, and five were received from North America (Table S1; Figure S1).The first and second PCs accounted for 22.1% and 4.7% of the genetic variation present in the C. amarus core collection, respectively (Figure S1).
Candidate genes relevant to sugar transport and accumulation as well as fruit development and fruit shape were identified.Candidate gene CaU06G09710 was located at 83.9 kb downstream of the peak SNP S6_11610218 associated with soluble solids content on chromosome Ca06 and encodes for a sugar transporter.A significant SNP S3_30466436 located on chromosome Ca03 was associated with fruit length, fruit diameter, and fruit shape index (Table 3).This SNP was consistently detected by the three GWAS models of BLINK, FarmCPU, and MLM.The SNP was located inside candidate gene CaU03G18580 that codes for an IQ domain protein.Candidate gene CaU01G17460 was located at 82.5 kb downstream of the significant SNP S1_27855991 associated with fruit diameter on chromosome Ca01 and codes for WD40 repeat protein.Another peak SNP S7_2706920 associated with fruit diameter is co-located with candidate gene CaU07G02850.The CaU07G02850 candidate gene was found at 22.6 kb downstream of the significant SNP and codes for a squamosa promoter binding protein.

Allelic effects of the significant SNPs on the fruit quality traits
Allelic effects of the peak SNP markers on the fruit quality traits were assessed (Figures 7 and 8).For the "AA" allele of SNP marker S1_13515500 on chromosome Ca01 resulted in higher soluble solids content (mean = 2.2% Brix) compared with a mean of 1.6% Brix among genotypes with the homozygous "GG" alternative allele (p = 9.7 × 10 −4 ; Figure 7).Genotypes with the GG allele for SNP marker S1_15333555 on chromosome Ca01 had a lower mean soluble solids content of 1.5% Brix compared to accessions carrying the "TT" allele with a mean of 2.2% Brix (p = 5.3 × 10 −5 ; Figure 7).For the "CC" and TT alleles of SNP markers S6_11610218 and T A B L E 3 Details of the peak single nucleotide polymorphisms (SNPs) associated with soluble solids content, flesh color, fruit length, fruit diameter, and fruit shape index traits across the 125 citron watermelon genotypes evaluated over two field seasons at the U.S. Vegetable Laboratory research station in Charleston, SC.S9_38139503 on chromosomes Ca06 and Ca09, there were no significant differences in soluble solids content (Figure 7).For fruit flesh color, SNP marker S1_5486012 on chromosome Ca01, the major allelic class CC was predominantly associated with white flesh color (56.5%), while genotypes with the minor allele TT had mostly orange or yellow flesh colors (Table S2).The major allele AA of SNP S2_6968474 on chromosome Ca02 was associated mostly with white fleshed accessions (43.7%), while the minor allele GG was associated with yellow (14.6%) and white (16.5%) fruit color phenotypes.The major allele GG of SNP marker S10_15722135 on chromosome Ca10 was largely associated with white flesh color, while minor allele TT was associated with orange (10.5%) and yellow (8.9%) flesh colors (Table S2).The TT major allele of SNP S11_5350066 on chromosome Ca11 was largely associated with white flesh color (52.4%), while genotypes carrying the GG minor allele were mostly yellow and white fleshed.For SNP marker S11_20091563 on chromosome Ca11, genotypes with the CC major allele were predominantly associated with white flesh color (58.5%), while accessions carrying the minor allele TT were mostly yellow fleshed (Table S2).

Traits
For fruit length, genotypes with the AA allele for SNP marker S3_30466135 on chromosome Ca03 had longer fruits (mean = 180.7 mm) compared with the GG allele (mean = 138.9mm; p = 3.1 × 10 −9 ; Figure 7).The CC allele of SNP marker S3_30466436 on chromosome Ca03 resulted in shorter fruits (mean = 139.4mm) compared to the TT allele with a mean of 180.5 mm at a p value of 2.9 × 10 −9 (Figure 7).For SNP marker S3_30466640, genotypes with the CC allele had longer fruits (mean = 180.7 mm) compared to accessions with the GG allele (mean = 138.7 mm; p = 2.9 × 10 −9 ; Figure 7).A haplotype combination involving allelic states AA, TT, and CC of S3_30466135, S3_30466436, and S3_30466640 SNP markers, respectively, on chromosome Ca03 corresponded to longer fruit length, and there were 40 accessions carrying the AATTCC haplotype in the citron watermelon germplasm collection.For alleles CC and GG of SNP marker S10_17320207 on chromosome Ca10, there was no significant difference in fruit length (Figure 7).
For fruit diameter, the GG allele of SNP marker S1_27855991 on chromosome Ca01 resulted in larger diameter values (mean = 153.6 mm) compared to the TT allele (mean = 120.1 mm; p = 3.8 × 10 −6 ; Figure 8).Accessions with the CC allele of SNP marker S3_30466436 on chromosome Ca03 had larger fruit diameter readings (mean = 154.3mm) compared to accessions carrying the TT allele (mean = 135.4mm; p = 3.1 × 10 −4 ; Figure 8).Genotypes with the CC allele for SNP S8_12874511 had larger fruit diameter readings (mean = 165.5 mm) compared to those with the TT allele (mean = 126.5 mm; p = 2.8 × 10 −17 ; Figure 8).For fruit shape index, the CC allele for SNP marker S3_30466436 on chromosome Ca03 resulted in smaller value (mean = 0.8) compared to the TT allele (mean = 1.4; p = 9.7 × 10 −25 ; Figure 8).Genotypes with the CC allele for SNP marker S3_30466640 on chromosome Ca03 had a higher fruit shape index value (mean = 1.4) compared to accessions with the GG allele (mean = 0.9; p = 5.3 × 10 −24 ; Figure 8).For the CC and TT alleles of SNP marker S10_5269197 on chromosome Ca10, there were no significant differences in fruit shape index values (Figure 8).

DISCUSSION
Improving fruit quality traits is one of the major goals for commercial and public watermelon breeding programs.Consumer choice and repeat purchasing of watermelon produce is greatly influenced by fruit sweetness, flesh color, and fruit shape (Kaiser, 2012;Maynard, 2001).The screening of the citron watermelon collection revealed variations within the fruit quality traits.High broad-sense heritability was detected for all traits suggesting presence of genetic contributions to the observed variation.High heritability estimates for soluble solids content and flesh color have been reported previously (Ren et al., 2014;Wehner et al., 2017).Significant correlations were identified that ranged from low to moderate indicating that the traits could be combined in multiple ways during mating designs for fruit quality improvement.The positive correlation among soluble solids content, fruit length, and fruit diameter suggests that it is feasible to increase flesh sweetness while simultaneously selecting for larger fruit size in watermelon breeding and cultivar development.Previous research efforts have also presented positive correlations among soluble solids content, fruit length, and fruit diameter traits in a field-based watermelon trial (Correa et al., 2020).
Fruit diameter was negatively correlated to fruit shape index implying that increasing diameter resulted in fruits with round fruit shape or conversely shorter fruit diameter resulted in elongate fruit shape.Many SNPs (37) significantly associated with soluble solids content, flesh color, fruit length, fruit diameter, and fruit shape index were detected using BLINK, FarmCPU, and MLM models, and they appear to confer small genetic effects with PVE ≤ 23.4%.The small effects of the peak SNPs highlight the complexity of the genetics underlying the evaluated fruit quality traits.Utilization of these three models in the analysis helped account for differences in the underlying phenotypic variation and genetic architecture of the fruit quality traits.The MLM method was limited in the number of traits where significant marker-trait associations were detected.No peak SNPs were found for soluble solids content and fruit flesh color when using MLM.This can be attributed to the low statistical power and poor con-trol of false negatives common to MLM method compared to multiple-locus models such as BLINK and FarmCPU (Huang et al., 2018;Liu et al., 2016;Wang & Zhang, 2021;Yu et al., 2006).There was concordance across models where five significant SNP markers were detected by BLINK and FarmCPU over all fruit quality traits.One SNP marker significant for fruit length, diameter, and fruit shape index was detected by MLM, BLINK, and FarmCPU models.This consistency across models suggests that these genomic regions are crucial in the genetic control of the fruit quality traits.Identification of molecular markers that influence fruit quality traits and understanding the genetic mechanism will be crucial in marker-assisted trait improvement.Genomic loci underlying soluble solids content and flesh color on chromosomes 2, 4, 5, and 10 in C. lanatus have been reported (Fall et al., 2019;Guo et al., 2019;Wu et al., 2019).The present research uncovered loci for soluble solids and flesh color on chromosomes Ca01, Ca02, Ca06, Ca09, Ca10, and Ca11.The new peak SNPs uncovered in this research will be useful in marker-assisted fruit quality trait improvement in watermelon.
Genomic loci associated with fruit shape traits (length, diameter, and fruit shape index) were uncovered on chromosomes Ca01, Ca02, Ca03, Ca04, Ca05, Ca07, Ca08, and Ca10.Previous studies reported genomic regions for fruit shape on chromosomes 1, 2, 3, and 6 (Dou et al., 2018;Guo et al., 2019;Wu et al., 2019).All the peak SNPs for fruit length on chromosome Ca03 were located at approximately 30.5 Mb and were also associated with fruit diameter and fruit shape index including the most significant SNP suggesting F I U R 7 Boxplot showing allelic effects of the peak single nucleotide polymorphism (SNP) markers associated with soluble solids content and fruit length traits in the 125 citron watermelon genotypes based on Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK), fixed and random model circulating probability unification (FarmCPU), and mixed linear model genome-wide association study (MLM GWAS) models.
presence of shared genetic control for these three traits.Guo et al. (2019) reported that five SNPs on chromosome 3 (at approximately 31.1 Mb) were associated with fruit shape in C. lanatus.This region on chromosome Ca03 warrants further investigation since it had peak SNPs consistently detected by BLINK, FarmCPU, and MLM models and co-located with a fruit shape candidate gene.
Four candidate genes relevant to fruit sweetness (measured by soluble solids content) and fruit shape traits (length, diameter, and fruit shape index) co-located with the peak SNPs generated from GWAS.The functional annotation of the candidate genes was as follows: CaU06G09710 (sugar transporter), CaU03G18580 (IQ domain protein), CaU01G17460 (WD40 repeat protein), and CaU07G02850 (squamosa promoter binding protein).Translocation and accumulation of sugars from the source (e.g., photosynthetically active tissues) to sink organs such as fruits relies heavily on sugar transporter proteins (Kuhn & Grof, 2010;Williams et al., 2000).Moreover, overexpression of a tonoplast sugar transporter in watermelon resulted in elevated sugar levels in the fruit (Ren et al., 2018).The peak SNP markers and sugar transporter gene will be useful in manipulation of the fruit sink strength ultimately improving fruit sweetness demanded by watermelon consumers.
Three candidate genes (CaU03G18580 [IQ domain protein], CaU01G17460 [WD40 repeat protein], and CaU07G02850 [squamosa promoter binding protein]) colocated with the peak SNPs associated with fruit length, diameter, and fruit shape index.In watermelon fruit, development processes involve elongation and enlargement.The IQ domain proteins play crucial roles in plant organ development and shape as well as tolerance to drought stress (Duan et al., 2017;Lazzaro et al., 2018).Previous GWAS and integrated omics reports showed IQ domain proteins to be involved in fruit shape and elongation in watermelon and tomatoes (Dou et al., 2022(Dou et al., , 2018;;Guo et al., 2019;Rodriguez et al., 2011).The WD40 repeat proteins play important roles in many plant development processes including cell division, floral development, and meristem organization (Stirnimann et al., 2010).Squamosa promoter binding proteins play critical roles in plant architecture, flower and fruit development, and fruit ripening among other F I G U R E 8 Boxplot showing allelic effects of the peak single nucleotide polymorphism (SNP) markers associated with fruit diameter and fruit shape index traits in the 125 citron watermelon genotypes based on Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK), fixed and random model circulating probability unification (FarmCPU), and mixed linear model genome-wide association study (MLM GWAS) models.
plant developmental phase transitions across many plant species (Chen et al., 2010).These candidate genes warrant further extensive analysis to determine their molecular and functional mechanisms in the control of fruit quality traits.

CONCLUSION
In this study, extensive phenotyping of fruit quality traits in a citron watermelon collection and trait mapping using GWAS were performed.Significant variation for fruit quality traits was detected in the citron watermelon germplasm.Results from GWAS revealed genomic loci underlying fruit quality traits.Candidate genes relevant to fruit quality traits and colocated with the significant SNPs are provided.Phenotypic variation will be useful in parental selections for crossing blocks to improve watermelon fruit quality traits.The significant SNP markers and candidate genes will be crucial in development of molecular tools to improve watermelon fruit quality traits via marker-assisted breeding.

F
Manhattan and quantile-quantile (QQ) plots for soluble solids content based on (a) fixed and random model circulating probability unification (FarmCPU) and (b) Bayesian-information and linkage-disequilibrium iteratively nested keyway genome-wide association study (BLINK GWAS) models across the 125 citron watermelon genotypes using best linear unbiased estimates (BLUEs) from the two field growing seasons.The red solid horizontal line is a false discovery rate (FDR) cutoff of α = 0.05.

F
Manhattan and quantile-quantile (QQ) plots for fruit flesh color based on (a) fixed and random model circulating probability unification (FarmCPU) and (b) Bayesian-information and linkage-disequilibrium iteratively nested keyway genome-wide association study (BLINK GWAS) models across the 125 citron watermelon genotypes using best linear unbiased estimates (BLUEs) from the two field growing seasons.The red solid horizontal line is a false discovery rate (FDR) cutoff of α = 0.05.

F
Manhattan and quantile-quantile (QQ) plots for fruit length based on (a) fixed and random model circulating probability unification (FarmCPU), (b) Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK), and (c) mixed linear model genome-wide association study (MLM GWAS) models across the 125 citron watermelon genotypes using best linear unbiased estimates (BLUEs) from the two field growing seasons.The red solid horizontal line is a false discovery rate (FDR) cutoff of α = 0.05.

F
I G U R E 5 Manhattan and quantile-quantile (QQ) plots for fruit diameter based on (a) fixed and random model circulating probability unification (FarmCPU), (b) Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK), and (c) mixed linear model genome-wide association study (MLM GWAS) models across the 125 citron watermelon genotypes using best linear unbiased estimates (BLUEs) from the two field growing seasons.The red solid horizontal line is a false discovery rate (FDR) cutoff of α = 0.05.

F
Manhattan and quantile-quantile (QQ) plots for fruit shape index based on (a) fixed and random model circulating probability unification (FarmCPU), (b) Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK), and (c) mixed linear model genome-wide association study (MLM GWAS) models across the 125 citron watermelon genotypes using best linear unbiased estimates (BLUEs) from the two field growing seasons.The red solid horizontal line is a false discovery rate (FDR) cutoff of α = 0.05.
Descriptive summary statistics, variance components, and broad-sense heritability soluble solids content, fruit length, fruit diameter, and fruit shape index traits across the 125 citron watermelon genotypes evaluated over two field seasons at the U.S. Vegetable Laboratory research station in Charleston, SC.Correlation among soluble solids content, fruit length, fruit diameter, and fruit shape index traits across the 125 citron watermelon genotypes evaluated over two field seasons at the U.S. Vegetable Laboratory research station in Charleston, SC.
T A B L E 1 F I G U R E 1Histogram showing the distribution of soluble solids content, fruit flesh color, fruit length, fruit diameter, fruit shape index, and fruit shape traits across the 125 citron watermelon genotypes averaged over two field growing seasons at the U.S. Vegetable Laboratory research station in Charleston, SC.T A B L E 2

The Plant Genome
The original contributions presented in this research are available within the article and the supplementary files.The genomic datasets used in this research can be accessed at different websites: http://cucurbitgenomics.org/; http:// cucurbitgenomics.org/ftp/genome/watermelon/USVL246/;http://cucurbitgenomics.org/ftp/reseq/watermelon/Amarus_ reseq/.Any reasonable inquiries/requests can be made to the corresponding author.
D A T A AVA I L A B I L I T Y S T A T E M E N T