Prototype for developing SNP markers from GWAS and biparental QTL for rice panicle and grain traits

There is a large gap between genomewide association studies (GWAS) and developing markers that can be used in marker‐assisted selection (MAS) schemes for cultivar improvement. This study is a prototype for developing markers using segregating single nucleotide polymorphisms (SNPs) for panicle architecture and grain shape traits identified by GWAS in the Rice Diversity Panel‐1 and colocalized in QTL regions revealed by linkage mapping in the Estrela × NSFTV199 rice (Oryza sativa L.) population. Markers were developed from sequence variants suitable for reliable detection in regions surrounding the most significant SNPs identified in GWAS. Once developed, the markers were validated in three Japonica subspecies biparental populations, used to improve QTL mapping resolution, and employed to select potential parents for use in MAS. All marker alleles segregated in the rice tropical japonica subpopulation.


INTRODUCTION
The Rice Diversity Panel-1 (RDP1) was developed for genomewide association studies (GWAS) to explore the diversity in rice (Oryza sativa L.) for rice improvement (Eizenga et al., 2014;McCouch et al., 2016). The RDP1 was phenotyped for 27 yield-related traits and genotyped with approximately 700,000 single nucleotide polymorphisms (SNPs). The panel represents the two subspecies, Indica composed of the indica and aus subpopulations and Japonica composed of the aromatic, temperate japonica, and tropical japonica subpopulations. Since most U.S. rice cultivars are Japonica, par-ticularly tropical japonica (Lu et al., 2005), biparental recombinant inbred line (RIL) mapping populations were developed from RDP1 Japonica accessions that were phenotypically and genotypically diverse to validate yield-related RDP1 GWAS-QTL. To this end, initial quantitative trait locus (QTL) mapping of the Estrela × NSFTV199 RIL population revealed 70 RIL-QTL associated with 14 of these yield-related traits (Eizenga et al., 2019b). With the aim of designing DNA markers for these yield-related traits that would be useful to breeding programs, we examined segregating SNPs in the overlapping GWAS-QTL for RIL-QTL with high R 2 and determined there were 11 RIL-QTL with overlapping GWAS-QTL located in six different chromosome regions. The objectives of this study were to present a prototype for developing markers based on these segregating SNPs, test these markers in other RIL populations, and use these markers to improve QTL mapping resolution and for germplasm selection.

SNP selection and marker development
To select SNPs, six regions with overlapping GWAS-QTL and SC14-QTL and colocalized candidate genes on chromosomes (chr.) 3, 4, 5, 7, 8 and 9 were examined (Table 1; Eizenga et al., 2019b). The Primer3web v4.1.0 software (Untergasser et al., 2012;Koressaar & Remm, 2007) and assay design service provided by 3CR Biosciences (Welwyn Garden City, Hertfordshire, UK) were used to design the SNP markers. Method details are in Supplemental Material S1 and the sequences for the two SNP specific primers and shared reverse primer are in Supplemental Table S1. For the SC14 RILs and parents, genomic DNA isolation was described by Eizenga et al. (2019b). For the SC11 and SC17 RILs, genomic DNA extraction was performed according to Xin et al. (2003).

Data analyses
For analysis, best linear unbiased prediction (BLUP) estimates were calculated for the replicated SC14 trait data collected over 2 yr according to Eizenga et al. (2019b). For the SC11 and SC17 RIL single plant traits, the actual value was used, and for the parents, a mean was calculated based on five random plants.
To evaluate the association of the individual markers with the individual traits, a one-way ANOVA was conducted using JMP 14.0.0 (SAS Institute, 2015). Marker alleles showing significant segregation with the trait at a threshold probability of p < .05 were compared by Student's t-test to validate the

Core Ideas
• We present a prototype for SNP marker development from colocalized GWAS-QTL and RIL-QTL regions. • New SNP markers were developed that are tightly linked to QTL for use in marker-assisted selection. • 18 new SNP markers were validated in three Japonica subspecies biparental populations.
statistical significance, and heterozygous alleles were treated as missing data. The CORR procedure in SAS 9.4 (SAS Institute, 2012) was used to calculate Pearson's correlation coefficients among the traits within each population. Allele effects were calculated from the differences in allele means for each trait that was significant by the Student's t-test calculated in JMP14.0.0 (SAS 2015). The RDP1 allele effects were calculated from the allele mean differences obtained from the GWAS in TASSEL version 5 (Bradbury et al., 2007;Eizenga et al., 2019b). To determine the reference allele frequency within the five O. sativa subpopulations, the RiceVarMap v2.0 database (Zhao et al., 2015) was examined for the specific SNP, and if the SNP was absent, the Rice SNP-Seek database was inspected (Mansueto et al., 2017).
To improve the SC14 QTL map resolution, a new linkage map was developed that included the 18 SNP markers and "Glab" on chr. 5 (1.00 Mb) (Eizenga et al., 2019a), according to Eizenga et al. (2019b). Using QGene 4.4.0 (Joehanes & Nelson, 2008), the QTL analysis was re-run with the scan interval set at 0.1 cM to better delineate the QTL region that mapped to chromosome regions where these additional SNP markers were located.

Validation in Japonica biparental populations
Based on the markers segregating in the parents (Supplemental Table S1), the SC14 RILs were genotyped with all 18 markers, SC11 RILs with 8 markers and SC17 RILs with 10 markers. As validation, t-tests were conducted for the panicle architecture, grain, and agronomic traits phenotyped, namely 16 traits in the SC14 RILs and 13 traits in the SC11 and SC17 RILs (Supplemental Table S2). This confirmed the SC14 marker-trait associations of the 15 markers with all the traits where the GWAS SNPs colocalized with RIL-QTL (Supplemental Figure S1; Supplemental Table S3). SC11 validated the DB80 (chr. 7) association with panicle architecture traits, as well as seed weight and length, and the DB88 association with number of seeds (florets) per panicle. Additionally, SC17 confirmed the DB83 and DB84 (chr. 4) associations with flag leaf width and the DB87 (chr. 9) associations with plant height and number of seeds (florets) per panicle. Also, there was an association between DB69 (chr. 8) and panicle length, supporting the GWAS-QTL reported by Zhong et al. (2021).
For the grain traits, associations between the chr. 3 markers (DB71 to DB75) and grain length were validated in all three populations, but the association with 100-seed weight was only found in SC14. The association between the chr. 5 markers (DB88, DB76 to DB79) and grain width was highly significant for all three populations except between DB88 and SC17. Of note, weaker associations were observed between some of these chr. 5 markers and grain length across the three populations.
For the three markers (DB6, DB46, DB91) developed from GWAS-QTL, no marker had an association with the GWAS targeted trait, but DB6 (chr. 1) was weakly associated (p < .05) with 100-seed weight across all three populations and plant height in two populations. These were substantiated by the overlapping RIL-QTL and the candidate genes OsARF4 (auxin response factor 4) affecting grain weight and size  and the "green revolution" gene, SEMI-DWARF1 (SD1) affecting plant height (Murai et al., 2011).
The correlations between the phenotypic traits within each population (Supplemental Table S4) confirmed most of the marker trait associations noted for each marker across the traits evaluated. The direction of the allele effects corresponding to the variant in each biparental parent was verified (Supplemental Table S5) to be in the same direction as the allele effects of the variants in RDP1 GWAS QTL (Supplemental Table S6).
The reference allele frequency within the RDP1 or rice minicore accessions was compared to the allele frequency among the five rice subpopulations using either the RiceVarMap v2.0 (4,726 accessions; Zhao et al., 2015) or SNP-Seek (3,000 accessions; Mansueto et al., 2017) database (Supplemental Table S6). Within the RDP1, DB69 and DB74 only segregated in the Japonica accessions and DB46, DB77, DB78, and DB87 only segregated in tropical japonica, which was similar to allele frequencies observed in the aforementioned collections. Previously, Eizenga et al. (2019b) noted almost all the segregating SNPs identified as potential markers were predominately found in Japonica accessions, which is supported by these observations in the larger O. sativa collections where the allele frequency was moderate in tropical japonica, but within at least one of the other four subpopulations, the allele frequency was highly skewed toward either the reference or alternate variant. In fact, only the DB76 alleles had a moderate allele frequency across all five subpopulations. These observations confirm the importance of considering the targeted subpopulation and subspecies when developing and utilizing markers.

Utilization in QTL mapping and germplasm development
To demonstrate the utility of these markers for improving the SC14 QTL mapping resolution in the targeted regions, the original 132 markers, 18 new SNP markers, and "Glab" were used. Compared with the original mapping (Eizenga et al., 2019a(Eizenga et al., , 2019b, one new grain length QTL was uncovered, qHULGRLG1, 14 of the 18 SNPs improved the resolution of 34 SC14-QTL on chr. 3, 4, 5, 7, and 9, and 41 SC14-QTL mapped to the same regions (Supplemental Figure S2; Supplemental Table S7). Only DB69 (chr. 8) did not improve the resolution. Of note, DB6 (chr. 1) improved QTL resolution even though it was developed from a rice minicore GWAS-QTL for alkali spreading value, whereas DB46 and DB91 did not improve resolution because there were no overlapping SC14 QTL.
Recently, the effectiveness of these SNP markers was verified as part of the selection of two tropical japonica germplasm lines, SC14_166, released with increased panicle branching and seeds per panicle, and SC14_072, released with an extra-long grain (Eizenga et al., in press). Breeders using these lines in marker-assisted selection (MAS) schemes can use the chr. 4 (DB83, DB84) and chr. 7 (DB80) markers to introgress the panicle architecture traits from SC14_166, and the chr. 3 (DB71 to DB75) and chr. 5 markers (DB88, DB76, DB77, DB78) to select for the desired grain shape.

CONCLUSION
As a prototype for converting GWAS results into useful markers, we developed a suite of 18 SNP markers from segregating SNPs identified in GWAS-QTL. Of these, 15 markers targeted panicle architecture and grain traits colocated with RIL-QTL. These markers were validated by t-tests in three Japonica RIL populations, and allele effects predicted the same size and direction as GWAS allele effects. Based on the allele distribution across subpopulations, most likely these markers will be most useful when selecting within tropical japonica or Japonica germplasm. All 18 markers are available to use as MAS tools in breeding efforts, especially when pyramiding various trait combinations to develop higher-yielding cultivars.

A C K N O W L E D G M E N T S
We gratefully acknowledge the superb technical assistance of Ms. Teresa A. Hancock in leading all phases of the phenotypic data collection and the initial data manipulation for the RIL populations, which was supported in part by the National Science Foundation (US)-Plant Genome Project: "The Genetic Basis of Transgressive Variation in Rice" (Award no. 1026555) through subaward monies to co-PI, GCE. Mention of a trademark or proprietary product does not constitute a guarantee or warranty of the product by the U.S. Department of Agriculture and does not imply its approval to the exclusion of other products that also can be suitable. USDA is an equal opportunity provider and employer. All experiments complied with the current laws of the United States, the country in which they were performed.

C O N F L I C T O F I N T E R E S T
The authors declare no conflict of interest with this research.