Using Heading date 1 preponderant alleles from indica cultivars to breed high‐yield, high‐quality japonica rice varieties for cultivation in south China

Summary Heading date 1 (Hd1) is an important gene for the regulation of flowering in rice, but its variation in major cultivated rice varieties, and the effect of this variation on yield and quality, remains unknown. In this study, we selected 123 major rice varieties cultivated in China from 1936 to 2009 to analyse the relationship between the Hd1 alleles and yield‐related traits. Among these varieties, 19 haplotypes were detected in Hd1, including two major haplotypes (H8 and H13) in the japonica group and three major haplotypes (H14, H15 and H16) in the indica group. Analysis of allele frequencies showed that the secondary branch number was the major aimed for Chinese indica breeding. In the five major haplotypes, SNP316(C‐T) was the only difference between the two major japonica haplotypes, and SNP495(C‐G) and SNP614(G‐A) are the major SNPs in the three indica haplotypes. Association analysis showed that H16 is the most preponderant allele in modern cultivated Chinese indica varieties. Backcrossing this allele into the japonica variety Chunjiang06 improved yield without decreasing grain quality. Therefore, our analysis offers a new strategy for utilizing these preponderant alleles to improve yield and quality of japonica varieties for cultivation in the southern areas of China.


Introduction
Rice (Oryza sativa L.) is an important cereal crop and provides the staple food for more than half of the world's population (Tian et al., 2009). Achieving increases in grain yield in rice is a major focus in agriculture (Ren et al., 2018;Yuan, 2014). Heading date (flowering time) is one of the most important agronomic traits for regional adaptation and grain yield, and is affected by genetic and environmental factors (Li et al., 2015). Selection for the optimal flowering time for a particular region will make maximum use of temperature and sunlight conditions to improve yield potential (Izawa, 2007;Zheng et al., 2016). Therefore, detailed knowledge of the genetic factors for heading date will increase our understanding of the adaptive mechanisms of rice varieties and enable breeders to design appropriate genotypes for specific environments (Putterill et al., 2004;Zheng et al., 2016).
Rice is a short-day (SD) flowering plant; its heading is promoted under short photoperiod conditions. In recent years, the molecular genetic pathway for SD photoperiodic regulation in cultivated rice has been well characterized. OsGI (a homolog of Arabidopsis thaliana GIGANTEA) is responsible for perceiving light signals and circadian clocks, and regulates the expression of Heading date 1 (Hd1, an ortholog of CONSTANS in Arabidopsis) and OsMADS51 (Hayama et al., 2002(Hayama et al., , 2003Kim et al., 2007;Takahashi et al., 2009). Hd1, which encodes zinc-finger-type transcriptional activators with CCT domains, promotes flowering under SD conditions and represses flowering under long-day (LD) conditions by regulating the expression of Heading date 3a (Hd3a, an ortholog of FLOWERING LOCUS T in Arabidopsis; Kojima et al., 2002;Yano et al., 2000). OsMADS51, which encodes a type I MADS-box protein and functions upstream of Early heading date 1 (Ehd1), promotes flowering under SD conditions (Kim et al., 2007). Ehd1 is a B-type response regulator, which promotes flowering by regulating the expression of Hd3a and independently from Hd1 pathway (Doi et al., 2004). RICE FLOWERING LOCUS T1 (RFT1) functions as a floral activator, belongs to the rice FT-like gene family and is the closest homolog of Hd3a (Komiya et al., 2008). RFT1 and Hd3a are both mobile flowering signals but RFT1 functions under LD conditions, whereas Hd3a functions under SD conditions (Komiya et al., 2008). Recent research demonstrated that RFT1 and Hd3a functionally diverged to control flowering time under LD and SD conditions, partly via a fine-tuned epigenetic mechanism (Li et al., 2015;Sun et al., 2012).
Natural genetic variation provides an opportunity to identify key alleles associated with traits that can be selected to improve agronomic characteristics of crops (Lu et al., 2013;Zhu et al., 2008). For example, the pleiotropic gene Ghd7, which affects flowering time, plant height and spikelet number per panicle, was shown to contain an important single nucleotide polymorphism (SNP) that affects these three related traits in rice (Lu et al., 2012). Two SNPs in maize (Zea mays) Dwarf8 were shown to be independently associated with flowering time and plant height (Thornsberry et al., 2001). As an important heading date gene, Hd1 alleles have been widely used in traditional rice breeding and are also good targets for molecular marker-assisted selection (MAS) breeding.
China has a long history of rice cultivation and a broad range of rice cultivation regions (from 18°N to 53°N), which contribute to a diversity of flowering time variants. However, the utilization of preponderant alleles of Hd1 in modern cultivated Chinese varieties is still unclear. In this study, we selected 123 major rice varieties cultivated in China from 1936 to 2009 in order to: (i) evaluate their genetic diversity and population structure; (ii) identify the key Hd1 alleles affecting yield-related traits; and (iii) utilize the preponderant alleles of Hd1 in rice breeding to improve yield and quality. The results of this study will provide valuable data and a new strategy for generating varieties more suited to specific regions, potentially improving yield.

Results
Wide variation of yield-related traits was noted in the 123 major rice varieties cultivated in China We selected 123 major rice varieties cultivated in China from 1936 to 2009, based on these varieties were the main cultivars at that time. These 123 varieties, comprising 54 japonica and 69 indica varieties, were selected to analyse the relationship between the Hd1 alleles and eight yield-related traits ( Figure 1a, Table S1). A wide variation in heading date was observed among the 123 varieties, ranging from 54 to 117 days in Hangzhou (a LD site) and 66 to 100 days in Hainan (a SD site; Figure 1b). The photoperiod analysis showed that most of the indica varieties were photoperiod insensitive, while most japonica varieties were not (Figure 1c, d). The other seven yield-related traits were highly diverse among the collection. For example, tiller number (TN) ranged from 5.6 to 14.2 in Hangzhou and 5.5 to 12.0 in Hainan. Grain number per plant (GNPP) ranged from 59.5 to 247.3 in Hangzhou and 71.4 to 207.2 in Hainan. Thousand grain weight (TGW) ranged from 17.6 to 28.8 in Hangzhou and 19.6 to 33.4 in Hainan (Table 1).

SSR diversity and population structure
Twenty-eight polymorphic SSR markers, randomly distributed on the 12 rice chromosomes, were selected to evaluate the genetic diversity of the 123 cultivated varieties. A total of 128 alleles were amplified and individual SSR marker contained between 2 and 9 alleles with an average of 4.5714 alleles for each marker. The  (Table S2).
The population structure of the 123 cultivated varieties based on the SSR genotype showed that the highest log-likelihood scores of the population structure were observed when the number of populations was set at two (K = 2), suggesting that these varieties can be classified into two subpopulations (Figure S1a). A neighbour-joining tree of 123 cultivated varieties was constructed based on Nei's genetic distance, which indicated that the rice accessions could be well differentiated into japonica and indica groups ( Figure S1a).
Hd1 exhibited a high degree of nucleotide polymorphism and protein diversity We analysed the Hd1 coding region (1825 bp, which includes one intron and two exons) in the 123 cultivated varieties. The Hd1 sequence exhibited a high degree of nucleotide polymorphism; Thirteen SNPs and 10 insertions/deletions (InDels) were detected in the coding region of Hd1, of which 3 SNPs were found in the zinc finger domain and 1 InDel was found in the CCT domains ( Figure 2). The Hd1 sequence of Nipponbare (NPB, haplotype 1, H1) was defined as the reference sequence. A total of 19 haplotypes, named H2 to H20, were identified in the 123 cultivated varieties (Figure 2). Haplotypes H4, H5, H8-13, H17 and H20 belong to the japonica subpopulation, while haplotypes H2, H3, H6, H7, H14-16, H18 and H19 belong to the indica subpopulation. The most prevalent haplotypes were H8, H13, H14, H15 and H16, which included 30, 9, 12, 26 and 24 varieties, respectively. Of the remaining haplotypes, each was found in four or fewer varieties ( Figure 2, Table S1). When comparing the mean values of the eight target traits in plants with the five major Hd1 haplotypes, significant differences were detected among the five haplotypes except for TN in Hangzhou and TGW in both locations (Table 2).

The Hd1 ind haplotypes displayed signatures of artificial selection
In order to identify which Hd1 haplotypes were the major types used in different years, we divided the 123 varieties into three groups according to the period of time of their release: years 1936-1969, 1970-1993and 1993). Among the five major haplotypes, the Hd1 jap haplotype H8 maintained a high selection frequency in all 3 groups, with allele frequencies of 50%, 61% and 60%, respectively (Figure 4a, b). Moreover, the expression level of H8 changed very little among three groups (Figure 4c). For Hd1 ind haplotypes, distinct differences were noted among the three groups. For example, H15 was the major haplotype from 1936 to 1969, but H14 and H16 were the major haplotypes from 1993 to 2009 (Figure 4a). The allele frequencies of H14, H15 and H16 were 0%, 67% and 10% in the 1936-1969 group, and 31%, 15% and 50% in the 1993-2009 group, respectively ( Figure 4b). Interestingly, the expression of Hd1 ind alleles decreased in H15 and H16 (Figure 4c). These results indicated that the preponderant Hd1 allele has changed in recent decades in Chinese indica varieties.
We further compared these 3 indica haplotypes to the mean values of the eight target traits; significant differences were detected in secondary branch number (SBN), indicating that the secondary branch number was the major aim for breeding of Chinese indica cultivars (Table 3).

Haplotype 16 was the preponderant allele in Chinese indica breeding
An association analysis between Hd1 haplotypes and eight yieldrelated traits was conducted to identify SNP-trait associations separately using a general linear model (GLM), which accounted for population structure data (Table S1). Five major haplotypes, H8, H13, H14, H15 and H16, were analysed. Other haplotypes were excluded because of a limited number of varieties (four or fewer). Three SNPs (SNP 316 (C-T), SNP 495 (C-G) and SNP 614 (G-A)) were found in the five types ( Figure 2). SNP 316 (C-T) is a C/T mutation located at 316 bp downstream of the ATG initiation site, which was found in the japonica group (Figure 2). SNP 316 (C-T) showed a significant association with HD, SBN, GNPP, TGW and GWSP in the 2016 LD experiment (Table 4). SNP 495 (C-G) and SNP 614 (G-A) are major SNPs in the indica group. SNP 495 (C-G) only showed a significant association with HD in the 2016 LD experiment, while SNP 614 (G-A) exhibited a significant association with SBN, GNPP and GWSP in the 2016 LD experiment and with TGW in the LD and SD conditions (Table 4). SNP 495 (C-G) is the only difference between haplotypes H14 and H15, but there was no significant difference among the eight yield-related traits between these haplotypes. However, a significant difference was found between haplotype H16 (one base substituted and four base (AAAG) deficiency with H14, four base (AAAG) deficiency with H15) and H14 or H15 among the eight yield-related traits ( Table S3).

Improvement of Hd1 alleles in indica-japonica breeding
Due to the influence of photosensitivity, the heading date of japonica cultivars was gradually shortened in the process of extending rice cultivation to the south area of China, which resulted in a decrease in grain yield. Therefore, it is necessary to introduce the heading-related genes of indica varieties into japonica varieties to prolong the heading date. According to the above results of preponderant Hd1 alleles of indica and japonica varieties, we hybridized and screened the progeny of the japonica variety Chunjiang06 (CJ06, the same Hd1 alleles as haplotype H8) and the indica variety Taichung native 1 (TN1, the same Hd1 alleles as haplotype H16; Figure 5a). A BC 4 F 5 line (Q77) containing the Hd1 TN1 fragment in the CJ06 background was selected (Figure 5b, c). In a comparison of yield-related traits among CJ06, Table 3 Comparison of eight yield-related traits among three major Hd1 ind haplotypes in varieties cultivated in 1936-1969and 1993Trait Year 1936-19691993  Q77 and TN1 plants, we found that the HD of Q77 was 7 days longer than that of CJ06 (receptor parent), the number of GNPP, PBN and SBN were significantly increased by 29.50, 2.31 and 3.77, respectively, and the weight of grain weight per plant (GWPP) and GWSP were significantly increased by 18.50 and 0.54, respectively. In contrast, there was no significant difference in these traits between Q77 and TN1 (Figure 5d-i).
Changing the heading date in rice may affect grain qualities (Cho et al., 2013). Therefore, we detected the quality-related traits (amylose content; AC, gel consistency; GC, gelatinization temperature; GT and eating and cooking qualities; ECQ index) of CJ06 (high quality), Q77 and TN1 (low quality). There were no significant differences between Q77 and CJ06 for AC, GC and the ECQ index, but there was a significant difference between Q77 and TN1 for these traits (Figure 6a, c, d). There were no significant differences among CJO6, Q77 and TN1 for GT (Figure 6b). The results indicated that the introduction of the Hd1 ind alleles of haplotype H16 into the CJ06 did not change the quality of the grain. Therefore, it is feasible to use the Hd1 ind alleles of haplotype H16 to prolong the heading date of a japonica variety, thereby increasing grain yield without changing quality.

Discussion
Rice has an estimated 8000-to 10 000-year history of domestication and breeding (Doebley et al., 2006;Takahashi et al., 2009). Human selection and adaptation to diverse environments have resulted in numerous cultivars (Khush, 1997). Natural genetic variation for heading date has been largely documented, which provides a valuable material for improving adaptation to local environments (Goretti et al., 2017). Despite the different alleles of heading date was application in rice traditional breeding programmes, the best gene pyramiding at the molecular level is still unclear. In this study, we selected and sequenced Hd1 from 123 major rice varieties cultivated from 1936 to 2009 in China to identify the key SNPs in Hd1 that affect yield-related traits. Then, we utilized the preponderant alleles of Hd1 for heading date-, yield-and quality-related trait improvement.

The diverse Hd1 alleles display significant indicajaponica differentiation
Hd1 is an important gene for the control of flowering time, and its sequence has evolved a high degree of polymorphism during rice domestication (Takahashi et al., 2009;Wei et al., 2014;Zheng et al., 2016). Takahashi et al. (2009) identified 17 Hd1 allele types using 64 core rice cultivars worldwide. Zheng et al. (2016) reported 39 allele types from 154 rice germplasms. However, these reports did not point out whether Hd1 has preponderant alleles in indica-japonica rice breeding. Our results show that there are obvious differences in the preponderant Hd1 alleles between indica and japonica varieties (Figure 4a). Furthermore, our results clearly demonstrated that the preponderant alleles of Hd1 jap did not change much in the breeding of modern Chinese varieties, but did change notably in Hd1 ind (Figure 4a, b). In addition, we found that the increase in grain yield in modern indica varieties is mostly due to the selection for SBN (Table 3). This result indicates that the rice breeding model in China has focused on the high-yield model with large panicles, and increasing the number of secondary branches is the key way to generate rice with large panicles.

Association analysis and utilization of preponderant alleles provide a critical strategy for rice improvement
Candidate gene-based association mapping takes advantage of recombination events in a natural population to resolve complex trait variation to individual nucleotides (Zhu et al., 2008). The selection and utilization of preponderant alleles have become an important way to cultivate high-yielding varieties. For example, Lu et al. (2012) identified a preponderant allele (S_555) of Ghd7 using 104 rice accessions. Utilization of the alleles (S_555) can decrease plant height and allow the plant to be more resistant to lodging, while not influencing heading date and yield traits in indica subspecies. In our study, we successfully used the preponderant alleles of Hd1 ind haplotype H16 to prolong the heading date of a japonica variety CJ06 (Figure 5), indicating that this is a feasible approach to improving the heading date in japonica varieties.
Japonica varieties are mainly cultivated in the middle and low Yangtze region and northeast China, but this area is only onefourth of the total rice planting area of China. The biggest problem with growing japonica varieties in lower latitudes is that  in lower latitudes they will flower too early and not reach their full potential yield and quality. Therefore, it is an important aim for rice breeding to prolong the heading date of japonica varieties appropriately without affecting grain yield and quality. Our results demonstrated that the introduction of the preponderant Hd1 ind alleles of haplotype H16 into a japonica cultivar significantly increased the yield per plant but did not change the qualityrelated traits (Figures 5 and 6). This result implied that cultivating japonica varieties in the south of China can be achieved by introducing the Hd1 ind allele of haplotype H16 to improve heading date, yield and quality traits.

Experimental procedures
Plant materials and plant growth condition

Evaluation of traits
Heading date (HD) was defined as the days from sowing to the appearance of the first panicle. Tiller number (TN), primary branch number (PBN), secondary branch number (SBN) and grain number per plant (GNPP) were measured 25 days after heading. GNPP was selected the highest panicle as the grain number. Except for two marginal plants in each side, ten independent plants were used to score the phenotypic data sets. Thirty-five days after flowering, seeds from each plant were harvested and weighed after physicochemical properties had stabilized. Grain weight was calculated on the basis of 200 grains and converted to TGW. Grain weight per plant (GWPP) was the weight of all seeds harvested from a single plant. Grain weight per single panicle (GWSP) was the weight of the seeds harvested from a single tiller. For quality-related traits, 15 grains of milled rice were selected for measuring gelatinization temperature (GT), and 15 g of grain was ground to flour to measure amylose content (AC) and gel consistency (GC). AC, GC and GT were measured according to the procedures in Leng et al. (2013). The eating and cooking qualities (ECQs) index was measured according to the procedure in Zeng et al. (2017).

DNA extraction, PCR and sequence analysis
Genomic DNA was extracted from fresh leaves of each plant using the cetyltrimethylammonium bromide (CTAB) method (Murray and Thompson, 1980). The polymorphic simple sequence repeat (SSR) markers, randomly distributed across the 12 rice chromosomes, are listed in Table S2. PCRs for amplification of the SSRs were conducted according to the methods in Jin et al. (2010) and the products were run on an 8% denaturing polyacrylamide gel at 120 V for 100 min and the gel was visualized using silver staining. Hd1, including the 1825-bp coding region, was amplified from genomic DNA using KOD plus (TOYOBO, Tokyo, Japan). PCRs were conducted using standard PCR protocols. The primers used for PCR and sequencing are listed in Table S4. The initial genomic sequence of Hd1 was assembled using DNA star software (DNAStar Inc., Madison, WI).

Genetic diversity and population structure analysis
PowerMarker V3.25 was used to analyse the genetic diversity including the number of alleles per locus, major allele frequency, gene diversity and polymorphism information content (PIC) values (Liu and Muse, 2005). Nei's distance (Nei et al., 1983) was calculated and used for the unrooted phylogeny reconstruction using the neighbour-joining method as implemented in Power-Marker with the tree viewed using MEGA 4.0 (Tamura et al., 2007). The population structure among the 123 rice varieties, based on the genotype data, was performed using STRUCTURE V 2.3.4 (Pritchard and Wen, 2004). The number of populations (K) was selected from 2 to 10 and five independent runs of a burn-in of 10 000 iterations followed by 100 000 iterations for each value of K. The optimum structure number of K was selected based on the report of Evanno et al., 2005.

RNA preparation and qRT-PCR analysis
Total RNA was extracted using the RNeasy plant mini kit (Qiagen, Valencia, CA) following the manufacturer's instructions. RNA preparations were treated with DNase I (Takara, Tokyo, Japan) to remove traces of DNA contamination. Reverse transcription was conducted with the ReverTra Ace qPCR-RT Kit (TOYOBO). After synthesis, the cDNA was diluted fivefold in TE buffer, and 1 lL was used for quantitative PCR using the Fast SYBR Green Master Mix (Applied Biosystems, Carlsbad, CA) and gene-specific primers (Table S4) in an ABI7900 analyser (Applied Biosystems, Foster City, CA).

The development of chromosome segment substitution lines
The TN1/CJ06 chromosome segment substitution lines (CSSLs) were obtained according to Su et al. (2011). The plants carrying the TN1 genotype at the flanking region of Hd1 were selected to cross with the CJ06 variety for five rounds of backcrossing. A total of 71 SSR markers were used to screen for the genetic background (Ren et al., 2016). The SSR markers RM539 and RM454 were used to identify the plants containing the TN1 genotype in backcross progeny.

Statistical analysis
Analysis of variance was performed using Microsoft Excel 2003. Duncan's multiple comparison was performed by SAS 8.0 software. Hd1 sequences were aligned by Clustalx 2.1, and the alignment results were input into TASSEL. A general linear model (GLM) was performed in TASSEL V 3.0 for association analysis, which accounted for population structure (Q).

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article.
Table S1 Basic information for 123 major rice varieties cultivated in China. Figure S1 Population structure and unrooted neighbor-joining trees of 123 major rice varieties cultivated in China.