Nonfunctional alleles of long‐day suppressor genes independently regulate flowering time

Abstract Due to the remarkable adaptability to various environments, rice varieties with diverse flowering times have been domesticated or improved from Oryza rufipogon. Detailed knowledge of the genetic factors controlling flowering time will facilitate understanding the adaptation mechanism in cultivated rice and enable breeders to design appropriate genotypes for distinct preferences. In this study, four genes (Hd1, DTH8, Ghd7 and OsPRR37) in a rice long‐day suppression pathway were collected and sequenced in 154, 74, 69 and 62 varieties of cultivated rice (Oryza sativa) respectively. Under long‐day conditions, varieties with nonfunctional alleles flowered significantly earlier than those with functional alleles. However, the four genes have different genetic effects in the regulation of flowering time: Hd1 and OsPRR37 are major genes that generally regulate rice flowering time for all varieties, while DTH8 and Ghd7 only regulate regional rice varieties. Geographic analysis and network studies suggested that the nonfunctional alleles of these suppression loci with regional adaptability were derived recently and independently. Alleles with regional adaptability should be taken into consideration for genetic improvement. The rich genetic variations in these four genes, which adapt rice to different environments, provide the flexibility needed for breeding rice varieties with diverse flowering times.


INTRODUCTION
Yield improvement has always been an important goal of scientific research. Improvement in crop productivity requires optimal flowering time for wide geographical adaptation as well as for various management conditions (Izawa 2007;Wu et al. 2013). Oryza rufipogon, the common ancestor of cultivated rice, grows in the tropics; yet current rice varieties have been domesticated to grow widely from 53°N to 40°S (Khush 1997;Brambilla and Fornara 2014). Due to the remarkable adaptability to various environments, rice varieties with diverse flowering times have been domesticated or improved from their ancestry. For example, the two major rice subspecies, indica and japonica, were domesticated to adapt to different ecological and geographical environments (Chou 1948;Caicedo et al. 2007). Some elite rice cultivars flower extremely early with weak photoperiod sensitivity in order to adapt to short summer growing seasons (Fujino and Sekiguchi 2005;Wei et al. 2008;Li et al. 2013). Optimal flowering time is an important breeding objective, which enables rice to adapt to seasonal changes and make maximum use of temperature and sunlight resources (Izawa 2007). Detailed knowledge of the genetic factors that control rice flowering time will increase our understanding of the adaptive mechanisms in cultivated rice and enable breeders to design appropriate genotypes for distinct preferences (Putterill et al. 2004).
Recent advances in flowering time research in rice have identified a more complex and unique flowering pathway compared with that of Arabidopsis, the model long-day plant. Conserved Heading date 1 (Hd1) and unique Early heading date 1 (Ehd1) are two major floral signal integrators that receive multiple signals from other genes to control the expression of the florigens, Heading date 3a (Hd3a) and RICE FLOWERING LOCUS T (RFT1) (Tsuji et al. 2011). Hd1 suppresses flowering in long-day (LD) conditions, but activates it in short-day (SD) conditions (Yano et al. 2000). Ehd1 encodes a B-type response regulator that may not have an ortholog in the Arabidopsis genome (Doi et al. 2004). Ehd1 is an Hd1-independent flowering pathway that promotes flowering in both LD and SD conditions, and is a critical convergence point of regulation by multiple signaling pathways (Doi et al. 2004). In SD conditions, Ehd1 is controlled by OsGIGANTEA (OsGI), Early heading date 2 (Ehd2), and OsMAD51. In LD conditions, rice flowering is regulated by both the LD-activation pathway and the LD-suppression pathway (Tsuji et al. 2011). OsMAD50, Ehd2, Early heading date 3 (Ehd3) and Early heading date 4 (Ehd4), which promote flowering by directly or indirectly upregulating Ehd1 expression, constitute a LD-activation pathway in rice (Tsuji et al. 2011;Gao et al. 2013). On the other hand, Ehd1 expression is inhibited by a number of negative regulators, including Ghd7 (for Grain number, plant height, and heading date 7), DTH8 (for days to heading on chromosome 8) and OsPRR37 (for Oryza sativa Pseudo-Response Regulator 37). Together with Hd1, these flowering-time suppressor genes may constitute a LD-suppression pathway in rice (Tsuji et al. 2011;Koo et al. 2013).
Rice is a facultative short-day plant with a critical day length response and flowers rapidly in SD conditions (Tsuji et al. 2011). However, the growing area of domesticated rice, whose ancestral species grow in the tropics (SD conditions), was extended to high latitudes (LD conditions) following domestication and improvement (Izawa 2007). Domesticated rice can now flower under non-inductive LD conditions (Komiya et al. 2008). Interestingly, many studies have found that the flower suppressor genes under LD conditions in rice are important for rice to extend to high latitude (Komiya et al. 2008;Xue et al. 2008). For example, molecular and association analysis revealed that natural variation in Hd1 alleles contributes to variation in flowering time and plays an important role in the regional adaptation of rice (Komiya et al. 2008;Wei et al. 2013). Allelic variation at the Ghd7 locus increases grain yield by adapting to long growing seasons and plays a key role in the adaptability of cultivated rice on a global scale (Xue et al. 2008;Lu et al. 2012). Sequence analysis of OsPRR37 variants demonstrated that the variation contributes to the northward expansion of rice cultivation (Koo et al. 2013). However, how the natural variation of these loci is associated with flowering time in rice accessions remains unknown. In addition, their evolutionary patterns and relative importance within different rice cultivation areas are still not clear. Exploration of these issues will facilitate an understanding of the comprehensive role of each gene in flowering time regulation and provide an efficient way to improve ecological adaptation.
In this study, four genes (Hd1, DTH8, Ghd7 and OsPRR37) in the LD-suppression pathway of rice were collected and sequenced in the germplasm of 154, 74, 69 and 62 varieties, respectively, of cultivated rice (Oryza sativa) to reveal: (i) the level and pattern of sequence variation of the four loci between two rice subspecies, (ii) the relationship between polymorphism in the four loci and the natural variation in flowering time, and (iii) the roles of the four loci in floweringtime regulation and the regional adaptability of rice.

Genetic variation in rice
We sequenced and collected 359 sequences of four flowering suppressor loci under LD conditions from the representative germplasm (154, 74, 69 and 62 varieties for Hd1, DTH8, Ghd7 and OsPPR37, respectively) of the two cultivated subspecies. The total lengths of the aligned coding regions for Hd1, DTH8, Ghd7 and OsPPR37 were 2,028, 903, 774 and 2,229 bp (Figure 1), respectively. The schematic diagrams and polymorphic sites of all four loci are shown in Figure 1. For the four nuclear loci, the species-wide levels of variation in polymorphic sites varied from 12 (Ghd7) to 35 (Hd1) and the number of alleles ranged from 7 (Ghd7) to 39 (Hd1) in cultivated rice.
For DTH8 coding region (903 bp), 14 SNPs and six indels were detected in the 62 varieties ( Figure 1B). DTH8 with simple structure had only one exon and no intron. Of these polymorphic sites, three indels and three substitutions resulted in changes to four amino acids. Nine alleles designated as DTH8-1 to DTH8-9 were constructed based on these polymorphic sites. DTH8-1, DTH8-5, DTH8-6 and DTH8-7 were the most prevalent alleles, present in 19, 6, 12 and 12 varieties, respectively. These alleles were largely represented by most of the accessions of cultivated rice (80.64%). Six alleles (66.67%) were shared by both cultivated subspecies, except for DTH8-2, DTH8-8 and DTH8-9.
Twenty-nine SNPs and only one indel were detected in the 2229 bp coding region of OsPPR37 ( Figure 1D). Of these polymorphic sites, one indel and three SNPs resulted in amino acid changes. Seventeen OsPPR37 types were classified according to these variations, designated as OsPPR37-1 to OsPPR37-17. The most prevalent alleles were OsPPR37-8, OsPPR37-14 and OsPPR37-17, which were represented by 6, 14 and 19 varieties, respectively; while other alleles were rare types represented by only one to five varieties. Fourteen indica varieties (35%) possessed the OsPPR37-14 allele with an 8 bp deletion, while 19 japonica varieties (65.52%) carried the OsPPR37-17 allele.
Since the number of polymorphic sites is heavily dependent on sample and sequence size, we further computed the nucleotide diversity to compare genetic diversity in the four loci. Standard statistics of sequence polymorphism for four loci are shown in Table S1. The subspecies-wide levels of silent nucleotide variation of exons varied across loci from 0.00239 (OsPRR37) to 0.00522 (Hd1) in indica and from 0.00201 (OsPRR37) to 0.00485 (Hd1) in japonica. Average silent nucleotide variation of Hd1 was significantly higher than other loci (P < 0.05, for both p and u).

Nonfunctional alleles diversified flowering time of rice
Contribution of function or loss function of the four loci to flowering-time variation Alleles with a premature stop codon or leading to changes in open reading frame were considered as potentially nonfunctional. We performed an association analysis to further understand the relationship between function or loss of function of the four loci and flowering-time variation. However, population genetic structure can result in false associations between genetic markers and phenotypes. The already well demonstrated genetic differentiation between these two subspecies (Second 1982;Garris et al. 2005;Zhu et al. 2007) should be a source of a main strong genetic structure within the collection of varieties. So we divided the cultivated rice into two groups, including indica and japonica, for eliminating the effect of population structure on association analysis. Significant correlations in both indica (Hd1: r ¼ 0.48, P ¼ 1.98 Â 10 À5 ; DTH8: r ¼ 0.31, P ¼ 0.02; Ghd7: r ¼ 0.64, P ¼ 5.97 Â 10 À5 ; OsPRR37: r ¼ 0.58, P ¼ 2.85 Â 10 À6 ) and japonica (Hd1: À6 ) were found between function or loss of function of the four loci and the variation in flowering time. At the same time, flowering time in varieties carrying nonfunctional type alleles was significantly shorter than that of varieties with functional type alleles ( Figure 2 and Table S2). Significant differences in flowering time were examined among rice varieties with functional or nonfunctional alleles (P ¼ 4.39 Â 10 À6 for Hd1; P ¼ 0.027 for DTH8; P ¼ 5.96 Â 10 À5 for Ghd7; P ¼ 2.74 Â 10 À5 for OsPRR37; Figure 2).
For Hd1 locus, a diverse collection of 154 O. sativa accessions was surveyed; nonfunctional alleles were detected in 74 (48%) of these accessions. Eight alleles were represented by more than five samples. Of these, six were nonfunctional alleles, and those varieties with all nonfunctional alleles had a shorter growing period ( Figure 2). Average flowering time (FT) of accessions with nonfunctional alleles was about 76 d, which was significantly shorter (P ¼ 4.39 Â 10 À8 ) than the other cultivated rice groups (99 d). Of the total FT diversity, about 28% was attributable to functional differences in the Hd1 locus.  Four nonfunctional DTH8 alleles were found in 13 accessions, accounting for 19% of the total accessions. More than five accessions had both DTH8-4 and DTH8-5. Flowering time in varieties with DTH8-4 alleles had a significantly shorter growing period (FT ¼ 63; P ¼ 0.0001), but no difference in flowering time was observed between varieties with DTH8-5 and other alleles. However, for functional and nonfunctional groups, average FT differed significantly (P ¼ 0.027) at 76 and 99 d, respectively. Furthermore, about 16% of the total FT difference was attributable to divergence between functional and nonfunctional groups.
Ghd7 is a major QTL for flowering time under LD conditions (Xue et al. 2008). Of the 74 varieties, 12 (16.21%) had a nonfunctional Ghd7-7 allele and all accessions with Ghd7-7 allele grew in temperate areas. Average FT in these samples was 58 d, approximately two-thirds (60.42%) of the other varieties. 37% of the total FT variation was attributable to function or loss of function of the alleles.
OsPPR37 was nonfunctional in 41 cultivated accessions, which accounted for 59% of the total population. OsPPR37-14, OsPPR37-15, OsPPR37-16 and OsPPR37-17 were nonfunctional alleles in the population. Average FT in varieties with these alleles was 78, 96, 97, and 97 d. Average FT in accessions with nonfunctional alleles was about 86 d which significantly differed (P ¼ 2.74 Â 10 À5 ) from varieties containing functional alleles (104 d). 34% of the total FT variation was attributed to function or loss function of the alleles.
Distribution and origin of the nonfunctional alleles Nonfunctional alleles of the four loci were observed in both indica and japonica (Figure 1). Fifty-two indica and 21 japonica varieties had Hd1 nonfunctional alleles; nine indica and three japonica varieties had DTH8 nonfunctional alleles; two indica varieties and 10 japonica varieties belonged to one cultivated rice type with Gdh7-7, the only Ghd7 nonfunctional allele; for the OsPRR37 locus, nonfunctional alleles were detected in 19 indica and 22 japonica varieties. These results suggest that loss However, nonfunctional alleles of the four loci exhibited different geographical distribution patterns, and the nonfunctional alleles of each locus have distinct geographic origins (Figure 3). Varieties with the nonfunctional alleles Ghd7 and DTH8 are locally distributed in rice-growing areas. Most of the varieties with Ghd7-7 grow in northern China, the Korean Peninsula and Japan, and a number of DTH8 nonfunctional alleles were found in varieties grown in east China. Varieties with Hd1 and OsPRR37 nonfunctional alleles were distributed throughout the rice-growing areas (Figure 3). Yet, cultivars with different nonfunctional Hd1 or OsPRR37 alleles were mapped to regional regions of the rice-growing areas. However, mantel's test detected a significant association between geographical distance and genetic distance (Fst/ (1-Fst)) in varieties with nonfunctional alleles for Hd1 (r ¼ 0.62, P ¼ 0.019) and OsPPR37 (r ¼ 0.83, P ¼ 1.23 Â 10 À4 ), but not in varieties with functional alleles for four loci ( Figure S2; Hd1: r ¼ 0.12, P ¼ 0.17; DTH8: r ¼ 0.21, P ¼ 0.09; Ghd7: r ¼ 0.24, P ¼ 0.08; OsPPR37: r ¼ 0.03, P ¼ 0.51), suggesting that the isolation by distance model only applied to populations with the nonfunctional alleles Hd1 and OsPPR37. Network trees representing phylogenetic relationships in the four loci are shown in Figure 4. Except for OsPRR37, the most striking feature of the networks for the four loci (Figure 4) is that alleles with high frequency are functional types, with most old enough to have developed a star-like branching structure (phylogenetic cluster). We found significant differences among the four loci in phylogenetic incompatibility (Hd1: P ¼ 0.01; DTH8: P ¼ 0.04; Ghd7: P ¼ 0.02; OsPRR37: P ¼ 0.01) using the JE method (Jakobsen and Easteal 1996). The results identified that three of the most frequent functional alleles corresponded to three major lineages of Hd1, and the other three genes possessed two major lineages. All of these functional alleles were located at the central position (Figure 4). The intermediate relationship of the functional alleles of the four loci suggests that these functional alleles might be ancestral (Figure 4). In addition, the nonfunctional alleles were distributed in different lineages. Twenty-three of the 27 nonfunctional alleles of the four loci were located at the tips (Figure 4), which indicates that the nonfunctional alleles may have originated independently and recently (Bandelt et al. 1999).

DISCUSSION
Nonfunctional alleles of LD-suppressor genes diversified flowering time in rice We compared the coding region sequences of four suppressor genes for flowering time from 64-154 rice landraces and found many function-loss mutations in all four genes. Significant correlations were detected between function or loss of function of the four loci and variation in flowering time. In addition, varieties with nonfunctional alleles flowered significantly earlier than those with functional alleles under LD conditions (Figure 2). These results indicate that function-loss mutations in rice play an important role in the regulation of flowering time under LD conditions. Flowering time is one of the best-studied ecologically significant traits under natural or artificial selection for adaptation of plants to specific natural environments. Some studies have found that loss of gene function is an important factor that regulates flowering time in many species. Of the flowering genes identified in Arabidopsis, FRIGIDA and FLOWERING LOCUS C are two suppressor genes of the vernalization pathway, and the loss of function of these two genes appears to underlie the extensive natural variation of flowering time for local adaptation (Le Corre et al. 2002;Balasubramanian et al. 2006). In wheat, one allele with a TE element insertion had the lowest expression level among the Ppd-D1 alleles, and this allele may be responsible for the longest growth period and for the adaptation to northern latitudes and higher altitude regions (Gao et al. 2009). In addition, large deletions in the VRN-1 first intron are associated with the spring growth habit in barley and wheat (Fu et al. 2005). Loss of gene function may change the regulatory network with minimal sideeffects, which may be an important reason why many loss-offunction mutations were detected in the regulatory network of flowering time (Hottes et al. 2013).
Hd1, DTH8, Ghd7 and OsPRR37 have different genetic effects on flowering time. Ghd7 is the most important gene for adaptation to the northern region and the function or loss function of Ghd7 explains 37% of the variation of flowering time in rice. OsPRR37 and Hd1, which, respectively, explain 34 and 28% of the variation in flowering time in rice, are widely used in rice domestication and breeding; while cultivated rice varieties from different geographical regions contain different nonfunctional alleles in both loci. The function or loss function of DTH8 explains 19% of the variation in flowering time and only fine-tunes flowering time in rice. Alleles associated with regional adaptability should be taken into consideration for genetic improvement. Elucidation of the evolution patterns of these nonfunctional alleles as well as the relationship between them and the geographical distribution of rice provides opportunities to tailor rice environmental adaptability to suit diverse agricultural demands.
Nonfunctional alleles of LD-suppressor genes originated independently and recently Based on multiple lines of evidence, we argue that the nonfunctional alleles of all four suppressor genes, which independently regulate flowering times in local rice varieties, originated from different ancestors during rice domestication or breeding under LD environments. Firstly, a clear regional distribution pattern in these nonfunctional alleles was observed. The distribution of the nonfunctional alleles DTH8 and Ghd7 affected the distribution of local rice varieties, while that of the other two genes (Hd1 and OsPRR37) was closely correlated with latitude ( Figure 3). Secondly, network studies suggested that the nonfunctional alleles were derived independently from distinct ancestral alleles with distant relationships ( Figure 4). Thirdly, most nonfunctional alleles in different lineages were located at the tips ( Figure 4); in addition, many phylogeographic studies have found that ancient alleles should be located at the center of the network tree, whereas recent alleles that are locally distributed geographically should be at the tips of the network tree. Taken together, our geographic analysis and network studies suggest that the nonfunctional alleles of these suppressor genes with regional adaptability were derived recently and independently.
The rich genetic variation in these four genes, which can adapt to different environments, provides the flexibility needed for designing various flowering times in rice. However, the mechanisms by which these genes regulate flowering time and their origin patterns in rice germplasm are still not thoroughly understood. Selection of genes in the photoperiod pathway from wild or landrace rice may enable cultivated rice varieties to adapt to different photoperiods along latitudes and farming systems (Ross Ibarra et al. 2007). Artificial selection would leave evolutionary footprints in the genomes of cultivated rice (Sabeti et al. 2006;Ross Ibarra et al. 2007). With wild rice samples included, these genes in cultivated rice will be analyzed in future studies and their evolutionary patterns compared with that of wild rice. This will shed new light on the domestication mechanism in rice, including the origin and formation of domestication traits and the specific pressures that affect these domestication traits, which can then be applied to design varieties of the desired quality.

Plant materials
The sequences of Hd1, DTH8, Ghd7 and OsPRR37 (Table S3) were collected from five studies reported by Wei et al. (2013), Fujino et al. (2010), Wei et al. (2008), Lu et al. (2012) and Koo et al. (2013). To ensure the samples covering all distribution areas and rice subgroups, we further sequenced 5, 8, 7 and 14 accessions for Hd1, DTH8, Ghd7 and OsPRR37, respectively. The varieties for phenotype identification of Ghd7 and DTH8 were grown in Wuhan and Nanjing in China, respectively, while the varieties for OsPRR37 were grown in Korea. For the varieties detecting Hd1, we collected the phenotypes from the Chinese Crop Germplasm Information System in Beijing, China. All phenotypes of 359 accessions were obtained under LD conditions. DNA extraction, PCR amplification and sequencing Fresh leaves were collected from field-grown plants and genomic DNA was extracted using the hexadecyl-trimethyl ammonium bromide method as described by Ge et al. (1999). We only amplified the coding region for each locus. Sequences were amplified from genomic DNA using LA Taq (Takara). Table S4 and Figure S1 provide a list of all primers used for polymerase chain reactions (PCRs) and sequencing. PCR amplification methods follow those in Zheng and Ge (2010). All accessions were sequenced by ABI3730XL automatic sequencer (Applied Biosystems). Since heterozygous individuals may exist for a small number of cultivars (Wei et al. 2013), PCR fragments of heterozygous samples were cloned into EASY vectors (Transgen, China) and three cloned DNA fragments were sequenced for each individual. Taq errors occur randomly so polymorphisms shared by more than two clones are unlikely to be artificial (Eyre et al. 1998;Zheng and Ge 2010). To further verify the singleton site, we re-sequenced individuals that contained singletons in the alignments and obtained four clones after the second round of PCR.

Statistical analysis
In this study, we classified haplotypes as functional or nonfunctional based on previous studies (Wei et al. 2008;Fujino et al. 2010;Lu et al. 2012;Koo et al. 2013;Wei et al. 2013). The correlation between the functional and nonfunctional of these alleles and flowering time was examined by the Spearman Correlation Coefficient Test. The differences of flowering time among varieties with different alleles were examined by ANOVA for each locus and, if significant (P < 0.05), the Duncan multiple range test and critical test conducted. A hierarchical analysis on flowering time differences was performed for the four loci by one-way ANOVA, which estimated the relative contribution to flowering time variance of the function or loss of function of the suppressor genes. Statistical analysis was performed by R2.5.1.

Phylogeographic analysis
Initial sequence data were assembled with the Contig-Express and aligned using ClustalX 1.81 (Thompson and Toga 1997). Sequences were manually edited with DAMBE (Xia and Xie 2001). Insertions/deletions (indels) were included in the analysis. An allele network tree was constructed using the Median-Joining model using network version 4.0 (Bandelt et al. 1999). Gaps with a length greater than one were considered a single mutation. The Mantel test performed in Arlequin version 3.5 was used to examine the correlation between geographical and genetic distance (Slatkin 1993).

SUPPORTING INFORMATION
Additional supporting information may be found in the online version of this article at the publisher's web-site Figure S1. Schematic diagrams of four nuclear loci and locations of the regions sequenced Exons are shown as black boxes; thin lines between black boxes refer to introns. Locations of primers for each fragment are sketched above the diagrams using red arrows. Figure S2. Distribution map of rice landraces carrying the functional alleles (A) Hd1, (B) DTH8, (C) Ghd7 and (D) OsPRR37. Each circle represents one accession. Each color represents different haplotypes except that alleles with less than three varieties are shown in pink. Table S1. Summary of nucleotide polymorphism of four loci Table S2. The differences between the functional and nonfunctional alleles by one-way ANOVA Table S3. Sample list  Table S4. The primer sequences used in this study