Overexpression of PIF4 affects plant morphology and accelerates reproductive phase transitions in soybean

Phytochrome - interacting factor 4 acts as a signalling hub for integrating multiple environmental cues like light and temperature. While the function of PIF4 in the model plant Arabidopsis

Further, soybean is a major legume possessing a unique nitrogen-fixing ability leading to its use in rotation agriculture for replenishing soil nitrogen composition (Córdova et al., 2019;Drinkwater et al., 1998;Lawn & Brun, 1974). Hence, soybean breeding programmes continuously aim to develop improved cultivars for better regional adaptability (Fehr, 2007;Hammond et al., 1972;Hartman et al., 2005;Li, Xin, et al., 2017). The photoperiod critically determines the onset of floral evocation in soybean, and soybean cultivars are classified into different maturity groups (ranging from 000 to 10) based on their daylength requirements to attain maturity (Yang et al., 2019). Short days promote flowering in soybean, but all varieties do not require obligate short days to flower. Therefore, soybean crop shows vast phenotypic diversity and plasticity Tasma et al., 2001;Yang et al., 2019).
Phytochrome-interacting factor 4 (PIF4), belonging to the PIF subfamily of basic helix loop helix (bHLH) transcription factors, has been studied in the model plant Arabidopsis thaliana for its role in integrating light and temperature information inputs (Leivar & Monte, 2014;Leivar et al., 2008). Upon exposure to red light, PIF4 binds to the light-activated (Pfr) form of Phytochrome B (phy B) and gets degraded via 26S proteasomal degradation pathway . Light-induced degradation and dark-induced stabilisation of PIF4 regulates the transcription of thousands of downstream genes responsible for shade response, plant architecture, and biosynthesis of growth hormones (Bernardo-García et al., 2014;Franklin, 2009;Franklin et al., 2011;Koini et al., 2009;Lorrain et al., 2008;de Lucas et al., 2008). Among the plant hormones, Gibberellic acid (GA) is known for synchronising growth and floral transition events in Arabidopsis (Eriksson et al., 2006;Xu et al., 2014). DELLA proteins, repressors of GA and inactivators of PIF4, act at the interface of PIF4 and GA signalling pathways for converging light and gibberellin responses to optimise growth in response to changing environments (de Lucas et al., 2008). Further, auxins act in flower initiation, floral organogenesis and post-reproductive processes (Vanneste & Friml, 2009). PIF4 controls the biosynthesis of a vital auxin, indole acetic acid (IAA) by directly activating the promoters of TRYPTOPHAN AMINO ACID TRANSFERASE (TAA1), and CYP79B2 at a higher temperature (Franklin et al., 2011). PIF4 is also known to directly activate the mobile florigen FLOWERING LOCUS T (FT) in short days and warm ambient temperature conditions (Kumar et al., 2012). While the functional characterisation of PIF4 in Arabidopsis has provided valuable insights into the molecular control of light and temperature perception, detailed investigations are warranted to understand the role of PIF4 homologs in soybean (Jung et al., 2012).
Legumes including soybean have unique floral complexities; hence, it is challenging to translate information gained from Arabidopsis research to soybean (Jung et al., 2012;Liew et al., 2014;Wong et al., 2009Wong et al., , 2011. Arabidopsis is a facultative long-day plant, while soybean prefers short days for flowering (Liew et al., 2014). Also, soybean has a paleopolyploid genome resulting from two whole-genome duplication events; leading to multiple copies of flowering genes (Schmutz et al., 2010). For example, while two homologs of Arabidopsis florigen FT; GmFT2a and GmFT5a are known to promote photoperiodic flowering in soybean (Kong et al., 2010), yet another FT homolog; GmFT1a controls floral reversion (Liu et al., 2018). Floral reversion is a unique floral complexity which ensures reversion of flowering in photoperiod sensitive soybean varieties upon exposure to long photoperiod (Liu et al., 2018). Interestingly, CRYPTOCHROME (CRY) homolog; GmCRY1a is responsible for blue lightmediated floral initiation in soybean and not the other homolog, GmCRY2a (Zhang et al., 2008). Hence, duplicated gene copies may play a key role in shaping the distinctive flowering characteristics of soybean cultivars (Cai et al., 2019).
The role of PIF genes in monocot crops such as rice and maize has been reported previously (Kumar et al., 2016;Xie et al., 2019). In rice, PIF gene homologs have been designated as OsPIL11,12,13,14,15, and 16 (Cordeiro et al., 2016). OsPIL14 interacts with phytochrome B while OsPIL15 controls the tiller angle in rice (Cordeiro et al., 2016;Xie et al., 2019). Maize PIF3 functional divergence was shown by testing the specificity of its interaction with Pfr form of maize phytochrome B (PHYB) homolog; ZmPHYB2. Another homolog ZmPHYB1 did not interact with ZmPIF3 (Kumar et al., 2016). PIF3 has also been characterised for involvement in shade avoidance responses in Medicago sativa, an important perennial legume species (Lorenzo et al., 2019). In tomato, SIPIF4 controls hypocotyl elongation in warm temperatures (Hayes, 2019). Further, SlPIF1a and SlPIF3 control fruit ripening, while SlPIF4 controls pigmentation in tomato (Gramegna et al., 2019).
Earlier, we reported seven copies of PIF4 (GmPIF4a-g) in soybean, with GmPIF4b identified to be the most likely candidate involved in floral transition based on its expression profile in inductive short days and its ability to induce early flowering in wild-type (WT) Arabidopsis (Arya et al., 2018). Further, complementation experiments showed that GmPIF4b partially rescued the compact rosette and stunted hypocotyl phenotypes in Arabidopsis pif4-101 mutant (Arya et al., 2018). GmPIF4b protein was found to be regulated diurnally in long and short photoperiods (Arya et al., 2018). However, functional characterisation of PIF4 in soybean has not been reported. Thus, we employed a constitutive overexpression approach to characterise the function of GmPIF4b in a short-day cultivar Bragg. We report here that the transgenic soybean lines carrying 35s::GmPIF4b:: polyA construct exhibited changes in plant morphology and also showed early onset of flowering with the accelerated transition from early pod formation stage to full maturity stage.

GmPIF4 protein sequences against Arabidopsis PIF4
Multiple sequence alignment was performed by employing the ClustalW algorithm to compare the active phytochromebinding domains of soybean PIFs. Arabidopsis PIF4 sequence was also included for reference. The alignment was visualised using Jalview software (Waterhouse et al., 2009).

| cis-element search in the promoter regions of soybean PIF4s
2.4 kb upstream (5' end) region of GmPIF4s' coding sequence was analysed for cis-elements associated with light, temperature, hormonal and meristem controls. Most of the regulatory sequences are located upstream of the first "ATG" codon near the transcription start site; hence, 2.4 kb region upstream of first ATG was used in this study (Juven-Gershon & Kadonaga, 2010). cis-elements were searched in the PlantCARE database (Lescot et al., 2002). Information on plant cis-acting regulatory elements like enhancers and repressors is stored in the PlantCARE. cis-elements are represented as consensus sequences, positional matrices and individual sites on the promoter. The database also stores information about the binding sites of transcription factors, their position and site on the promoter, and functional annotations of the cis-elements of interest (Lescot et al., 2002).

| Analysis of gene structures of soybean PIF4s
Genomic and CDS sequences of soybean and Arabidopsis PIF4s were retrieved from Phytozome database, and Gene Structure Display Server (GSDS) tool (http://gsds2.cbi.pku. edu.cn) was used for visualising the exon-intron structure of soybean and Arabidopsis PIF4 genes (Hu et al., 2015). Sequences were compiled in a '.txt' file and uploaded on the GSDS server for obtaining the line diagram of gene structures. The full-length coding sequence of GmPIF4b (Glyma.14G032200.1) was cloned downstream of the 35S promoter in a cloning vector pRT-101 (Töpfer et al., 1987), and resulting construct (35S::GmPIF4b::polyA) was transferred to the binary vector pUQC10255 (from The University of Queensland) for Agrobacterium-mediated (EHA105) transformation of soybean. Soybean was transformed using the protocol detailed in Method 1. Steps of soybean transformation and shoot regeneration from transgenic calli are shown in Figures S1 and S2. T-DNA insertion in the transgenic plants was confirmed by genomic PCR of Bar gene using a cycle of 94°C for 2 min, 36 cycles of 94°C for 30 s, 63°C for 30 s, 72°C for 1 min and a final extension at 72°C for 10 min. Segregation analysis of the progeny was performed using glufosinate resistance test by applying 50 mg/L of bialaphos (glufosinate) to the leaves of T2 transgenic plants for determining resistant and susceptible soybean plants. Chisquare test was performed for calculating the probabilities of best fit in 15:1 ratio for T2. (In T2, the trait is expected to segregate in 9:3:3:1 ratio with 15 (9 + 3 + 3) part of the population resistant and 1 part susceptible). Glufosinate test also helped in determining homozygosity of the progenies. A line was considered homozygous if 100% of its progeny was resistant to glufosinate.

| Estimation of copy number in transgenic soybean plants
A previously published protocol for copy number estimation by qPCR in transgenic soybeans was followed (Li, Cong, et al., 2017). A standard curve of amplification cycles was generated by plotting the Ct (amplification cycle where the fluorescent signal is detected in qPCR) values against the log of DNA copy number in a sample dilution. Lectin1 gene of soybean was used as an endogenous control. Bar gene was used for copy number estimation. The ratio of bar:lectin1 was determined for each line (transgenic and wild type) by using the mathematical formula; where n is the dilution factor used to generate the standard curve, Ct bar is the Ct value of bar gene amplification, interceptbar is the intercept of the standard curve of bar gene amplification, and slope bar is the slope of the standard curve of bar gene amplification. Similarly, Ct lectin1 is the Ct value of lectin1 gene amplification, intercept lectin1 is the intercept of the standard curve of lectin1 gene amplification, and slope lectin1 is the slope of the standard curve of lectin1 gene amplification. The amplification conditions used were; 94°C for 2 min, 40 cycles of 94°C for 30 s, 60°C for 30 s, 72°C for 1 min and a final extension at 72°C for 10 min. Primers used were as follows: For Bar amplification,

| Analysis of pod set in transgenic soybean plants following intermittent exposure to long days
Soybean plants growing in short days (8 h light, 16 h dark) at 25°C and 400 µmm −2 ·s −1 light intensity were transferred to long days (16 h light, 8 h dark) at full-bloom stage (R2). The total number of flowering nodes were counted before exposure to long days, and the total number of flowering nodes giving rise to pods were counted after exposure to long days.  (Liew et al., 2017). Mean fold change in gene expression was calculated using the 2 − ΔΔC T method (Livak & Schmittgen, 2001). The primers used were as follows:

| Quantification of GmPIF4b, GmFT2a and GmFT5a transcripts in WT and transgenic soybean lines
For Actin amplification,

Binding Domain in soybean PIF4s
Active phytochrome-binding domains, APA and APB of PIF proteins, mediate interactions with the Pfr forms of phytochrome A and B, respectively ( Figure 1a) (Huq & Quail, 2002;Leivar & Monte, 2014). These short stretches are present at the N terminal and are characteristic of PIF proteins (Leivar & Monte, 2014). Multiple sequence alignment of soybean PIF4s and Arabidopsis PIF4 (AtPIF4) proteins provided insights into the sequence structure of soybean PIF proteins. GmPIF4a, b, c, d, f and g have conserved APB domains, whereas GmPIF4e lack the APB domain. Further, amino acid L (Leucine at 37th position) is present in all soybean PIF4s except GmPIF4e, whereas Arabidopsis APB domain contains amino acid Q (Glutamine) in this position (Figure 1b). The substitution suggests a change in the structure of the APB domain in soybean PIFs since L is a hydrophobic amino acid and Q is a polar amino acid with proton accepting and donating properties ( Figure 1b). Further, GmPIF4c has R (Arginine) amino acid and GmPIF4d has a K (Lysine) whereas all other soybean PIF4s and Arabidopsis PIF4 has Q at the 39th site ( Figure 1b).

| Intron-Exon organisation of soybean PIF4s
The gene structure analysis revealed variable intron-exon structures of Arabidopsis and soybean PIF4s (Figure 1c).
AtPIF4 has 6 CDS and 5 introns. GmPIF4c and GmPIF4d have a similar organisation with 7 CDS and six introns; GmPIF4a and GmPIF4b are made of 8 CDS and seven introns ( Figure 1c). This information is significant as the structure of a mature mRNA depends upon the splicing of introns. Further, GmPIF4a, GmPIF4c, GmPIF4f and GmPIF4g also have alternative transcripts or splice variants (Table S1). One splice variant has been detected for GmPIF4a and GmPIF4c and five splice variants have been detected for GmPIF4f and three for GmPIF4g. Splice variants result from differential regulation of mRNA splicing; hence, different protein products can be obtained from the same gene to increase the diversity of protein products (Chaudhary et al., 2019). Alternative splicing putatively generates protein products of different biological functions (Chaudhary et al., 2019).

| Analysis of cis-regulatory elements in the promoter regions of soybean PIF4s
Short recurring sequences known as cis-elements are often present upstream of the transcription start site and are putative binding sites of important transcription factors or regulatory molecules (D'Haeseleer, 2006). Analysis of promoter regions (2.4 kb upstream of the first ATG) revealed the presence of cis-elements related to light signalling, gibberellin control, auxin response, stress and defence response and metabolite synthesis pathways ( Figure  2). G-BOX, BOX-4, I-BOX and GT-1 motifs were abundantly present light response elements in the promoters of Arabidopsis and soybean PIF4s. Seven G-BOX elements are present in Arabidopsis PIF4, while six are present in the promoter region of GmPIF4a and GmPIF4e. The repetition of G-BOX reduced to 4 sites in GmPIF4b, three sites in GmPIF4c, and one site in GmPIF4d and GmPIF4g (Figure 2a). G-Box is one of the most common light response elements found in the promoters of light-signalling genes and represents a significant consensus sequence in terms of transcription factor binding (Ezer et al., 2017). Further, the frequency of repetition often defines the extent of binding of a transcription factor to its site for activation or repression of a gene (Espley et al., 2009). GARE motif is an important consensus sequence in gibberellin responsive genes (Bastian et al., 2010). Arabidopsis PIF4 promoter contains 2 GARE motifs, while GmPIF4d contain 3 GARE motifs. 1 GARE motif is present in GmPIF4a, GmPIF4c and GmPIF4e and 0 in GmPIF4b, GmPIF4f and GmPIF4g. However, TATC box, another gibberellin control element, is present at one site in GmPIF4b (Figure 2b). Diversity in consensus sequences can determine the specificity of interactions of transcription factors for the same biological function and can aid in recognition of one gene copy from another (Biłas et al., 2016). An important auxin response motif known as the TGA motif is found in Arabidopsis PIF4, which is indicative of its role in regulating auxin biosynthesis. Arabidopsis PIF4 promoter contains one TGA element, GmPIF4b and GmPIF4d contain one, and no TGA element is present in other GmPIF4s suggesting their divergence from regulating auxin pathways (Figure 2b).
Plant senescence responses often involve the critical role of abscisic acid, and ABRE is an essential motif for abscisic acid control (Song et al., 2016). 5 ABRE motifs are present in Arabidopsis PIF4, 7 ABRE motifs are present in GmPIF4a and GmPIF4e promoters, 3 ABRE motifs are present in GmPIF4c, and no ABRE motif is present in GmPIF4b, GmPIF4d, GmPIF4f and GmPIF4g ( Figure   2c). Stress inducible genes have ABRE motifs often located in their promoters which is suggestive of their role in controlling stress responses (Narusaka et al., 2003). MBS, cis-element related to drought response is located in the promoter of all GmPIF4s except GmPIF4f and GmPIF4g (Figure 2c).
Unique endosperm, seed and meristem gene expression response elements are found in soybean PIF4s, and these elements are absent in the promoter of Arabidopsis PIF4. CAT box, which is an important meristem expression motif is also present in GmPIF4a and GmPIF4c (Figure 2d). Developing plant embryos get their nutrients from the endosperm tissue, and GCN4 motif is located in the promoters of genes responsible for endosperm specific expression (Wu et al., 1998;Yoshihara et al., 1996). GCN4 motif is present in GmPIF4a, GmPIF4c and GmPIF4e, indicating their putative role in endosperm-related gene expression response (Figure 2d). Cis-element analysis of soybean PIF4s reflects their diversity in controlling plant growth and development. The presence of unique motifs in soybean PIF4s supports the divergence of these genes in controlling diverse functions in soybean.

| Soybean transformation and transgene copy number detection in transgenic soybean lines carrying 35S::GmPIF4b::polyA construct
The overexpression approach has been widely used for the characterisation of unknown genes in crops (Saijo et al., 2000;Wang et al., 2015). Here, we used Agrobacterium-mediated soybean transformation for generating transgenic soybean plants using 35S::GmPIF4b::polyA construct. Regenerated transgenic plants were grown under glasshouse conditions until maturity. Out of nine lines regenerated, only three lines produced viable seeds and were designated as Line1, Line2 and Line3. This generation of regenerated plants was designated as T0. Seeds obtained from T0 produced T1 plants. The number of T1 seeds obtained was not sufficient for segregation analysis; hence, segregation analysis was performed in T2 progeny. Twenty-five seeds of T2 progeny were used for determining the number of glufosinate resistant and susceptible lines. Glufosinate susceptible leaves turned yellow at the site of glufosinate application, while resistant leaves maintained green leaf colour (Figure 3a). Chi-square test showed that Line 1 and Line 2 fit the expected 15:1 ratio of trait segregation in T2 suggesting the insertion of one transgene in the genome, but Line 3 deviated from 15:1 ratio reflecting the possibility of a higher copy number (Table S2). PCR analysis using genomic DNA showed the amplification of Bar gene amplicon in Lines 1, 2 and 3 ( Figure 3b). The structure of 35S::GmPIF4b::polyA construct is shown by a line diagram (Figure 3c). Additional evidence to confirm the copy number in transgenic lines was obtained by qPCR analysis of genomic DNA obtained from T0 plants. The ratio of bar:lectin1 was calculated using the standard curve equations obtained for WT and transgenic lines 1, 2 and 3. Results revealed that Line 1 and Line 2 had one transgene each, while Line 3 had two transgenes (~1.7) inserted in the genome. Equations of the standard curve and linear regression values (R 2 ) for Bar and Lectin 1 (control) are shown in Table S3. The standard curve is a straight-line plot of Ct values against the logarithm of DNA copy number per dilution, while R 2 values reflect the coefficient of variation (Li, Cong, et al., 2017). The R 2 values for bar gene amplification were 0.9911, 0.9919 and 0.9606 for Lines 1, 2 and 3, respectively (Table S3).

| Overexpression of GmPIF4b affects plant morphology in transgenic soybean
Average plant height (measured as the length of the primary stem), leaf surface area, number of branches, flowering time and number of pods were recorded for transgenic soybean plants against WT. Average plant height was significantly reduced in Lines 1, 2 and 3 as compared to the WT soybean plants (Figure 4a). Further, leaf surface area was also reduced in transgenic plants (Figure 4b).
Average plant height of Lines 1, 2 and 3 was 32.14, 23.04 and 23.25 cm, respectively, while the average plant height for WT was 43.25 cm (Figure 4c). An average reduction of 17.125 cm was observed in the height of transgenic plants. Third trifoliate leaf was used for comparing the leaf surface area of WT and transgenic lines. The area of third trifoliate reduced by 26.75 cm 2 in Line 1, 24.98 cm 2 in Line 2 and 24.5 cm 2 in Line 3 as compared to the WT (Figure 4d). The total number of branches per plant was also significantly reduced in transgenic lines. Lines 1, 2 and 3 plants produced an average of 6.75 (~7), 6 and 6 branches per plant, while WT plants produced an average of 7.5 (~8) branches per plant (Figure 4e). In terms of days to flowering, Line 1 produced flowers eight days earlier as compared to WT and Lines 2 and 3 flowered 13 days earlier than WT (Figure 4f). The average number of pods produced by transgenic lines deviated significantly in Line 2 only with Line 2 producing an average number of 3.6 (~4) more pods as compared to WT (Figure 4g). Hence, constitutive overexpression of GmPIF4b significantly affected plant morphology and flowering time in the short-day soybean cultivar Bragg.

accelerates reproductive phase transition in transgenic soybean
Reproductive phases of soybean are designated as Rn, where n denotes the number allocated to a specific reproductive stage (Table S4). R2 corresponds to a stage of flowering at full bloom (Walter R Fehr & Caviness, 1977). After R2, the soybean plants start producing pods to reach full maturity (R8)(Walter R Fehr & Caviness, 1977). The time taken for the transition from R2 to R8 stage was recorded, including intermittent stages (R4, R6 and R7) as separate points ( Figure 5). It was interesting to observe that transgenic soybean plants carrying 35s::GmPIF4b::polyA construct showed an accelerated transition from R2 to R4 as compared to the WT. While WT plants were at R6 (full seeds with green pods), the transgenic lines had already started attaining R7 with several pods turning to dark brown colour (Figure 5a). At full maturity, WT (Bragg) pods were pale to light brown while transgenic pods developed a dark or deep brown colour (Figure 5b). Dark brown pods are often found in wild soybean varieties (He et al., 2015). Interestingly, WT seeds had a dark hilum, and transgenic seeds developed a clear hilum (Figure 5c). WT plants attained R4 in an average number of 67 days while Lines 1, 2 and 3 attained R4 in 60.5, 56.85 and 56.75 days, respectively (Figure 5d). Lines 1, 2 and 3 reached R6 in 80.83, 77.57 and 76.75 days (average value), respectively, while WT lines attained R6 in 87.4 days (Figure 5e). Similarly, Lines 1, 2 and 3 accomplished R7 and R8 faster as compared to WT with Lines 1, 2, and 3 reaching R8 in 102.5, 102.1 and 99, respectively, and WT in 109 days (Figure 5f,g).

| Analysis of the effect of sub-optimal photoperiod, intermittent exposure to long days, in transgenic lines
Termination of flowering and reduced yield has been reported in late maturity soybean varieties (Han et al., 1998;Kato et al., 2015). Unfavourable or sub-optimal photoperiod is one of the factors that can lead to abscission of flowers in late maturity soybean varieties (Han et al., 1998;Kato et al., 2015;Liu et al., 2018). To test if the presence of sub-optimal photoperiod can affect flowering and yield, WT and transgenic plants growing in short days were exposed to 10 long days at full-bloom (R2) stage. Transgenic plants produced more pods at flowering nodes as compared to the WT after interruption with long days (Figure 6a). An average number (d) Bar graph representing data (average no. of days) to attain R4. (e) Bar graph representing data (average no. of days) to attain R6. (f) Bar graph representing data (average no. of days) to attain R7. (g) Bar graph representing data (average no. of days) to attain R8. n = 4-6. Error bars represent standard deviations. Student's t-test was used for calculating significant differences which are indicated with asterisks (*) for p < 0.05 and (**) for p < 0.01. Phenotypes of all transgenic lines have been compared to the wild type phenotype of 7.8-8.1 nodes were present in transgenic lines and 8.8 nodes in WT plants at full-bloom stage (before exposure to long days) (Figure 6b). These differences were not significant. However, in WT, the number of flowering nodes reduced significantly after exposure to long days (putatively due to termination of flowers) and poorly developed pods were observed (Figure 6a). An average number of six flowering nodes gave rise to healthy pods full with seeds in transgenic lines, and an average number of three flowering nodes gave rise to poorly developed and empty pods in WT plants (Figure 6a,c). The differences in the number of pods produced were highly significant (Figure 6c). to compare the expression of two main soybean florigens, GmFT2a and GmFT5a in WT and transgenic plants, as GmFT2a and GmFt5a are the prime florigens controlling photoperiodic floral induction in soybean (Kong et al., 2010). Quantitative PCR was employed to determine the transcript levels of GmFT2a, GmFT5a and GmPIF4b in WT and transgenic soybean plants carrying 35s::GmPIF4b::polyA construct. GmPIF4b levels were significantly elevated in transgenic plants. Compared to WT, the mean fold change in the transcript expression of GmPIF4b was 4.02 for Line 1, 4.61 for Line 2 and 15.91 for Line 3 (Figure 7a). It was also interesting to observe that transcript levels of GmFT2a and GmFT5a were elevated in a significant manner in Lines 1, 2 and 3. The mean fold changes in GmFT2a transcript levels were 4.49, 2.01 and 7.93 for Lines 1, 2 and 3, respectively ( Figure 7b). Further, the mean fold changes in GmFT5a transcript levels were 4.49, 2.21 and 4.71 for Lines 1, 2 and 3, respectively ( Figure 7c).

| DISCUSSION
Soybean is a major legume crop used widely for oil and fodder (El-Shemy, 2011). Genes that integrate light and temperature signals are of particular interest due to their role in controlling flowering, maturity and yield (Balasubramanian et al., 2006;Franklin et al., 2011;Kumar et al., 2012). Multiple sequence alignment of soybean PIF4s with Arabidopsis PIF4 revealed the presence of APB domain in all soybean PIF4s, except GmPIF4e (Figure 1b). PIF4 is an important player in light and temperature perception, and the presence of seven homologs of PIF4 in soybean may point towards the diversity of their functions (Arya et al., 2018). Analysis of cis-elements in the promoter regions of Arabidopsis and soybean PIF4s showed that essential light signalling, hormone biosynthesis and plant stress-related elements are present at different sites ( Figure 2). The presence of light response elements such as G-BOX, I-BOX, gibberellin synthesis control elements such as GARE motif, and various stress-responsive elements is suggestive of the conserved function of soybean PIF4s (Figure 2). In this study, phenotypes that control plant architecture were observed upon constitutive overexpression of GmPIF4b in short-day soybean cultivar, Bragg (Figure 4). Plant height was significantly reduced in transgenic soybean (Figure 4a,c). It could be due to attenuation in the GA biosynthesis pathway as the external application of GA can increase the lower internodal length in soybean in short days (Mislevy, Boote, & Martin, 1988. Recently, Chen et al., 2020 also reported a reduction in plant height upon overexpression of soybean APETELLA gene, GmAP1a, which is an essential floral integrator (Kaufmann et al., 2010). Chen et al., 2020 reported that the expression levels of key GA biosynthesis and GA responsive genes were lower in GmAP1a overexpression lines as compared to the WT (Williams 82) plants.
In our experiment, transgenic soybean plants overexpressing GmPIF4b also exhibited reduced leaf surface area as compared to the WT (Figure 4b,d). PIF4 mediates shade avoidance response in Arabidopsis , and reduction in leaf surface area is a typical phenotype shown by plants growing under dense vegetation conditions to ensure optimum allocation of resources for favouring the growth of reproductive structures (Franklin, 2008;Gommers et al., 2013;Keiller & Smith, 1989;Lorrain et al., 2008). Early flowering in transgenic soybean upon overexpression of GmPIF4b is suggestive of a conserved phenotypic response, as overexpression of PIF4 in Arabidopsis also results in an accelerated transition to flowering for achieving reproductive success (Galvão et al., 2015;Galvāo et al., 2019;Kumar et al., 2012). Further, no effect on pod set in transgenic Line 1 and Line 3 indicates that overexpression of GmPIF4b could affect the phase transition without compromising yield (Figure 4g).
Change in pod colour from tan in WT to dark brown in transgenic plants reflects that overexpression of GmPIF4b could affect the molecular pathways responsible for imparting pod colour in soybean (Figure 5a,b). Previous reports suggest two genetic loci, L1 and L2, are associated with the inheritance of pod colour in soybean and generally, wild soybean plants attain black or dark pod colour (He et al., 2015). The expression of Glyma19g27460 gene was upregulated in black pods. Hence, Glyma19g27460, which encodes for a SANT (an acronym for Swi3, Ada2, N-Cor and TFIIIB) superfamily Myb domain protein was reported to be the most likely candidate for L1 locus (He et al., 2015). SANT protein domains help in the association of chromatin remodelling proteins with the histones (Boyer et al., 2004).
Change in hilum colour from dark brown in WT to clear white in transgenic plants over-expressing GmPIF4b was a novel and exciting observation (Figure 5c) because soybean breeders have used hilum colour as a genetic marker in soybean crosses (Bhatt & Torrie, 1968). Hilum colour also acts as a classification factor in the choice of soybean varieties to be grown by farmers, for industrial use and customer satisfaction purposes (Araujo et al., 2019).
Abscission or termination of flowering can result in declined yields in determinate soybean varieties (Kato et al., 2015). Bragg, which is a determinate variety, served as a perfect model for studying the effect of sub-optimal photoperiod on flowering. Transgenic soybean plants over-expressing GmPIF4b exhibited better pod production when plants growing in short days were transferred to long days at full-bloom stage. This result indicated that constitutive overexpression of GmPIF4b putatively reduced abscission of flowers in long photoperiod leading to higher pod set in transgenic lines as compared to the WT (Figure 6a-c).
Quantitative RT-PCR results showed that the transcripts of soybean florigens GmFT2a and GmFT5a were elevated in transgenic lines (Figure 7b,c). Kong et al.,2010 reported five gene pairs of FT homologs in soybean, GmFT1a and 1b, GmFT2a and 2b, GmFT3a and 3b, GmFT4a and 4b, and GmFT5a and GmFT5b. GmFT2a and GmFT5a were upregulated in short days (Kong et al., 2010). Further, ectopic expression of GmFT2a and GmFT5a resulted in premature flowering in Arabidopsis. Their study also showed that the transcript levels of GmFT2a and GmFT5a accumulated in short days, but the levels of GmFT2a dropped in the trifoliate leaves when soybean plants were treated with long days indicating that the GmFT2a expression is more sensitive towards photoperiod (Kong et al., 2010). Elevated expression of GmFT2a and GmFT5a likely led to early flowering in transgenic soybean over-expressing GmPIF4b in this study (Figure 7b,c). It is also possible that overexpression of GmPIF4b resulted in attenuation of plant hormonal biosynthesis pathways, such as auxin and GA hormones that control the induction of flowering in the meristem (Aloni et al., 2006;Eriksson et al., 2006;Galvão et al., 2015). Moreover, it is well established that flowering is a polygenic trait (Zan & Carlborg, 2019), and more than one factor may likely have contributed to the onset of early flowering in transgenic soybeans over-expressing GmPIF4b.
Overall, GmPIF4b putatively controls plant architecture in vegetative stages (plant height and leaf area) and time to flower as evident by accelerated phase transitions in transgenic soybean over-expressing GmPIF4b (Figures 4 and 5, Table 1). No significant effect on the yield of two transgenic lines indicated that it is possible to accelerate phase transitions (vegetative to reproductive), without compromising the final yield, by altering expression of genes that control the integration of environmental cues (Figure 4). This study also points out that GmPIF4b can act as a prime candidate gene to be targeted for developing soybean varieties with better regional adaptability.