• Open Access

A comparative study of seed yield parameters in Arabidopsis thaliana mutants and transgenics

Authors

  • Inge Van Daele,

    1. Department of Plant Systems Biology, VIB, Gent, Belgium
    2. Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium
    3. Plant Sciences Unit, Growth and Development Research Area, Institute for Agricultural and Fisheries Research (ILVO), Melle, Belgium
    Search for more papers by this author
    • These authors contributed equally to this work.

    • Present address: Department of Molecular and Cellular Interactions, VIB, 1050 Brussels, Belgium.

  • Nathalie Gonzalez,

    1. Department of Plant Systems Biology, VIB, Gent, Belgium
    2. Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium
    Search for more papers by this author
    • These authors contributed equally to this work.

  • Ilse Vercauteren,

    1. Department of Plant Systems Biology, VIB, Gent, Belgium
    2. Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium
    Search for more papers by this author
  • Lien de Smet,

    1. Department of Plant Systems Biology, VIB, Gent, Belgium
    2. Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium
    Search for more papers by this author
    • Present address: Department of Molecular Biotechnology, Ghent University, 9000 Gent, Belgium.

  • Dirk Inzé,

    1. Department of Plant Systems Biology, VIB, Gent, Belgium
    2. Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium
    Search for more papers by this author
  • Isabel Roldán-Ruiz,

    1. Plant Sciences Unit, Growth and Development Research Area, Institute for Agricultural and Fisheries Research (ILVO), Melle, Belgium
    Search for more papers by this author
  • Marnik Vuylsteke

    Corresponding author
    1. Department of Plant Systems Biology, VIB, Gent, Belgium
    2. Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, Belgium
      (fax +32 9 331 38 09; email marnik.vuylsteke@psb.vib-ugent.be)
    Search for more papers by this author

(fax +32 9 331 38 09; email marnik.vuylsteke@psb.vib-ugent.be)

Summary

Because seed yield is the major factor determining the commercial success of grain crop cultivars, there is a large interest to obtain more understanding of the genetic factors underlying this trait. Despite many studies, mainly in the model plant Arabidopsis thaliana, have reported transgenes and mutants with effects on seed number and/or seed size, knowledge about seed yield parameters remains fragmented. This study investigated the effect of 46 genes, either in gain- and/or loss-of-function situations, with a total of 64 Arabidopsis lines being examined for seed phenotypes such as seed size, seed number per silique, number of inflorescences, number of branches on the main inflorescence and number of siliques. Sixteen of the 46 genes, examined in 14 Arabidopsis lines, were reported earlier to directly affect in seed size and/or seed number or to indirectly affect seed yield by their involvement in biomass production. Other genes involved in vegetative growth, flower or inflorescence development or cell division were hypothesized to potentially affect the final seed size and seed number. Analysis of this comprehensive data set shows that of the 14 lines previously described to be affected in seed size or seed number, only nine showed a comparable effect. Overall, this study provides the community with a useful resource for identifying genes with effects on seed yield and candidate genes underlying seed QTL. In addition, this study highlights the need for more thorough analysis of genes affecting seed yield.

Introduction

Given the rapid increase in the world population and the need for bio-energy as an alternative for fossil fuel, increasing crop yield is a major challenge. Because seeds are the major source of the world’s food calories, one way to meet this challenge is to create crops producing a higher seed yield. Seed size and number are the two main components contributing to seed yield. However, the genetic factors and molecular mechanisms controlling these important agronomic parameters are still poorly understood.

Final seed size is achieved through a coordinated growth of three elements that develop simultaneously in the seed: the zygotic tissues, comprising the diploid embryo and the triploid endosperm resulting from a double fertilization, and the seed coat, developing from the maternal integument. These different growth processes are genetically controlled, and several specific regulators have been identified (Sun et al., 2010). Any perturbation of one of these growth processes by modification of the expression of specific gene regulators might affect the final seed size. The role of the endosperm in determining seed size was revealed by three genes that function in the same genetic pathway: HAIKU1 (IKU1), HAIKU2 (IKU2) and MINISEED3 (MINI3) encoding a VQ motif (FXhVQChTG) protein, a leucin-rich repeat (LRR) receptor kinase and a WRKY transcription factor, respectively. Down-regulation of these three genes leads to precocious cellularization of the endosperm and subsequently reduces seed size at maturity (Garcia et al., 2003; Luo et al., 2005; Wang et al., 2010). In these three mutants, embryo and maternal integument tissues are also smaller, underlining the developmental interaction between the endosperm and the integument. Conversely, prolonged cellularization of the endosperm leads to larger seeds: in apetala2 (ap2) mutants, larger embryo sacs are formed because of an extended endosperm growth and the embryos grow longer and contain more and larger cells (Jofuku et al., 2005; Ohto et al., 2005). In addition, the epidermal cells of the seed coat in these mutants are larger, showing that the AP2 transcription factor also acts maternally on seed mass control (Jofuku et al., 2005; Ohto et al., 2005, 2009). The role of the integument in determining the final seed size was revealed by mutation studies on TRANSPARENT TESTA GLABRA2 (TTG2) and AUXIN RESPONSE FACTOR2 (ARF2), both encoding transcription factors. The ttg2 mutant shows reduced integument cell elongation, which negatively affects endosperm growth and subsequently leads to reduced seed size (Garcia et al., 2005). Conversely, arf2 mutants show enlarged integuments containing more cells, thereby creating a large cavity after pollination and allowing extended embryo development (Schruff et al., 2006). Overexpression of AINTEGUMENTA (ANT), encoding an AP2 transcription factor, increases cell proliferation, resulting in larger ovules and seeds containing larger embryos (Mizukami and Fischer, 2000).

Seed number, often negatively correlated with seed size (Alonso-Blanco et al., 1999), is the second most important seed yield-contributing parameter. The role of inflorescence architecture and floral development in determining the final seed number has been exemplified by a study on DWARF4, encoding a cytochrome P450 enzyme involved in the brassinosteroid biosynthesis, and by AUXIN_REGULATED GENE CONTROLLING ORGAN SIZE (ARGOS) in Arabidopsis. Plants overexpressing DWARF4 have an increased number of branches and siliques, resulting in higher seed numbers (Choe et al., 2001), plants overexpressing ARGOS carry more seeds per silique (Broman et al., 2003; Hu et al., 2003).

Some of the genes enhancing seed yield upon overexpression, such as DWARF4, ANT and ARGOS, have a positive effect on vegetative growth as well (Choe et al., 2001; Hu et al., 2003; Mizukami and Fischer, 2000). In addition, an increased number of cells underlying the increased vegetative growth in ANT overexpressing plants, as well as in the ap2 or arf2 mutants (Mizukami and Fischer, 2000; Ohto et al., 2005; Schruff et al., 2006), suggest the importance of cell division in seed development.

Despite the availability of detailed phenotypic information for numerous mutants affecting seed parameters and biomass, it is difficult to compare the results generated in different studies because distinct seed parameters were often measured under different growth conditions. Therefore, the aim of this study is to create a global and coherent view on the molecular control of seed yield in Arabidopsis and identify new potential seed yield regulators by performing a comparative phenotypic analysis of 64 transgenic lines, grown under the same environmental conditions and phenotyped using a common standard protocol. For this purpose, we conducted a comprehensive literature search for seed yield mutants. We also included several transgenics reported to produce enlarged leaves (Gonzalez et al., 2009) and a number of core cell cycle gene mutants. The effect of down-regulation or overexpression of the 46 genes selected on seed production was investigated by quantifying total seed weight and seed yield parameters such as seed size, seed number per silique, number of inflorescences, number of branches on the main inflorescence and number of siliques. We also measured increments of total rosette area across a time span of 4 weeks. Analysis of this comprehensive data set indicates that for only 9 of 14 lines reported earlier to affect seed size or number, comparable effects could be measured. In addition, we identified novel genes affecting seed yield parameters. The joint analysis of these gene effects in a single study is very helpful to obtain insight into the relative importance of genetic factors determining seed yield and, eventually, to use this knowledge to improve grain yield of commercial crops.

Results

Gene selection and experimental setup

Forty-six Arabidopsis genes, three of which were orthologues of rice genes and one of a poplar gene, were identified from the literature as affecting seed size, seed number, vegetative growth, flower or inflorescence development, or cell division and analysed in this comparative study in Arabidopsis (Table 1; Table S1).

Table 1.   Description of the genes selected for the comparative seed parameter analysis. The 46 genes selected belong to at least one of the five following categories: seed size/number, enhanced vegetative growth, flower or inflorescence development, cell cycle and other. The effect of the mutation [either loss-of-function (LOF) or gain-of-function (GOF)] on seed number, seed size or vegetative growth as reported in literature is indicated as +(positive), −(negative), none or na (not analysed). For some genes, the effect was observed in another plant species (poplar or rice)
Gene nameGene IDEffect on seed numberEffect on seed sizeEffect on vegetative growthFlower or inflorescence developmentCell cycleOtherGOFLOFReference
  1. *Arabidopsis plants overexpressing the poplar EIF5A3 gene produce more seeds and have increased vegetative growth.

ANTAT4G37750++   x (Mizukami and Fischer, 2000)
ARF2AT5G62000++    x(Schruff et al., 2006)
ARGOSAT3G59900+na+   x (Hu et al., 2003)
DWARF4AT3G50660+None+   x (Choe et al., 2001)
EIF5A3AT1G13950na*Nonena*   x (Ma et al., 2010)
GA20OX1AT4G25420+na+   x (Huang et al., 1998)
HSD1AT5G50600+na+   x (Li et al., 2007)
GASA4AT5G15230na+nax  x (Roxrud et al., 2007)
GASA4AT5G15230+Nonex   x(Roxrud et al., 2007)
AHK (AHK2, AHK3, AHK4)AT5G35750 AT1G27320 AT2G01830+    x(Riefler et al., 2006)
AP2AT4G36920+    x(Ohto et al., 2005)
CKX1AT2G41510+   x (Werner et al., 2003)
CKX3AT5G56970+   x (Werner et al., 2003)
DWARF11 (rice)AT5G14400nana    x(Tanabe et al., 2005)
GW2 (rice)AT1G17145na+na    x(Song et al., 2007)
GW2 (rice)AT1G78420nana   x (Song et al., 2007)
IKU2AT3G19700NoneNone    x(Luo et al., 2005)
MINI3AT1G55600NoneNone    x(Luo et al., 2005)
E2FaAT2G36010nana+ x x (De Veylder et al., 2002)
AN3AT5G28640nana+   x (Horiguchi et al., 2005)
ATHB16AT4G40060Nonena+   x (Wang et al., 2003)
AVP1AT1G15690nana+   x (Li et al., 2005)
BRI1AT5G39400nana+   x (Wang et al., 2001)
CLE26AT1G69970nana+   x (Strabala et al., 2006)
EXP10AT1G26770nana+   x (Cho and Cosgrove, 2000)
GRF1AT2G22840nana+   x (Kim et al., 2003)
GRF2AT4G33740nana+   x (Kim et al., 2003)
GRF5AT3G13960nana+   x (Horiguchi et al., 2005)
NAC1AT1G56010nana+   x (Xie et al., 2000)
CLV1AT1G75820nananax   x(Ottoline Leyser and Furner, 1992)
CLV2AT1G65380nananax   x(Kayes and Clark, 1998)
ERL1AT5G62230nanaNonex   x(Shpak et al., 2004)
ERL2AT5G07180nanaNonex   x(Shpak et al., 2004)
KNAT1AT4G08150nanaNonex   x(Douglas et al., 2002; Venglat et al., 2002)
MAX4AT4G32810nanax   x(Auldridge et al., 2006; Sorefan et al., 2003)
REVAT5G60690nananax   x(Otsuga et al., 2001)
CDKB1.1AT3G54180nanaNone x  x(Boudolf et al., 2004)
DEL1AT3G48160nana x x (Vlieghe et al., 2005)
DPAAT5G02470nanaNone x x (De Veylder et al., 2002)
KRP1AT2G23430nanana x  xDr. Lieven De Veylder
KRP2AT3G50630nana x x (De Veylder et al., 2001)
WEE1AT1G02970nanaNone x  x(De Schutter et al., 2007)
CKX2AT2G19500naNone  xx (Werner et al., 2003)
CKX4AT4G29740naNone  xx (Werner et al., 2003)
LEC2AT1G28300nanana  x x(Stone et al., 2001)

Eighteen genes have previously been described to affect seed size or seed number (Table 1). For example, Arabidopsis plants overexpressing GIBBERELLIC ACID-STIMULATED ARABIDOPSIS 4 (GASA4), AINTEGUMENTA (ANT) or CYTOKININ OXIDASE/DEHYDROGENASE 1 and 3 (CKX1 and CKX3) produce larger seeds (Mizukami and Fischer, 2000; Roxrud et al., 2007; Werner et al., 2003).

We also included negative regulators of seed growth. Larger seeds have been observed in A. thaliana sensor histidine kinases 2, 3 and 4 triple mutant (ahk2-ahk3-ahk4), AUXIN RESPONSE FACTOR 2 mutant (arf2) and APETALA2 mutant (ap2) (Ohto et al., 2005; Riefler et al., 2006; Schruff et al., 2006). In rice, loss-of-function of the RING-type protein E3 ubiquitin ligase GW2 results in enhanced grain weight, whereas overexpression of GW2 has a negative effect on seed size (Song et al., 2007). We identified two putative orthologs of GW2 in Arabidopsis: AT1G17145 and AT1G78420.

Rice dwarf11 mutants show reduced seed length (Tanabe et al., 2005). As no knockout mutants are obtainable in Arabidopsis, we generated Arabidopsis plants overexpressing AT5G14400, the ortholog of dwarf11. In Arabidopsis, gasa4, iku2 and mini3 mutants produce smaller seeds (Luo et al., 2005; Roxrud et al., 2007). The gasa4 mutant has also been shown to produce more seeds. This positive effect on seed number was also observed when the genes DWARF4, GA20OX1, HYDROXYSTEROID DEHYDROGENASE 1 (HSD1) and ARGOS are overexpressed (Choe et al., 2001; Hu et al., 2003; Huang et al., 1998; Li et al., 2007). Ectopic expression of the poplar EIF5A3 gene in Arabidopsis leads to the production of more seeds (Ma et al., 2010). In this study, overexpression of the Arabidopsis EIF5A3 gene was investigated. Six (ANT, CKX1, CKX3, ahk2-ahk3-ahk4, arf2 and ap2) of the eight Arabidopsis or rice transgenics described earlier to positively affect seed size produced also less seeds (Mizukami and Fischer, 2000; Ohto et al., 2005; Riefler et al., 2006; Werner et al., 2003). Not all seed yield parameters are described for all the transgenics affecting seed number or size.

Some of the lines for which a positive effect on seed number or size has been reported, such as the overexpressors of ANT, DWARF4, HSD1, ARGOS, GA20OX1 and EIF5A3 and the arf2 mutant, were also described to display an enhanced vegetative growth (Broman et al., 2003; Choe et al., 2001; Huang et al., 1998; Li et al., 2007; Mizukami and Fischer, 2000; Schruff et al., 2006). Therefore, 11 lines previously reported to produce larger vegetative structures (Table 1) were additionally selected to assess their effect on seed yield parameters.

An increased seed number can result from an increased number of branches, flowers or ovules, meaning that any gene involved in the regulation of one of these processes, for example MAX4 that was reported to affect shoot branching (Sorefan et al., 2003), could affect the final number of seeds. Therefore, eight genes involved in inflorescence or flower development (Table 1) were selected, and their effects were analysed.

In mutants affected in seed size, an alteration in the number of cells is often observed, suggesting that cell proliferation is an important biological process involved in the regulation of seed development. Therefore, seven core cell cycle mutants, including the lines affected in the expression of the E2Fa transcription factor or its co-activator DPa (De Veylder et al., 2002), were included in this comparative analysis (Table 1).

In total, 64 lines corresponding to 46 genes were collected or produced (Table 2; Table S2). For nine genes, both the overexpressing and the knockout lines were analysed. For each of these lines, the total seed weight (TSW) and the following parameters contributing to seed yield were measured: seed size (SS), number of inflorescences (INFN), number of branches on the main inflorescence (BRAN), number of siliques (SILN), seed number per silique (SN) and rosette area (RA). RA was measured twice a week throughout a growth period of 4 weeks preceding bolting to avoid biases in RA caused by variation in flowering time.

Table 2.   Percentage compared to wild-type of total seed yield (TSW), seed size (SS), number of inflorescences (INFN), number of branches (BRAN), number of siliques (SILN), number of seeds in ten siliques (SN) and rosette area (RA) for 67 Arabidopsis thaliana gain-of-function (GOF) and loss-of-function (LOF) lines Thumbnail image of

Assessing and comparing experimental variability

Because of the large amount of lines to be analysed, the seed yield parameters and RA were measured across 14 experiments (11 with Col-0 background, three with Ler background). To determine the experimental variation for these parameters, we used the coefficient of variation (%CV), defined as the standard deviation expressed as percentage of the mean. Only the wild-type Col-0 data were considered for this analysis as this genotype was analysed in 11 of the 14 experiments.

Coefficient of variation values associated with SILN, SN and SS are <15% (Figure 1a) indicating a higher reproducibility compared with BRAN, INFN and TSW. The reproducibility of RA was higher at later time points (Figure 1b).

Figure 1.

 Box plots displaying the differences in coefficient of variation (%CV) among (a) number of branches (BRAN), number of inflorescences (INFN), number of siliques (SILN), seed number per silique (SN), seed size (SS) and total seed weight (TSW), measured in the control line Col-0 for each of the 11 related experiments; and (b) the rosette area (RA) measured in the control line Col-0 at twice a week throughout a growth period of 4 weeks preceding bolting. INFN, BRAN, SILN, SN and RA data were log transformed before calculation. The distribution of %CV reflects the variation among the 11 experiments. The whiskers extend only to the most extreme data values within the inner ‘fences’, which are at a distance of 1.5 times the interquartile range beyond the quartiles, or the maximum value if it is smaller. Outliers, positioned beyond the whiskers, are plotted as a cross.

Significant genotypic variation for seed yield parameters, rosette area and growth

To account for the experimental variability in the assessment of the genotype effects on the various seed yield parameters, we performed a combined analysis, also called meta-analysis, by fitting a linear mixed model to the data. The meta-analysis revealed that variation among the 64 lines was highly significant (< 0.001) for each seed yield parameter measured. Genotype was also a highly significant factor (< 0.001) for final RA and for rosette growth over time.

We compared all lines to either Col-0 or Ler for each seed yield parameter measured (Table 2; Table S2). To compensate for the large number of comparisons made, the cut-off for significance was set to α = 0.01. Therefore, we will mainly discuss the significant (< 0.01) data showing a different line mean compared with the wild type.

Trait correlations

The total number of seeds (approximated by TSW in this study) produced by a Arabidopsis plant can be partitioned into different parameters, namely the number of seeds per silique (estimated as SN), the number of siliques per inflorescence (not estimated in this study) and the total number of inflorescences (approximated by INFN). Only SN correlated significantly (= 0.33; = 0.014) with TSW, whereas the correlation between TSW and INFN was positive, but not significant (= 0.25; = 0.074) (Figure 2).

Figure 2.

 Correlation plot of the seed yield parameters measured. Positive correlations are indicated in yellow-red—negative correlations in cyan-blue. P-values are given in the corresponding cells.

SN correlates strongly negative with SS (r = −0.47; < 0.001), indicating that the number of seeds developed in a silique strongly and negatively affects the final seed size. Such a trade-off has frequently been found between seed size and number of seeds produced by a plant (Alonso-Blanco et al., 1999). SILN associates strongly with BRAN (= 0.58; < 0.001) and INFN (= 0.44; < 0.001), which is obvious given the branch and inflorescence structure of A. thaliana, and the way SILN, BRAN and INFN were measured.

Seed size and seed number

Fourteen (ANT, arf2, ARGOS, DWARF4, GA20OX1, HSD1, GASA4, gasa4, ahk2-ahk3-ahk4, ap2, CKX1, CKX3, iku2 and mini3) of the 64 lines phenotyped in this study, corresponding to 16 genes, were previously described to positively or negatively affect seed size or seed number in Arabidopsis (Table 2; Table S2).

Only four (arf2 and ap2 mutants, CKX1 and CKX3 overexpressors) of the seven lines (ANT, CKX1, CKX3, ahk2-ahk3-ahk4, arf2, ap2, GASA4) previously shown to produce larger seeds in Arabidopsis display the same phenotype in this study (Table 2; Table S2). Indeed, the two ap2 mutants analysed, ap2-7 and ap2-11, showed 62% (< 0.001) and 40% (< 0.001) larger seeds than the wild type, respectively. A decrease in TSW for these two mutants, probably due to a decrease in the SILN (Table 2; Table S2), could be observed. Similarly, the arf2 mutant seeds are 36% (< 0.001) larger than those of wild-type seeds, but the arf2 mutants have fewer branches and less siliques containing less seeds. Finally, a 57% increase (< 0.001) in SS in the CKX1 overexpressing line (Table 2), and to a lesser extent in the CKX3 overexpressor (8%, < 0.001), was observed. The BRAN and SILN were decreased in both genotypes, the TSW, and to a lesser extent the SN, and were decreased only in the CKX3 overexpressor.

Regarding the lines iku2, mini3 and gasa4 for which a decrease in seed size has been reported in Arabidopsis, indeed, iku2 and mini3 mutants showed significantly smaller seeds (42% (< 0.001) and 15%–36% (< 0.001), respectively; Table 2) compared with control plants. A decrease in the BRAN was also observed. As a consequence, the TSW was reduced in these mutants. In one of the mini3 mutants (SALK_056336), however, an opposite phenotype was observed for these two parameters. Interestingly, overexpression of IKU2 under the control of the seed-specific promoter PHAS leads to a slight increase in SS (5%, < 0.001). This effect was however not significant in the 35S:IKU2 line.

In rice, overexpression of GW2 results in smaller seeds. Two putative orthologs, AT1G17145 and AT1G78420, were cloned and overexpressed in Arabidopsis. Surprisingly, AT1G17145 overexpression resulted in larger seeds (10%, < 0.001) (Table 2, Figure 3). However, in this line, the TSW is decreased (62%, < 0.001) because of a decrease in seed number as the plant produces fewer branches and less seeds per silique.

Figure 3.

 Dry seeds of (a) Col-0, and (b) GRF5 and (c) GW2 ortholog/AT1G17145 overexpressing plants.

Five lines (DWARF4, GA20OXI1, HSD1, ARGOS and gasa4) previously reported to produce more seeds were analysed. Under the growth conditions we applied, only the genotype overexpressing ARGOS showed an increase in TSW (18%, < 0.05) probably due to the observed increase in BRAN and SN as already described (Broman et al., 2003). Although the TSW was not increased in the GA20OX1 overexpressing line, the BRAN and SILN were significantly increased, but the SN was significantly decreased. The seeds of this genotype were 5% (P < 0.001) larger than those of the control. In the gasa4 mutant, a slightly nonsignificant increase in TSW (9%), probably resulting from the increase in BRAN and SILN, was observed. Arabidopsis plants expressing the poplar EIF5A3 gene were reported to produce more seeds but when the Arabidopsis gene is overexpressed in Arabidopsis, a 36% (P < 0.001) decrease in TSW is observed probably resulting from the decreased SN (10%, P < 0.001). However, these plants produce seeds 5% (P < 0.001) larger than controls. In addition to arf2 and ap2 mutants, and CKX1 and CKX3 overexpressors, producing less seeds, we could also confirm the decreased number of seeds in the triple mutant ahk2-ahk3-ahk4 harbouring less inflorescences, branches and siliques.

Leaf growth

For seven of 14 lines previously described to produce either more or larger seeds, larger vegetative structures were reported as well (Table 1). Therefore, we additionally analysed 11 lines described earlier to enhance the vegetative plant growth to identify novel potential regulators of seed development (Table 2; Table S2). Some of these lines were reported to flower later [arf2, ARGOS, CLE26, (Hu et al., 2003; Schruff et al., 2006; Strabala et al., 2006)] or earlier [GA20OX1, ATHB16,(Huang et al., 1998; Wang et al., 2003)] than control plants. The AVP1, GA20OX1, AN3 and EXP10 lines showed significant (< 0.05) increased RA over time compared with the control line. The analysis of the seed yield parameters revealed that the genotypes overexpressing GRF5, BRI1 and GRF1 produce larger seeds (26% (P < 0.001), 12% (P < 0.001) and 10% (P < 0.01), respectively) compared with control plants (Table 2; Table S2, Figure 3). The increase in SS in GRF5 overexpressing plants was associated with a 31% (< 0.001) decrease in TSW because of a decrease of INFN, BRAN, SILN and SN. In contrast, the GRF1 overexpressing line showed an increased TSW (116%, < 0.05).

Inflorescence and flower development

Inflorescence architecture and flower development can affect the amount of seeds produced by a plant. Therefore, the effect on seed yield-contributing parameters of seven genes described to be involved in the regulation of these processes was investigated (Table 1). The clv1 mutants in Col-0 and Ler background produced more seeds per silique (39% (P < 0.001) and 31% (P < 0.001), respectively). However, because of a decrease in SILN, this genotype did not show an increase in TSW. In the clv2 mutant, a slight, but significant increase in SS (8%, P < 0.001) was observed. TSW, however, was not different from that in the wild-type plants. Reduction of MAX4 expression led to an increase in SILN (105%, P < 0.01), confirming its effect on inflorescence architecture (Auldridge et al., 2006). A decrease in TSW (81%, < 0.001), however, caused by a decrease in SN (78%, < 0.001), was observed.

Cell cycle

Cell proliferation is an important process in the regulation of seed development. Here, we investigated the effect of the misexpression of seven core cell cycle genes on seed number and size (Table 1). No major significant changes in TSW were observed (Table 2; Table S2). However, seeds harvested from the del1 loss-of-function plants showed an 11% (< 0.001) increase in size compared with control plants. In WEE1 knockout and KRP2 overexpressing plants, the BRAN was increased (125% (< 0.01) and 136% (< 0.001), respectively). In this latter genotype, the SILN was also increased (139%, < 0.001), but the SN was reduced (10%, < 0.001) compared with control plants.

Mapping genes relative to QTL affecting seed size, length and number

So far, only a few quantitative trait loci (QTL) studies (Alonso-Blanco et al., 1999; Herridge et al., 2011) have been conducted in A. thaliana to discover genes that regulate seed size and seed number. QTL mapping procedures are used to predict genomic regions containing functionally important naturally occurring genetic variation. However, the gene(s) responsible and the functional change(s) ultimately must be identified, requiring either further fine mapping of the QTL region, which is a time-consuming process, or complementation with genome-wide association (GWA) studies by extensively genotyping large samples of unrelated individuals. Both approaches can profit from focusing on a selection of genes with known or potential functions in the trait of interest, instead of anonymous genome-wide markers.

We have repeated the QTL analysis on recombinant inbred lines (RILs) obtained from the cross between the small seeded Landsberg erecta (Ler) accession and the large seeded Cape Verde Islands (Cvi) accession and mapped QTL for seed weight (SW), length (SL) and seed number per fruit (SN) (Figure 4). The number and effects of QTL identified are very similar to those reported earlier by Alonso-Blanco et al. (1999) and partially overlap with QTL identified by Herridge et al. (2011). So far, the genes/alleles underlying these QTL have not been determined. We have mapped the 25 and 26 genes identified in this study to affect significantly (< 0.05) SS and SN, respectively, relative to the QTL for SW, SL and SN (Figure 4). For example, GA20OX1, CKX3, AP2 and ATHB16 mapped at 1–7 Mb away from the QTL peak at 58 cM on chromosome 4; EIF5A3, AVP1, GW2, EXP10 and AHK mapped at 3–8 Mb away from the QTL peak at 11 cM on chromosome 1. Further support for their nomination as candidate gene could be obtained by the nature of sequence polymorphisms between Ler and Cvi (http://signal.salk.edu/atg1001), further fine mapping of the QTL and/or by GWA studies.

Figure 4.

 Map position of genes identified to affect significantly (< 0.05) seed size (SS) and seed number per silique (SN) relative to the QTL for the seed weight (SW), seed length (SL) and seed number (SN), segregating in the Ler×Cvi RIL mapping population, obtained from composite interval mapping. The red horizontal line is the genome-wide significance threshold (α = 0.05). The horizontal green bar indicates significant parameter-specific effects. Increasing effects by the Cvi allele are indicated in yellow-red (Ler in cyan-blue).

Discussion

Although seed parameters such as size and number have been studied for many years in Arabidopsis, the fragmented knowledge about seed yield remained to be assembled into a global and coherent picture. Our analysis of 64 genotypes representing 46 genes either in gain- and/or loss-of-function situations meets this challenge. This study revealed extensive variation for total rosette area and the five seed yield parameters studied.

Comparing seed phenotypes obtained in independent experiments should be considered with caution, even when using plants of the same genotype

It has long been assumed that if you change the expression of a gene, and this results in a phenotype, the genotype will display a similar phenotype when grown in comparable conditions in different laboratories. This is the tacit premise underlying many gain- or loss-of-function studies. However, small variations in growing conditions and handling of plants can account for significant differences in observed phenotypes in independent laboratories (Massonnet et al., 2010). This was also noticed in this study: only for nine (ahk2-ahk3-ahk4, ap2, arf2, gasa4, iku2 and mini3 mutants, and ARGOS, CKX1 and CKX3 overexpressors) of the 14 lines (ANT, arf2, ARGOS, DWARF4, GA20OX1, HSD1, GASA4, gasa4, ahk2-ahk3-ahk4, ap2, CKX1, CKX3, iku2 and mini3) previously described to affect seed size or number, comparable effects were observed. For GA20OX1 and HSD1, seeds from the original lines were obtained: lines overexpressing GA20OX1 showed a lower NS than WT, which is opposite to what has been reported. Overexpressing HSD1 did not affect SS and SN at all. For two overexpression lines (DWARF4 and GASA4), the difference in effects could be explained by the difference in genetic background: in this study, DWARF4 and GASA4 were overexpressed in a Col-0 background, while results reported were obtained from overexpression in a Wassilewskija background. Regarding ANT, although the transgene is clearly overexpressed in both studies, effects on SS and SN were different. These differences in results underscore the need for a higher experimental reproducibility to enable comparison of results produced by different laboratories or in different experiments (Schilling et al., 2008). This can be achieved by a better description, monitoring and a more precise control of the environmental plant growth conditions (Massonnet et al., 2010).

The analysis of lines described to increase vegetative growth identified GRF5, BRI1 and GRF1 as positive regulators of seed yield

Several lines positively affected in seed size or number were also reported to show an increase in leaf area. To test whether the inverse also holds true, 18 lines reported to produce larger leaves were included in this study. Interestingly, three genotypes overexpressing GRF1, GRF5 and BRI1 produced larger seeds. Plants overexpressing the putative transcription factors GRF1 and GRF5 were shown to produce larger cotyledons containing larger and more cells, respectively (Alonso et al., 2003; Gonzalez et al., 2009). The increased seed size in these plants could possibly result from the production of larger embryos containing larger cotyledons. These genotypes, however, produce less siliques containing less seeds. The role of brassinosteroids in seed development has been reported previously. In the Arabidopsis shrink1-dominant (shk1-D) mutant, in which overexpression of the P450 monooxygenase family gene CYP72C1 results in lower levels of the endogenous brassinosteroid, seeds are smaller (Takahashi et al., 2005). In rice, a defect or an increase in brassinosteroid biosynthesis by down-regulation of DWARF11 or overexpression of a sterol C-22 hydroxylase gene, respectively, leads to the production of smaller or larger seeds, respectively (Tanabe et al., 2005; Wu et al., 2008). In our study, no increase in seed size was observed by overexpressing the ortholog of DWARF11 in Arabidopsis. However, the finding that the seeds produced by plants overexpressing BRI1 encoding for the brassinosteroid receptor are larger emphasizes the involvement of this hormone and its importance in seed size control.

Among the different genes involved in cell cycle regulation, DEL1, encoding an inhibitor of endoreduplication (Vlieghe et al., 2005), leads to the production of larger seeds when down-regulated. During early endosperm development (nuclear), mitosis is uncoupled from cytokinesis, and up to approximately 200 nuclei are generated within the single-celled embryo sac (Sabelli and Larkins, 2009). Down-regulation of DEL1 could lead to inhibition of cell division, but increase of DNA content and therefore leading to an increased endosperm cell size and seed size as a correlation between a correlation between ploidy level and size has previously been reported in several plant species (Galbraith et al., 2001; Sugimoto-Shirasu and Roberts, 2003). If in this mutant the phenotype is because of extended growth of the endosperm or of the embryo, it needs to be determined.

The analysis of genes described to affect seed parameters identified GW2 as a positive regulator of growth

In rice, the overexpression of the RING-type protein E3 ubiquitin ligase GW2 negatively affects seed size, while its loss-of-function enhances grain weight (Song et al., 2007). We identified two putative orthologs of GW2 in Arabidopsis, AT1G17145 and AT1G78420. Misexpression of these two GW2 orthologs in Arabidopsis, however, did not affect seed size. In contrast, down-regulation of AT1G78420 positively affects the number of seeds per silique, resulting in an increased total seed weight. In addition, plants overexpressing AT1G78420 showed an increased number of inflorescences, branches and siliques, as well as larger leaves. This latter observation suggests GW2 as a potential new regulator of growth.

In conclusion, this joint analysis of various known and unknown gene effects resulted in a better insight into the relative importance of genetic factors determining seed yield and allowed the identification of new genes affecting either seed yield or biomass.

Experimental procedures

Plant material

Forty-six genes reported to be involved in seed yield, vegetative growth, cell cycle or inflorescence architecture regulation were selected (Table 1). Arabidopsis lines (Col-0 or Ler background) over- or underexpressing these genes were collected or generated.

Received lines

For 29 of the 46 genes, 34 lines were kindly provided (Table S1). Seeds of clv1-1 (CS45) and clv2-1 (CS46) mutant lines and T-DNA SALK lines were obtained from the Nottingham Arabidopsis Stock Centre (NASC), UK (http://signal.salk.edu/cgi-bin/tdnaexpress) (Alonso et al., 2003). Zygosity was confirmed by PCR using specific border primers and the LB-6313R T-DNA primers.

Cloned lines

The open reading frames of AINTEGUMENTA, ARGOS, CLE26, eIF5A, ERL1, ERL2, GASA4, GW2 orthologs (AT1G17145 and AT1G78420), KNAT1, LEC2, MINI3, NAC1 and REV were PCR-amplified from cDNA synthetized from RNA extracted from A. thaliana Columbia (Col-0) 12-day-old seedlings, flowers or siliques. DWARF11, EXP10 and IKU2 were amplified from DNA of Col-0 seedlings. PCR fragments were cloned into an entry vector using a BP reaction and subsequently cloned into the appropriate destination vector for expression in plants by an LR reaction according to the manufacturer’s protocol (Invitrogen, Paisley, UK). The destination vectors used in this study were pB7WG2 or PK7WG2 (Karimi et al., 2002), containing the CaMV 35S promoter, and/or pPphasGW3’arc5I, containing a seed-specific promoter pPhas (Chandrasekharan et al., 2003) (Table 2; Table S2).

The overexpression vectors containing the DWARF4 or the GRF2 gene were kindly donated by Dr. Kenneth A. Feldmann (Choe et al., 2001) and Dr. Hans Kende (Kim and Kende, 2004), respectively.

The different constructs were transformed into Col-0 by flower dip. T2 plants harbouring one insertion locus were selected, and five independent T3 plants displaying 100% of resistance on Basta or Kanamycin selection were assumed to be homozygous for the transgene and were selected for further analyses.

qRT-PCR

Expression levels of all SALK lines and five independent in-house transformed lines per construct were measured by qRT-PCR (Figure S1), using the appropriate primers (Table S3). CBP20, CDKa1 and CKA2 were used as internal control genes for normalization. T-DNA lines and a single overexpression line per construct displaying a significant deviation in expression relative to the wild type were selected for further phenotyping.

Experimental design and data collection

Phenotyping all lines with either Ler or Col-0 as background was performed across three and 11 experiments, respectively, over a 2-year time span. All experiments were set up with a completely randomized design and have either the wild-type line Col-0 or Ler in common. Prior to each experiment, lines were multiplied in parallel under identical growth conditions to minimize the confounding effect of seed factors (e.g. time of harvest, quality, etc.) on the genetic factors underlying variation in seed and biomass parameters. Surface-sterilized seeds were sown onto round Petri plates containing 100 mL of sterile medium consisting of 4.3 g/L MS supplemented with 0.5 g/L MES, 100 mg/L myo-inositol, 10 g/L sucrose and 10 g/L agar. Seeds were stratified at 4 °C for 3 days before being transferred to standard growth conditions (16 h day regime, 21 °C) for 8 days. Next, 25 plants per line were transferred to soil in individual 5-cm pots and placed at 21 °C under either a long day regime (16 h light) for seed related analysis, or a short-day regime (16 h dark) for rosette image analysis. An ARACON (BETATECH bvba, Gent, Belgium) was placed when the primary inflorescence was approximately 10 cm in height.

Twenty plants per line were measured to estimate a mean value for the following traits: seed size (SS), total seed weight (TSW), number of inflorescences (INFN), number of branches (BRAN), number of siliques (SILN), seed number per silique (SN) and rosette area (RA). In each plant, INFN, BRAN and SILN were recorded on the main inflorescence 35 days after germination. Plants were harvested at maturity, and the seeds were cleaned using a sieve. Per plant, an average SS was estimated from a sample of 500 seeds imaged with a document scanner and analysed using ImageJ software (Herridge et al., 2011). For each plant, TSW (mg) was measured by weighting the harvested seeds, and a mean number of seeds per silique was estimated by counting the number of seeds yielded by ten siliques using the Elmor C3 Counting Machine (Elmor). RA (cm2) was measured twice a week throughout a growth period of 4 weeks before bolting started (from stage 1.02 to stage 1.20; Boyes et al. (2001)) using the TraitMill™ (Reuzeau et al., 2006).

Data analysis

We used the coefficient of variation (%CV), defined as the standard deviation expressed as percentage of the mean, to compare the experimental variation in the series of seed yield traits (measured in different units), and in the series of RA repeated measurements data (having the same units, but running at different levels of magnitude). We calculated the %CV for the data obtained for the control line Col-0 for each of the 11 related experiments. As INFN, BRAN, SILN and SN data consist of counts, and RA data were skewed, these data were log transformed before analysis.

As all experiments were designed to have a similar treatment structure, we used REML as implemented in Genstat v13 (Payne, 2010) to perform a combined analysis, also called meta-analysis, of INFN, BRAN, SILN, SS, SN, TSW data and RA repeated measurements data obtained in the 11 related experiments. The following generalized linear mixed model (GLMM) (random terms underlined) Yijk = μ + genotypej + experimentk + errorijk was fitted to the INFN, BRAN, SILN, SN, SS and TSW data, partitioning phenotypic variation into fixed genotype effects and random experiment and error effects. Yijk is the phenotype of the i-th plant from the genotype j analysed in experiment k; μ is the overall mean of the phenotypes obtained for all lines considered; random effects in the model were assumed to be independent and normally distributed with a mean of zero and variance inline image, where = experiment and error. For INFN, BRAN, SILN and SN data, a logarithm-based link function was incorporated. The significance of the comparisons between the line and the control line REML means was assessed by a t-test using the VMCOMPARISON procedure in Genstat v13 (Payne, 2010). The cut-off for significance was set to α = 0.01 to compensate for the large number of comparisons made.

A combined analysis of the repeated RA measurements data obtained in the 11 experiments was performed using the residual maximum likelihood (REML) as implemented in Genstat v13 (Payne, 2010). The following linear mixed model Yijkt = μ + genotypej + timet + genotype.timejt + experimentk + errorijkt was fitted to the RA data, where Yijkt is the log base 10 transformed RA data of the i-th plant from the genotype j measured at time point t (= 1–7 for all genotypes except CKX3, CKX4, EIF5A MINI3, Rev6, SALK_049859, SALK_106927 and SALK_150003 where = 1–5) and analysed in experiment k, and μ is the overall mean of the phenotypes obtained for all lines considered across all time points. Random effects in the model were assumed to be independent and normally distributed with a mean of zero and variance inline image, where = experiment and error. Times of measurement were considered to be equally spaced, and various ways of modelling the correlation structure (uniform, autoregressive order 1 (AR1) and 2 (AR2), and antedependence order 1 and 2) were compared in the REML framework as implemented in Genstat v13 (Payne, 2010). Selection of AR1 as best model fit was based on a likelihood ratio test (LRT) statistic and the Aikake Information coefficient (AIC). As residuals from the analysis indicated increasing variance over time, this was modelled directly by specifying that heterogeneity is to be introduced into model. Significance of the fixed main and interaction effects was assessed by an F-test. Fitting linear contrasts among the lines and the wild-type line in the REML analysis of repeated measurements was performed using the VTCOMPARISON procedure in Genstat v13 (Payne, 2010). Again, the cut-off for significance was set to α = 0.01 to compensate for the large number of comparisons made.

QTL Mapping

The seed size (SS), seed length (SL), seed number per fruit (SN) and marker data for 162 Ler×Cvi RILs come from the analysis described in Alonso-Blanco et al. (1999). We used the QTL menu in GenStat 14 (International VSN) to explore the QTL for SS, SL and SN. In a preliminary search for QTL, we tested the association of individual loci with the trait every 5 cM along the genome, using the commonly known simple interval mapping (SIM) procedure. In a second step, we tested for QTL at particular positions after correcting for QTL elsewhere in the genome, as were identified in the preliminary analysis. This procedure is commonly known as composite interval mapping (CIM). The genome-wide type I error rate is set to α = 0.05. Genes were physically mapped relative to the QTL position using the physical position of the 99 markers.

Acknowledgements

We thank Benjamin Laga and Marc Bots from Bayer CropScience for their helpful suggestions and Annick Bleys for assistance with preparing the manuscript. We thank Joost Keurentjes for providing us the physical positions of the markers. This work was supported by the Inter-University Attraction Poles Programme (IUAP VI/33), initiated by the Belgium State Science Policy Office, and a Methusalem Grant of the Flemish Government at Ghent University.

Ancillary