HPLC metabolite profiles
To determine whether metabolite profiles were inherited from parents to offspring, the aromatic compounds present in apical tissues of the F1 families 001 and 002 (Experimental procedures), and of their parents, Populus deltoides cv. S9-2, P. nigra cv. Ghoy and P. trichocarpa cv. V24, were analyzed by reverse-phase HPLC. The UV/visible absorption spectra of the chromatogram peaks suggested the presence of simple phenolics and benzoic acid derivatives as well as phenylpropanoids and flavonoids. The metabolite profiles were different for the three parents, both qualitatively and quantitatively. Characteristic chromatogram peaks of each parent could be traced back in the chromatograms of their respective progeny, indicating the inheritance of parent-specific compounds in the offspring. Based on analysis of the 15 most abundant chromatogram peaks, the mean broad-sense heritability was shown to vary between 0.55 and 0.82, depending on the quantification method used (Experimental procedures).
Because flavonoids are abundantly present in apical tissues of Populus (Greenaway et al., 1992) and have characteristic UV/visible absorption spectra, and because the structure of the flavonoid biosynthetic pathway is well described, this pathway lends itself as an excellent model to evaluate the feasibility of genetical metabolomics of a complex biosynthetic pathway. A total of 29 flavonoids could be clearly distinguished in all individuals of family 001. The chromatograms of family 002 revealed 39 flavonoids. Thirteen of these 39 flavonoids were undetectable in 25–50% of the family 002 individuals, and for some of them chi-squared tests hinted at a 1:1 or 1:3 Mendelian segregation. However, no significant mQTL could be detected for any of the 13 compounds using single-trait QTL analysis of population 87002, suggesting that these flavonoids were below the detection limit rather than absent in part of the family.
Of the 26 flavonoid peaks that were present in all individuals of family 002, spiking indicated that 15 of them were also found in family 001. Because of the restricted number of traits that are accepted by MultiQTL (see below), and to be able to compare the results for both families 001 and 002 that share the common female parent P. deltoides cv S9-2, all subsequent analyses were focused on these 15 common peaks (Figure 2).
Figure 2. HPLC separation of one individual of population 87001. Single-wavelength chromatograms were taken at 287, 340, 355 and 365 nm to integrate the flavonoids with a UV/visible absorption spectrum similar to that of a flavanone/dihydroflavonol, a flavone, rutin and a flavonol respectively. The 15 flavonoids in common between families 001 (populations 87001 and 95001) and 002 (populations 87002 and 95002), indicated at their respective wavelengths, are shown in red.
Download figure to PowerPoint
Flavonoid concentration distributions
The concentration distributions of all 15 flavonoids present in both families were unimodal and in most cases skewed to the right. Most of these flavonoids were clearly present in the common P. deltoides cv. S9-2 parent, as expected, but often not detected in the P. nigra cv. Ghoy or in P. trichocarpa cv. V24 parents (Table 1). The concentrations of some of the compounds were higher in the hybrids than in either parent. This was apparent for flavanone 5 in both families and for rutin 12 in family 002, and was confirmed for both by the additional investigation of nine ramets of each parent and four different cultivars of each of the three poplar species. This phenomenon, called chemical over-expression, is thought to result from the obstruction or elaboration of a pathway in the F1 hybrids, leading to the accumulation of intermediates or (new) end products respectively (Orians, 2000).
Table 1. Descriptive statistics of flavonoid levels
|Name||Class||Compound (ng mg−1 dry weight)||87001 95001||87002 95002|
|Pinobanksin||Dihydroflavonol (1)||615||ND||31.1||761 (400)||68.7 (69)|
|1070 (700)||14.5 (17)|
|Eriodictyol||Flavanone (2)||1120||11.8||ND||697 (370)||193 (170)|
|917 (610)||55.2 (64)|
|Pinostrobin||Flavanone (3)||2.04||ND||ND||2.97 (2.1)||0.342 (0.39)|
|4.91 (3.8)||0.289 (0.31)|
|Pinobanksin 3-acetate||Dihydroflavonol (4)||673||ND||ND||307 (210)||95.4 (87)|
|339 (570)||36.5 (41)|
|Unknown||Flavanone* (5)||ND||ND||ND||205 (110)||237 (210)|
|142 (120)||78.1 (118)|
|Apigenin||Flavone (6)||12.6||29.3||ND||63.4 (37)||59.3 (59)|
|73.8 (612)||3.7 (33)|
|Unknown||Flavone (7)||133||ND||26.3||175 (170)||257 (250)|
|141 (120)||83.0 (93)|
|Unknown||Flavone (8)||103||ND||ND||65.7 (47)||17.4 (25)|
|29.4 (30)||10.5 (22)|
|Galangin||Flavonol (9)||253||ND||ND||315 (170)||62.2 (73)|
|473 (320)||22.7 (25)|
|Kaempherol||Flavonol (10)||227||14.0||0.752||142 (90)||225 (210)|
|207 (140)||65.7 (79)|
|Quercetin||Flavonol (11)||107||18.8||10.8||131 (100)||198 (170)|
|211 (160)||60.9 (91)|
|Rutin||Flavonol (12)||ND||2440||ND||841 (550)||141 (130)|
|837 (590)||88.4 (93)|
|Galangin 3-methyl ether||Flavonol (13)||12.0||ND||ND||16.5 (8)||2.76 (2.4)|
|20.5 (12)||1.48 (5.5)|
|Quercetin 3-methyl ether||Flavonol (14)||172||3.42||ND||658 (460)||46.9 (39)|
|75.0 (58)||38.5 (53)|
|Unknown||Flavonol (15)||102||1.22||ND||54.8 (31)||54.3 (45)|
|67.6 (47)||16.1 (17)|
Because the 15 flavonoids are synthesized from the same biosynthetic pathway (Figure 1), their levels were expected to be highly correlated. Analysis of their correlation coefficients may suggest groups of flavonoids whose levels are co-regulated, and, additionally, reaction steps for which mQTL may be found. To investigate which of the 15 flavonoid concentrations were correlated, the 105 possible correlation coefficients of the 15 flavonoids were calculated based on the peak height/dry weight (PH/DW) (Table S1). No negative correlations were found. Correlation networks were subsequently generated for the highly correlated (r > 0.80) flavonoid concentrations in each population. Figure 3 shows that both general and family-specific associations were evident.
Figure 3. Metabolite correlation networks. Correlation networks, obtained by applying the Fruchterman–Reingold 2D algorithm, are shown for each population and were generated by using the levels of the 15 flavonoids in common between families 001 and 002 and applying a threshold correlation coefficient (r > 0.8). Vertices and edges represent flavonoids and strong correlations respectively. The Pearson product–moment correlation coefficients are listed in Table S1.
Download figure to PowerPoint
For the two families 001 and 002, the correlation networks showed a strong association between the levels of quercetin 11 and quercetin 3-methyl ether 14 (Figure 3), the substrate and product of flavonol 3-O-methyltransferase (F3OMT) respectively. The same enzyme converts galangin 9 to galangin 3-methyl ether 13, whose abundances were also highly correlated in all populations except 95002.
In all populations, except 95001, eriodictyol 2, galangin 9, kaempherol 10 and the unknown flavonol 15 were mutually highly correlated (Figure 3). This result indicates a strong association between the flavanone/dihydroflavonol and flavonol branches of flavonoid biosynthesis, represented by eriodictyol 2 and by galangin 9, kaempherol 10 and the unknown flavonol 15 respectively. Notably, the correlation networks did not show a strong correlation between the levels of any of the flavones, i.e. apigenin 6 and the unknown flavones 7 and 8, and the levels of either flavanones/dihydroflavonols or flavonols. Also, the two flavanones, pinostrobin 3 and the unknown flavanone 5, were not consistently correlated with the level of any other flavonoid.
In addition to general associations, family-specific associations between flavanone/dihydroflavonol and flavonol biosynthesis also prevailed in the correlation networks. In family 001, the concentrations of pinobanksin 1, eriodictyol 2 and galangin 9 were highly correlated, whereas strong correlations between the levels of eriodictyol 2, pinobanksin 3-acetate 4, kaempherol 10, quercetin 11, quercetin 3-methyl ether 14 and the unknown flavonoid 15 were prominent in family 002 (Figure 3). Family 002 was further characterized by a high correlation between the levels of the flavone 7 and rutin 12 (Figure 3).
Taken together, both general and family-specific correlations were found. Within each family, most correlations were consistently observed in both populations. A closer examination of the correlation networks in each population did not reveal groups of flavonoids of a given class, i.e. no mutually highly correlated clusters were found that contained all flavones, all flavonols or all flavanones. In contrast, both general and family-specific correlations pointed to a tightly associated biosynthesis of specific flavanones/dihydroflavonols and flavonols.
QTL analysis of flavonoid concentrations
To reveal loci that control the flux within flavonoid metabolism, mQTL were searched for the different flavonoid concentrations. A multi-trait approach was applied by using maximum-likelihood interval mapping because of the multiple traits and the high correlations that were often found between the different flavonoids. However, the higher the number of traits in a multi-trait approach, the higher the probability that multiple loci along the chromosome affect the multivariate trait and the higher the chance of detecting so-called ‘ghost’ QTL caused by the interfering effect of linked loci (Jiang and Zeng, 1995; Knott and Haley, 2000; Korol et al., 2001; Martínez and Curnow, 1992). Therefore, single-trait QTL analysis with both regression and a non-parametric Wilcoxon test was performed as an alternative and complementary approach. Furthermore, because ratios of compound concentrations are more robust than individual metabolite levels (Fiehn, 2003; Morreel et al., 2004; Steuer et al., 2003), we calculated the 105 possible ratios between the peak heights of all 15 flavonoids of the families 87001 and 87002, logarithmically transformed them to so-called log ratios (Birks and Kanowski, 1993) and used them for univariate or single-trait QTL analysis. This strategy increases the chance of detecting mQTL that control the differential synthesis of two intermediates present in the same pathway. The QTL results obtained by the different methods (mIM and single-trait QTL analyses of log ratios) are presented in Table 2, Figure 4 and Tables S2 and S3. From these data, robust mQTL were assigned based on the criteria explained in the Experimental procedures. The mQTL of flavonoid concentration levels that were obtained for populations 87001 and 87002 are given below.
Table 2. Quantitative trait locus-associated LOD scores and flavonoid ratios affected by the QTL
|Genetic map||LG||LOD (mIM)||Ratios (univariate)|
|Family 001|| ||87001||95001||87001|
|P. d.||XIII||11.0a||15.2a||1/3a, 2/3a, 4/3b, 5/3b, 6/3a, 9/3a, 10/3a, 11/3a, 12/3a, 13/3a, 14/3a, 15/3a|
| ||XV|| || ||13/10b|
|P.n.||III||11.3b||11.9a||1/11a, 2/11a, 6/11a, 9/11a, 10/11a, 13/11a, 15/11a, 6/14a, 9/14b, 13/14a, 15/14a|
| ||XIII||18.2a||9.5b||1/4a, 2/4a, 6/4a, 9/4a, 10/4a, 13/4a, 15/4a|
|Family 002|| ||87002||95002|| 87002|
|P.d.||XIII||11.0a|| || |
|P.t.||IV||19.8a||12.1a||1/5a, 2/5a, 3/5a, 4/5a, 7/5a, 9/5a, 10/5a, 12/5a, 13/5a, 14/5b, 15/5a, 12/7a|
| ||V||13.8a|| ||8/6b, 15/6a|
| ||XIII|| ||10.4a|| |
Figure 4. Metabolite quantitative trait loci position. The presented mQTL were obtained by mIM with the peak height/dry weight (μV mg−1) of the 15 flavonoids in common between families 001 and 002. Map positions (cM) and marker names (Cervera et al., 2004) are given below each LG. The QTL likelihood map and the probability that the mQTL is present in each marker interval as determined by bootstrapping are shown in red and yellow respectively. Scales presenting the LOD score and the probability of occurrence in a marker interval are on the left and right side of each QTL likelihood map respectively. The position of the maximum LOD score is indicated by a triangle. In each QTL likelihood profile, the flavonoid(s) whose concentration(s) is (are) affected by the mQTL is (are) noted. The horizontal line below the QTL likelihood profile represents the physical length in megabases (Mb) of the assembled LG. Positions of candidate genes are indicated by green triangles, and the positions of the genetic markers by thin lines. In addition, the bivariate standard normal distribution of quercetin 11 and quercetin 3-methyl ether 14 is presented for each genotype group of the mQTL at LG III of the Populus nigra genetic map. Concentration values of quercetin 11 were logarithmically transformed, and those of quercetin 3-methyl ether 14 by taking the square root, and are given on the abscissa and ordinate respectively. AT, acyl-CoA-dependent acyltransferase; CHS, chalcone synthase; F3′H, flavonoid 3′-hydroxylase; F3OGlcT, flavonol 3-O-glucosyltransferase; FS, flavone synthase; OMT, O-methyltransferase.
Download figure to PowerPoint
In family 001, multi-trait interval mapping (mIM) analysis revealed two mQTL on the genetic map of P. nigra cv. Ghoy (hereafter designated P.n. map), on LG XIII and on LG III, and one mQTL on the map of P. deltoides cv. S9-2 (designated P.d. map), on LG XIII. The highest LOD score (18.2) was observed at marker E32F4211 on LG XIII of the P.n. map (Figure 4, likelihood map in red). Bootstrapping results (Figure 4, yellow bars) indicated an almost 80% chance that the mQTL occurred in the 13 cM marker interval e33g3405r–E32F4211. Examination of the log ratios that were affected by the mQTL (Table 2) revealed the importance of this locus on the abundance of pinobanksin 3-acetate 4 (Table S3). In agreement, the highest value (24%) for the variance explained by the mQTL, as determined by mIM, was associated with pinobanksin 3-acetate 4 (Table S2).
A second mQTL on the P.n. map was located on LG III and reached its maximum (LOD 11.3) in the interval E43G4113r–e46g1504. The mQTL had a 60% probability of occurrence in this 10 cM interval based on bootstrapping results (Figure 4, yellow bars). Although the highest values for the variance explained by the mQTL (Table S2) were associated with quercetin 11 and quercetin 3-methyl ether 14 in population 87001 only, all single-trait mQTL for log ratios involving either quercetin 11 or quercetin 3-methyl ether 14 co-localized with the mQTL predicted by mIM (Table S3). Therefore, we concluded that the mQTL is involved in the biosynthesis of the latter two compounds.
As described above, the levels of quercetin 11 and quercetin 3-methyl ether 14 were strongly correlated in all populations. In the case of population 87001, both compounds were isolated from the remainder of the correlation network because the abundance of neither one strongly correlated with any of the other 13 flavonoid levels (Figure 3). If the mQTL on LG III, obtained by mIM, were indeed implicated in the strong association between the levels of both quercetin 13 and quercetin 3-methyl ether 14 as suggested by the single-trait mQTL analyses of the log ratios, it would explain a major part of their concentration co-variance. Indeed, the transformed concentration levels of quercetin 11 and quercetin 3-methyl ether 14 had an initial correlation coefficient of 0.89 in population 87001, whereas the correlation was reduced to 0.58 when the mQTL effect was eliminated (Figure 4). mIM indicated that 19% and 14% of the concentration variances of quercetin 11 and quercetin 3-methyl ether 14 were explained by this mQTL respectively. Taken together, these data point to an mQTL on LG III that affects the concentrations of both quercetin 11 and quercetin 3-methyl ether 14.
The mQTL on LG XIII of the P.d. map was specifically associated with the abundance of pinostrobin 3, and the maximum LOD score (11.0) was found in the interval e40g0109–E33F3406r (Figure 4). Approximately 30% of the variance in pinostrobin 3 concentrations was explained by this mQTL.
In family 002, four mQTL were detected using mIM, of which one was in both populations of this family, i.e. in 87002 and 95002 (Table S2). This mQTL was located on LG IV of the genetic map of P. trichocarpa cv. V24 (P.t. map), between the markers e39g0319 and E39G0325 (Figure 4). Bootstrapping results indicated a >80% probability that the mQTL was located in this interval, which is less than 10 cM. An LOD score of 19.8 was obtained, and the mQTL affected significantly only the concentration of the flavanone 5 when the QTL results of the log ratios were surveyed (Table 2), explaining approximately 44% of its concentration variance (Table S2).
Notably, all mQTL were involved in the biosynthesis of one or two flavonoids that were only moderately correlated to all other flavonoid levels as revealed by the correlation networks (Figure 3). Furthermore, no mQTL were detected for the total peak height relative to the dry weight taken as an estimate of the total amount of aromatics.
Identification of candidate genes
Homologues of all known flavonoid biosynthesis genes were searched for in the poplar genome to identify possible candidate genes for the detected mQTL. The number of homologues found for each structural gene is shown in Table S4. For chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H) and flavonoid 3′-hydroxylase (F3′H), only one homologue was detected in the poplar genome, i.e. on LG X, LG V and LG XIII respectively. Two to five homologues were found for the remaining flavonoid biosynthesis genes, and they were distributed over the genome.
To reveal whether some of the flavonoid biosynthesis (Figure 1) gene homologues were present in the mQTL, amplified fragment length polymorphism (AFLP) markers in the mQTL regions were sequenced and mapped in silico on the poplar genome sequence (Figure 4). Interestingly, LG III contained three CHS homologues that were located in the 90% confidence interval of the mQTL, namely between markers PMGC486c and e40g0213. In addition, a F3′H, a flavone synthase (FS) and two flavonol 3-O-glucosyltransferase (F3OGlcT) homologues were found on LG XIII. No homologues of flavonoid biosynthesis genes were detected on LG IV (Figure 4).
Because we hypothesize that the mQTL on LG XIII controlling the abundance of pinostrobin 3 and pinobanksin 3-acetate 4 are a methyltransferase and an acetyltransferase (see Discussion), hidden Markov marker (HMM) profiles of O-, C-, N- and S-methyltransferases and of CoA-dependent O-acyltransferases were constructed and used to search for homologues on the poplar genome (see Experimental procedures). One O-methyltransferase homologue and three CoA-dependent O-acyltransferases were mapped to the LG XIII (assembly version 1.0, June 2004; Figure 4).