Eucalypts are one of the most planted tree genera worldwide, and there is increasing interest in marker-assisted selection for tree improvement. Implementation of marker-assisted selection requires a knowledge of the stability of quantitative trait loci (QTLs). This study aims to investigate the stability of QTLs for wood properties and growth across contrasting sites and multiple pedigrees of Eucalyptus globulus.
Saturated linkage maps were constructed using 663 genotypes from four separate families, grown at three widely separated sites, and were employed to construct a consensus map. This map was used for QTL analysis of growth, wood density and wood chemical traits, including pulp yield.
Ninety-eight QTLs were identified across families and sites: 87 for wood properties and 11 for growth. These QTLs mapped to 38 discrete regions, some of which co-located with candidate genes. Although 16% of QTLs were verified across different families, 24% of wood property QTLs and 38% of growth QTLs exhibited significant genotype-by-environment interaction.
This study provides the most detailed assessment of the effect of environment and pedigree on QTL detection in the genus. Despite markedly different environments and pedigrees, many QTLs were stable, providing promising targets for the application of marker-assisted selection.
There has long been interest in the use of molecular markers to assist in selection in plant and animal breeding (Stuber et al., 1982; Strauss et al., 1992; Weller, 2001). Often referred to as ‘marker-assisted selection’ (MAS), this involves the selection for specific traits based directly on molecular marker genotypes instead of traditionally used phenotypic selection. Eucalyptus is a forest tree of global economic importance as raw material for pulp, paper, solid wood products and, increasingly, as a source of carbon-neutral renewable energy. The application of MAS to genetic improvement in forest trees, such as Eucalyptus, presents certain distinct challenges relative to crops and livestock, which are related to their outcross breeding systems and the fact that most tree species have been domesticated for a few generations at most (Butcher & Southerton, 2007). Nonetheless, the long generation times, coupled with often poor juvenile/mature trait correlations (Pelgas et al., 2011), mean that early selection via MAS would be particularly beneficial in forest trees, whereas their recent domestication means that trees have a wealth of genetic variation that can be exploited (Butcher & Southerton, 2007).
Proposed techniques for providing marker–trait associations for MAS include quantitative trait loci (QTL) and association genetic analyses, and, recently, genomic selection. When DNA markers were first developed, it was envisaged that QTLs would be rapidly used as direct tools for early selection (Stuber et al., 1982; Tanksley et al., 1989). However, inherent limitations of the QTL technique, and the experimental design and/or genetic material used by most QTL studies, have largely restricted the application of QTLs for MAS to direct selection for large effect alleles with simple inheritance, such as those for disease resistance in apple (Bus et al., 2009) and barley (Zhong et al., 2006). A fundamental limitation of the QTL technique is that linkage disequilibrium generally decays much more rapidly in breeding populations than in the biparental families used for QTL detection, so that marker–trait associations often do not hold outside of the families used for detection (Strauss et al., 1992). Furthermore, the effects of specific loci often vary in different genetic backgrounds and environments (Neale & Savolainen, 2004; Symonds et al., 2005; Pelgas et al., 2011), yet QTL studies are predominantly conducted in a single or very few biparental families and sites, further limiting the generality of their findings.
Some of the limitations which prevent the direct use of QTLs for MAS can be overcome by association genetic analysis. The major differences between association and QTL studies are the much higher resolution and greater allelic diversity captured by association genetics (Ingvarsson & Street, 2011). However, despite a growing body of single nucleotide polymorphism (SNP)–trait associations, the application of candidate gene-based association genetics to MAS in both plants and livestock has been limited, largely reflecting the cost and difficulties in adequately characterizing the genetic architecture underlying polygenic traits (Hayes et al., 2009; Jannink et al., 2010; Grattapaglia & Resende, 2011). In the case of forest trees, an important limitation to such characterization is the rapid decay of linkage disequilibrium, which, so far, has prohibited the use of the genome-wide association genetic approaches commonly employed in humans and model species (Neale & Kremer, 2011). As a result, association studies in forest trees have used a candidate gene approach. The selection of appropriate candidate genes is a major challenge and obviously vital for the success of gene-based association studies.
The rapid development of high-throughput and cost-effective genotyping technologies is making genome-wide studies in non-model organisms more feasible, which may circumvent the need for associations with known genes. Genomic selection was first proposed for livestock improvement (Meuwissen et al., 2001) and is claimed to have revolutionized dairy cattle breeding, following the experimental application of the technique in the USA, New Zealand, Australia and the Netherlands (Hayes et al., 2009). It has since attracted widespread interest for breeding crop plants (Heffner et al., 2009; Jannink et al., 2010), fruit (Kumar et al., 2011) and forest trees (Grattapaglia & Resende, 2011; Isik et al., 2011; Iwata et al., 2011; Resende et al., 2012a,b; Zapata-Valenzuela et al., 2012; Denis & Bouvet, 2013). In brief, this approach bases selection on genomic breeding values calculated from densely spaced markers across the genome (Meuwissen et al., 2001). In contrast with QTL and association genetic approaches, genomic selection uses all genomic information, regardless of effect size or the requirement of meeting significance thresholds, potentially capturing all of the loci that contribute to variation in a given trait. However, despite promising results from experimental studies, further investigation into the accuracy of prediction models in different populations (e.g. interspecific hybrids relative to pure species) and environments is needed before the technique is applied operationally in tree breeding (Isik et al., 2011; Resende et al., 2012a,b).
The ongoing evolution of more sophisticated techniques for the detection and application of marker–trait associations to selection is clearly making MAS more achievable. However, regardless of the technique used to detect marker–trait associations, knowledge concerning the nature and extent of genotype-by-environment interaction (G × E) at the molecular level is crucial for the broad success of MAS (Liu et al., 2006; Collard & Mackill, 2008; Hayes et al., 2009; Resende et al., 2012a,b). The existence of G × E in plants and its potential significance for breeding have long been realized (Burdon, 1977; Cooper & DeLacy, 1994). However, it is only more recently that researchers have begun to quantify G × E at the molecular level in crops (e.g. Messmer et al., 2009; Crossa et al., 2010) and forest trees (Jermstad et al., 2003; Rae et al., 2008; Novaes et al., 2009; Thumma et al., 2010a; Pelgas et al., 2011). Generally, the importance of G × E is known to be relatively high for traits of low heritability, such as growth and yield. However, the importance of G × E for physical and chemical wood properties of tree species is the subject of some debate. Within Eucalyptus, the consensus from the few quantitative genetic studies is that G × E for such traits mostly involves heterogeneity of variance, rather than changes in the ranking of genotypes, across environments and, as such, is of limited practical importance for breeding (Raymond, 2002). However, few past studies in E. globulus have had access to sufficient samples of control pollinated or clonal material grown on multiple sites to adequately assess levels of G × E for wood properties (Lima et al., 2000; Volker, 2002; Costa e Silva et al., 2004, 2009). The few QTLs and association studies able to investigate this issue in trees at the molecular level have found that a substantial proportion of loci exhibit variable expression in different environments for a range of wood property traits, in Populus (Novaes et al., 2009) and Eucalyptus (Southerton et al., 2010; Thumma et al., 2010a), highlighting the importance of testing the effect of specific loci and alleles in different environments.
Eucalyptus globulus is the premier species for pulp-wood plantations in temperate regions world-wide (Potts et al., 2004). The key selection traits for a pulp-wood breeding objective in E. globulus are volume growth, basic density and pulp yield (McRae et al., 2004). Chemical traits, such as the content of cellulose, lignin and extractives and the ratio of syringyl to guaiacyl (S : G) subunits of lignin, are also of economic interest, as they affect pulp yield, as well as the cost and efficiency of the pulping process (Stackpole et al., 2011). A growing number of studies have considered QTLs for wood properties in the genus (e.g. Grattapaglia et al., 1996; Verhaegen et al., 1997; Bundock et al., 2008), including chemical properties (Thamarus et al., 2004; Rocha et al., 2007; Freeman et al., 2009; Thumma et al., 2010b; Gion et al., 2011). However, most have considered a single pedigree and site only, and those which have used populations in multiple sites generally lack the same families planted across sites (e.g. Thumma et al., 2010b), or sufficient individuals to detect QTLs within a single site (e.g. Thamarus et al., 2004; Bundock et al., 2008). Further, the lack of orthologous markers has limited comparison between the majority of QTL studies in eucalypts to the linkage group level (e.g. Thamarus et al., 2004; Freeman et al., 2009; Thumma et al., 2010b; Gion et al., 2011), again prohibiting detailed comparison of QTL expression across genetic backgrounds and environments.
Recent technological developments in the genus, including the release of the E. grandis genome sequence and the development of high-throughput genotyping using diversity array technology (DArT; Sansaloni et al., 2010; Steane et al., 2011) markers, are revolutionizing eucalypt genomic studies. The majority of DArT markers have been sequenced and annotated on the genome, greatly facilitating the identification of candidate genes near QTLs and the high-resolution comparison of results across studies. Furthermore, DArT markers display a largely homogeneous distribution across the E. grandis reference genome, providing excellent coverage for linkage mapping studies (Petroli et al., 2012). This project exploits this new DArT technology and aims to investigate the genetic architecture underlying growth and wood properties across a broad range of genetic material in E. globulus, and the extent to which QTLs are stable in different environments and pedigrees.
Materials and Methods
Genetic material and trial designs
Linkage mapping and QTL analyses were conducted in four unrelated E. globulus Labill. families, including three full-sib F1 families and a full-sib outbred F2. In combination, these families sampled a diverse section of the natural distribution of E. globulus (Table 1). Each F1 pedigree was planted at two sites, Manjimup in the south-west of Western Australia and Branxholme in western Victoria, in September 2001 (Table 2). The most obvious environmental difference between the trial sites was the mean annual rainfall, with the more productive Western Australian site receiving almost 50% more than the Victorian site (Table 2). The three mapping families were part of a broader trial comprising nine F1 families, each planted at both sites. Each site was planted with trees arranged in (5 × 5) pedigree plots. The trial was designed as a modified ‘row column design’ at the plot level (Williams et al., 2002), where over-represented families were randomly assigned to plot positions of under-represented families. As only a subset of the planted families was studied, row and column terms were not included in the statistical model, and plot within pedigree was treated as a completely random term. For each F1 pedigree, 92 individuals per site were used in this study (with the exception of pedigree 4 in Victoria, n =91 individuals) selected from four to five plots per pedigree, depending on survivorship. Plots were selected at random, except that those at the edges of the plantation were avoided. The F2 mapping pedigree included 112 genotypes, each of which was replicated clonally (two trees per genotype). It was planted in a field trial at Woolnorth in north-west Tasmania in May 1998. The ortet and ramet representing each genotype were assigned to separate replicates at random, and all clonally replicated individuals were used in this study (for full details of the trial design, see Milgate et al., 2005).
The procedure for DNA extraction and quantification used for all families, as well as the genotyping of microsatellite (or simple sequence repeat, SSR) and amplified fragment length polymorphism (AFLP) markers in the F2, is described in Freeman et al. (2006). The genotyping of DArT markers is described in Hudson et al. (2012), which includes the maps generated in this study. Twenty-one cell wall candidate genes were also mapped in this study. These were selected from association studies in E. nitens (Southerton et al., 2010; Thumma et al., 2010a) and E. globulus (S. Thavamanikumar & L. McManus, unpublished). Specifically, candidate genes were mapped which were significantly associated with Kraft pulp yield (KPY) and/or cellulose content across at least two populations of E. nitens (RAC7, CCoAMT2, EXP1, SAM1, CAD, DHY, UP3, EXGT1, CNX1, CSLA9, MYB1, COBL4, HB1; Southerton et al., 2010; Thumma et al., 2010a). Except for the CCR gene (see 'Discussion'), the remaining candidate genes (AQP1, NAP1, CSA3, KOR, PCBER, SUSY3, CDPK) were selected as a result of their association with various growth and wood properties in at least one population of E. globulus (S. Thavamanikumar, unpublished).
Candidate genes were placed on the linkage maps using two different methods. The majority were placed at the DArT marker in the consensus map closest to the homologous gene in the E. grandis genome sequence, using the GBrowse function of EucGenIE (http://www.eucgenie.org/; Hefer et al., 2011). Hudson et al. (2012) demonstrated a high degree of synteny and collinearity between saturated linkage maps constructed in E. grandis and E. globulus, suggesting that the genomes of these two species are very similar and that information from the E. grandis genome sequence is readily transferable to E. globulus. A subset of candidate genes (AQP1, NAP1, CSA3, KOR, PCBER, COBL4) was placed by genotyping SNPs within the genes in the mapping families exhibiting polymorphism, using cleaved amplified polymorphic sequence (CAPS) markers. Gene regions containing the SNPs of interest were first amplified using PCR, performed in 15-μl reactions. Each reaction contained 1 μl of c. 5 ng μl−1 genomic DNA, 1 × PCR buffer (Bioline 5 × MangoTaq™ colourless reaction buffer; Bioline (Aust) Pty Ltd, Alexandria, Australia), 1.5 mM MgCl2, 0.27 mM of each deoxynucleoside triphosphate (dNTP), 1 unit of DNA polymerase (Bioline MangoTaq™) and 0.47 μM of each of the forward and reverse primers (Supporting Information Table S1). Amplification conditions were as follows: an initial denaturation step at 94°C for 5 min, followed by 30 amplification cycles (30 s at 94°C; 30 s at annealing temperature (50°C for AQP1 and PCBER, 55°C for COBL4 and CSA3, and 60°C for NAP1 and KOR); 30 s at 72°C), followed by a final extension step at 72°C for 5 min. After verifying amplification via agarose gel electrophoresis, 5 μl of raw PCR product from each individual was digested in a final volume of 30 μl containing 1 × buffer and 2 units of restriction enzyme. Reactions were incubated overnight at the digestion temperature recommended for each enzyme by the supplier. Digestion products were resuspended in 1 × loading buffer (15% Ficoll, 0.05% xylene cyanol, 0.05% bromophenol blue) and separated by agarose gel electrophoresis. Details of the enzymes and specific reaction conditions for markers are given in Table S1.
QTL analysis was conducted using a consensus linkage map of the four families employed in this study (Fig. S1), allowing an assessment of the QTL homology between each pedigree. Joinmap 4 (Van Ooijen, 2006) was used for the construction of all maps, as it is able to integrate data with various segregation types and recombination estimates from a variety of sources into a single map. Linkage maps were constructed primarily from DArT markers (Sansaloni et al., 2010; Steane et al., 2011), but also included microsatellites and candidate genes. In the case of the F2, DArT and candidate gene markers were added to the map previously constructed from AFLP and microsatellites (Freeman et al., 2006).
In summary, the mapping procedure involved the construction of individual parental maps, which were merged to form sex-averaged maps for each pedigree, which, in turn, were subsequently merged to form the overall consensus. The construction of parental maps followed an iterative approach, as described by Freeman et al. (2006), but, in this case, mapping was conducted in three rounds, adding: (1) all markers except for DArT and AFLP segregating 3 : 1; (2) all AFLP and DArT segregating 3 : 1 with a high-quality score (reproducibility ≥ 90; see Sansaloni et al., 2010); (3) the remaining DArT markers. Parental maps were merged on the basis of biparentally segregating markers to form a high-density (498–695 markers per pedigree) sex-averaged map for each pedigree. The overall consensus map was then constructed using a subset of markers from these sex-averaged maps. Markers were selected from each pedigree (at c. 2.5–5-cM intervals) to provide even and full coverage of each linkage group, giving preference to ‘anchor’ markers mapped in the greatest number of families. The consensus map used for QTL analyses comprised 463 markers, including 392 DArT, 35 SSR, 30 AFLP and six candidate gene SNPs. The remaining candidate genes were placed after QTL analyses.
In the parental maps of each pedigree, linkage groups were selected at a minimum (independence) logarithm of the odds ratio (LOD) of 3. Marker order within linkage groups was determined using the default Joinmap 4 settings and a maximum χ2 goodness of fit jump threshold of 2.0. In order to produce maps with robust marker order, collinearity of shared markers for each linkage group was established between parental maps before constructing the sex-averaged map in each pedigree and between the sex-averaged maps before constructing the overall consensus map. After constructing the sex-averaged map for each pedigree and the consensus map, the order of markers was again checked against component maps and markers were excluded (or the order of component maps fixed) to maximize agreement with the component maps.
Assessment of phenotypic traits
Stem diameter and wood property traits were assessed at 7 yr of age in each pedigree and site. Stem diameter over bark (mm) of the main stem was measured at breast height (1.3 m; DBH). In order to assess wood properties, cores, 12 mm in diameter, were taken from bark to bark through the centre of each tree at a height of c. 110 cm and from a constant aspect within each trial. For the F1 families, each core was sawn in half longitudinally: one half was used to measure basic density and the other half to assess chemical wood properties. For the F2 pedigree, two separate cores were taken and one was used for each assay. Basic density was determined by the water displacement method (TAPPI, 1989). Each core was air dried, and then ground to wood meal (as described by Poke et al., 2005) for the chemical analyses. Chemical wood properties, cellulose, klason lignin and extractives content, as well as the predicted pulp yield and lignin S : G ratio, were assessed indirectly by near-infrared (NIR) spectroscopy, using a Bruker spectrometer (model Vector 22-N; Bruker Optik GmbH, Ettlingen, Germany). With the exception of the lignin S : G ratio, the NIR models for klason lignin and extractives (Poke et al., 2005), as well as cellulose and predicted pulp yield (Downes et al., 2011), have previously been validated independently in E. globulus studies. The models for klason lignin and extractives (F2 pedigree only), as well as cellulose content (all families), were also re-validated in the present study. Ten per cent of the cores for each pedigree (and each pedigree × trial site in the case of F1 families) were analysed using wet chemistry for model validation. Predicted and laboratory values for cellulose were highly correlated (R2=0.83, 0.79 and 0.87 for the F2 and F1 families in Western Australia and Victoria, respectively). Predicted and laboratory values were also highly correlated for extractives and klason lignin content in the F2 family (R2=0.90 and 0.66, respectively).
Quantitative genetic analyses
The quantitative genetic analysis of the F2 pedigree is described by Freeman et al. (2009); the following applies to the F1 families only. In order to test for the significance of pedigree, site and their interaction effects, a mixed model was fitted for each trait using the PROC MIXED procedure of SAS (version 9.2; SAS Institute Inc., 2010). The mixed model incorporated the effects of site (fixed), family (fixed) and site × family (fixed). Plot within family within site was fitted as a random term, which was used as the error to test the fixed effects. G × E may be caused either by heterogeneity of variances/scale effects across sites, and/or changes in the ranking of genotypes, of which only the latter will significantly influence any breeding strategy. In order to test for significant changes in pedigree ranking, for each trait, least-square means were generated and multiple pairwise comparisons were performed following Tukey–Kramer adjustment.
QTL analysis was conducted with MAPQTL 6.0 (Van Ooijen, 2009), using the consensus linkage map for all analyses. In order to examine the pedigree and site stability of the detected QTLs, analyses were conducted separately in each F1 pedigree site by site, in each pedigree (combined across sites) and combined across all families and sites. To simplify the comparison of QTL results between families, interval mapping was used for all analyses, whereas multiple-QTL model (MQM) mapping was used in the analysis combined across all families and sites in order to minimize the confidence intervals surrounding these key QTLs. All analyses used the default parameters of MAPQTL 6.0 (Van Ooijen, 2009). For the F1 families, the experimental design cofactors plot (for the analysis in each pedigree by site) and plot and site (in all other analyses) were fitted to remove some of the heterogeneity within and between sites. Putative QTLs were declared at two different levels: significant (chromosome-wide type I error < 0.05) and suggestive (chromosome-wide type I error < 0.1). The LOD thresholds for significant and suggestive QTLs were determined by permutation testing for each level of analysis (1000 replications; Churchill & Doerge, 1994). It is probable that some QTLs at the suggestive threshold are false positives, but these are presented to the mapping community as suggested by Van Ooijen (1999) to aid comparative QTL mapping. The MQM procedure in the analysis combined across families followed that described by Freeman et al. (2008), except that it used the regression algorithm (now available in MAPQTL 6.0; Van Ooijen, 2009). Many of the QTLs detected in the different analyses are likely to be independent estimations of the same QTL. Accordingly, for comparative purposes, QTLs detected in different families, and between analyses within the F1 families, were merged when their peak was within 15 cM of another QTL for the same trait (following Brown et al., 2003). In keeping with this definition of what constitutes a discrete QTL between analyses, QTLs were represented by arbitrary 15-cM confidence intervals (Brown et al., 2003). Within each family, QTL stability across sites was assessed by the significance of the site × QTL term in the mixed model (already described, but substituting family with QTL) and by comparing the significance of QTL effects in the analysis within each F1 pedigree site by site vs combined across sites.
Quantitative genetic analysis
In the cloned F2 pedigree, all wood property traits exhibited highly significant variation between genotype means (P <0.001), whereas growth traits exhibited less significant variation (P <0.05; see Freeman et al., 2009). For the F1 families, each of the traits measured exhibited significant variation, between families and between sites in each family (Table 3). Trees on the drier site (Victoria) were slower growing (32.4%) and had higher density (12.4%) and extractives content (25.6%), but lower pulp yield (6.3%), cellulose (4.9%) and klason lignin content (3.9%), and S : G lignin ratio (11.1%), than those on the wetter site (Western Australia). Based on F values, site was the major factor affecting all traits. With the exception of klason lignin, the site-by-family interaction (G × E) was significant for all traits. The strongest G × E was for DBH, for which the F value for the trial-by-pedigree interaction was comparable with that for the pedigree effect. For the wood property traits, the F value for the pedigree effect was substantially greater than the site-by-pedigree interaction (Table 3), suggesting that the relative performance of each pedigree was less affected by site for these traits.
Table 3. Genotype-by-environment interaction (G × E) in the F1 mapping families of Eucalyptus globulus
Despite large differences in site means for growth and all wood property traits, as well as statistically significant G × E for all traits except klason lignin, the relative rankings of families were stable across sites for pulp yield, cellulose and klason lignin content, as well as S : G lignin ratio. For density and extractives, there was a slight change in the relative ranking of families, although this change was not significant (results not shown). The difference in family rankings between sites was significant for DBH, largely because of the poor performance of family 1 in the drier, slower growing Victorian trial site (Fig. 1).
In total, 117 QTLs were found across all analyses (27, 32, 34 and 24 in the F1 families 1, 4, 5 and the F2 family, respectively). Of these, 19 were judged to be redundant as QTLs for the same trait occurred in the same location in more than one pedigree, reducing the total to 98 QTLs (Table 4). The 98 QTLs mapped to 38 discrete regions (Fig. 2), reflecting the inter-related nature of many of the traits. Fourteen of the 21 candidate genes in this study (RAC7, CAD, DHY, UP3, EXGT1, CNX1, HB1, AQP1, CSA3, NAP1, PCBER, COBL4, SUSY3 and CDPK) mapped to 10 of these genomic regions (Fig. 2). Fifteen of the 38 non-overlapping QTL regions were supported in the analyses combined across all families and sites (Table 4; Fig. 3). As these analyses employed a more conservative significance threshold, the great majority of QTLs significant at this level affected trait variation across different sites and/or families. Each QTL explained a small to moderate proportion of the phenotypic variation present, and this value was influenced strongly by the sample size of the different analyses (Table S2), as is known to be the case in QTL studies (Beavis, 1998).
Table 4. Quantitative trait loci (QTLs) for growth and wood properties across families and sites in Eucalyptus globulus (see Table 3 for trait acronyms)
For each QTL in the F1 families, the logarithm of the odds ratio (LOD) score is shown for the analysis within each site and pooled across sites, where significant. In the individual site analyses, ‘w’ and ‘v’ indicate significance in Western Australia and Victoria, respectively. Chromosome-wide significance levels for each QTL: suggestive: sP < 0.1; significant: *, P <0.05; **, P <0.01; ***, P <0.001.
cM reports QTL position from the analysis combined across families, where significant; otherwise, the mean position of significant results is reported.
The significance of QTL × E at each QTL. Where a QTL is significant in multiple families, the number in parentheses indicates which family exhibits significant QTL × E.
Eleven QTLs influenced DBH, distributed across all linkage groups, except for 1, 6 and 9. Multiple regions affected DBH on linkage groups 7, 8 and 11 (Fig. 2). A single QTL for DBH was identified in the F2 family (Table 4). Across all the analyses, 16 QTLs in total were identified for density, two to seven of which were detected in any one pedigree (Table 4). The QTLs for density were located on all linkage groups, except 1, 4 and 7 (Fig. 2). In six different linkage groups, more than one QTL affected density. Twelve to 17 QTLs were identified for each of the chemical wood property traits, with an average of 14.2 QTLs/trait. QTLs for chemical traits were found in multiple locations on all linkage groups, and ‘hotspots’ influencing multiple chemical traits within and between families occurred on most groups (Fig. 2).
Stability across sites
The stability of QTLs across sites can be quantified in a number of different ways. In the analysis undertaken in each F1 pedigree, site by site, there were 22, 22 and 27 QTLs detected in families 1, 4 and 5, respectively. Of these, just one QTL per pedigree was significant at both sites, each with consistent allelic effects at both sites. However, of the QTLs detected in the single-site analyses, 45%, 54% and 30% were also detected in the analysis combined across sites within families 1, 4 and 5, respectively. A less conservative measure of site stability is the test for QTL-by-site interaction. Under this test, an average of 74.2% of QTLs were site stable (i.e. did not exhibit a significant interaction with site), including 61.5% of the QTLs for growth, 68.7% of those for wood density and 78.8% of the QTLs for wood chemistry traits.
The use of multiple pedigrees allowed the detection of substantially more of the genetic architecture underlying the variation in growth and wood properties than in previous studies. There are likely to be hundreds to thousands of genes influencing growth and wood properties in forest trees (e.g. Dillon et al., 2010), in line with the findings of powerful genome-wide association studies for quantitative traits in humans (Visscher, 2008) and crop plants (Buckler et al., 2009). However, QTL studies in tree genera, such as Eucalyptus, Pinus and Populus, have typically found fewer than five QTLs for individual growth and wood property traits (e.g. Thamarus et al., 2004; Pot et al., 2006; Rae et al., 2008; Thumma et al., 2010b; Gion et al., 2011). By contrast, a total of 98 QTLs were located in this study, 11–17 of which were identified for each trait.
Despite a growing number of studies presenting evidence for the influence of specific genes on phenotypic trait variation in forest trees, research in this area is still very much in its infancy. In this context, co-location between QTLs and candidate genes can provide valuable evidence to support the influence of candidate genes on phenotypic trait variation. The co-locations in this study included lignin biosynthetic pathway genes CAD and PCBER with QTLs for S : G and klason lignin content, respectively. A highly significant association between PCBER and klason lignin content was also detected in an E. globulus association study (S. Thavamanikumar, unpublished). Genes involved in transcription/gene activation also co-located with QTLs for density (RAC7 and HB1). Similarly, Thumma et al. (2010a) found significant associations between HB1 and density, as well as between RAC7 and micro-fibril angle (which is strongly correlated with density) in E. nitens (Thumma et al., 2010a). Other notable co-locations included QTLs for pulp yield with cellulose synthase genes CSA3 and COBL4. The latter was significant in the analysis combined across all families and sites, and co-located exactly with QTLs for cellulose, as well as pulp yield, in separate families. Genes coding for cell wall/membrane proteins, AQP1 and NAP1, also co-located with QTLs for extractives, consistent with the significant effect of these genes on pulp yield and various chemical traits, respectively, in an E. globulus association genetic study (S. Thavamanikumar, unpublished). Independent support for the influence of these candidate genes is an important step toward MAS; however, further verification and broader characterization of allelic effects at these loci will be necessary before they can be confidently employed.
QTL stability across families
As is often the case in QTL studies, only a moderate proportion (0–31%) of the QTLs for a particular trait were verified at the family level. Direct comparison of cross-family QTL stability between different studies is complicated by differing cross types, sample sizes, phenotypic traits and analysis techniques. In addition, the relatively small sample size in the F2 and each F1 family per site would probably lead to an underestimate of cross-family QTL stability in this study. Nonetheless, the proportion of QTLs verified between families is broadly comparable with observations in other studies in forest trees (Brown et al., 2003; Thamarus et al., 2004; Pelgas et al., 2011). For example, 51 QTLs were detected for micro-fibril angle, wood density and percentage late wood across two unrelated populations of Pinus taeda, each containing c. 450 individuals planted on adjacent sites (Brown et al., 2003). Of these, 8% of the QTLs for wood density and proportion of late wood were shared between the two populations. In E. globulus, Thamarus et al. (2004) detected 13 QTLs for wood density, pulp yield and micro-fibril angle in one F1 population (n =148), using phenotypic data averaged over multiple sites. Three of these QTLs (23%) were independently validated in a second F1 population (n =135) grown at the same sites. Consistent with the present study, QTLs for wood density were more often verified across families than were those for chemical traits or micro-fibril angle in P. taeda (Brown et al., 2003) and E. globulus (Thamarus et al., 2004), which may reflect the higher heritability of density compared with other traits (Stackpole et al., 2011). The difficulty in validating a given QTL between families may reflect factors including a lack of segregation, type I error, type II error and differing QTL effects caused by G × E or epistasis.
Although it was difficult to validate QTLs for a given trait across families, many QTLs were supported by co-location with those for different traits within or between families, resulting in all QTLs mapping to just 38 (see later) key genomic regions. Common loci influencing the variation in multiple traits, often including traits which are not strongly phenotypically correlated, is consistent with other QTL (e.g. Brown et al., 2003; Pot et al., 2006; Ukrainetz et al., 2008; Thumma et al., 2010b; Gion et al., 2011) and association (e.g. Thumma et al., 2010a) studies in tree species. Co-location of QTLs for diverse traits may reflect the location of pleiotropic regulators influencing many developmental and/or biosynthetic pathways (Kirst et al., 2005), or linked clusters of genes (Breitling et al., 2008). Given the strong phenotypic and genotypic correlations between most of the traits examined, in this study (not shown) and past studies (e.g. Stackpole et al., 2011), it is likely that many of the co-locations between QTLs for these traits reflect pleiotropy (Rae et al., 2008; Thumma et al., 2010b).
QTL stability across sites
The demonstration of significant G × E for all traits, except klason lignin, in this study is an important finding, as there is very little information regarding G × E for wood chemical traits in the genus. Furthermore, most studies have used open pollinated material and are therefore likely to underestimate the level of additive G × E because of the inclusion of site-stable inbreeding effects, especially for growth (Hodge et al., 1996; Costa e Silva et al., 2011). Nonetheless, consistent with the present study, significant G × E and changes in family rank are commonly noted for DBH, whereas wood density generally exhibits less significant G × E and relatively constant rankings across sites, in Eucalyptus (Muneri & Raymond, 2000; Costa e Silva et al., 2006) and Pinus (Raymond, 2011). Similar to density, Raymond et al. (2001) found significant G × E for predicted pulp yield and changes in rank at the subrace level, but concluded that G × E was not of great significance for breeding as genetic correlations between the sites studied were very high and the family-by-site interaction was not significant. For the wood property traits in this study, the lack of significant changes in family rank between sites supports the conclusion that G × E at the family level is unlikely to be a major problem for attempts to breed germplasm which will perform well in different sites (Muneri & Raymond, 2000). This finding is significant as the rainfall at the two sites examined is close to the extremes in which E. globulus plantations are grown in Australia (Costa e Silva et al., 2006).
By contrast, studies investigating clonal material replicated across sites have found more pronounced G × E in Populus (Novaes et al., 2009) and Pinus (Resende et al., 2012a). The discrepancy between results dealing with control pollinated families vs those in clonal material may be explained by the fact that family (or population)-by-environment interactions are likely to be less pronounced than variation between identical genotypes at different sites because of the buffering effect of genetic variation within families (and populations; Lima et al., 2000). As E. globulus is deployed as families (which have historically been open pollinated, but are increasingly control pollinated) and clones are not often used (Potts et al., 2008), this would bode well for buffering the impact of G × E.
This is one of the few studies to report QTL-by environment interaction (QTL × E) for wood properties in forest trees, building on similar findings from recent studies. Specifically, in a P. taeda F2 pedigree, 25% of significant QTLs for wood chemistry (Sewell et al., 2002) and 40% of QTLs for wood specific gravity (Groover et al., 1994) exhibited G × E. Such interactions for wood properties have also been demonstrated by QTL studies in Populus (Novaes et al., 2009) and association studies in Pinus (Dillon et al., 2010) and Eucalyptus (Southerton et al., 2010; Thumma et al., 2010a; S. Thavamanikumar, unpublished). In the present study, significant G × E was found for QTLs influencing all traits. The relative proportion of QTLs which exhibited significant interactions for the different traits was broadly consistent with the quantitative genetic findings from this and past studies, and the QTL results in P. taeda (Groover et al., 1994; Sewell et al., 2002), with 38.5% of QTLs for DBH exhibiting significant environmental interaction, compared with 31.3% of QTLs for density and 20.6% for wood chemical traits.
This study provided support for the interactive nature of several loci displaying G × E in past studies. For example, Thumma et al. (2010a) reported that roughly one-half of the SNPs associated with cellulose and pulp yield in E. nitens exhibited significant allelic effects in opposing directions across different sites. All the candidate genes exhibiting G × E were mapped in this study, and two of six co-located with QTLs that had significant G × E and/or ‘variable QTL effects across sites’ (i.e. greater significance in the analysis of a family at a single site than across both sites combined). For example, Thumma et al. (2010a) found that a SNP in CAD, an important gene in the monolignol biosynthesis pathway, exhibited significant G × E for cellulose content. In this study, CAD co-locates with QTLs for the S : G lignin ratio in family 1 and DBH in family 4. Both QTLs exhibited variable effects in the different trial sites and the QTL for DBH exhibited significant G × E (Table 4). Similarly, in a P. taeda association study (Yu et al., 2006) a specific CAD allele affected growth and wood density, with variable phenotypic effects in different genetic backgrounds and environments. The present study also provides positional support for G × E at the COBRA-like locus COBL4 (Qiu et al., 2008), the Arabidopsis homologue of which has been implicated in cellulose deposition (Brown et al., 2005). In E. nitens, COBL4 influenced cellulose content and pulp yield in multiple populations, although the allelic effects were reversed in one population (Southerton et al., 2010; Thumma et al., 2010a). In the present study, COBL4 co-located with QTLs for wood density and the content of cellulose and extractives in family 5. All of these QTLs exhibited variable effects at the different trial sites and the QTL for density showed significant G × E.
In conclusion, this study will help to capitalize on the recently released genome sequence by identifying genomic regions influencing growth and wood properties in Eucalyptus. Many of the QTL regions were consistent across multiple families and/or coincided with previously reported QTLs and candidate genes. The majority of QTLs detected did not show significant G × E, despite large site effects on all traits and statistically significant family and family-by-environment interactions. Nevertheless, G × E was clearly evident for some QTLs, particularly for growth. However, the fact that some of these QTLs appear to reflect loci which have been reported previously to exhibit G × E suggests that site-specific performance could be enhanced by using MAS to select different genotypes at these loci for deployment in specific regions, as is performed with conventional phenotypic selection.
This research was funded by the Co-operative Research Centre for Forestry and an Australian Research Council grant (LP0884001). Emlyn Williams and Gavin Moran (CSIRO) designed the F1 field trials. Gunns Ltd., Timbercorp and Western Australian Plantation Resources (WAPRES) planted and managed the trials in Tasmania, Victoria and Western Australia, respectively. This project utilized the University of Tasmania's Central Science Laboratory (CSL) equipment and the assistance of CSL staff member Adam Smolenski. We thank Paul Tilyard, James Marthick, Desmond Stackpole, Rebecca Jones, Matthew Hamilton and Sascha Wise for technical assistance. We also thank Bala Thumma for providing the gene model codes from the E. grandis genome sequence and Chris Harwood for overall project management and for comments on the manuscript.