Genomic selection for resistance to spruce budworm in white spruce and relationships with growth and wood quality traits

Abstract With climate change, the pressure on tree breeding to provide varieties with improved resilience to biotic and abiotic stress is increasing. As such, pest resistance is of high priority but has been neglected in most tree breeding programs, given the complexity of phenotyping for these traits and delays to assess mature trees. In addition, the existing genetic variation of resistance and its relationship with productivity should be better understood for their consideration in multitrait breeding. In this study, we evaluated the prospects for genetic improvement of the levels of acetophenone aglycones (AAs) in white spruce needles, which have been shown to be tightly linked to resistance to spruce budworm. Furthermore, we estimated the accuracy of genomic selection (GS) for these traits, allowing selection at a very early stage to accelerate breeding. A total of 1,516 progeny trees established on five sites and belonging to 136 full‐sib families from a mature breeding population in New Brunswick were measured for height growth and genotyped for 4,148 high‐quality SNPs belonging to as many genes along the white spruce genome. In addition, 598 trees were assessed for levels of AAs piceol and pungenol in needles, and 578 for wood stiffness. GS models were developed with the phenotyped trees and then applied to predict the trait values of unphenotyped trees. AAs were under moderate‐to‐high genetic control (h 2: 0.43–0.57) with null or marginally negative genetic correlations with other traits. The prediction accuracy of GS models (GBLUP) for AAs was high (PAAC: 0.63–0.67) and comparable or slightly higher than pedigree‐based (ABLUP) or BayesCπ models. We show that AA traits can be improved and that GS speeds up the selection of improved trees for insect resistance and for growth and wood quality traits. Various selection strategies were tested to optimize multitrait gains.


| INTRODUC TI ON
Coevolution of herbivores and host plants has led to complex defense mechanisms with insect pests as an important selection force (Züst et al., 2012). The temporal and spatial presence or absence of herbivores in a sympatric area yields immediate ecological and evolutionary changes (Agrawal, Hastings, Johnson, Maron, & Salminen, 2012;Sork, Stowe, & Hochwender, 1993), which can create distinct geographic pattern of adaptive and resistance traits (Parent et al., 2017;Züst et al., 2012). This standing genetic variation in resistance to insect herbivory presents a unique opportunity for breeding programs in crops and trees to enhance resistance of planting stock, which will secure future crop or fiber yield and preserve plantation investments.
Over the last half century, tree breeding has been successful at providing many countries with genetically improved material for their reforestation programs (Mullin et al., 2011). Although it is possible to improve tree species simultaneously for several traits, the focus has generally been on growth traits. Other economically important traits, such as wood properties and insect resistance, are more costly and difficult to assess and require longer time frames.
Despite its success, conventional tree breeding is intrinsically slow and takes decades before genetically improved stock is available for reforestation. For boreal conifers, it can take up to 30 years to complete a breeding cycle of recurrent selection of superior parent trees and testing their progeny. Moreover, the breeders' paradigm until recently assumed that the climate and ecological site conditions were stable over time so that local seed sources were better adapted. Under this paradigm, it was possible to develop genetically improved material within breeding zones of stable climate conditions. However, this paradigm no longer holds with climate change.
Over the long term, natural tree populations may have the potential for evolutionary response to climate change (Alberto et al., 2013).
However, in the short term, tree breeders need more advanced methods and approaches to facilitate the simultaneous evaluation of a variety of ecologically and economically important traits in selected trees, and speed up the selection of superior trees. Such approaches are critical to maintain health and productivity of forest plantations.
Genomic selection (GS) was proposed by Meuwissen, Hayes, and Goddard (2001) as an alternative to conventional breeding based on pedigree selection and relies on genome-wide distributed markers to model the entire complement of QTL effects across the genome to estimate genomic estimated breeding values (GEBVs) of individuals. It is an attractive approach because GS models combine genotypic and phenotypic data collected on individual in genetic tests that can be used to predict breeding values of genotyped, but unphenotyped, individuals or young seedlings without having to test them in the field. Moreover, GS models can be developed using a relatively small subset of individuals for traits that are cumbersome or expensive to assess in the field, such as pest or drought resistance. These models can then be used to predict the breeding values of a large numbers of unphenotyped candidates in order to make selections at high intensity. In forest trees, genomic breeding tools have been developed over the last decade that allow for accelerated screening and selection in large breeding populations and for combining different growth, wood quality, and resistance traits (Bartholomé et al., 2016;Beaulieu, Doerksen, MacKay, Rainville, & Bousquet, 2014;Grattapaglia & Resende, 2011;Lenz et al., 2017;Lenz, Nadeau, Mottet, et al., 2020;Li et al., 2019). Assessing resistance traits is becoming a pressing issue with new conditions of climate and environmental change. It follows that if GS is deployed economically on large number of candidates, multitrait selection combining growth, wood quality, and resistance to pest and other stress becomes more achievable than with conventional breeding (Lenz, Nadeau, Mottet, et al., 2020).
White spruce is a boreal conifer with a natural distribution that covers almost all of Canada and some northern states in the USA.
It occurs on a variety of soils and under a wide range of climatic conditions (Farrar, 1995). It is also one of the most planted conifers in eastern Canada. Breeding programs have been in place over the last 50 years and have mainly focused on growth and adaptive traits (Mullin et al., 2011), which vary widely at the phenotypic level (Li, Beaulieu, Corriveau, & Bousquet, 1993;Nienstaedt, 1985). Over the last decades, emphasis has also been put on wood quality traits Corriveau, Beaulieu, & Daoust, 1991;Lenz, Cloutier, MacKay, & Beaulieu, 2010), because faster growth is generally negatively correlated with important wood quality traits such as density and mechanical properties (Corriveau et al., 1991;Ivkovich, Namkoong, & Koshy, 2002). Maintaining wood quality is thus important to meet the lumber grading construction standards.
Climate change is demanding white spruce breeding programs to accelerate breeding cycles to develop improved genetic material that will be adapted to future climatic conditions and better resistance to increasing occurrence of insect pests and diseases. Warmer temperatures, for instance, would lift the climate barriers to population growth or range expansion of native or invasive forest pests, potentially resulting in severe outbreaks (Gauthier, Bernier, Kuuluvainen, Shvidenko, & Schepaschenko, 2015). Insect outbreaks are already among the major natural disturbances that can cause widespread damages to natural forests and plantations. Spruce budworm (Choristoneura fumiferana (Clem), SBW) is a native insect prevalent in North Eastern America and a defoliator that periodically attacks spruce forests in eastern Canada (Blais, 1983;Dupont, Bélanger, & Bousquet, 1991;Gray & MacKinnon, 2006). It mainly feeds on balsam fir (Abies balsamea (L.) Mill.) and on spruces such as white spruce (Picea glauca (Moench) Voss) and to a lesser extent on black spruce (Picea mariana (Mill.) BSP). Molecular mechanisms of resistance of white spruce to SBW and its genetic control have recently been reported (Mageroy et al., 2015(Mageroy et al., , 2017Parent et al., 2017). Using a functional genomics approach, these authors reported that the expression levels and function of a β-glucosidase gene, Pgβglu-1, were underpinning natural resistance to SBW in mature white spruce trees. The encoded Pgβglu-1 enzyme catalyzes the cleavage of acetophenone sugar conjugates, picein and pungenin, to release the corresponding acetophenone aglycones (AAs), piceol and pungenol (Mageroy et al., 2015). Pungenol and piceol aglycones commonly accumulate in spruce foliage in the form of the corresponding glycosides, pungenin and picein, but these glycosides appear to be inactive against the insect until they are cleaved by the β-glucosidase Pgβglu-1, which releases the active aglycones. These findings open the possibility to breed for resistance to SBW in spruces by selecting for elevated concentrations of the active aglycones in tested material.
Here, we present a proof-of-concept study of GS for enhancing resistance to defoliation by an herbivory insect and evaluating prospects for multitrait selection implicating resistance, growth, and wood quality. The objectives of the present study were to: (a) estimate the genetic control of AA concentrations related to SBW resistance in white spruce and better understand their genetic relationships with growth and wood quality traits; (b) evaluate the performance of genomic selection models for each of these traits; and (c) estimate the expected genetic gains for AAs alone and in conjunction with the other traits in various multitrait selection schemes.

| Genetic material and phenotyping
Data were collected in white spruce selection plantations established in New Brunswick, Canada, and managed by the New Brunswick Tree Improvement Council (NBTIC). These plantations regrouped full-sib families that were produced by controlled pollinations between 212 elite parents (120 females and 118 males). Families were evaluated in 22 tests established on five sites in as many different localities (Table   S1). For the present study, we sampled 1,310 progeny trees from 136 full-sib families (see Table 1 for descriptive statistics) on which phenotypic measurements were taken. Most families were present on two sites, but a few of them (27) were sampled from a single site.
Families were planted in selection plots of 36-48 trees and were present in only one of the tests present on any given site. The spacing of the selection plantation trials was about 2 m × 2 m with regular suppression of competing vegetation. Total tree height and diameter at breast height (DBH) were measured in 2017. Depending on the test, the age of the trees varied from 16 to 28 years (Table S1), which was taken into account in the statistical analyses. The volume of the 1,310 progeny trees measured for height and diameter at breast height (DBH) was estimated using Honer's equation (Honer, Ker, & Alemdag, 1983). All these trees were genotyped to obtain genomic profiles (see below). Two hundred and eleven (211) additional trees of the same breeding generation previously selected in the same tests were grafted and used to establish breeding gardens. These trees were also sampled and genotyped, but no phenotypic measurements were collected for them.
Acoustic velocity was recorded as a wood quality trait for a subset of 578 progeny trees out of the 1,310 that were measured for height and DBH. Acoustic velocity is a proxy for wood stiffness or modulus of elasticity (MOE) measured at standing trees (Lenz, Auty, Achim, Beaulieu, & Mackay, 2013). We used the Hitman ST300 tool (Fibre-Gen, New Zealand) to obtain acoustic velocity measurements of each tree. The probes were inserted approximately 4 cm into the stem and aligned vertically around breast height at a distance of approximately 70-80 cm apart. For each tree, three readings were taken on the north face of each stem and then averaged.
Measurements were taken in 2018 during a period when there were no severe temperature differences that might have affected velocity measurements (Gao, Wang, Wang, & Allison, 2013).
Quantitative liquid chromatography-mass spectrometry (LC-MS) analysis of acetophenones piceol, pungenol, and picein was conducted from needle samples for the subset of 598 progeny trees. In addition to determining the concentrations of the two acetophenone aglycones piceol and pungenol found in needles and known to be the active compounds related to spruce budworm

| SNP genotyping
Previous to this study, a large registry of gene SNP markers  was constructed for white spruce from extensive EST resequencing and by using the reference white spruce gene catalog GCAT based on full-length cDNAs for aligning sequences and identifying SNPs (Rigault et al., 2011). A total of 213 K high-confidence annotated SNP markers in ~15K genes were discovered and released in the public domain . These markers were used to build the white spruce genotyping chip to obtain a multilocus genomic profile for each of the white spruce trees analyzed here.
Using replicated control samples on each sample plate, the reproducibility rate of the assay was estimated at 99.98%. After removing the nonsegregating markers and applying in-house quality control filters on each SNP such as low rate of missing data (<15%, average of ~0.4%), a minor allele frequency (MAF) ≥0.01, an absolute value of fixation index |Fe| < 0.50, low genotyping error rate (<2%, average of 0.02%) as estimated from replicated control samples, and testing for Mendelian segregation of each SNP within full-sib families as previously described (Lenz, Nadeau, Azaiez, et al., 2020), high-quality genotyping data from 4,148 SNPs were retained for GS analyses.
All trees from the progeny tests and the breeding gardens (total of 1,521 trees) were successfully genotyped. Five progeny trees were filtered out due to low call rate, because of insufficient good-quality DNA. Thus, GS predictions could be made for 1,516 white spruce trees. Missing genotypes (only 0.2% of genotypes) were imputed using a k-nearest neighbor method based on linkage disequilibrium (LD-kNNi) with the software LinkImpute (Money et al., 2015). An accuracy of 0.79 for imputed genotypes was estimated by randomly masking 10,000 genotypes.

| Relationship matrices
All analyses were performed in the R v.3.6.1 environment (R Core Team, 2019). A pedigree-based relationship matrix (A) was first computed based on the registered pedigree information using the function "ainverse" of the R package ASReml-R v.4.1 (Butler, Cullis, Gilmour, & Gogel, 2017). For use in genomic selection (GS) models, the realized genomic relationship matrix (G) was computed with the "A.mat" function of the R package rrBLUP (Endelman & Jannink, 2012) with the default options, which is equivalent to the formula described by VanRaden (2008).

| Heritability and genotype-by-environment interactions
For each trait, variance components, heritability, and breeding values were estimated using the conventional pedigree-based (ABLUP) or the genomic-based (GBLUP) individual-tree mixed models in where y is the phenotype; β represents vectors of fixed effects including the overall mean, the site, the test within site, and the age of the trees fitted as a covariate; a is the random additive genetic effect, with a ∼ N 0, 2 a A ; and e is the residual term, with e ~ N(0, R), where R is a block diagonal matrix specifying a heterogeneous error variance structure for the five sites. For the ABLUP method, the matrix A is the pedigree-based relationship matrix, which was replaced by the realized genomic relationship matrix G for the GBLUP method (a ∼ N 0, 2 a G ). The matrices X and Z 1 are incidence matrices of their corresponding effects. The interaction of site with additive genetic effects, or genotype-by-environment interactions, could not be accurately estimated because the majority of families were present in only one or two sites.
The dominance effect was evaluated using the following additive-dominance model: where y, β, a, and e are as defined in Equation 1, d is the random dominance genetic effect estimated using the family source of variation ( 2 f ) for the ABLUP model, with d ∼ N 0, 2 f I , or using the realized dominance relationship matrix (G Dom ) calculated following Vitezica, Varona, and Legarra (2013), with d ∼ N 0, 2 d G Dom for the GBLUP model. The matrix I is an identity matrix. To test the hypothesis of greater than zero variance for additive and dominance effects (H 0 : σ 2 = 0; H 1 : σ 2 > 0), we performed a likelihood-ratio test between the full model (Equation 1 or 2) and a reduced model without the effect to be tested.

| Correlations between traits
To estimate phenotypic and genetic correlations between traits, bivariate models were run for all pairs of traits in ASReml-R. The following additive-only model was fitted: where y 1 and y 2 are the stacked vectors of phenotypic observations for trait 1 and trait 2 respectively; β is the vector of fixed effects, including an overall mean for each trait, the site within trait, the test within site and within trait, and the age of the trees within trait; a(t) is the random additive effect within trait, with a(t) ∼ N(0, V A ⊗ A); and e is the residual error, with e ∼ N(0, I e ⊗ V R ). The matrix A (ABLUP) was replaced by the realized genomic relationship matrix (G) for the GBLUP method. The total genetic correlations (additive and dominance effects) were estimated using the following model: (1) y = X + Z 1 a + e, (r a , r d , and r e , respectively) and unique variances for each trait (i.e., CORGH variance structure in ASReml). To facilitate convergence, we provided starting values for the variance components in V A , V D , and V R matrices that were taken from the results of the single-trait models (Equation 1 or 2). For r a , r d , and r e , the starting value were set to 0 (i.e., no correlation), but this parameter was allowed to vary in the model. The additive genetic correlation between traits was directly given by the estimated parameter r ⋀ a from model [8], and the total genetic correlation was calculated from model [9] as: are the residual variance of trait 1 and 2, respectively. The significance of the genetic correlations was tested by performing a likelihood-ratio test between the full model in Equation 8 or 9 and a reduced model assuming r a = 0 and r d = 0 (i.e., diagonal V A and V D matrices). The significance of the phenotypic correlations was tested by performing a likelihood-ratio test between the full model in Equation 8 or 9 and a reduced model assuming no correlation between traits (i.e., fixing the parameters r a = 0, r d = 0, and r e = 0).

| Cross-validations
The prediction accuracy of conventional pedigree-based (ABLUP) and genomic selection (GBLUP) models was estimated using the following procedure. For growth traits, the full set of 1,310 phenotyped individual trees was randomly split into 10 folds, each containing ~1/10th of the trees from each family. For each round of CV, nine folds (~1,179 trees) were used in model training, which was used to predict the breeding or genetic values, depending on the model used, for the remaining fold (~131 validation trees). This 10-fold cross-validation was repeated 10 times, for a total of 100 models for each trait. For acetophenone aglycones and acoustic velocity, respectively 598 and 578 progeny trees were successfully phenotyped, and thus, each fold used for model training contained respectively 540 and 520 trees, with the remaining phenotyped trees for these traits set aside for model validation.
The predictive ability (PA) of the models was evaluated as the Pearson correlation coefficient between the predicted breeding or genetic values of the validation trees and the adjusted phenotypes (y*). Adjusted phenotypes were obtained by taking the residuals (y* = e) of a model that included an overall mean, the site, the test within site, and the age fixed effects as described in Equation 1 (Perez & de los Campos, 2014), but with a and d being the random additive and random dominance effects of markers, respectively. The incidence matrix Z 1 ij for the additive effects a denotes genotypes for individual i at marker j coded as the number of the rare allele (e.g., 0, 1, 2). The incidence matrix Z 2 ij for the dominance effects d denotes genotypes for individual i at marker j coded as 0 for homozygous and 1 for heterozygous genotypes (Toro & Varona, 2010).
Both a and d were modeled under BayesCπ, in which only a proportion π of markers have an effect, while a proportion (1 − π) of marker effects are shrunk toward zero. This is modeled by assigning a prior for marker effects (a and d) that is a mixture of a point of mass at zero and a Gaussian slab (Habier, Fernando, Kizilkaya, & Garrick, 2011).
The parameter π is treated as unknown and is assigned a beta prior ∼ Beta p 0 , 0 , with p 0 > 0 and 0 ∈ 0, 1 . We used the default starting parameters provided by BGLR. Similar to ABLUP and GBLUP, we fitted a heterogeneous error variance structure for the five sites in BGLR. Each model was run for 50,000 iterations and a thinning interval of 20, with the first 15,000 iterations discarded as a burn-in. The convergence of the models was verified by running 2 MCMC chains for each trait and by using Gelman-Rubin diagnostic plots (shrink factor < 1.1) implemented in the R package coda (Plummer, Best, Cowles, & Vines, 2006). Genomic estimated breeding values were obtained by summing over the effects of all markers estimated using the additive-only model (Equation 1) where a ⋀ j is the estimated additive effect of the jth marker. The genomic estimated genetic values were estimated from the additive-domi- is the estimated dominance effect of the jth marker. We evaluated the PA and PACC of the BayesCπ model using the same procedure as ind (additive-dominance model), we used the GBLUP heritability estimates so that PACC for ABLUP, GBLUP, and BayesCπ can be compared on equal grounds.

| Narrow-sense heritability estimates
Acetophenone aglycone data showed substantial variation among trees and families ( Figure 1). They were not normally distributed and needed square root transformation in order to comply with the assumptions of normal distribution and homogeneity of the variances of modeling errors. The results of the analyses when only additive effects were considered (Equation 1) are presented in Table 2, and the additive genomic relationships among all the individuals (G-matrix) can be seen on Figure S1. The narrow-sense heritability estimates (h ⋀ 2 ind ) were significantly different from zero for all traits under both ABLUP and GBLUP models (Table 2). They were moderate to high (above 0.50 with ABLUP and 0.40 with GBLUP) for piceol, pungenol, and acoustic velocity, while those of growth traits were low to moderate (between 0.10 and 0.26). In addition, it should be noted that h ⋀ 2 ind estimates were generally lower for GBLUP than for ABLUP. This is because the GBLUP method leads to less biased estimates of quantitative genetic parameters than ABLUP due to the use of more  Table 1 for full description of traits.
accurate estimates of relatedness between trees by relying on their genomic profiles (de Almeida Filho et al., 2009).

| Additive and phenotypic correlations between traits
As anticipated, growth traits were positively correlated with each other at the phenotypic level, and phenotypic correlations between growth traits and picein, piceol, and pungenol were mostly low and negative under the GBLUP model (r ⋀ P = −0.13 to 0.04) ( Table S2). The additive genetic correlations between the acetophenone aglycones and growth traits were also low, negative, and generally not significant, except for a significant correlation between height and piceol under the GBLUP model (r ⋀ a = −0.38 ± 0.15) ( Table 3). Genetic relationships between growth traits and acoustic velocity were also low and not significant. On the other hand, piceol and pungenol concentrations were highly positively correlated (r ⋀ a = 0.60 ± 0.09), whereas picein and pungenol concentrations were negatively correlated (r ⋀ a = −0.65 ± 0.07) (Table 3). Altogether, the results did not show any large adverse genetic correlations between the levels of the needle aglycones piceol and pungenol on the one hand, and acoustic velocity or growth traits on the other hand.

| Broad-sense heritability estimates and genotypic correlations
ABLUP and GBLUP models that included both the additive and the dominance effects (AD) (Equation 2) were used to obtain estimates of total genetic values. Dominance genomic relationships among all the individuals (D-matrix) are presented in Figure S2. Narrow-sense, dominance, and broad-sense heritability estimates can be found in Table 4. When comparing the estimates of narrow-sense heritability obtained with the additive effects models ( Table 2) to those of AD effects models (Table 4), we found a large reduction in additive genetic variance for growth traits under the AD model, which was especially apparent for the ABLUP models. For these traits, most of the genetic variance was assigned dominance variance under the AD model. For the other traits, both estimates of narrow-sense heritability were of the same order of magnitude, especially for piceol and pungenol, for which dominance effects appeared to be low.
Dominance effects were weakly significant only for height growth under the ABLUP model, and for height, DBH, volume, and acoustic velocity under the GBLUP model. AIC and BIC values obtained for each model are presented in Table S4. The AIC showed that the fit of the model was similar between the additive-only effects models and the AD effects models, except for those traits where dominance was significant. However, the BIC was most of the time at the advantage of the additive effects model, except for height where the GBLUP AD model provided a better fit.
The broad-sense heritability estimates (H  Table 5), as well as the phenotypic correlations (Table   S3), between traits were generally of the same order of magnitude and in the same direction than the additive genetic correlations and the phenotypic correlations estimated with the additive-only effects models (Table 3 and Table S2).

| Additive effects
Results of the cross-validation of additive-only effects prediction models (Equation 1) for both the pedigree-based (ABLUP) and the marker-based (GBLUP) models are presented in Table 6. The predictive ability (PA) varied from about 0.40 to 0.48 for aglycones and acoustic velocity, and from about 0.11 to 0.26 for growth traits.
Higher PA values for aglycones and acoustic velocity were anticipated given the higher heritability estimates for these traits. PA was of the same order of magnitude between the ABLUP and the GBLUP models, but it was slightly larger for three of the six traits (DBH, TA B L E 3 Estimates of the additive genetic correlation among the growth, wood quality, and spruce budworm resistance traits obtained with the ABLUP (above diagonal) and GBLUP (below diagonal) models [8] by considering additive effects only. Standard errors are in parentheses  Table 1

| Additive and dominance effects
Cross-validation of AD effects selection models (Equation 2) were used to estimate the precision of genetic value estimates for both pedigree-based (ABLUP) and marker-based (GBLUP) models. A comparison of the PA of the additive-only (Table 6) and the AD effects TA B L E 4 Narrow-sense, dominance, and broad-sense heritability estimates for the growth, wood quality, and spruce budworm resistance traits obtained with the ABLUP and GBLUP models [2] by considering additive and dominance (AD) effects. Standard errors are in parentheses  Table 1 for full description of traits.

TA B L E 5
Estimates of the total genetic correlation among the growth, wood quality, and spruce budworm resistance traits obtained with the ABLUP (above diagonal) and GBLUP (below diagonal) models [9] by considering additive and dominance (AD) effects. Standard errors are in parentheses  Table 1 for full description of traits. b The model did not converge. models (Table 7) indicates that they were similar, with values varying from 0.11 to 0.46, and 0.12 to 0.47, respectively, for the ABLUP models, and from 0.14 to 0.48, and 0.15 to 0.48, respectively, for the GBLUP models. However, after standardizing PA by the square root of the GBLUP broad-sense heritability estimates, the prediction accuracy (PACC) estimates obtained with the AD effects models were slightly lower than those obtained with the additive-only effects models, even for the traits that showed significant dominance effects (Table 4). Similar to additive-only effects models, the lowest estimates were obtained with the method based on PA∕ � H ⋀ 2 ind and varied from 0.27 to 0.68 for ABLUP and from 0.33 to 0.65 for GBLUP (Table 7). Slightly higher accuracies were obtained for all traits, except for piceol, under GBLUP as compared to ABLUP. The prediction accuracy based on true genetic values (TGVs) ranged from 0.60 to 0.97, depending on the method used and the how the true genetic values were estimated. When PACC was estimated using TGVs, we observed the same general trend as for the additive-only effects models (Table 6). For GBLUP and using the TGVs obtained from the GBLUP model, the accuracy of the AD model was lower than that of the additive-only effects model. However, the opposite was observed for the ABLUP model and using the TGVs obtained from the ABLUP model, with the AD model having a higher accuracy than the additive-only effects model. The accuracies of the ABLUP AD model using TGVs from ABLUP were unexpectedly high (above 0.90 for most traits and up to 0.97) and were probably overestimated.
Regarding the BayesCπ additive and dominance effects models, the trends found ( TA B L E 6 Predictive ability (PA) and prediction accuracy (PACC) for the growth, wood quality, and spruce budworm resistance traits obtained with the additive-only effects models ABLUP, GBLUP, and BayesCπ (Equation 1) proportion of marker having an additive effect for pungenol and piceol (π = 0.41 and 0.55, respectively) or dominance effect (π = 0.37 and 0.53, respectively) was in the same range as for growth traits (additive: π = 0.36-0.43; dominance: π = 0.38-0.44), indicating a similar gene control implicating a large number of genes for all traits.

| Genetic gain expectations from genomic selection
We estimated the genetic gains that could be expected from the selection of the top 5% trees for acetophenone aglycones based on either their GEBVs for the establishment of seed orchards, or GEGVs in the context of multiclonal/multivarietal forestry (Park, Beaulieu, & Bousquet, 2016). To do so, and given that all 1516 trees of the multisite selection population were previously genotyped, we inferred breeding and genetic values for the 916 unphenotyped candidate trees for acoustic velocity and acetophenone aglycones by using GS models built from the 598 trees that were both genotyped and phenotyped, thus allowing selection to be applied to the whole selection population. Thus, the 5% selection intensity corresponded to the selection of the top 76 trees among the 1,516 trees for which genomic profiles were obtained. The gains expected from the selection of each trait considered individually are presented on the diagonal of both subtables (Table 8). It can be seen that the gains are slightly higher when the selection is based on GEGVs as compared to GEBVs. Overall, selection at this intensity for aglycones would generate large gains (~37% for piceol and ~45%-47% for pungenol) and would generally not translate into lost in growth traits (maximum loss of 0.4 to 0.6% in volume with selection made for pungenol concentration). Loss in wood stiffness, as measured by acoustic velocity, would be slightly higher, but still not above 1.1%. The 76 progeny trees selected for superior piceol concentration belonged to 21 full-sib families, whereas those selected for high pungenol concentration belonged to 27 full-sib families out of the 136 families tested. However, selection for higher height growth would generate losses of up to 10%-12% and 4.8%-5.4% in piceol and pungenol concentrations, respectively. A selection for volume resulted in smaller losses in piceol and pungenol. Independent culling for the two aglycones would generate 36% and 27% gains in piceol and pungenol,

| D ISCUSS I ON
The most recent spruce budworm outbreaks in the Canadian boreal forest  affected close to 85 million hectares, killing about half of the host trees and leading to a commercial volume loss of 3-68 m 3 /ha dependent on the intensity of the outbreak (Gray & MacKinnon, 2006). Breeding for SBW resistance could help avoid economic loss in forest plantations in the future.
The acetophenone aglycones piceol and pungenol were found to play a major role in SBW resistance in white spruce, and underlying resistance mechanism was discovered in recent years as reviewed in Parent et al. (2020). Although aglycone concentrations in leaf material were shown to be under significant genetic control (Méndez-Espinoza et al., 2018), it was unclear whether accelerated breeding through GS would be possible and would lead to meaningful increment of leaf aglycones in breeding populations.
Here, we present a proof of concept for using genomic selection in the context of improving resistance against insect defoliation in a conifer and by focusing on needle compounds shown to be directly related to insect resistance. We show that the moderate-to-high heritability and high accuracy of GS models led to large expected genetic gains in the concentrations of piceol and pungenol in white spruce needles.

| Strong genetic control of SBW resistance traits makes them candidate for genomic selection
The narrow-sense heritability estimates obtained using the addi- In the present study, we found higher narrow-sense heritability estimates for piceol, pungenol, and picein using ABLUP, but comparable and more realistic estimates using GBLUP. These results confirm that the genetic control of piceol and pungenol is moderate to strong in white spruce and that higher levels of accumulation of these compounds in needles could be achieved through breeding and selection.
Our results are also in line with those previously reported for other quantitative traits. Hence, Li et al. (1993) (Lenz et al., 2013). Globally, all of these estimates are of the same magnitude showing that the genetic control of tree height in white spruce is generally low to moderate. For acoustic velocity, Lenz et al. (2013) reported narrow-sense heritability estimates of 0.38 in white spruce, which is slightly lower than estimates obtained in the current study. Lenz, Nadeau, Azaiez, et al. (2020) analyzed acoustic velocity data collected in both full-sib and polycross white spruce progeny tests. Similar narrow-sense estimates of 0.41 and 0.35 were found for the full-sib and the polycross tests, respectively. Thus, there is agreement between these studies that acoustic velocity is under moderate genetic control in white spruce and, on average, higher than that for growth traits but lower than that for acetophenone aglycones.
It appears that the ABLUP and GBLUP models did not succeed to separate well the additive and the dominance genetic effects, which was especially apparent for the ABLUP AD models. Hence, under the ABLUP AD models, the portion of phenotypic variation due to additive genetic effects (i.e., the narrow-sense heritability) dropped to near zero and all of the genetic variance was assigned to dominance for growth traits, indicating that additive and dominance effects were confounded. Using ABLUP, dominance was only significant for height and the standard errors of dominance estimates were large for all traits. In contrast, with GBLUP significant dominance was found for the three growth traits and acoustic velocity, with much smaller standard errors, which is in line with results reported by Muñoz et al. (2014) for tree height in loblolly pine (Pinus taeda L.) using a clonal trial made of 951 individuals belonging to 61 full-sib families. These authors showed that, compared with using pedigree information, use of the realized genomic relationships in GBLUP models resulted in a substantially more precise separation of additive and nonadditive components of genetic variance. The experimental design available in the present study was not designed to obtain accurate estimates of dominance variance since each family was present in only one test (one family plot) per site in our sampled dataset, and some tests contained a small number of families. Thus, it is likely that the dominance effects of specific crosses were in part confounded with the test effects and the within-test microenvironment differences between family plot effects. However, we consider that the results of the present study and those of previous studies support the idea that the genetic control of the acetophenone aglycones in white spruce is moderate to high and that substantial genetic gains can be expected from the selection of superior genotypes and their vegetative propagation through somatic embryogenesis or cuttings or a combination of both (Park et al., 2016).
Broad-sense heritability was also estimated for white spruce tree height by Park, Weng, and Mansfield (2012). They reported estimates of H ⋀ 2 i = 0.35 for tree height at age 14 using 340 clones from a population of 75 full-sib-families established in three locations in New Brunswick. Their estimate is in line with that obtained in the present study. Wahid et al. (2012) also reported heritability estimates for white spruce tree height using data collected in two clonal tests established in Quebec using 52 clones from 14 full-sib families. Their estimates were slightly lower than ours with values of 0.10 and 0.26, which might be partly due to ontogenic effects in younger material. More studies would be needed to confirm this hypothesis.

| Weak evidence of a trade-off between growth and acetophenone aglycones traits
In the present study, at both the additive genetic and the phenotypic levels, we found highly significant positive correlations between piceol and pungenol concentrations in needles and negative correlations between the concentration of picein and that of pungenol.
Similar relationships between piceol and pungenol concentrations were also found at both the additive genetic and phenotypic levels (r ⋀ a , r ⋀ p ~ 0.80 ± 0.05), by Méndez-Espinoza et al. (2018). However, they did not report any significant negative relationship between the acetophenone glucoside picein and pungenol concentrations in needles, contrary to us. This significant negative relationships are also in contradiction with Mageroy et al. (2015) who reported that high levels of picein were not related to the accumulation of acetophenone aglycones. This discrepancy could be due to various reasons such as differences in the provenance of the population assessed, differences in age of the material sampled (47 years versus 16-28 years in the present study), or samples having not been collected at the same time in both studies (summer versus autumn). The total genetic correlation coefficients (additive-dominance models) between the acetophenone aglycones estimated in the present study are of the same order of magnitude than those previously reported in Méndez-Espinoza et al. (2018). However, their total genetic correlation coefficients were slightly smaller than their additive genetic correlation coefficients, contrary to the trend observed in the present study using either the ABLUP or the GBLUP models.
We also found that the acetophenone aglycone concentrations in needles were either not correlated or marginally negatively correlated with the other traits. This is also in some contrast with those reported previously by Méndez-Espinoza et al. (2018). These The general lack of trade-off between growth and acetophenone aglycones (except for a weak negative genetic correlation between height and piceol concentration) observed from previous studies, at the phenotypic level by Lamara et al. (2018) and at both the phenotypic and genetic levels by Méndez-Espinoza et al. (2018), was confirmed in this dataset. However, this topic needs further investigation across the sympatric area of SBW and white spruce in order to better understand the ecological and evolutionary implications of the resistance mechanism and coevolution.

Additive genetic effects
Because the true breeding values of the tested material are unknown, one preferred method to estimate predictive accuracy is to calculate the Pearson correlation coefficient between the predicted breeding values and the adjusted phenotypes, and then standardize by the square root of heritability (Lenz, Nadeau, Azaiez, et al., 2020;Lenz, Nadeau, Mottet, et al., 2020). Using this approach, we found moderate-to-high predictive accuracies for both ABLUP and GBLUP models, but GBLUP slightly outperformed ABLUP for DBH, volume, and pungenol concentration. Most importantly, our results indicate that genomic selection was highly precise for aglycones as we found the highest accuracies for piceol and pungenol, and for a variety of sites given that the five different test sites were simultaneously considered in the model. This increase in predictive accuracy is likely due to low genotype-by-environment interactions across sites for these traits as reported previously (Méndez-Espinoza et al., 2018).
Our dataset did not allow obtaining accurate estimates of such interactions and to separate them from the average across-site effects given the unbalanced representation of families across test sites.
In this study, estimates of predictive accuracies based on "true breeding values" (TBV) obtained from the models with 100% of phenotypes were also estimated (Table 6) because this approach is very common in the literature (e.g., Beaulieu, Doerksen, Clément, et al., 2014;Chen et al., 2018;Lenz et al., 2017). We found higher predictive accuracy values using this approach, by about 1%-124% depending on the trait, especially when the same model was used to estimate reference true breeding values (i.e., ABLUP predicted versus ABLUP 100% phenotypes, or GBLUP predicted versus GBLUP 100% phenotypes). So far, such approaches based on TBVs tended to overestimate predictive accuracy, and thus, we did not put emphasis on these results.
The genetic control of plant defense sometimes conforms to an oligogenic model, especially in the case of resistance to pathogens (Walters, 2011). To test this hypothesis for acetophenone aglycones involved in herbivory defense against SBW, we used the BayesCπ approach to verify whether it would allow identifying a small number of markers with large effects conferring higher prediction accuracies than that of the GBLUP model, which considers that a quantitative trait is under the control of a large number of loci with small effects.
With BayesCπ, the proportion of markers that had significant effects on aglycone additive-only genetic variance was large (40%-56%) and in the same range than that of the other traits. Moreover, the prediction accuracy of the BayesCπ models was slightly lower than that of the GBLUP models, not higher (Table 6). Taken together, these two trends do not support an oligogenic model of gene effects for piceol and pungenol. This is not unexpected given the skewed distributions of metabolite values (Figure 1), which did not reflect a bimodal or oligomodal distribution, but a more continuous distribution indicative of multiple gene effects. The boxplot of marker effects obtained with BayesCπ shows only one marker having a marginally larger effect, for pungenol than the other SNP markers ( Figure S3a).
But it was still accounting for a small fraction of the total genomic estimated breeding values of the trees, and the relationship of marker effects between piceol and pungenol followed the general polygenic pattern between both traits ( Figure S3b). Given that marker effects estimated with genomic selection methods are not independent and that trees were related in this advanced-breeding population, other materials and approaches such as association and QTL mapping would help further characterize the genomic architecture of these traits. Indeed, a previous association study for piceol and pungenol on white spruce identified a large number of candidate genes with likely implications in the genetic control of these metabolites (Lamara et al., 2018), in accord with the trends observed here.

Additive-dominance genetic effects
In the present study, we did not find marked differences in predictive ability between the additive-only and additive-dominance effects models, and even slightly lower predictive accuracies for the last ones. Such lack of difference in the results between these two types of models has already been noted especially for fusiform rust resistance (oligogenic trait) in a structured breeding population of loblolly pine (Pinus taeda L.). Indeed, de Almeida Filho et al. (2016) showed that both models were largely similar in predictive ability and that the inclusion of dominance effects in the model improved only slightly the predictions for tree height, a polygenic trait. Simulating six traits with different genetic architectures (polygenic and oligogenic), and levels of dominance effects, these authors also showed that AD models improved markedly predictive accuracies only when traits were affected by larger dominance effects (d 2 > 0.2). Bouvet, Makouanzi, Cros, and Vigneron (2016) reported similar finding for 32-month height of Eucalyptus urophylla x E. grandis hybrids clonally replicated on three sites. Absence of improvement was also observed in other organisms, such as dairy cattle, for milk production (Ertl et al., 2014).
In this study, significant dominance effects, though still rather low (d ⋀ 2 ind = 0.10-0.25 using GBLUP), were detected for growth traits and acoustic velocity under the GBLUP model. However, dominance effects were not important in white spruce for aglycones related to resistance to spruce budworm, similar to what was found for resistance to fusiform rust in loblolly pine (de Almeida Filho et al., 2016).
For tree height in white spruce, Weng, Park, Krasowski, Tosh, and Adams (2008) reported that the percentage of the phenotypic variance due to dominance varied between 10% and 15% between age 5 and 14 years. These results are in line with those of the large majority of studies made on a variety of growth and wood traits in pines, showing a net excess of additive over dominance variance (see table A1 in Bouvet, Saya, & Vigneron, 2009). Thus, the dominance effects in white spruce might not be strong enough to realize any advantage for AD over additive-only effects models in terms of prediction accuracy. Indeed, de Almeida Filho et al. (2016) indicated that it is only once the percentage of phenotypic variance explained by the dominance effects reaches 20% that the AD models provide more accurate predictions. Furthermore, dominance effects may not well be estimated due to partial confounding with the test and family plot effects in this study.
Using the BayesCπ additive-dominance model, we found similar or slightly lower predictive accuracies than the GBLUP AD model for acetophenone aglycones, and the proportion of markers with nonzero effect (37%-55%) again did not support a hypothesis of oligogenic control of these traits ( Figure S4a,b).
Overall, we conclude that the additive GBLUP model was sufficient to obtain highly accurate predictions of the concentrations of piceol and pungenol in needles using only a small subset (598 trees out of 1,310) of the breeding population. These very encouraging results should allow for significant cost savings by reducing phenotyping costs for such metabolites implicated in pest resistance, which are cumbersome and expensive to assess.

| Important genetic gain for acetophenone aglycones would increase SBW resistance in future plantations
From this dataset, important genetic gains of more than 35 and 45% for piceol and pungenol concentrations in needles, respectively, were expected when low-intensity selection (top 5%) was based on the respective traits. Besides genetic control, those high percentages are related to the skewed trait distribution of needle concentrations of piceol and pungenol in the population and the fact that less than a fifth of individuals carried meaningful concentrations of those molecules (Figure 1). The effects of piceol and pungenol on larvae survival and development using artificial diets were previously reported (Delvas, Bauce, Labbé, Ollevier, & Bélanger, 2011). Our population mean values for piceol and pungenol concentrations in needles are in the same range as their reference dose (2 × piceol + 2 × pungenol), which lead to 50% less survival of larvae, significant lower purple mass, and a longer development time compared with the control of no piceol and pungenol in the artificial diets. Hence, selecting individuals with strong breeding or genetic values for concentrations in piceol and pungenol should increase significantly SBW resistance in the breeding material. Although SBW would likely feed on more resistant trees, food consumption of larvae will be lower due to antidigestive or toxic effects of the more resistant foliage (Despland et al., 2011), and hence lead to less damage. In addition, bioassays may underestimate the effect of piceol and pungenol as their maximum concentrations in most testing designs are significantly lower than the reported levels in resistant trees Parent et al., 2020). Before budburst, SBW feeds on foliage from the previous year. Furthermore, as the high levels of acetophenone aglycones are usually preserved in the needles of the previous year in resistant phenotypes, it negatively affects early larvae development and survival (Parent et al., 2017). This is in contrast with insects used in experimental rearing trials that initially grow up under rather perfect controlled conditions.
In multitrait selection scenarios, loss of needle concentrations in one or both of piceol and pungenol should be avoided in order to keep trees with adequate levels of acetophenone aglycones leading to meaningful resistance in the breeding population. In this study, we were able to identify correlation breakers, that is, progeny trees that combined good growth and high levels of acetophenone aglycones leading to genetic gain for both categories of traits. Overall, a balanced gain in concentrations of both acetophenone aglycones piceol and pungenol, as well as moderate gain in growth, should be the ideal compromise scenario to look for when breeding for improved SBW resistance in white spruce. Additional field testing would be advisable to help determine the appropriate weights that should be given to each trait and to evaluate the effective gains obtained from selecting more resistant white spruce trees to SBW attacks in terms of reduced defoliation and higher growth in plantation site conditions.

| CON CLUDING REMARK S
We have argued that with climate change and its potential effects on pest distributions and host-pest relationships, pest resistance should be of high priority in tree breeding programs and that the existing genetic variation for resistance traits and relationships with productivity traits should be better understood for their consideration in multitrait breeding. In this study, we evaluated the prospects for genetic improvement of the levels of acetophenone aglycones in white spruce needles, which have been shown to be tightly linked to resistance to spruce budworm. We showed that aglycones were under moderate-to-strong genetic control with null or marginally negative genetic correlations with other traits. The prediction accuracy of GS models for these metabolites was also high and comparable with that of pedigree-based models. These are very encouraging result for their genetic improvement with regard to enhancing white spruce resistance to spruce budworm at a very early stage and without the need to test and phenotype all candidates. GS can thus be deployed early to conduct such selection or to infer breeding and genetic values of unphenotyped material, thus saving time and costs. Finally, we showed that various selection strategies involving GS can be used to simultaneously improve productivity and adaptation traits such as pest resistance, which are essential to face climate change pressures and help maintain health and productivity of forest plantations.

ACK N OWLED G EM ENTS
The authors would like to thank Shona Millican (JDI) for organization

CO N FLI C T O F I NTE R E S T
None declared.

DATA AVA I L A B I L I T Y S TAT E M E N T
In order to comply with Intellectual Property Policies (IPP) of participating governmental institutions in this work, the supporting phenotyping data are not deposited into a public domain database.
The original data are stored in the institutions' databases and may be shared upon request to the corresponding author according to our IPP. The SNP genotyping data are provided as Supporting Information. The SNP genotyping dataset will be deposited into the public domain once the manuscript is accepted.