Genetic improvement of the chemical composition of Scots pine (Pinus sylvestris L.) juvenile wood for bioenergy production

Chemical composition is one of the key characteristics that determines wood quality and in turn its suitability for different end products and applications. The inclusion of chemical compositional traits in forest tree improvement requires high‐throughput techniques capable of rapid, non‐destructive and cost‐efficient assessment of large‐scale breeding experiments. We tested whether Fourier‐transform infrared (FTIR) spectroscopy, coupled with partial least squares regression, could serve as an alternative to traditional wet chemistry protocols for the determination of the chemical composition of juvenile wood in Scots pine for tree improvement purposes. FTIR spectra were acquired for 1,245 trees selected in two Scots pine (Pinus sylvestris L.) full‐sib progeny tests located in northern Sweden. Predictive models were developed using 70 reference samples with known chemical composition (the proportion of lignin, carbohydrates [cellulose, hemicelluloses and their structural monosaccharides glucose, mannose, xylose, galactose, and arabinose] and extractives). Individual‐tree narrow‐sense heritabilities and additive genetic correlations were estimated for all chemical traits as well as for growth (height and stem diameter) and wood quality traits (density and stiffness). Genetic control of the chemical traits was mostly moderate. Of the major chemical components, highest heritabilities were observed for hemicelluloses (0.43–0.47), intermediate for lignin and extractives (0.30–0.39), and lowest for cellulose (0.20–0.25). Additive genetic correlations among chemical traits were, except for extractives, positive while those between chemical and wood quality traits were negative. In both groups (chemical and wood quality traits), correlations with extractives exhibited opposite signs. Correlations of chemical traits with growth traits were near zero. The best strategy for genetic improvement of Scots pine juvenile wood for bioenergy production is to decrease and stabilize the content of extractives among trees and then focus on increasing the cellulose:lignin ratio.


| 849 1 | INTRODUCTION
Wood is a natural organic material that has been utilized by humans for millennia in many aspects of their daily lives. Its abundance, versatility and environmental sustainability makes it the material of choice for a broad range of purposes in both raw (fuel, construction timber, furniture, tools, fence posts, utility poles) and processed (paper, textile fabrics, bioethanol) forms.
Wood is composed of four major chemical components: cellulose, hemicelluloses, lignin, and extractives (Poletto, Zattera, & Santana, 2012). The former three, collectively called lignocellulose, are polymeric macromolecules that constitute wood's structural components. Cellulose microfibrils together with hemicellulosic chains that are strong in tension are embedded in a matrix of lignin that resists compression. In conifer tree species, they commonly represent 40%-50%, 25%-35% and 18%-35% of the total dry weight of stem wood, respectively, jointly making up 90%-96% (Pettersen, 1984;Stevanovic, 2016). Wood extractives, forming the rest of wood materials, represent a large and heterogeneous group of low-molecular-weight organic and inorganic compounds (Ekeberg, Flaete, Eikenes, Fongen, & Naess-Andresen, 2006;Fengel & Wegener, 1989), many of which also enjoy commercial utilization in a number of industrial areas (Nisula, 2018). Along with anatomical structure, the chemical composition of wood is the main determinant of wood mechanical properties and, in turn, of wood quality.
The term "wood quality" does not have a single definition, but it can be simplified to properties that reflect the degree of excellence of wood in one way or another, depending on its intended utilization (Barnett & Jeronimidis, 2003). For instance, it will be mainly characterized by stiffness, strength, density and the presence and extent of reaction wood in construction lumber (Fundova, Funda, & Wu, 2018, 2019Ramage et al., 2017), while fiber length and the chemical composition such as the ratio of carbohydrates to lignin and other wood components will be more important in pulp and paper industries and biofuel production (Wegner, Skog, Ince, & Michler, 2010).
In the context of rapidly changing global climate due to the unprecedented overproduction of greenhouse gas emissions, biofuel from wood is becoming an appealing carbon-neutral energy resource that could potentially replace the consumption of fossil fuels in the future. Biofuels have been predominantly produced from plant species such as corn, sugarcane and soybean, whose tissues contain high proportions of starch and/or oil (Schubert, 2006). However, lignocellulosic biomass produced by forest trees has been receiving considerable interest in this regard too (Binder & Raines, 2009;Pandey, Larroche, Ricke, Dussap, & Gnansounou, 2011) and is forecast to play an important role in satisfying the rising demand for transportation fuels (Sticklen, 2010). To keep pace with the demand brings about the need to generate satisfactory amounts of wood with suitable properties, in particular with suitable chemical composition.
Provided that chemical compositional traits are heritable and exhibit sufficient genetic variation, they can be included in forest tree breeding programs as target traits and improved via recurrent selection. However, the target traits need to be correlated with some proxy (selection) traits that can be rapidly, inexpensively and non-destructively measured on standing trees, preferably at young ages, because standard wet chemistry protocols (SPPBTC, 2003(SPPBTC, , 2009TAPPI, 1991TAPPI, , 2002 are laborious and expensive and thus cannot be applied for large-scale evaluations of breeding populations. Infrared (IR) spectroscopic techniques belong to the most promising alternatives in this context (Gebreselassie et al., 2017;Hein & Chaix, 2014;Lepoittevin et al., 2011;Pot et al., 2002;Schimleck, 2008). In particular, near-infrared (NIR) and Fourier-transform infrared (FTIR) spectroscopies (reviewed by Conrad & Bonello, 2016;Xu, Yu, Tesso, Dowell, & Wang, 2013), have been successfully applied for rapid screening of the chemical composition of wood in a number of forest tree species (Cozzolino, 2014). Here, only a small subset of samples (albeit as representative of the population under study as possible) undergoes the accurate and precise determination of the chemical composition in the wet lab while the remaining samples' composition will be predicted from their IR spectral profiles using multivariate regression modeling (Cozzolino, 2014;SAS, 2008;Zhou, Jiang, Cheng, & Via, 2015). FTIR spectroscopy is highly sensitive and accurate and provides information-rich spectra with a number of sharp peaks (Faix, 1992;Hergert, 1971), many of which can be directly related to the presence of one or more chemical components in question. It thus appears to be an ideal alternative to the traditional wet chemistry protocols, especially when limited variation in the chemical composition is anticipated among samples.
The chemical composition of wood follows a general pattern but varies at both inter-and intra-specific levels as well as among different parts and/or age classes of the same trees (Pettersen, 1984;Räisänen & Athanassiadis, 2013). One significant source of variation is due to different developmental stages of trees, when juvenile and mature woods are produced. In conifer species, the transition between them typically occurs between 10 and 20 years of age (Hayatgheibi et al., 2018;Jozsa & Middleton, 1994), but both the starting point and duration are highly variable and may differ among wood traits even at the same tree (Hodge & Purnell, 1993;Yang & Benson, 1997).
Mature wood is more desirable because it is denser, has longer tracheids and contains less lignin and extractives (Loo, Tauer, & McNew, 1985;Sykes, Isik, Li, Kadla, & Chang, 2003). However, its proportion at harvest gradually declines because forest tree improvement-at least until recently-primarily focused on increasing stem volume (Wilhelmsson & Andersson, 1993) while traits related to wood quality (and to the transition age from juvenile to mature wood) were not considered. Consequently, improved trees would reach merchantable dimensions sooner and could thus be harvested earlier than their unimproved counterparts (Petty, MacMillan, & Steward, 1990;Zhou & Smith, 1991) but the transition age remained unchanged. For instance, rotation periods of loblolly and radiata pines (Pinus taeda L. and Pinus radiata D. Don) in the US and Australia, respectively, were shortened by about 50% to 20-30 years compared with natural stands (Gapare, Wu, & Abarquez, 2006;Pearson & Gilmore, 1980), thus leaving trees with too little time for producing mature wood.
Large genetic variation in the chemical compositional traits has been observed in juvenile wood of several pine species (Shupe, Choong, & Yang, 1996;Sykes, Li, Isik, Kadla, & Chang, 2006). Since these traits are directly linked with the usability of wood for different applications, including pulp, paper and biofuel production, it is appealing to incorporate them in forest tree breeding programs. In this study we intended to (a) quantify the extent of additive genetic variation in growth, wood quality and chemical compositional traits in juvenile wood of Scots pine (Pinus sylvestris L.), (b) estimate all traits' narrow-sense heritabilities, (c) determine the magnitude and direction of phenotypic and additive genetic correlations between all pairs of the studied traits and (d) evaluate the potential of the chemical compositional traits for genetic improvement via selective breeding to produce wood materials with desired chemical compositions.

| Sample population
Samples for this study were selected in two Scots pine (Pinus sylvestris L.) full-sib progeny tests "Skorped" (411-2-H72-Skorped-Y; latitude 63.3444°N, longitude 17.6417°E, altitude 330 m a.s.l.) and "Vännäsby" (411-3-V73- The tests were part of a broader progeny test series and consisted of 199 and 197 controlled crosses, respectively, from seed orchard #411 "Domsjöägnet", three controlled crosses from other seed orchards, six seed stand seedlots and three provenances of lodgepole pine (Pinus contorta Douglas ex Loudon), all planted as 1 year old seedlings in paper pots. The test sites were divided into 210 and 208 postblocks, respectively, each consisting of 40 trees (4 columns by 10 rows with spacing of 2.2 m in each direction). The total number of planted trees was 8,390 and 8,320 on the two sites. A subset of 1,245 trees, representing 105 full-sib families (629 and 616 trees, respectively) that were planted at both sites, was included in this study. At each site, 85 families were represented by at least five trees while 20 families were represented by at least 10 trees, provided that enough surviving trees existed on the sites to meet this condition.

| Growth traits
Height (HGT) and diameter at breast height (DBH) at age 28 were obtained from Skogforsk who had measured the plantations in the fall of 2000. Merchantable volume (VOL) was calculated as a function of height and diameter following Brandel (1990) as:

| Wood quality traits
Density Wood density (RES) was estimated from records obtained in the fall of 2015 using a portable micro-drill Resistograph IML-RESI PD300 (Instrumenta Mechanic Labor), which measures drilling resistance of wood with a resolution of 10 points per mm. Each standing tree was drilled bark to bark c. 1.3 m above ground, with a reasonable distance kept from knots and observable stem damages. The drilling was performed in one direction only (from northeast and west at Skorped and Vännäsby, respectively), as no significant differences had been observed between measurements in two perpendicular directions (Fundova, Funda, & Wu, 2018). When resistograms (drilling resistance values plotted against penetration depths), as immediately displayed on the instrument's screen, exhibited inconsistencies or large deviations from the expected pattern, measurements were repeated. Raw resistograms were adjusted (debarked and linearly detrended to correct for needle friction) following Fundova et al. (2018), and RES was subsequently calculated as the mean of all drilling resistance values along a given resistogram.

| 851
Dynamic modulus of elasticity Acoustic velocity (VEL) was recorded on standing trees in the fall of 2015 using a portable device Hitman ST300 (Fibregen; Carter, Briggs, Ross, & Wang, 2005). The device records the time of flight of mechanically induced dilatational stress waves between two Monitran MTN/P100 accelerometers that are attached to probes hammered into a tree's stem at c. 45-degree angle. Distance between probes, measured with ultrasonic sensors, was maintained between 0.5 and 1.2 m while the lower probe was situated at c. 0.6 m above ground, depending on the occurrence of knots. Measurements were taken in the same directions as the resistograph's, thus from northeast and west at Skorped and Vännäsby, respectively. Each acoustic velocity value for a given tree, which was supplied into the formula below, was an average of two series of eight successive records that were taken with the aim of accounting for variation among records on the same tree (Paradis, Auty, Carter, & Achim, 2013). When estimates from the two series differed by more than c. 3%, a third series was generated. Wood stiffness, expressed as the dynamic modulus of elasticity (MOE d ; GPa), was calculated following Bucur (2006) as: where VEL is the acoustic velocity (km/s) and ρ is the wood density (kg/m 3 ). Unitless resistograph-based density (RES) was used as a surrogate for wood density, as it had been reported to be highly correlated with x-ray density, with the phenotypic and additive genetic correlation coefficients reaching 0.72 and 0.96, respectively (Fundova et al., 2018). The original resistograph density values were divided by four so that they were scaled down to approach actual density values expressed in kg/m 3 (Fundova, Hallingbäck, Jansson, & Wu, 2020). Note that the measurements of VEL and the subsequent calculation of MOE d were only conducted at the Vännäsby site.

| Chemical compositional traits
In spring 2016, bark to bark increment cores were extracted from standing trees at breast height using a 5 mm core borer (Haglöf), powered by a battery-operated portable device, and were stored in paper straws at c. +23°C and 30% relative humidity. The extraction followed the drilling trajectory of the resistograph's measurements performed in the previous season, with sufficient distance (5-10 cm) maintained between them to avoid their crossing. Annual rings 2-6 counted from the pith, representing juvenile wood and corresponding to tree age of c. 9-13 years, were subsequently isolated for chemical and FTIR analyses. Wood samples were ground using a Retch MM400 ball mill (Retch GmbH) in two cycles by 40 s each, with a 2 min gap between them to avoid overheating, and either used as such for chemical analyses or manually fine-ground with IR spectroscopy grade KBr (Sigma-Aldrich) to produce a homogenized mix for FTIR analyses. The manual grinding was performed using an agate pestle and mortar at a weight ratio of 1 unit of wood powder to c. 55 units of KBr (Gorzsás & Sundberg, 2014).
The chemical analyses were performed at MoRe Research (Örnsköldsvik) on a subset of 70 trees (34 from Vännäsby and 36 from Skorped), which were selected with the aim of covering as much phenotypic variation as possible in wood density and stiffness as estimated earlier with the resistograph and Hitman, respectively. The content of carbohydrates was determined following the protocol SCAN-CM 71:09 (SPPBTC, 2009). Monosaccharides glucose, xylose, mannose, galactose, and arabinose, denoted hereafter as GLU, XYL, MAN, GAL, and ARA, respectively, were quantified by Dionex ICS-5000 ion chromatography (Thermo Scientific Inc.). The ratio of cellulose (CEL) to hemicelluloses (HEM) was derived from the relative content of GLU and MAN following the formula developed by Sjöström (1993) for softwoods as CEL = GLU -MAN/3 and HEM = 1 -CEL. Total lignin (LIG) was quantified as the sum of Klason and acid-soluble lignin following TAPPI's protocols 222 om-02 (TAPPI, 2002) and UM 250 (TAPPI, 1991), respectively. Analysis of extractives (EXT) followed the Soxhlet extraction protocol SCAN-CM 67:03 (SPPBTC, 2003), with a 9:1 ratio of cyclohexane:acetone. The original protocols were slightly modified to account for low amounts of wood material (c. 200 mg per sample).
FTIR measurements of all 1,245 juvenile wood samples included in this study were performed at the Vibrational Spectroscopy Core Facility of Umeå University using a Bruker IFS 66v/S vacuum bench spectrometer (Bruker Optics) in a 16-unit automatic diffuse reflectance carousel (Harrick Scientific Products Inc.). Spectra were collected over the range of 5,200-400 cm −1 at 4 cm −1 spectral resolution, employing a zero filling factor of 2 and Blackman-Harris three-term apodization function. Each sample was scanned 128 times in order to attain good signal to noise ratios. Raw spectra were exported using OPUS 7.0 (Bruker Optics) and standardized for subsequent prediction model calibration using an open source graphical user interface available at https://www.umu.se/en/resea rch/ infra struc ture/visp/. The standardization comprised of IR wave spectra trimming to only retain the so-called "fingerprint" wave region of 1,869-771 cm −1 , baseline correction via asymmetrical least squares fitting (Eilers, 2004), normalization using either total area (TAN) or area minimum-maximum (AMM1 and AMM2) normalization, and smoothing following Savitzky-Golay filtering (Savitzky & Golay, 1964).

| Model development
The chemical composition of all 1,245 juvenile wood samples was predicted using partial least squares regression (PLSR) models developed by Funda, Fundova, Gorzsás, Fries, and Wu (2020) in SAS 9.4 (SAS Institute Inc.) based on spectral and chemical compositional data obtained for the selected 70 samples. Standardized FTIR spectra within 771-1,869 cm −1 served as predictor variables (in total 570 variables; each representing absorbance intensity at a given wavenumber) while the nine chemical compositional traits LIG, CEL, HEM, GLU, MAN, XYL, GAL, ARA, and EXT, expressed as percentages of the total dried wood content, served as response variables. Each response variable was modeled separately, and the models were validated using a split-sample cross-validation test, in which groups of every seventh observation were excluded from calibration data sets. The normalized root mean squared error of predictions (RMSEP), provided by SAS 9.4, was used as the benchmark statistics during calibration. Vandervoet's randomization-based model comparison test (Vandervoet, 1994) was applied as the primary criterion for model selection.

| Quantitative genetic analyses
All statistical analyses related to the estimation of quantitative genetic parameters were performed in the statistical package ASReml 4 (VSN International Ltd.). The two progeny tests Vännäsby and Skorped were analyzed as two separate single-site analyses and, with all data combined, also in multisite analysis. All growth, wood quality and chemical compositional traits (in total 14 variables) were fitted into the following linear mixed model.
where y ijklm is the mth individual for an offspring of ith and jth parents growing in the kth plot on the lth site, μ is the overall mean of a given response variable, g i and g j are the random general combining ability (GCA) effects of ith and jth parents, respectively, s ij is the random specific combining ability (SCA) effect for the cross between parents i and j (Griffing, 1956), t l is the fixed effect of the lth site, p k t l is the random effect of the kth plot nested within the lth site and e ijklm is the random error term, specific to the ijklmth individual. In the single-site analysis, the site-specific terms t l and p k t l were replaced with p k , which is the random effect of the kth plot. Individual-tree narrow-sense heritabilities for each response variable were estimated using variance components obtained from univariate analyses as: where 2 g , 2 s and 2 e denote GCA, SCA and residual variance, respectively, and the numerator and denominator represent additive genetic and phenotypic variances. Standard errors of the heritability estimates were calculated following Taylor series expansion incorporated in the ASReml software (Gilmour, Gogel, Cullis, Welham, & Thompson, 2015). Phenotypic and genetic correlations were calculated as: where 2 x and 2 y are the phenotypic or additive genetic variances for traits x and y, respectively, and xy is the phenotypic or additive genetic covariance between the traits estimated by fitting a bivariate mixed model (Gilmour et al., 2015). Expected genetic gain for direct selection (G A x ) was calculated as: where i is the selection intensity (i = 2.665) and h ix and A x are the square roots of the narrow-sense heritability and additive genetic standard deviation for trait x, respectively. The correlated response of the target trait y (CR y ) due to selection for trait x was calculated as: where A y is the additive genetic standard deviation for the target trait y and r Axy is the additive genetic correlation between the selection trait x and the target trait y.

| Predictive PLSR models
Only one model that performed best in terms of RMSEP was retained for a given response variable (Table 1). The overall predictive power of the models was good, with RMSEP values ranging from 0.302 for EXT to 0.812 for ARA (average RMSEP for all models was 0.613). With the exception of ARA (R 2 = .679), the explained response variation exceeded 75%. The number of significant factors retained following Vandervoet's test ranged from 1 to 9 (Table 1). FTIR spectra standardized following TAN and AMM gave rise to most accurate models in five and four response variables, respectively, although the differences in the attained RMSEPs between the best and second best models were minor, on average less than 2%. Following outlier analysis, up to four individuals were removed from the calibration data set. A summary of the

| 853
performance of all models used in this study is provided in Table 1.

| Analysis of the chemical composition
The chemical composition (Table 2) was highly variable among the 70 sampled trees included in wet chemistry analyses. The major chemical components LIG, CEL, HEM, and EXT were determined to constitute on average 27.1%, 32.6%, 22.9%, and 7.4% of total dry weight, respectively. The highest variability was observed in EXT, which ranged from 2.0% to 19.8%, with the coefficient of variation (CV) exceeding 50%. The sum of the total assigned content approached 90%; the unassigned portion that includes mainly pectin, proteins and inorganic compounds ranged from 2.8% to 14.8%.
Considerable variation in the chemical composition was also observed among the 1,245 predictions. While the means and standard deviations were comparable with those obtained from the wet chemistry analyses, the value ranges were greater among the predictions in all nine chemical traits, in particular in EXT, LIG, GLU, and CEL that had the range 0.8%-29.6%, 9.9%-34.2%, 23.6%-45.8%, and 22.8%-41.5%, respectively. EXT also exhibited the greatest phenotypic variation of all traits (CV = 56.3%). Descriptive statistics of the chemical composition determined T A B L E 1 Performance of partial least squares regression models for predicting the chemical composition of juvenile wood from standardized FTIR spectra (570 predictor variables representing absorbance intensities at 1,869-771 wavenumbers) from wet chemistry analyses and predicted from FTIR spectra is visualized in Figure 1 and summarized in Table 2.

| Growth and wood quality traits
Descriptive statistics of growth and wood quality traits for all 1,245 trees are shown in Table 3. Substantial variation was observed for DBH (range 8.3-26.6 cm; CV = 19.2%) and especially for VOL (21.7-305.2 dm 3 ; CV = 41.1%). Wood quality traits exhibited some variation too, with VEL the least (2.7-4.7 km/s; CV = 7.6%) and MOE d the greatest (3.5-12.4 GPa; CV = 20.7%). Note that MOE d estimates shown in Table 3 were calculated from density values provided by the resistograph and scaled down to 25%; therefore, they might lie slightly above or below true values, depending on the actual scaling coefficient.

| Genetic variation and narrow-sense heritabilities
Coefficients of additive genetic variation (CV A ) for growth, wood quality and chemical compositional traits along with their narrow-sense heritabilities (h 2 i ), estimated separately for each site as well as jointly for both sites, are presented in Table 4.
CV A exhibited a consistent pattern across traits at the two sites, with values ranging from 2.7% for MAN at Vännäsby to 36.1% for EXT at Skorped. VOL, MOE d and EXT exhibited the highest CV A within the respective trait groups while HGT, VEL and CEL & GLU exhibited the lowest.
Assessed based on the magnitude of standard errors (Porth et al., 2013), all heritabilities were significant. Heritabilities for growth traits ranged from 0.16 for DBH to 0.38 for HGT for the single site analyses (both at Skorped), and from 0.10 also for DBH to 0.33 for HGT in the multisite analysis. Wood quality traits exhibited the highest heritabilities, exceeding 0.5 for the three traits RES, VEL and MOE d at Vännäsby, with RES reaching appreciable levels also at Skorped and on both sites combined, 0.30 and 0.42, respectively. The genetic control of chemical compositional traits was mostly moderate and, except for MAN and GAL, also consistent between sites. These two traits differed in their heritabilities between sites by 0.19 and 0.17, respectively, with higher values being attained consistently at Skorped; the remaining F I G U R E 1 Descriptive statistics of the chemical composition of juvenile wood determined using wet chemistry analyses (green boxes; 70 samples) and predicted from standardized Fourier transform infrared spectra (blue boxes; 1,245 samples

| Genetic and phenotypic correlations
Additive genetic (r A ) and phenotypic (r P ) correlations between all pairs of traits are presented in Tables S1-S3. Mostly positive genetic correlations were observed between traits within all three trait groups, i.e., growth, wood quality and chemical compositional traits, and this pattern was consistent across sites as well as in the multisite analysis. The only exceptions to this pattern were strongly negative correlations of EXT with all other chemical compositional traits, ranging from −0.60 to −1.07, and close-to-zero correlations of MAN and XYL with GAL at Skorped and MAN with GAL in the multisite analysis. Within the group of wood quality traits, a strongly positive genetic correlation of 0.81 was observed between RES and MOE d at Vännäsby; however, this estimate might be inflated, as MOE d was calculated from resistograph density; that between independently measured RES and VEL was moderate (+0.53). Genetic correlations between traits across groups showed variable patterns. Those between growth and wood quality traits were either negative (DBH and VOL with RES, VEL and MOE d at Vännäsby) or near zero (all other instances). Those between growth and chemical compositional traits were mostly negligible, with only a few exceptions at Vännäsby (DBH, HGT and VOL with LIG, and HGT also with GLU and CEL), Skorped (HGT with HEM, MAN and XYL, and all growth traits with GAL and ARA) and in the multisite analysis (all growth traits with GAL), which were significant in either direction. An interesting pattern was revealed for relationships between chemical compositional traits and wood quality traits. RES, which was measured at both sites, was either negatively correlated (six, four and six trait pairs for the three models, respectively) or uncorrelated with eight of the nine chemical compositional traits (this group included LIG and all of the carbohydrate traits), while it was in all instances positively correlated with EXT (r A = 0.45, 0.35 and 0.40, respectively). Furthermore, the two wood-stiffness related traits VEL and MOE d that were measured at Vännäsby, followed a similar pattern to that of RES, reaching negative correlations with five (VEL) and six (MOE d ) chemical compositional traits while their correlations with EXT were moderately positive (0.42 and 0.51, respectively).
For an easier interpretation, all additive genetic (a) and phenotypic (b) correlation coefficients are visualized in heat maps ( Figure 2) following a cluster analysis. All heat maps, referring to Vännäsby (left), Skorped (middle) and the multisite analysis (right), exhibited a similar and strong correlation pattern as described earlier, showing (1)

| Genetic gain and correlated response to selection
Expected genetic gain and correlated genetic response to indirect selection at 1% selection intensity are presented in Table 5. Genetic improvement following direct selection was estimated to range from 5% for DBH and CEL to 56% for EXT, and considerable gains were predicted for VOL (16%) and wood quality traits, reaching over 11% for RES and 33% for MOE d .
Selection for either of the wood quality traits (RES and MOE d ) would result in a positive and favorable response of the other trait: RES would improve stiffness by almost 26% while MOE d would improve density by 13%. It would at the same time decrease LIG by 3.6%-6.5%, but also slightly decrease CEL (by c. 2%) and substantially increase EXT (by 24%-35%). Selection for VEL alone would result in an improvement of both target wood quality traits, with 8% gain attained for density and 29% for stiffness. Selection for growth traits would have a negligible effect on chemical compositional traits and small but unfavorable on MOE d .
Selection for higher CEL would not affect growth traits but it would slightly decrease wood quality traits (RES by 3%

| 857
and MOE d by 7%), increase LIG and HEM (by 5%) and substantially reduce EXT (by 40%). The reduction in EXT could be even stronger (56%) if selection was targeted against EXT, while the increase in CEL would be nearly the same. This strategy would, however, result in a higher increase in LIG compared to the strategy of targeting for higher CEL (c. 3% difference). Furthermore, selection against EXT would incur the reduction in wood density (by 4%) and stiffness (by 13%).

| DISCUSSION
In forest tree improvement programs, the success of the utilization of IR spectroscopic techniques depends on (a) IR spectra-based predictability of those industrially important mechanical and/or chemical properties, which are to be improved via selective breeding, from IR spectra; (b) extent of genetic variation present in these traits and their genetic control, i.e., their narrow-sense heritabilities; and (c) knowledge of the genetic relationships among these traits.

| Model reliability
The overall performance of our predictive models was good, with the attained RMSEP and R 2 y values conforming to our expectations based on earlier literature.
For cellulose (trait CEL): The predictive model for CEL, the most important component of lignocellulosic biomass in relation to bioethanol and pulp and paper production, reached an RMSEP of 0.72 and R 2 y of .81. By comparison, Acquah, Via, Fasina, and Eckhardt (2016) obtained an R 2 of .72 in their study of loblolly pine biomass acquired from harvesting operations in southern USA, using first-derivative treated FTIR spectra of the fingerprint region, although with a relatively small ratio of performance to deviation (RPD; 1.61). Similar power for predicting this component was reported by Bjarnestad and Dahlman (2002) in a study focusing on the characterization of hardwood and softwood pulps originating from different manufacturing processes. Toivanen and Alen (2006) did not provide a direct estimate for CEL in Scots pine, but their RMSEP for GLU reached 1.7, the highest of all studied traits. Nuopponen, Birch, Sykes, Lee, and Stewart (2006) modeled the chemical composition in Sitka spruce (Picea sitchensis (Bong.) Carrière), Scots pine and 24 different hardwood species and obtained RMSEPs for CEL of 3.3 and 2.8 when most of the mid-infrared (MIR) spectral region and only five principal wavenumbers were included as predictor variables, respectively.
For hemicelluloses (trait HEM): Model performance for predicting HEM was nearly the same as for CEL in this study, with RMSEP and R 2 y reaching .72 and .79, respectively; however, this component seems to be more difficult to model, at least when only the fingerprint region (c. 1,800-700 cm −1 ) is retained for model building. For instance, Acquah et al. (2016) observed that R 2 increased by 10% when full MIR spectra were included, as opposed to the scenario when all non-fingerprint regions had been cut off. This suggests that non-fingerprint regions might encompass some relevant information pertaining to the presence and content of HEM, in particular as the differences in R 2 were much smaller for the other three major components. This hypothesis could be supported by Zhou, Jiang, Via, Fasina, and Han (2015), who obtained superior models for HEM compared to those for CEL when spectral data acquired over the whole MIR range of 4,000-650 cm −1 were included, but this hypothesis could not be verified as the fingerprint region alone was not analyzed separately. For total lignin (trait LIG): In this study, LIG could be predicted from spectral data with a higher accuracy than either of the two major carbohydrate components. The RMSEP for LIG was one-third fold lower than that for CEL (0.48 vs. 0.72), and this observation was in congruence with several earlier studies (e.g., Nuopponen et al., 2006;. Accurate models for predicting LIG were constructed by He and Hu (2013) in a study involving 147 woody species from China and also by Jiang et al. (2014) in an NIR-based study of pine lumber.
For extractives (trait EXT): The model for EXT reached the highest predictive power (RMSEP = 0.30) of all chemical compositional traits included in this study. This high power was likely driven by a band position near 1,693 cm −1 , whose absorbance intensity exhibited the strongest association with this composite trait, with correlation coefficients ranging from 0.91 to 0.97 depending on the normalization method applied (Funda et al., 2020). Furthermore, only less than 5% of the response variation remained unexplained by the model. A high predictive power for EXT was also reported by Meder, Gallagher, Mackie, Bohler, and Meglen (1999),  and Nuopponen et al. (2006) (in the latter study, EXT were referred to as wood resin) and, when only the major wood components were taken into account, EXT performed superior in Toivanen and Alen (2006), whose model also explained nearly 97% of the response variation.
Based on the above studies, there seems to be a general pattern in the predictability of the four major components from FT-MIR spectra, with lower accuracy being attained for carbohydrates while highest for lignin and, specifically, extractives. As one possible explanation, Acquah et al. (2016) attributed these results to the similar molecular makeup of polysaccharides versus the specific chemical structures of lignin and extractives, giving rise to highly distinctive patterns of IR absorption bands. Moreover, this pattern seems to hold true for NIR spectra as well: for instance, Acquah, Via, Fasina, and Eckhardt (2015) obtained for loblolly pine logging residues the same ranking in the predictability of major wood components (EXT > LIG > CEL > HEM),  found similar predictability rankings (LIG > EXT > HEM > CEL) for four different hardwood species. Both studies used FT-NIR spectroscopy.

| Genetic control of growth, wood quality and chemical compositional traits
The genetic control of economically important growth (height, stem diameter, volume) and timber quality (density, stiffness, microfibril angle) traits has been well documented in a number of forest tree species (Chen et al., 2014(Chen et al., , 2015El-Kassaby, Mansfield, Isik, & Stoehr, 2011;Hayatgheibi, Fries, Kroon, & Wu, 2017;Hong, Fries, & Wu, 2014;Isik, Li, & Frampton, 2003;Ivkovich, Namkoong, & Koshy, 2002;Pot et al., 2002;Wu et al., 2008). However, less focus has been dedicated to the study of chemical compositional traits, although these participate in forming the overall quality of wood and in turn in determining its suitability for different end products and industrial applications.
Forest trees generally encompass substantial genetic variation in growth and wood quality traits. Their genetic control is usually rather weak for the former and moderate for the latter. Narrow-sense heritabilities (h 2 i ), similar to those obtained in this study, have been reported for Scots pine's growth traits (Fries, 2012;Fundova et al., 2018;Hong et al., 2014) and slightly higher for Scots pine by Fundova et al. (2020) and Haapanen, Velling, and Annala (1997) specifically for DBH. In our study, heritability of wood density (RES) obtained at Skorped (0.30) was in congruence with that reported by Fries (2012) and Haapanen et al. (1997), whereas estimates closer to the value obtained in the multisite analysis (0.42) were reported by Hong et al. (2014) and Fundova et al. (2018). Heritabilities of VEL measured on standing trees as well as of MOE d were higher compared to those obtained in other studies on Scots pine (Fundova, Funda, & Wu, 2019;Hong et al., 2014).
Genetic control of the chemical compositional traits does not seem to exhibit a consistent pattern across and within species, as their narrow-sense heritabilities have been reported to range from weak to high, with the majority of them being moderate (0.2 < h 2 i < 0.5). Most h 2 i obtained in this study were moderate, both at the two sites individually and in the multisite analysis, and their standard errors were within a reasonable range of 0.06-0.12, reaching on average 29.6% of the heritability estimates. Only three monomeric sugars, GLU, MAN and GAL, exhibited low h 2 i at Vännäsby. Somewhat lower estimates were reported by Sykes et al. (2006) in a study of juvenile and transition wood in loblolly pine, using the third and eighth annual rings from the pith at breast height, respectively. Their chemical compositional traits CEL and LIG exhibited weak individual-tree heritabilities (0.15 and 0.12) and were associated with relatively high standard errors (0.14 and 0.17, respectively). The high errors were attributed to a small number of parents included in the experiment and possible random genetic drift due to sampling bias. A comprehensive study was provided by Pot et al. (2002), who applied FTIR spectroscopy for estimating genetic parameters of chemical compositional traits in maritime pine (Pinus pinaster Ait.). They observed moderate to high heritabilities for holocellulose (i.e., total carbohydrates comprising of CEL and HEM) and LIG (h 2 = 0.47 and 0.36, respectively); however, after decomposing holocellulose into the two components, only CEL exhibited moderate genetic control (h 2 = 0.34) while no significant genetic control was detected for HEM, which was under the strongest genetic control in this study. Furthermore, Pot et al. (2002) found no genetic control for EXT. In another study on maritime pine, in which the chemical composition was predicted from NIR spectra, Lepoittevin et al. (2011) obtained substantially lower narrow-sense heritabilities for CEL (0.08) and LIG (0.05 and 0.25 at two different sites), but the moderate genetic control of EXT (0.35) was in congruence with our estimate as well as with estimates obtained in several other studies (Cown, Young, & Burdon, 1992;Lepoittevin et al., 2011;Zhou, Li, Huang, Chen, & Lin, 2000). Moderate to high genetic control of EXT was also reported in Scots pine by Partanen, Harju, Venalainen, andKarkkainen (2011) andFries, Ericsson, andGref (2000). Outside the world of conifers, Porth et al. (2013) observed higher h 2 i for most chemical compositional traits determined through traditional wet lab approaches in a black cottonwood (Populus trichocarpa Torr. & Gray) population from western Canada and US. Their estimates were 0.42, 0.35 and 0.65 for CEL, HEM and LIG, respectively, while those for the five monomeric sugars were on average more than twofold higher than those obtained in this study. Standard errors associated with their estimates were however also substantially higher, on average 0.16 versus 0.07 over eight chemical compositional traits included in both studies.
Chemical compositional traits exhibited low genetic variation except for GAL and EXT, for which the coefficients of additive genetic variation (CV A ) were three and seven times higher compared to other traits within their trait group, respectively. Slightly lower CV A were observed by Ukrainetz, Kang, Aitken, Stoehr, and Mansfield (2008) and Pot et al. (2002) for some of these traits.

| Genetic and phenotypic correlations
The knowledge of the magnitude of additive genetic correlations among traits is essential for understanding the correlated response of different traits to selection (Cheverud, 1988), that is, for quantifying how selection for one trait (or a group of traits) affects the performance of another trait. This question is particularly relevant in forest tree improvement, as some economically important traits that influence the overall | 859 economic value of trees or their products such as growth and wood quality traits are often negatively correlated (El-Kassaby et al., 2011;Fries, 2012;Hong et al., 2014;Li & Wu, 2005;Pot et al., 2002;Zhang & Morgenstern, 1995), and thus, their simultaneous genetic improvement might be challenging. In this study, most of the genetic correlations within groups of growth and wood quality traits were positive and significant, although both their direction and magnitude were driven by the mutual dependencies in their estimations, as VOL was calculated from HGT and DBH while MOE d was calculated from VEL and RES. Moderate positive genetic correlation between VEL and RES at Vännäsby was in line with other studies (Fundova et al., 2019. Correlations between growth and wood quality traits were weakly negative or nonsignificant. Additive genetic correlations between any two traits among the nine chemical compositional traits were nearly in all cases strong and positive. This is in accordance with expectations for the relationships of CEL and HEM with their respective monosaccharide constituents (although Porth et al., 2013 reported no significant genetic relationship and significant but weak phenotypic relationship between CEL and GLU in black cottonwood), but unlike many other studies (Lepoittevin et al., 2011;Pot et al., 2002;Sykes et al., 2003Sykes et al., , 2006 as well as Porth et al. (2013), we observed a positive and strong correlation between CEL and LIG that held for both of the progeny tests included in this study, with genetic correlations ranging from 0.74 to 0.80, while those reported in the aforementioned studies were −0.98, −1.03, −0.99, −0.73 and −0.45, respectively (the first being determined in a clonal trial). One possible explanation for this discrepancy could be the great variation in the content of EXT among trees included in this study (range 0.8%-29.6% as predicted from FTIR spectra or 2.0%-19.8% as determined in the wet lab; CV P = 56.3% and 53.6%, respectively). Since the chemical compositional traits are expressed in % (or in g × kg −1 of total dry weight), and thus sum up to no more than 100%, one trait (or a group of traits) can only exist at the expense of another trait. In this study, the high variation in EXT strongly influenced the remaining components CEL, HEM and LIG, as their value ranges and CVs were substantially smaller, and perhaps masked true relationships among them. For instance, Lepoittevin et al. (2011) observed a much lower variation in EXT on a site with half-sib progenies compared with our results (mean value = 6.7%, CV P = 19.0%, CV A = 11.3%).
The first breeding step for increasing CEL in juvenile wood within the studied material therefore seems to be to decrease EXT and then focus on improving the CEL:LIG ratio. Considerably less extractives were observed among the same trees in mature wood (Funda et al., 2020), and thus, mature wood's correlation matrix might exhibit a different pattern, reflecting more the results reported in studies which did not include extractives in their analyses. Sykes et al. (2006) and Porth et al. (2013) obtained negative correlations between CEL and LIG in their studies of loblolly pine and black cottonwood, respectively, but these variables could not be linked to EXT because EXT were not determined as a separate trait. Genetic correlations among individual monomeric sugars were all positive and, judging based on the magnitude of their standard errors, in most cases also significant. On the other hand, no clear correlation pattern was observed in black cottonwood (−0.57 for GAL-ARA to 0.67 for MAN-XYL; Porth et al., 2013) and Douglas-fir (Pseudotsuga menziesii (Mirbel) Franco; −1.00 for GLU-ARA to 0.51 for GLU-MAN; Ukrainetz et al., 2008).
In our study, additive genetic correlations between growth and chemical compositional traits were mostly negligible or very weak. The only somewhat meaningful correlations were observed between growth traits and LIG at the Vännäsby site, which were weakly to moderately positive. Similar results based on phenotypic correlations were reported for Douglasfir by Ukrainetz et al. (2008), who observed that larger trees (with greater height and diameter) contained more lignin and less carbohydrates. Although Ukrainetz et al. (2008) did not study CEL individually, the strongly negative genetic correlations observed between growth traits and GLU (−0.65 for HGT to −0.74 for VOL) might suggest a similar relationship between growth traits and CEL too, as most of the glucose in Scots pine is present in the form of cellulose while only a smaller portion exists in the hemicelluloses. A similar observation was reported by Pot et al. (2002) in maritime pine, whose genetic correlation between HGT and LIG was +0.40 while that between HGT and CEL was −0.37. Costa e Silva (1998) also reported positive genetic correlations between LIG and growth traits in a clonal test with Sitka spruce (+0.47 with DBH and +0.34 with HGT). In this study, growth traits and CEL were practically uncorrelated like in Sykes et al. (2006) or Lepoittevin et al. (2011) in loblolly and maritime pines, respectively; however, the above results indicate that there might exist an unfavorable genetic correlation pattern in growth versus chemical compositional traits from the perspective of tree breeding that aims to increase volume production but, at the same time, improve CEL:LIG ratio in wood.
Genetic correlations between chemical compositional traits and wood quality traits were studied e.g., by Ukrainetz et al. (2008), Pot et al. (2002), Sykes et al. (2003), Porth et al. (2013) and Lepoittevin et al. (2011). Positive genetic correlations between CEL and x-ray-based wood density were found by Pot et al. (2002) in maritime pine and Porth et al. (2013) in black cottonwood (+0.62 and +0.74, respectively) as well as by Sykes et al. (2003) in loblolly pine (+0.56), who used volumetric density. In contrast, Ukrainetz et al. (2008) found no significant relationship between GLU and x-ray density. Correlations between LIG and wood density had mostly the negative signmoderate estimates of −0.54 and −0.42 were reported by Pot et al. (2002) and Porth et al. (2013), respectively. A weakly negative relationship was also observed between LIG and Pilodyn-based density by Costa e Silva et al. (1998) in a clonal test with Sitka spruce. In this study, weak to moderate negative genetic correlations were obtained for resistograph density (and stiffness-related traits at Vännäsby) with LIG as well as with all carbohydrate traits (both poly-and monosaccharides); EXT represented the only exception, as they were positively correlated with RES on both sites as well as with VEL and MOE d at Vännäsby. This might suggest that the presence of extractives, or at least some of their components, influences either wood density per se or the drill needle penetration during wood density measurements with the resistograph.

| Genetic gain and correlated response to selection
Genetic gains predicted for the studied traits were small for DBH and CEL, moderate for VOL and RES and high for MOE d and EXT. The correlated genetic response to indirect selection showed that selection for higher CEL would result in increased LIG, as these two components follow the same trajectory due to the positive and strong additive genetic correlation between them. Similarly, it would result in increased HEM. On the other hand, a positive consequence of this breeding strategy would be a substantial reduction in the mean value of EXT, which-together with LIG-represent undesired chemical components, in particular when wood is intended for paper or bioethanol production. Such a selection strategy would have a slightly detrimental effect on wood quality (it would result in a c. 3% loss in density and 7% in stiffness), but it would not compromise tree growth, as both DBH and VOL would remain nearly unaffected. Alternatively, due to the very high range in EXT observed among the studied trees (~29%), and given the genetic relationships among CEL, LIG and EXT (positive between CEL and LIG and negative between EXT and the other two), the best strategy for genetic improvement of the chemical composition of Scots pine juvenile wood suitable for the abovementioned purposes appears to be to target selection against EXT first, while focus on increasing CEL under the constraint of no increase in LIG later when both the mean and range of EXT have already been reduced to satisfactory levels.

| Practical implications and conclusion
FTIR spectroscopy, coupled with multivariate regression modeling, has proved to be a promising technique for Scots pine breeding as it can rapidly, inexpensively, non-destructively, and with good accuracy determine the chemical composition of wood in a large number of samples. Unlike wet lab protocols, it suffices with fractional amounts of wood material (~5 mg) and thus offers the possibility of skipping the extraction of increment cores from most trees included in the genetic evaluation. This would considerably reduce the stress load posed on trees, as increment core borers typically create big holes in their stems, which might jeopardize young trees' physical stability as well as increase the risk of fungal infections, in particular in trees that are to be maintained in their plantations or stands for a long time, e.g., until the rotation age. The resistograph appears to be an ideal tool in this regard, especially when wood density is already involved in the evaluation, as wood shavings produced during drill needle penetration can be collected and later utilized in the FTIR analysis.
The extent of the additive genetic variation observed among the studied progeny in the four major chemical compositional traits, i.e., the proportion of cellulose, hemicelluloses, lignin and extractives, in Scots pine juvenile wood along with their moderate genetic control indicates that the studied progeny tests possess good potential for future improvement via selective breeding. It seems most appropriate to commence the genetic improvement by decreasing and stabilizing the content of extractives among trees and then focus on increasing the ratio of cellulose to lignin, which is desirable for the paper producing industry as well as for the conversion of wood into biofuel products. Since extractives are moderately to highly heritable, it might also be of practical interest to investigate in more detail the predictability of individual extractive components to determine which one(s) stand behind their high correlation with FTIR absorbance intensities.