Genome‐wide quantitative trait loci detection for biofuel traits in switchgrass (Panicum virgatum L.)

Switchgrass (Panicum virgatum L.) has been identified as a potential feedstock for cellulosic ethanol production in United States for its high biomass yield and adaptation to marginal lands. Composition of the cell wall plays an important role in bioethanol conversion. A total of 209 pseudo‐F1 testcross progenies obtained from a biparental cross, AP13 × VS16, were grown at three locations from 2008 to 2011. Near‐infrared spectroscopy was used to estimate cell wall composition from biomass harvested at maturity. A linkage map of the pseudo‐F1 testcross was constructed with 8,757 SNPs developed by genotyping‐by‐sequencing. Quantitative trait loci (QTL) analysis was performed on eight lignocellulosic traits, namely, klason lignin, sugar, glucose, xylose, hexose, ethanol, hexosoic ethanol, and cell wall ethanol conversion percentage. A total of 327 QTL were recorded for the eight lignocellulosic traits. We have identified 111 major regions in the switchgrass genome that underlie these lignocellulosic traits. Scanning of the genome sequence for genes flanking the QTL peaks, we identified 45 important genes that are involved in lignin biosynthesis, carbohydrate and sugar metabolism, and other biological and cellular functions. Identification of valuable genes associated with QTL along with pleotropic effects of the significant number of QTL suggests that simultaneous selection and genetic improvement of these traits are possible using marker‐assisted selection.

ethanol from lignocellulosic feedstock (Dien et al., 2006;Vogel & Jung, 2001). Lignin is synthesized in the phenylpropanoid biosynthetic pathway and is covalently linked with plant cell wall carbohydrates, thus making them less accessible for hydrolysis by fermentation enzymes for ethanol production. A RNAi-mediated downregulation of switchgrass caffeic acid O-methyltransferase gene (COMT), a key enzyme in the lignin biosynthesis pathway, resulted in ~38% increase in ethanol production due to the decrease in acid-soluble lignin content and syringyl to guaiacyl (S/G) lignin monomer ratio (Fu, Mielenz, et al., 2011). Increase in ethanol production has also been reported in COMT knockdown lines in switchgrass (Dumitrache et al., 2017). In a cinnamyl alcohol dehydrogenase (CAD) RNAi line, an increase in release of sugars, especially glucose, with a decrease in lignin content was observed (Saathoff, Sarath, Chow, Dien, & Tobias, 2011). Reduced lignin concentration and increased sugar content in a R2R3-MYB (PvMYB4) transcription factor knockout switchgrass line has been reported (Shen et al., , 2013. Overexpressed MYB4 transgenic lines showed an increase in ethanol production (Dumitrache et al., 2017). Decreased lignin content and increased sugar content were reported due to overexpression of microRNA, miR156 (Fu et al., 2012).
Klason lignin (KL) is defined as the major lignin content of most biomass samples, which is the sum of acid-soluble lignin and acid detergent lignin (ADL). Estimates of KL are usually higher than ADL, and it is the more accurate lignin assessment method within forage plants (Hatfield, Jung, Ralph, Buxton, & Weimer, 1994;Jung, Varel, Weimer, & Ralph, 1999). In grasses, estimated value of KL is two to four times greater than ADL (Hatfield & Fukushima, 2005;Hatfield et al., 1994).
In addition to lignin, other cell wall constituents affecting the biochemical conversion of sugar to ethanol are sugar (SUG), glucose (GLC), xylose (XYL), hexose (HEX), hexosoic ethanol (HEXE), and cell wall ethanol conversion percentage (CWEP). Previous reports show that lignin is the major inhibitor to the release of glucose in switchgrass and sorghum (Dien et al., 2008;Xu et al., 2011). But the release of the pentose sugar, XYL, is not affected much by lignin. However, negative correlation of KL with XYL and positive correlations of GLC with HEX and XYL were reported (Schmer et al., 2012). Positive correlation of ethanol production with sugar and negative correlations with KL and XYL were reported in switchgrass (Vogel, Casler, & Dien, 2017).
Most of the research on lignocellulosic traits focused mainly on reducing lignin content and altering cell wall composition through transgenic approaches. Apart from the deregulation of the transgenic plants, the other bottleneck of developing transgenic switchgrass is that low lignin transgenic lines may not always exhibit normal growth and development under field conditions (Chen et al., 2010;Fu et al., 2012;Shen et al., 2012). For example, low root growth has been reported in MYB4 transgenic lines (Baxter et al., 2015). Thus, it can result in poor agronomic performance and low biomass yield. Exploitation of natural genetic variation in the species has been suggested for the genetic improvement of cell wall recalcitrance in switchgrass (Casler & Boe, 2003;Casler & Vogel, 1999). Genetic improvement of lignocellulose traits through exploring natural genetic variation in conventional breeding as well as genomic-assisted breeding programs has not been elucidated fully in switchgrass. Recently, nine genomic regions controlling sugar and 14 associated with lignin content were identified using simple sequence repeat (SSR) markers in a switchgrass biparental mapping population (Serba et al., 2016). In addition, several genes involved in carbohydrate metabolism, plant development and defense, and transcription factors were detected in the vicinity of these quantitative trait loci (QTL).
We initiated a project for the identification of molecular markers associated with lignin and other cell wall composition traits to improve biofuel production in switchgrass. With this objective, we performed QTL mapping for eight lignocellulose traits: KL, SUG, GLC, XYL, HEX, ethanol (ETOH), HEXE, and CWEP. Here, we report putative candidate genes associated with the QTL in an inter-ecotype pseudo F 1 testcross population. Manipulation of some of the new candidate genes might be beneficial for research communities for further improvement of biofuel conversion in switchgrass.

| MATERIALS AND METHODS
In all, 349 pseudo-F 1 testcross progenies of a switchgrass biparental cross, AP13 × VS16, were evaluated in Ardmore (Ard) and Burneyville (RR), OK, and at Athens, GA, during 2008-2011. The experiments were laid out in a R-256 honeycomb design (Fasoulas & Fasoula, 1995) with four replications in Oklahoma locations and in a Randomized Complete Block Design at Athens, GA. Details of the experimental design, phenotypic data collection, subsampling, and biomass processing are as described in Serba et al. (2013Serba et al. ( , 2015. Biomass (includes leaves, stems, and panicles) was harvested every year either by hand or using a customized silage chopper (John Deere Forage Harvester C1200) after senescence. In general, all the plants senesced uniformly after the first killing frost in the fall. Total fresh biomass data were collected. Details on biomass harvesting and sampling can be obtained from Serba et al. (2016).
Sub-samples of biomass were collected at harvest, dried in forced-air ovens at 60°C and ground to a 1 mm particle size using Thomas Model 4 Wiley® mill (Thomas Scientific). Ground samples were scanned using a Foss Model DS 2500 near-infrared spectrometer (FOSS NIR Systems, Inc.). Forage compositional analysis for biofuel traits was determined using the near-infrared spectroscopy (NIRS) prediction equation . Trait values were calculated for KL, SUG, GLC, XYL, HEX, ETOH, HEXE, and CWEP. All these traits are considered important characteristics for the production of biofuel from switchgrass feedstock. ANOVA and correlation coefficients were calculated in SAS 9.5 using Proc GLM and Proc Corr procedures, respectively.
Quantitative trait loci mapping was performed in Windows QTL Cartographer v2.5  considering the means as well as the best linear unbiased prediction (BLUP) values of the traits. We used the genetic linkage map generated from 8,757 haplotype SNP markers that was described in Ali et al. (2019). For this QTL analysis, 209 progenies that had at least 50% of the 8,757 markers were used. Identification and reporting of the QTL were done using the composite interval mapping approach (Zeng, 1994). Based on genome-wide 1,000 permutations, logarithm of odds thresholds of 2.5 was used to call for QTL. Stepwise regression method was used to measure additive genetic variation and phenotypic variation explained (PVE) for the peak marker of each QTL.
The physical map of switchgrass genome flanking 50 kb upstream and downstream of the major QTL peak markers was scanned, and annotated genes within the region were identified using switchgrass v4.1 annotation information (https://phyto zome.jgi.doe.gov). Assignment of QTL positions on the linkage maps was performed using MapChart software (Voorrips, 2002) with box and whiskers calculated as width (cM) of QTL region at 1-and 2-LOD values down from peak LOD scores, respectively, as reported previously (Khanal, Navabi, & Lukens, 2015;Zhang et al., 2017).

| RESULTS
Analysis of variance showed significant variation among genotypes of the pseudo-F 1 testcross population for KL, SUG, GLC, XYL, HEX, ETOH, HEXE, and CWEP (Table 1). In addition to the genotypic variation, there were significant genotype × environment interactions for the traits. Frequency distribution showed nearly normal distribution for all the traits in the population (Figure 1a-h). The parental and population mean data are presented in the frequency graph. Detailed year-location range and mean values are provided in Table S1. The ranges exceeded the average of one or both parents, suggesting that the progenies exhibited transgressive segregation. Significant positive correlations were observed among KL, SUG, GLC, XYL, HEX, and HEXE, with the Pearson's correlation coefficient values ranging from 0.34 to 0.91 (Table 2). However, these traits showed negative correlations with both ETOH and CWEP. The correlation between ETOH and CWEP was positive and significant.   QTL for GLC (n = 37) explained up to 44.22% PVE. The additive effects varied from −3.8 to 3.7. GLC_c3K_8731090 had the highest PVE followed by GLC_c7K_3305721 (28.87%) and GLC_c1N_47555984 (28.14%). All of these three QTL showed positive additive effects. In all, 24 of the QTL showed positive effects, whereas 13 showed negative effects. There were 22 QTL detected for XYL, and their PVE ranged from 15.2% to 38.3% and additive effect ranged from −2.4 to 3.3. Of these, 12 were positive and 10 were negative effect QTL. XYL_c3N_51470215 had the highest PVE among positive effect QTL.
In all, 19 major effect QTL mapped for HEX explained up to 26% of the phenotypic value, and additive effect ranging from −8.5 to 8.7. Among those, 12 QTL showed positive additive effects, of which HEX_c6N_60565874 showed 26.2% PVE and an additive effect of 8.7. In all, 21 QTL were mapped for ETOH with PVE ranging from 15.1% to 29.8%. The additive effect of the QTL ranged from −2.5 to 2.2; 13 were positive and eight were negative effect QTL. ETOH_ c8K_6960501 was mapped with the highest LOD score (6.5), showing the highest (29.8%) PVE and an additive effect of 1.4.
In all, 17 main effect QTL were detected for HEXE with PVE up to 31.7%. Among positive effect QTL, HEXE_ c9K_37611668 showed the highest PVE (22.7%) with an additive effect of 2.0 and LOD score of 4.9. The additivity varied from −2.0 to 4.7. Eight QTL showed positive, whereas nine showed negative additive effects. A total of 17 major QTL were identified for CWEP with PVE ranging from 15.1% to 27.7%. CWEP_c8K_8471907 showed the highest PVE with LOD value 6.0 and additivity of 1.8. CWEP_ c6K_28896378 showed the highest additive effects of 3.3 with PVE of 26.92% and LOD of 5.8. The additivity ranged from −2.9 to 3.3 with 11 QTL exhibiting positive additive effects and six exhibiting negative additivity.

| Co-localized and pleotropic QTL
Many of the aforementioned QTL for each of the biofuel traits were detected repeatedly in years and locations. These QTL either occupied the same location of the switchgrass genome or overlapped. In all, 54 (15.3%) repeat QTL were recorded for all of the traits (Table S2). GLC showed the highest number of repeat QTL (n = 13), whereas only two repeat QTL were observed in HEX and HEXE. A significant

QTLs
LG Map position LOD PVE (%) Additivity number of QTL (172 of 327 QTL or 52.6%) showed pleiotropic effects-the same QTL appeared for different traits (Table S5). GLC shared the highest number of pleiotropic QTL of 32, whereas ETOH trait showed the lowest number of pleiotropic QTL (n = 17).

| QTL from BLUP values
Quantitative trait loci analysis was performed from BLUP values, calculated across all of the locations and years, because we observed significant genotype × year, genotype × location, and genotype × year × location interactions. A total of 42 QTL were identified from all of the traits analyzed with LOD, PVE (%) and additivity varying from 2.5 to 6.7, 11.6 to 30.5, and −3.6 to 2.2, respectively (

| Mapping and annotation of major QTL
We positioned 111 major QTL (10 for KL, 22 for SUG, 23 for GLC, 15 for XYL, 9 for HEX, 7 for HEXE, 15 for ETOH, and 10 for CWEP) throughout the switchgrass genome ( Figure 2). QTL box and whiskers were calculated as width of QTL at 1 and 2 LOD values down from peak LOD scores, respectively. The highest number of major QTL (15) were mapped on linkage groups (LG) 5N followed by 12 QTL on LG 6N and eight QTL on LGs 1K, 3K, and 4N each. It is interesting to note that LG 5N harbors at least one major QTL for each of the cell wall components studied (Table S2).
LG 6N harbored QTL for six of the eight traits. SUG, GLC, HEX, and HEXE QTL localized in a particular region of LG 1K. Seven QTL (three SUG, two GLC, one HEX, and one HEXE) overlapped themselves and spanned within 87-105 cM region of LG 1K (Figure 2). A total of 936 annotated genes were co-localized with 103 QTL in the switchgrass genome for all the biofuel traits studied (Table S7). These QTL had at least one annotated gene. QTL GLC_c1K_71432572 (Ard2011) and HEXE_c4N_53973685 (Ard2011) had the highest number of annotated genes (27) followed by SUG_c1K_71440139 (UGA2011, 26 genes) and ETOH_c9N_553950 (RR2008, 24 genes). In all, 11 QTL had only one annotated gene.
We identified 45 candidate genes that are involved in lignin biosynthesis, carbohydrate metabolism, and other important biological and cellular functions (Table 4)  4-alpha-galacturonosyltransferase related (GAUT1), trehalose-6-phosphate synthase (TPS), and Protein stay-green 1 were found associated with KL QTL. Important genes associated with SUG QTL were 4-coumarate:CoA ligase 3 (4CL3), CAD, alpha-amylase 2, and endo-1,4-beta-glucanase. TPS and 4CL3 were also found to be associated with GLC QTL. Pectin methylesterase (PME)/plant invertase and PME inhibitor were associated with XYL QTL. Cellulase/endoglucanase was associated with HEX QTL. We also identified a number of different types of transcription factors for these traits.

| DISCUSSION
A pseudo-F 1 testcross population was generated from a biparental crossing between two diverse switchgrass ecotypes: genotype AP13 from lowland ecotype (female parent) crossed to VS16 from upland ecotype (male parent used for the development of the prediction equation . The global H value of the NIRS equation is less than 3, which indicates reliability of the data (R. Mitchell, personal communication, March 12, 2020). We also compared the total crude protein predicted by the NIRS equation to that of the dry combustion method (Leco Co. CNH-600 elemental analyzer) used in a switchgrass population grown in the Noble Research Institute, LLC greenhouse, and obtained high correlations (r 2 = .834). All these suggested that the data developed in this study are fairly reliable. The genotypes of this population showed significant variation and transgressive segregation for all of the biofuel traits studied. Both parents were very distinct in morphological and phenological traits (Casler & Jung, 1999;Hopkins, Vogel, Moore, Johnson, & Carlson, 1995;Serba et al., 2013Serba et al., , 2015. It is possible that favorable alleles from both parents combined into some hybrid progenies and made them better quality feedstock than either parents. Transgressive segregation indicates wider genetic variation and unique gene expression in the progenies compared to the parents, thus efforts can be made to select the most desirable progenies from the population. Biofuel traits are highly variable and largely dependent on the environmental cues and developmental stages of the plants. Combined analysis of data showed significant genotype × environmental interactions for these traits. Significant genotype × environment interactions were reported for cellulose, hemicellulose, and ethanol yield among switchgrass genotypes (Hopkins et al., 1995;Schmer et al., 2012).
Quantitative trait loci analyses using location-year data, identified 327 QTL. Moreover, reappearance of 15.3% of the QTL across years for all of the traits suggested the stable expression of the QTL across environments. Existence of 72.22% BLUP QTL, especially for KL, SUG, GLC, and XLY, also provides evidence of the quality and reproducibility of the QTL mapping. More than half of the QTL exhibited pleiotropic effects among different traits. A number of pleiotropic effects QTL showed favorable additive effects, which implicate the possibility of simultaneous selection for reduced recalcitrant, especially reduced lignin content and increased sugar and bioethanol traits.
Conversion of ethanol largely depends on cell wall carbohydrates, lignin, and hydroxycinnamates. Cell wall components can be genetically modified using both conventional (Casler & Jung, 1999;Sarath et al., 2011;Vogel et al., 2013) and molecular breeding methods including genomic selection (Baxter et al., 2015;Cass et al., 2015;Lipka et al., 2014;Rancour, Hatfield, Marita, Rohr, & Schmitz, 2015;Saathoff et al., 2011;Vogel et al., 2017). KL was found negatively correlated with ethanol production, which was in agreement with the previous findings (Vogel et al., 2017). It is interesting to note that KL showed positive correlation with sugar traits. During QTL analysis of cell wall components in maize, Barrière, Méchin, Denoue, Bauland, and Laborde (2010) hypothesized that the biosynthesis of the lignin components, ADL, is partially independent, which can be a reason for the correlation of KL with the sugar. ADL was also correlated with cell wall digestibility and ethanol yields in genetically related switchgrass plants (Vogel et al., 2017).
Quantitative trait loci mapping in this population using SSR markers identified GLC QTL in LGs 3a (3K), 3b (3N), 4a (4K), 5a (5K), 7b (7N), and 9a (9N) and XYL QTL in 2b (2K), 3a (3K), and 5a (5K; Serba et al., 2016). These results are in agreement with the findings of this study. Moreover, we reported additional major QTL for GLC, XYL, or SUG traits in all of the LGs. Identification of additional QTL in this study might be due to the use of high-density linkage map with greater genome coverage. In the same study, lignin QTL were mapped in LGs 1K, 2K, 3K, 4N, 5N, 7K, 7N, 9N, and 9K. We detected major QTL for KL in LGs 2N, 3N, 4K, 5N, 6K, 8K, and 9N. Except LGs 5N and 9N, the KL QTL identified in this study reside in different LGs as compared to the lignin QTL of Serba et al. (2016). These discrepancies might be due to (a) Serba et al. used a low-density SSR map versus the high-density SNP map used in this study; and (b) different methods of lignin concentration measurement, that is, MBMS versus NIRS prediction equations.
We identified large number of QTL associated with biofuel traits compared to the previous studies especially on LG 5N, which harbored all of the studied eight biofuel traits and LG 6N, which possessed QTL for six of the eight traits. In a recent study of switchgrass biofuel traits, Ramstein et al. (2018) reported QTLs for carbon and ash content on LG 5N. Scanning the genes residing on LGs 5N and 6N, it is evident that a number of genes with known functions for lignin biosynthesis; cellulose, glucan, and xyloglucan metabolism; pectin modifications; and terpenoid synthesis and metabolism, are present on these two important switchgrass chromosomes. QTL for SUG, GLC, and HEX are positioned in one region of LG 1K. These important genomic regions and genes can be the target for future studies.
Annotation of the genes flanking the QTL peaks provided us information on the important genes that might be involved especially in lignin biosynthesis and in carbohydrate and sugar metabolism. Major genes in the monolignol biosynthesis pathway such as 4CL3, CAD, and CCR were co-localized with QTL for sugar, GLC, and KL traits. Downregulation or knocking-out of the 4CL1 (Xu et al., 2011) and CAD  showed decreased lignin and increased sugar content in switchgrass. 4CL2, CCR1, and MYB transcription factor genes were found associated with lignin QTL in maize (Barrière et al., 2010;Barrière, Méchin, Lefevre, & Maltese, 2012) and lignin and sugar traits in switchgrass (Serba et al., 2016). Downregulation of CCR genes resulted in 56% reduction in lignin content in poplar (Leple et al., 2007;Ralph et al., 2008). GAUT1 is one of the members of the plant cell wall pectin biosynthetic complex. Arabidopsis thaliana GUAT1 mutant leads to the efficient sugar release in bioconversion processes (Atmodjo et al., 2011). Increase in ethanol production up to 21% was observed recently in GUAT4 knockdown transgenic switchgrass lines (Dumitrache et al., 2017). PME demethoxylates pectin in the cell wall (Kauss & Hassid, 1967;Roberts, 1990) and alters pectin chemistry (Micheli, 2001;Tieman, Harriman, Ramamohan, & Handa, 1992). GAUT1 and PME are found associated with KL and XYL QTL, respectively. Cellulase and alpha-amylase genes were found associated with hexose and sugar traits. TPS is involved in trehalose synthesis in plants. The Arabidopsis tps1 mutant showed altered expression of genes involved in sugar and pectin metabolism (Gómez, Baud, Gilday, Li, & Graham, 2006). In this study, we observed that the KL and GLC QTL harbors TPS.
Several genes associated with QTL for biofuel traits in this study have already been targeted for manipulation through RNAi and/or genome editing technology (Park et al., 2017;Saathoff et al., 2011;Shen et al., 2013;Xu et al., 2011). Successful implementation of the projects will result in improved switchgrass cultivars suitable for bioethanol production. Genes such as PME and TPS, which are associated with biofuel trait QTL, might be additional targets for altering cell wall components. Further studies of these QTL and associated candidate genes would be valuable in marker-assisted selection (MAS) in switchgrass for improved bioethanol production.

| CONCLUSION
Analyses of a pseudo-F 1 switchgrass population generated from a biparental cross, AP13 × VS16, showed wide genetic variations and exhibited transgressive segregation for a number of biofuel traits, namely, KL, SUG, GLC, XYL, HEX, HEXE, and CWEP. This study identified several new QTL for biofuel traits that were not reported previously. QTL analyses identified 111 genome-wide major QTL regions associated with these traits. More than 50% of the QTL showed pleiotropic results with desirable allelic effects. Annotation of genes flanking the QTL peak markers opens new avenues for manipulating important genes for lignin and carbohydrate metabolism to improve the biofuel traits of switchgrass. Moreover, a number of candidate genes identified in this study falls outside of the gene list that has been targeted for manipulation through RNAi or genome editing technologies to improve switchgrass recalcitrance traits. The results suggest that the application of these QTL and associated markers in the genetic improvement of recalcitrance traits through MAS will improve the genetic gain for bioenergy traits in switchgrass breeding.