Gene‐based mapping of trehalose biosynthetic pathway genes reveals association with source‐ and sink‐related yield traits in a spring wheat panel

Abstract Trehalose 6‐phosphate (T6P) signalling regulates carbon use and allocation and is a target to improve crop yields. However, the specific contributions of trehalose phosphate synthase (TPS) and trehalose phosphate phosphatase (TPP) genes to source‐ and sink‐related traits remain largely unknown. We used enrichment capture sequencing on TPS and TPP genes to estimate and partition the genetic variation of yield‐related traits in a spring wheat (Triticum aestivum) breeding panel specifically built to capture the diversity across the 75,000 CIMMYT wheat cultivar collection. Twelve phenotypes were correlated to variation in TPS and TPP genes including plant height and biomass (source), spikelets per spike, spike growth and grain filling traits (sink) which showed indications of both positive and negative gene selection. Individual genes explained proportions of heritability for biomass and grain‐related traits. Three TPS1 homologues were particularly significant for trait variation. Epistatic interactions were found within and between the TPS and TPP gene families for both plant height and grain‐related traits. Gene‐based prediction improved predictive ability for grain weight when gene effects were combined with the whole‐genome markers. Our study has generated a wealth of information on natural variation of TPS and TPP genes related to yield potential which confirms the role for T6P in resource allocation and in affecting traits such as grain number and size confirming other studies which now opens up the possibility of harnessing natural genetic variation more widely to better understand the contribution of native genes to yield traits for incorporation into breeding programmes.


Table S1
Results from the epistasis, signature of selection, heritability per gene and gene family, and gene-based prediction.

Table S3
List of variants significantly associated with source-and sink-related traits from the single variant analysis.
Methods S1 Partitioning the heritability per single (local) gene.  of pairwise SNP LD against distance in base pairs (bp). Curves show nonlinear regression of r 2 on distance.  Expected −log10�p�
We fitted the following mixed linear model: where ̂ was a vector of phenotypic values, was the vector of fixed effects (with and without PC adjustment), was the vector of random additive genetic effects of the global genomewide SNP markers, was the vector of random additive genetic effects of the local gene region markers, and was a vector of random residuals. The incidence matrices for , , and were , , and , respectively. The distributions of random effects were assumed to be ~( , 2 ), ~( , 2 ), and ~( , 2 ), where was the identity matrix, was the global additive GRM calculated using = ′ / , and is the local additive GRM calculated the same way as . is a n × g matrix of scaled and centered markers from n individuals and g is the total number of markers. To build the matrix we used a genotypic incidence matrix coded as 2 for homozygote A1A1, 1 for heterozygote A1A2, and 0 for homozygote A2A2. We extracted genomic estimates of the additive global whole genomic ( 2 ), local gene ( 2 ), and residual ( 2 ) variances, enabling the calculation of local gene heritability as ℎ 2 = 2 /( 2 + 2 + 2 ) and global whole genomic heritability as ℎ 2 = 2 /( 2 + 2 + 2 ). We tested the presence of regional/local gene variance ( 2 ) using a likelihood ratio test [ = −2ln ( 0 / 1 )], where 0 and 1 are the likelihood values for the hypothesis of absence ( 0 : 2 = 0 ) or presence ( 1 : 2 > 0) of regional variance, respectively. We adjusted the P-values for multiple comparisons to control for type I error at α = 0.05 using the Bonferroni procedure (the number of genes tested was considered to set the threshold).

Methods S2. Partitioning the heritability of TPS and TPP gene family
We combined three kernels into the MLM to estimate the proportion of the phenotypic variance explained of TPS and TPP gene families. We used the variants of each gene family (TPS and TPP) to build a local GRM and combined with the effects of the global GRM from the genomewide markers.
We fitted the following mixed linear model: where ̂ was a vector of phenotypic values, was the vector of fixed effects, was the vector of random additive genetic effects of the global genome-wide SNP markers, and were the vectors of random additive genetic effects of the local TPS and TPP gene family markers, and was a vector of random residuals. The incidence matrices for , , , and were , , , and , respectively. The distributions of random effects were assumed to be ~( , 2 ), ~( , 2 ), ~( , 2 ), and ~( , 2 ), where was the identity matrix, was the global additive GRM calculated using = ′ / , and and are the local additive GRM calculated the same way as . is a n × g matrix of scaled and centered markers from n individuals and g is the total number of markers. To build the matrix we used a genotypic incidence matrix coded as 2 for homozygote A1A1, 1 for heterozygote A1A2, and 0 for homozygote A2A2. We extracted genomic estimates of the additive global whole genomic ( 2 ), local TPS and TPP gene family ( 2 and 2 ), and residual ( 2 ) variances, enabling the calculation of local TPS gene family heritability as ℎ 2 = 2 /( 2 + 2 + 2 + 2 ), local TPP gene family heritability as ℎ 2 = 2 /( 2 + 2 + 2 + 2 ), and global whole genomic heritability as ℎ 2 = Methods S3. Gene-based predictive models We predicted the phenotypic traits using the trehalose biosynthetic pathway genes (TPS and TPP kernels) individually and jointly with the whole genome markers (35K SNP Chip). We used population structure variables (matrix of zeros and ones based on Molero et al. (2019) group clustering) as fixed covariates in the model (Lyra et al., 2018). Predictive ability (r) was calculated as the Pearson correlation between adjusted values and genomic estimated breeding values in 50 replications from independent validation scenarios (Albrecht et al., 2014), randomly sampling 75% of the genotypes (n=110) to form a training set, while the remaining 25% (n=37) were used as a validation set. We applied Fisher's Z transformation of the predictive abilities and compared them among models using Tukey's test at α=0.05. All prediction analyses were performed using the BGLR R package (Perez & de los Campos, 2014), using 60 000 Markov Chain Monte Carlo (MCMC) iterations, with 15 000 iterations for burn-in, and keeping only one from every five consecutive iterations to minimize autocorrelation.
First, we used the additive genome-wide marker effects as predictors by fitting the following GBLUP model: where ̂, , , and are the same as those defined in the Eq. (1).
Second, we used the local additive TPS gene family effects as predictors by fitting the following GBLUP model: where ̂, , , and are the same as those defined in the Eq.
(2). The single kernel TPP gene family was used the same way as Eq. (4).
Finally, we combine all kernels (global whole genomic, TPS, and TPP genic effects) in the GBLUP model using the same model as Eq. (2).
Observation. For the Methods S1-S3, we used the phenotypic traits of 147 individuals (only these genotypes were available in the 35K SNP Chip ID). Also, genes with only one variant were removed from the analyses. For the Methods S1 and S2, we used the complete set of individuals (149 lines), and elite and exotic subpopulations independently.