SEARCH

SEARCH BY CITATION

Keywords:

  • candidate-gene-based association mapping;
  • Chinese white poplar;
  • gene-derived simple sequence repeat (SSR);
  • linkage population;
  • PtoCesA gene family

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information
  • Chinese white poplar (Populus tomentosa), an important commercial tree species for timber and pulp production in northern China, has been used to examine the individual genes and allelic diversity responsible for complex traits controlling growth and lignocellulosic biosynthesis. Taking advantage of the low degree of linkage disequilibrium (LD) within P. tomentosa association populations, we examined associations between 15 cellulose synthase (PtoCesA) genes and traits including growth and wood properties.
  • Thirty-six novel simple sequence repeat (SSR) markers within PtoCesA genes were detected by re-sequencing and genotyped in an association population (460 individuals). Single-marker and haplotype-based LD approaches were used to identify significant marker–trait associations. Family-based linkage studies and real-time PCR testing were conducted to validate the functional significance of SSR variation.
  • Fifteen single-marker associations from seven PtoCesA genes and nine haplotype-based associations within six genes were identified in the association population (false discovery rate < 0.05). Next, five SSR marker–trait associations (< 0.05) from four PtoCesA genes were successfully validated in a linkage mapping population (1200 individuals).
  • The results imply a functional role for these genes in mediating wood properties, demonstrating the potential of combining single-marker and haplotype-based LD approaches to detect functional allelic variation underlying quantitative traits in a low-LD population.

Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Wood formation distinguishes trees from herbaceous plants, and represents a major metabolic sink for woody plants, as trees convert much of their photosynthesized products into woody tissues, which make up c. 20% of the total terrestrial carbon storage (Schlesinger & Lichter, 2001; Li et al., 2006). Woody tissues are composed of various biopolymers; cellulose and lignin supply mechanical strength to secondary walls, and hemicelluloses form cross-links among cellulose microfibrils. These polymers provide an enormous, renewable feedstock for pulp and paper, biofuel, and solid wood products (Mellerowicz & Sundberg, 2008).

Cellulose is the most abundant biopolymer in trees; its biosynthesis is catalyzed by cellulose synthase (CesA) and involves the synthesis and assembly of β-1,4-glucan chains by cellulose synthase complexes (CSCs) and the orderly deposition of the chains to form microfibrils in cell walls (Somerville, 2006). Previous studies have indicated that the proteins for cellulose biosynthesis are encoded by CesA genes, a multigene family with many members (Li et al., 2006; Suzuki et al., 2006). Plant CesA genes were first identified in cotton (Gossypium hirsutum) fibers (Pear et al., 1996). There are at least 10 CesA genes in Arabidopsis, 12 in rice (Oryza sativa) and at least nine in maize (Zea mays); distinct CesA genes dominate cellulose synthesis in different types of cell wall (Pear et al., 1996; Richmond & Somerville, 2000; Tanaka et al., 2003; Burton et al., 2004; Persson et al., 2007).

Recently, the Populus trichocarpa genome has been sequenced and annotated (http://genome.jgi-psf.org/Poptr1/Poptr1.home.html), and databases of expressed sequence tags (ESTs) from different developmental stages of wood formation are being rapidly generated (Sterky et al., 2004; Tuskan et al., 2006). Poplar (Populus) has emerged as a model species for the identification of CesA homologs and exploration of the mechanisms of cellulose biosynthesis (Djerbi et al., 2004; Zhang et al., 2010b). The first CesA gene from trees was isolated from aspen (Populus tremuloides) by Wu et al. (2000). Since then, 17 members of the CesA gene family have been cloned from aspen and hybrid aspen (Populus tremula × P. tremuloides) (Djerbi et al., 2004; Joshi et al., 2004; Supporting Information Table S1). Eighteen CesA genes (17 protein sequences) have been identified in P. trichocarpa (Suzuki et al., 2006; Table S1). In addition, several CesA genes are specifically expressed during primary or secondary wall synthesis in some angiosperms, such as Arabidopsis, black cottonwood (Populus trichocarpa) and loblolly pine (Pinus taeda L.) (Djerbi et al., 2005; Nairn & Haselkorn, 2005; Suzuki et al., 2006; Atanassov et al., 2009; Song et al., 2010). To further explore the functions of the CesA genes in wood formation, 17 CesA (PtoCesA) genes have been isolated from Populus tomentosa (Chinese white poplar; Table S1). Populus tomentosa, which belongs to the section Populus in the genus Populus, is an important commercial tree species for timber and pulp production in northern China. A vast amount of genetic variation has arisen during the evolution of P. tomentosa, as is evident in the natural populations (Zhang et al., 2007); this variation provides a potential source of beneficial alleles for marker-assisted breeding for improvement of wood fiber traits. Ongoing research has also examined the structure of the natural populations. Huang (1992) was the first to provide climatic regionalization in the distribution zones of P. tomentosa and show that three climatic zones can be treated as genetic regions. A P. tomentosa population with 460 individuals has been divided into 11 subpopulations using 20 genomic microsatellites in the model-based program structure (Du et al., 2012), and this population structure information was used in our subsequent association analysis in this study.

Linkage disequilibrium (LD)-based association mapping provides a valuable opportunity to identify the natural allelic variation responsible for a particular phenotype (Thumma et al., 2005). Tree species are ideal for the fine mapping of candidate genes and functional analysis of gene variants (Ingvarsson, 2005; Wegrzyn et al., 2010). Recently, advances in high-throughput marker technologies and new genomic resources have enabled a closer examination of the number and effect of candidate genes related to traits of interest, through complex trait dissection using LD mapping (Nordborg et al., 2002; Ingvarsson et al., 2008; Eckert et al., 2009). Wood-quality traits are quantitative traits controlled by multiple genes, with a moderate to high degree of heritability (Thumma et al., 2010). Significant associations between single nucleotide polymorphisms (SNPs) within candidate genes affecting wood formation have been established for forest trees (Thumma et al., 2005, 2009; Gonzalez-Martinez et al., 2007; Wegrzyn et al., 2010). In marker-assisted selection (MAS) breeding, simple sequence repeat (SSR) markers are ideal because they are hypervariable, codominant, and highly informative (Varshney et al., 2005). Unlike random genomic SSR markers, gene-derived SSR markers include microsatellites exclusively within candidate genes, including promoters, 5′ untranslated regions (UTRs), 3′ UTRs, introns, splice sites and exons (Varshney et al., 2005). Indeed, the presence of SSRs in the coding and/or regulatory regions can alter function, transcription or translation (Li et al., 2004; Varshney et al., 2005).

In this study, 36 polymorphic SSR markers developed from the PtoCesA gene family were used for single-marker and haplotype-based association mapping to explore allelic effects on natural variation in growth and wood-property traits in P. tomentosa. Furthermore, we also present experimental evidence to confirm the power of LD mapping to identify useful alleles located within functional genes controlling phenotypic traits.

Materials and Methods

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Plant materials

Association population

In 1982, the Institute of Chinese White Poplars (Beijing Forestry University, Beijing, China) assembled a collection of 1047 individuals from the entire natural distribution region of Chinese white poplars (Populus tomentosa Carrière), covering an area of 1 million km2 (30–40°N, 105–125°E; Zhang et al., 2010b). Root segments from this collection were used to establish a clonal arboretum using a randomized complete block design with three replications in Guan Xian County, Shandong Province, China (36°23′N, 115°47′E). In this study, a set of 460 unrelated individuals of P. tomentosa from the collection, representing all of the original provenances, were randomly sampled for association analysis.

Linkage population

In this study, 1200 hybrid individuals were randomly selected from 5000 F1 progeny established by controlled crossing between two elite poplar parents, clone ‘YX01’ (Populus alba × Populus glandulosa) as the female and clone ‘LM 50’ (P. tomentosa) as the male, and these two species are members of the section Populus. The progeny were grown in 2008 in the Xiao Tangshan horticultural fields of Beijing Forestry University, Beijing, China (40°2′N, 115°50′E) using a randomized complete block design with three replications and were subsequently used for linkage analysis of phenotypic traits.

Phenotypic data

The 460 individuals of the association population were scored on the basis of seven quantitative traits, with at least three ramets per genotype. The growth traits, including tree height (H), diameter at breast height (D), and stem volume (V  ), were measured in 2009 using the methods described by Zhang et al. (2006). Wood-property traits included microfiber angle (MFA) and holocellulose, α-cellulose, and lignin contents. First, wood cores were collected from each tree at a height 1.35 m above ground level, in which the variation in MFA was characterized using an X-ray powder diffractometer (Philips, Eindhoven, the Netherlands). These wood cores were then ground into wood meal, in which the holocellulose, α-cellulose, and lignin contents were measured using near-infrared reflectance spectroscopy (NIRS), as described by Schimleck et al. (2004).

The same seven phenotypic traits were measured in all three replicates of the 1200 clones in the hybrid population in 2010 using the same methods described in the preceding paragraph for the association population. The software sas for Windows, ver. 8.2 (SAS Institute, Cary, NC, USA) was used for analysis of variance (ANOVA) and phenotypic correlations for these seven traits.

DNA extraction, SSR discovery, and genotyping

Total genomic DNA was isolated from young leaves using the DNeasy Plant Mini Kit (Qiagen China, Shanghai, China) following the manufacturer's protocol. The genomic DNA sequences of 17 PtoCesA genes were obtained from the Joint Genome Institute (JGI) Database (http://genome.jgi-psf.org/Poptr1_1/Poptr1_1.home.html; Suzuki et al., 2006; Kumar et al., 2009). In total, 105 903 kb of genomic DNA sequences from these 17 unique PtoCesA genes, with an average of 4645 bp per gene, was obtained by re-sequencing 40 P. tomentosa individuals, and the gene length ranged from 4462 bp (PtoCesA2) to 7633 bp (PtoCesA17; Table S1). We detected 36 polymorphic SSR loci within the PtoCesA gene family using ssrit software (http://www.gramene.org/db/markers/ssrtool; Temnykh et al., 2001), with the criterion that the minor allele frequency was ≥ 5% (Fig. 1). Detailed information on these 17 candidate genes and their homologous reference genes in Populus species, and the 36 screened SSR markers, is presented in Tables S1 and S2.

image

Figure 1. The positions and polymorphisms of simple sequence repeat (SSR) loci (e.g. C4-SSR1 and C4-SSR4)located in different regions of the cellulose synthase gene 4 (PtoCesA4) in Populus tomentosa individuals. C4-SSR1 was located in the promoter region of PtoCesA4, with the repeat numbers of the motif (AT) from 8 to 34; C4-SSR4 was in the intron region, with the repeat numbers of the motif (TACTGC) from 2 to 6.

Download figure to PowerPoint

The SSR amplification reaction and PCR were conducted following the procedure of Zhang et al. (2010a). The PCR products were finally separated by capillary electrophoresis using an ABI3730xl DNA Analyzer (Applied Biosystems, Carlsbad, CA, USA), after confirmation of PCR amplification on a 1.5% agarose gel. The analysis of polymorphic loci was performed with GeneMapper v4.0 software (Applied Biosystems) using the LIZ 600 size standard (Applied Biosystems). Subsequently, micro-checker 2.2.3 (http://www.microchecker.hull.ac.uk/) was used for identifying and correcting genotyping errors (van Oosterhout et al., 2004).

Real-time PCR testing

Real-time PCR (RT-PCR) was performed using cDNA samples, which were reverse-transcribed from the total RNA in mature xylem tissue of P. tomentosa individuals (10 individuals per group, and each tree was homozygous for the particular haplotype). The quantitative PCR program and the generated real-time data analysis were performed as described by Zhang et al. (2010a). The specific primer pairs were individually designed for the PtoCesA genes (depending on the single-marker and haplotype-based associations) and an internal control (Actin) using Primer Express 3.0 software (Applied Biosystems). Primer details are shown in Table S3.

Data analysis

Genetic diversity, Hardy–Weinberg equilibrium (HWE) and LD tests

The summary statistics for population diversity, including the observed number of alleles per locus (NA), polymorphism information content (PIC), expected heterozygosity (HE), and Wright's inbreeding coefficient (FIS), were calculated using popgen version 1.32 (Yeh et al., 1999). HWE tests were performed using the software Arlequin version 3.11 (http://cmpg.unibe.ch/software/arlequin3/); then, we also applied the Bonferroni correction for multiple testing. Patterns of LD were investigated among SSR loci from 15 PtoCesA candidate genes (Table S2). The squared correlation of allele frequencies r2 (Hill & Robertson, 1968) was used to test the LD between pairs of SSR markers, with 105 permutations using the software package tassel version 2.0.1 (http://www.maizegenetics.net/).

Single-marker analysis

In the association population (discovery population), all association tests between 36 SSR markers and 7 traits were conducted, using the unified mixed-model method (MLM) with 104 permutations in the software package tassel version 2.0.1 (Yu et al., 2006; Bradbury et al., 2007). The effects of all the genotype classes in each SSR marker (Table S2) were tested by performing a χ2 test at the 0.01 probability level, but the rare genotypes (the percentage of minor genotypes < 5% and the null allele) in each marker were removed from the genotype effect analysis. The MLM can be described as follows: yμ + Qv + Zu e, where y is a vector of phenotype observation, μ is a vector of intercepts, v is a vector of population effects, u is a vector of random polygene background effects, e is a vector of random experimental errors, Q is a matrix defining the population structure from structure, and Z is a matrix relating y to u. Var(u) = G = inline imageK with inline image as the unknown additive genetic variance and K as the kinship matrix (Yu et al., 2006). In this Q + K model, the relative kinship matrix (K) was obtained using the method proposed by Ritland (1996), so that this citation matches the Reference List. Please confirm that this is correct. which is built into the program SPAGeDi, version 1.2 (Hardy & Vekemans, 2002), and the population structure matrix (Q) was identified based on the significant subpopulations (K = 11; Du et al., 2012), as assessed according to the statistical model described by Evanno et al. (2005), using 20 neutral genomic SSR markers. Corrections for multiple testing of smoothed P-values for all associations were performed using the positive false discovery rate (FDR) method with 104 permutations (Storey & Tibshirani, 2003).

Inheritance tests of all significant SSR loci identified in the association population were examined in the F1 hybrid population (validation population), by performing a χ2 test at the 0.01 probability level, and then SSR markers following Mendelian expectations (P   0.01) were used in single-marker analysis in this hybrid population (excluding the genotype data involving the null allele at each locus). Significant SSR loci were detected by fitting the data to the model y = μ + mi + eij, where y is the trait value, μ is the mean, mi is the genotype of the ith marker, and eij is the residual associated with the jth individual in the ith genotypic class. The per cent phenotypic variance explained by the most significant marker was calculated, and the FDR method was used to perform a correction for multiple testing (Storey & Tibshirani, 2003).

Haplotype analysis

The haplotype (a block of linked ordered markers) frequencies of locus genotypes were estimated and the tests of haplotype association with the trait values were carried out using the software famhap version 19 (http://famhap.meb.uni-bonn.de/index.html.; Becker & Knapp, 2004; Herold & Becker, 2009). famhap estimates haplotype frequencies using maximum-likelihood. Singleton alleles were ignored when constructing the haplotypes, and haplotypes with a frequency < 5% were also discarded. The input consisted of genotype matrices with structure analysis matrices (Q) and phenotypic value matrices, and significances of the haplotype associations were determined based on 104 permutation tests. A correction for multiple testing was performed using the positive FDR method (Storey & Tibshirani, 2003).

Modes of gene action

The modes of gene action were quantified using the ratio of dominance (d ) to additive (a) effects estimated from least-square means for each genotypic class. Partial or complete dominance was defined as values in the range 0.50 < |d ⁄a| < 1.25, whereas additive effects were defined as values in the range |d ⁄a| ≤ 0.5. Values of |d ⁄a| > 1.25 were equated with under- or overdominance. Details of the algorithm and formulas for calculating gene action were previously described (Eckert et al., 2009; Wegrzyn et al., 2010).

Results

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Phenotypic data distribution and correlations

In the association population, holocellulose and α-cellulose contents ranged from 64.13% to 87.40% (mean 73.58%) and from 40.63% to 47.74% (mean 44.53%), respectively. Descriptive statistics of the trait distributions are presented in Table S4. In the hybrid population, the two parent lines showed significant differences in all seven measured traits. The trait measurements of all 1200 F1 progeny were intermediate between those of the two parents for six traits, excluding holocellulose content; the F1 progeny had higher holocellulose contents than either parent. Table S5 shows the descriptive trait-distribution statistics of the F1 population. As expected, the frequency distributions for each trait measured in these two populations followed an approximately normal distribution (data not shown).

The wood and growth traits in the association population showed significant correlations (Table 1). Of these, lignin content was significantly negatively correlated with holocellulose content (P < 0.01), α-cellulose (P  < 0.01), and D (P  < 0.05); similar results were observed in the hybrid population (Table 1). In addition, lignin content was strongly and positively correlated with MFA (P  < 0.01), and significant positive pairwise correlations were observed between holocellulose and α-cellulose contents and D in the association population (P  < 0.01). The details of the phenotypic correlations among these traits in these two populations are shown in Table 1.

Table 1. Estimates of phenotypic correlations (R) for these seven traits in the Populus tomentosa association (above diagonal) and linkage mapping (below diagonal) populations
TraitsLigninHolocelluloseα-celluloseMFA H D V
  1. H, tree height; D, the diameter at breast height; V, stem volume; MFA, microfiber angle.

  2. *, < 0.05; **, < 0.01.

Lignin1−0.223**−0.254**0.202**0.061−0.150*0.068
Holocellulose−0.173**10.622**0.018−0.0140.160**0.168**
α-cellulose−0.385**0.305**1−0.005−0.0950.192**−0.072
MFA0.0120.054−0.281*10.018*−0.011−0.001
H 0.0040.088*0.0540.03310.638**0.690**
D −0. 210*0.0450.007−0.0020.516**10.944**
V −0.0250.057−0.0170.0150.750**0.954**1

SSR loci detection, HWE, and LD test

In total, 36 novel polymorphic SSR makers (minor allele frequency ≥ 5%) were developed from 15 candidate genes of the PtoCesA gene family, with an average density of one SSR every 2.62 kb. No polymorphic microsatellite loci were found in PtoCesA1 and PtoCesA5 (Table S2). For these SSRs, c. 62% were derived from the intron regions and 22% from the promoter regions, and the number of microsatellite markers detected in the 5′ UTR, exon, and 3′ UTR regions were 3, 2, and 1, respectively (Table S2). A survey of the repeat motif types indicated that dinucleotide repeats were the most abundant (42%), followed by trinucleotide repeats (31%) (Table S2). One hundred and forty-four alleles among the 460 samples were identified, and the NA ranged from 2 to 8 with an average of 4.0 (Table S2). The mean PIC and HE of loci were 0.436 and 0.505, respectively (Table S2). The HWE test in the 36 microsatellites indicated that seven loci departed from HWE (P  < 0.01); however, no individual locus deviated significantly from HWE after applying the Bonferroni correction for multiple testing (Table S2). In agreement with tests for HWE, all FIS values were small (the mean FIS = −0.065) and did not suggest the occurrence of inbreeding in our samples.

All r2 values were pooled to assess the overall behavior of LD within the PtoCesA genes. Figure 2 shows a larger number of SSRs that were in linkage equilibrium (r2 < 0.3; P  < 0.001) across the sequenced regions. Limited LD of the SSR loci within the candidate gene did not extend over the entire gene region. However, the average decay distance associated with LD within the PtoCesA genes was not calculated because of the limited number of SSR markers in this study. Several loci within the same candidate gene were in significant LD, such as markers C12-SSR1, C12-SSR2, and C12-SSR3 in PtoCesA12 (r2 > 0.6; P  < 0.001; Fig. 2).

image

Figure 2. Pairwise linkage disequilibrium (LD) (r2) between simple sequence repeat (SSR) markers located in the cellulose synthase (PtoCesA) genes in Populus tomentosa. A larger number of SSRs were in linkage equilibrium (r2 < 0.3, < 0.001); limited LD of the SSR loci within the candidate gene did not extend over the entire gene region, and several loci within the same candidate gene were in significant LD, such as markers C12-SSR1, C12-SSR2, and C12-SSR3 in PtoCesA12 (r2 > 0.6; < 0.001).

Download figure to PowerPoint

Summary of single-marker and haplotype-based associations

Single SSR marker–trait associations

All of 252 (36 SSRs × 7 traits) single-marker association tests conducted were accounted for with 104 permutations using the MLM. In all, 24 associations were significant at the threshold of P  < 0.05, representing 16 SSR loci from nine PtoCesA genes (Table 2). Multiple test corrections for all 24 associations reduced this number to 15 at a significance threshold of Q  < 0.05. These loci explained a small proportion of the phenotypic variance, ranging from 2.9% to 8.7% (Table 2). Of these, four SSR markers were associated with holocellulose. Lignin and α-cellulose had three significant associations each; two MFA associations and one association each with H, D, and V traits were observed in the association population (Q  < 0.05; Table 2). The 15 associations represent 10 SSR loci from seven PtoCesA genes. Many of the 10 SSR markers exhibited significant associations with at least one trait, consistent with the extent of codominance, and also suggesting a pleiotropic effect of these loci responsible for certain traits (Tables 1, 2). For two of the 15 associations, the mode of gene action is consistent with overdominance (|d ⁄a| > 1.25); the remaining 13 associations were split between modes of gene action that were partially to fully dominant (0.50 < |d ⁄a| < 1.25, = 7) or codominant (|d ⁄a| ≤ 0.5, = 6; Table 2). The majority of gene effects explained a small to moderate fraction of the phenotypic variation.

Table 2. Summary of significant simple sequence repeat (SSR) marker-trait pairs from the association test results in the Populus tomentosa discovery (association population) and validation (linkage mapping population) populations after correction for multiple testing errors
TraitGene symbolLocusAssociation population (= 460)Linkage mapping population (= 1200)
P-valuedfErrorR2 (%)Q-valued/aa2a/spbP-valuedfErrorR2 (%)Q-value
  1. H, tree height; D, the diameter at breast height; V, stem volume; MFA, microfiber angle; N, number of trees sampled; P-value, significance level for association (significance is  0.05); R2, percentage of the phenotypic variance explained; Q-value, a correction for multiple testing (false discovery rate FDR (Q) ≤ 0.05).

  2. a

    d/= 2 × d/2a (the ratio of dominance (d) to additive (a)); 2a, calculated as the difference between the phenotypic means observed within each homozygous class (2a = |GBB − Gbb|, where Gij is the trait mean in the ijth genotypic class); d, calculated as the difference between the phenotypic mean observed within the heterozygous class and the average phenotypic mean across both homozygous classes (d = GBb)0.5(GBB + Gbb), where Gij is the trait mean in the ijth genotypic class).

  3. b

    sp, standard deviation for the phenotypic trait under consideration. Details of the algorithm and formulas for calculating gene action have been described by Eckert et al. (2009) and Wegrzyn et al. (2010).

  4. c

    Represent the different allele/genotype effects at the same locus for the same trait between the discovery and validation populations.

LigninPtoCesA3 5′ UTRC3-SSR10.03004534.6 0.050.3430.875    
PtoCesA3 intron 3C3-SSR20.00214485.10.03500.4630.8980.002011874.0c0.0337
PtoCesA10 exon 3C10-SSR20.00014507.20.01720.6291.4690.001711636.60.0308
PtoCesA10 intron 3C10-SSR30.02504463.5 0.05−0.2733.919    
PtoCesA17 intron 2C17-SSR10.00164352.90.0329−0.2853.887    
HolocellulosePtoCesA2 intron 4C2-SSR10.00144395.00.03110.6060.2370.001411855.10.0267
PtoCesA3 intron 3C3-SSR20.00304464.30.0413−0.5710.184    
PtoCesA10 intron 3C10-SSR30.00234418.70.0366−0.5340.362    
PtoCesA12 promoterC12-SSR10.00104456.20.02410.4470.491    
PtoCesA12 promoterC12-SSR20.02604515.0 0.050.0810.550    
α-cellulosePtoCesA2 intron 4C2-SSR10.00304376.80.04630.8440.3520.001011705.80.0201
PtoCesA4 intron 1C4-SSR20.00424366.50.0487−1.2440.6620.040011754.0 0.05
PtoCesA10 3′ UTRC10-SSR40.00104364.10.0244−0.4670.720    
PtoCesA12 intron 6C12-SSR40.01504495.0 0.050.4321.208    
MFAPtoCesA4 intron 8C4-SSR30.02554393.3 0.051.4590.313    
PtoCesA4 intron 11C4-SSR40.00014456.00.0172−0.9910.2760.002311807.30.0350
PtoCesA17 intron 2C17-SSR10.00394345.90.04420.3530.597    
H PtoCesA9 promoterC9-SSR10.04504258.2 0.050.3280.580    
PtoCesA16 intron 13C16-SSR30.00224485.70.03581.2960.633    
D PtoCesA4 intron 1C4-SSR20.00424423.30.04871.4590.313    
PtoCesA8 intron 12C8-SSR30.03604357.0 0.05−0.4050.6090.031011636.3 0.05
PtoCesA10 3′ UTRC10-SSR40.02004395.9 0.05−0.1150.406    
V PtoCesA2 intron 4C2-SSR10.00114366.20.0252−0.7800.464    
PtoCesA10 3′ UTRC10-SSR40.02404398.8 0.050.5200.802    
Haplotype–trait associations

Among haplotype-based associations, 44 regions (amplicons) from 10 PtoCesA genes were analyzed, and the number of haplotypes per region varied from 2 to 13 with an average of 7.0. Twelve significant regions from six unique genes were identified using the software famhap version 19, with a significance threshold of P  < 0.05 (details not shown). Multiple test corrections reduced this number to nine regions, which derived from six genes and included 69 haplotypes, at a significance threshold of < 0.05 (Table 3). Eighteen significant haplotypes were significantly associated with the five phenotypic traits excluding D and V phenotypes (< 0.05), and eight single-marker associations (< 0.05), strongly supporting the haplotype-based associations for the same traits, respectively (Tables 2, 3).

Table 3. List of haplotypes with significant associations with wood quality and growth traits in the Populus tomentosa association population (= 460) after a correction for multiple testing (false discovery rate FDR (Q) ≤ 0.05)
AmpliconTraitP-valueQ-valueHaplotypesSignificant haplotypesHaplotype frequenceSingle-marker associationsa
  1. H, tree height; D, the diameter at breast height; V, stem volume; MFA, microfiber angle; P-value, the significant level for haplotype-based association (significance is  0.05).

  2. a

    Single-marker associations with the lowest Q value (FDR Q ≤ 0.05) relating to the significant haplotype–trait association.

  3. /, no data was identified in this study.

PtoCesA3 Lignin0.00120.02498(C3-SSR1)-(C3-SSR2) C3-SSR2 (lignin, = 0.0350)
(CT)9-(CT)40.28
(CT)6-(CT)30.21
PtoCesA10 Lignin0.00150.027110(C10-SSR1)-(C10-SSR2)-(C10-SSR3) C10-SSR2 (lignin, = 0.0172)
(CAAACA)3-(TGA)3-(TA)50.37
(CAAACA)4-(TGA)4-(TA)70.18
(CAAACA)5-(TGA)4-(TA)70.14
PtoCesA2 Holocellulose0.00310.04036(C2-SSR1)-(C2-SSR2) C2-SSR1 (holocellulose, = 0.0311)
(TTAA)3-(CAA)60.40
(TTAA)5-(CAA)60.11
PtoCesA3 Holocellulose0.00100.022150/C3-SSR2 (holocellulose, = 0.0413)
PtoCesA12 Holocellulose0.00050.018712(C12-SSR1)-(C12-SSR2)-(C12-SSR3) C12-SSR1 (holocellulose, = 0.0241)
(TTA)6-(ATT)4-(AT)50.18
(TTA)9-(ATT)3-(AT)50.33
PtoCesA4 α-cellulose0.00400.04267(C4-SSR1)-(C4-SSR2) C4-SSR2 (α-cellulose, = 0.0487)
(AT)15-(CTT)50.22
(AT)24-(CTT)70.16
(AT)30-(CTT)30.12
PtoCesA10 α-cellulose0.00100.02216(C10-SSR2)-(C10-SSR3) C10-SSR4 (α-cellulose, = 0.0244)
(TGA)4-(TA)40.62
(TGA)4-(TA)70.09
PtoCesA4 MFA0.00290.03955(C4-SSR3)-(C4-SSR4) C4-SSR4 (MFA, = 0.0172)
(GCCATGTAAAGAA)3-(TACTGC)40.28
PtoCesA9 H 0.00400.042610(C9-SSR1)-(C9-SSR2)-(C9-SSR3) /
(AT)18-(CT)6-(GC)30.20
(AT)18-(CT)6-(GC)50.08
(AT)20-(CT)4 -(GC)50.15

Confirmation of association studies in a linkage mapping population

Thirty-three of 36 genic SSR markers followed Mendelian expectations (P   0.01), with a segregation ratio close to 1 : 2 : 1 for eight SSR loci, 1 : 1 for 10 loci, and 1 : 1 : 1 : 1 for 15 markers. The 16 significant SSR markers (P  < 0.05; Table 2) identified in the association population were all in accordance with Mendelian expectations (P   0.01), and no novel allele was discovered in the hybrid population. Therefore, single-marker association analysis (112; 16 SSRs × 7 traits) was conducted in this linkage mapping population, and we first observed seven marker–trait associations (P  < 0.05; Table 2). A multiple test correction reduced this number to five (Q  < 0.05; Table 2), and the proportion of phenotypic variation explained varied from 4.0% to 7.3% (Table 2). No SSR associations with growth traits were identified in the validation population (Table 2), which is consistent with the hypothesis that growth traits have relatively low heritability compared with wood-property traits (Thumma et al., 2010). The significant markers identified for three growth traits in the association population (Table 2) are probably derived from a causal relationship between wood-property and growth traits, and may represent false positive associations (Dillon et al., 2012).

Two significant SSR markers (C3-SSR2 and C10-SSR2) explaining 4.0–6.6% of the phenotypic variance for lignin content were identified in the validation population (Table 2). In the association population, the differences in lignin content of C3-SSR2 genotypes were significant (24.70% for (CT)4/(CT)4, 23.95% for (CT)4/(CT)3, and 23.06% for (CT)3/(CT)3), which was consistent with the additive effects of gene action on lignin content, However, the (CT)4 allele generated in the association population was not found in the validation population, and two genotypes ((CT)5/(CT)3 and (CT)5/(CT)5) of the parents were segregated in the linkage population. The differences in lignin content among the three genotypes (two significant) of marker C10-SSR2 were 24.72% for (TGA)4/(TGA)4, 24.58% for (TGA)4/(TGA)3, and 23.83% for (TGA)3/(TGA)3, indicating that patterns of gene action are consistent with dominant effects on lignin content (Table 2). Three haplotypes in the same amplicon of PtoCesA10 were significantly associated with lignin composition, which was supported by the significant marker C10-SSR2 (Table 3 and Fig. 3a). This marker was validated in the linkage mapping population (R2 = 6.6%; Table 2), and the mean values showed significant differences between genotypes, indicating that the allelic effect of C10-SSR2 is consistent in both association and validation populations (Fig. 3).

image

Figure 3. Haplotype and single-marker associations with lignin content are illustrated for the cellulose synthase gene 10 (PtoCesA10) in Populus tomentosa. (a) The genotypic effects of the three significant haplotypes (< 0.05) of PtoCesA10 are shown. The haplotypes reveal significantly different mean phenotypic values for lignin content. The effects of three markers (two significant) are also shown. C10-SSR2 located in the coding region (PtoCesA10 exon 3) was exclusively associated with lignin content; the (TGA)4 allele at this marker causes an insertion mutation adding Asp to the amino acid sequence. The other marker, C10-SSR3, was significantly associated with holocellulose (< 0.05) and lignin contents (< 0.05 and 0.05 < < 0.1). All three markers were in linkage disequilibrium (LD). (b) The C10-SSR2 marker associated with lignin content was also uniquely detected in the F1 population; the mean value of each genotype was intermediate between the values of the two parents. However, the genotypic effects of the two genotypes were significantly different: the lignin content was elevated in the group with (TGA)4/(TGA)3 but was reduced in the group with (TGA)3/(TGA)3.

Download figure to PowerPoint

For the α-cellulose and holocellulose content traits, we observed that the significant marker C2-SSR1 for α-cellulose content was similarly associated with holocellulose content in the discovery and validation populations. In the association population, the differences in holocellulose content were significant (72.70% for (TTAA)3/(TTAA)3, 72.89% for (TTAA)3/(TTAA)4, 73.75% for (TTAA)3/(TTAA)5, and 73.68% for (TTAA)5/(TTAA)5). The same patterns were found for α-cellulose content (44.56, 44.49, 45.23, and 45.36%, respectively); this suggested that gene action was consistent with dominant effects in relation to these two traits (Table 2). The (TTAA)5 allele in marker C2-SSR1 is the minor allele for the holocellulose trait, and the same alleles in C2-SSR1 were also detected in validation population for α-cellulose and holocellulose traits (data not shown).

C4-SSR4 associated with MFA in the association population (6.0%, Q  < 0.05) and was also successfully validated in the linkage mapping population (Tables 2, 3). Heterozygotes (TACTGC)5/(TACTGC)4 for the marker had a difference of > 1° in MFA with either homozygote class. One of the five individual haplotypes in the amplicon of PtoCesA4 was significant for MFA, with a high MFA of 19.2° (Table 3), and the same allelic effect of C4-SSR4 was identified in the validation population.

Allelic relative expression based on real-time PCR

To test whether these significant allelic associations affect the relative mRNA expression levels for these genes, we quantified the mRNA levels among different groups with different genotypes or haplotypes. In total, 19 tests (nine haplotypes and 10 individual markers) representing seven PtoCesA genes (Table 3) were used to quantify the mRNA levels for these genes among different groups. Measurement of differential expression by real-time PCR indicated that only two candidate genes (PtoCesA10 and PtoCesA12) had different expression levels among the different groups (Fig. 5a,b). The mRNA products of the PtCesA10 transcripts were detected among three groups with different haplotypes (10 individuals per group; each individual selected was homozygous for the particular haplotype). The highest relative expression level of mRNA products (0.9376) was in the group with (CAAACA)5-(TGA)4-(TA)7, followed by (CAAACA)4-(TGA)4-(TA)7 (0.7168), and (CAAACA)3-(TGA)3-(TA)5 (0.2641) (Fig. 5a). The mRNA products of the PtoCesA12 transcripts were detected in two groups representing two significant haplotypes, with some differences in expression levels. A relatively high expression level (0.8513) was detected in the group with (TTA)9-(ATT)3-(AT)5, while a lower expression level (0.3910) was detected in the group with (TTA)9-(ATT)3-(AT)5 (Fig. 5b).

Discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Linkage disequilibrium in trees

For association mapping, understanding the patterns of LD in the species under consideration is an important prerequisite, because the rate of decay of LD is needed to determine whether genome-wide associations are feasible or whether a candidate gene-based approach has to be considered. Previous studies have generally suggested a very low LD in trees; for example, Brown et al. (2004) found a rapid decline in LD within several kilobases in loblolly pine. Similar findings of limited LD were reported for candidate genes in other species of conifers (Dvornyk et al., 2002; Neale & Savolainen, 2004; Krutovsky & Neale, 2005; Gonzalez-Martinez et al., 2007). LD analysis in the Cinnamoyl CoA Reductase (CCR) gene and a gene encoding a COBRA-like protein (EniCOBL4A) in Eucalyptus nitens showed that LD does not extend over the entire gene (Thumma et al., 2005, 2009). In Populus, previous studies based on SNP markers have indicated that a rapid decay of LD occurs within just 300–1700 bp in candidate genes among related species of Populus (Ingvarsson, 2005; Ingvarsson et al., 2008; so that this citation matches the Reference List. Please confirm that this is correct. Xu et al., 2009; Wegrzyn et al., 2010), which is consistent with the LD decay observed in some PtoCesA genes using SSR markers in this study, indicating the potential of association genetics to identify the genes responsible for variation in key traits. However, the assessment of LD using genic SSR markers only applies to gene regions in this species, and may not be applicable at the whole genome-wide LD level. To date, all previous studies on LD decay in related species of Populus have focused on gene regions; whether there is LD in nongenic regions in Populus remains to be seen (Ingvarsson, 2005; Ingvarsson et al., 2008; Wegrzyn et al., 2010). The greater resolution power of SSRs in the detection of LD, compared with biallelic SNPs, has been demonstrated in other species (for a review, see Abdurakhmonov & Abdukarimov, 2008), suggesting the possibility of using a combination of SNP and SSR markers in LD mapping. These observations can be readily explained by noting that different marker types capture different historical information in a genome because of dissimilar mutation rates (e.g. SNP vs SSR or AFLP vs SSR) and nonuniform LD distribution among the chromosomes; in addition, population background is also a key factor influencing LD (Neale & Savolainen, 2004).

LD has been found to decline rapidly in some PtoCesA genes using limited, but evenly spaced markers (minor allele frequency was above c. 20%; Andreescu et al., 2007), suggesting that the resolution of marker–trait associations may be high in this study. The LD does not extend over the entire gene region, demonstrating that a candidate-gene-based LD approach maybe the best way to understand the molecular basis underlying quantitative variation in this species (Thumma et al., 2005).

Comparison and identification of associations in P. tomentosa

To date, the candidate-gene-based association approach has been particularly used to identify candidate gene alleles associated with growth and wood properties in several tree species (Thumma et al., 2005, 2009; Gonzalez-Martinez et al., 2007; Dillon et al., 2010, 2012; Wegrzyn et al., 2010; Beaulieu et al., 2011; Sexton et al., 2011). Generally, in a high-LD species, the power of a single-marker association test is often limited because LD information contained in flanking markers is ignored. Intuitively, haplotypes (a collection of ordered markers) may be more powerful than individual, nonordered markers (Akey et al., 2001). However, the comparison of single-marker and haplotype-based associations in this low-LD tree species demonstrated that the effect of the haplotype is mainly derived from the significant individual marker, and haplotype analysis may not be more powerful than single-marker analysis, although several haplotype-based associations (P  < 0.05) were identified in the absence of significant single markers. Therefore, evaluation of single marker- and haplotype-based LD analyses should be performed when judging significant associations.

Besides the direct coding of CesA subunit proteins, genetic evidence suggests that CesA genes may participate in the pathway(s) of lignin and C6 sugar formation (Song et al., 2010; Wegrzyn et al., 2010); also, other candidate genes have been implicated in the pathway of cellulose biosynthesis, although whether they are directly involved has not been verified (Szyjanowicz et al., 2004; Coleman et al., 2009; Wegrzyn et al., 2010). Cell wall synthesis is coordinated with several other biological processes, and the genes in these shared pathways often are functional homologs (Persson et al., 2005; Beaulieu et al., 2011).

Lignin is a complex phenolic heteropolymer, and plays a key role in plant structure by providing strength, rigidity, and hydrophobicity to xylem cell walls (Demura et al., 2002). Marker C3-SSR2 and a haplotype-based association representing PtoCesA3 were highly significantly (< 0.05) associated with lignin content, but it was not a true validation in the linkage population (Table 2). Microarray profiling across developing xylem in Populus (Persson et al., 2007; Rajangam et al., 2008) showed that the PtiCesA8 gene (97% identity at the protein level with PtoCesA3) was strongly expressed during secondary cell wall deposition. Genes encoding lignin monomer-polymerizing laccases and lignin monomer synthesis enzymes are among the genes most closely co-expressed with AtCesA8 (Persson et al., 2005). Significant individual SNP associations in CesA1A (96–97% identity at the protein level with PtoCesA3) with lignin have been identified in black cottonwood (Populus trichocarpa) (Wegrzyn et al., 2010). Therefore, it is essential to expand our understanding of the action of PtoCesA3.

In PtoCesA10, the marker C10-SSR2 located in the coding region (PtoCesA10 exon 3) was uniquely associated with lignin content. The (TGA)4 allele is the minor allele in marker C10-SSR2, and it produces an insertion mutation, adding an Asp to the amino acid sequence (Fig. 3a). The results of association and validation (Table 2, Fig. 3a) strongly suggest that C10-SSR2 may be a functional polymorphism that is in or near a locus involved in the control of lignin content. Further analyzing the protein structure encoded by PtoCesA10, we found that the insertion of the amino acid (AA) is a distance of 15 AAs away from the zinc-binding domain, which has been shown to be involved in CESA protein–protein interactions (Joshi et al., 2004), suggesting that AA insertion may be associated with the zinc-binding domain for regulation of gene expression related to lignin composition. This conjecture was also supported by the significant expression differences among three groups of trees (Fig. 5a). Lignin deposition is largely associated with secondary wall formation, and genes linked to the lignin-related pathway for suberin synthesis are highly co-expressed with secondary cell wall AtCesA7 (91% identity at the protein level with PtoCesA10 ; Persson et al., 2005). This result is also in agreement with those of previous studies showing that PtiCesA7-A (96% identity at the protein level with PtoCesA10) is specifically expressed in the secondary cell wall (Rajangam et al., 2008; Song et al., 2010). Similarly, Wegrzyn et al. (2010) have identified significant SNP and haplotype-based associations in CesA1B (PtoCesA10 homologous genes) with lignin composition in black cottonwood.

Microfibril orientation may be related to the rate of cellulose synthesis (Paredez et al., 2006; Rajangam et al., 2008; Beaulieu et al., 2011). C4-SSR4, a noncoding marker within PtoCesA4, was the only single-marker association identified with the MFA, and illustrated a pattern of gene action consistent with additive effects (Table 2). Both single-marker and haplotype-based association results demonstrated that allelic polymorphism at the C4-SSR4 could be linked to some co-expressed processes involved in microfibril orientation. Furthermore, this marker was also detected in the validation population, with significant differences in MFA among three genotypes (data not shown). AtCesA6 (94% identity at the protein level with PtoCesA4) has been shown to affect both microtubule and microfibril orientation (Paredez et al., 2006). This research may provide a possible path for exploration of the genetic basis of microfibril orientation, but the role of the variant in controlling the trait needs further testing.

Holocellulose is the total polysaccharide fraction of the secondary xylem cell walls and is composed of cellulose and hemicelluloses; it makes up c. 80% of the secondary xylem tissue (Li et al., 2006). In the present study, the differences in holocellulose content for marker C12-SSR1 were significant among three of four genotypes (73.15% for (TTA)6/(TTA)4, 73.20% for (TTA)6/(TTA)6, 74.05% for (TTA)9/(TTA)6, and 74.38% for (TTA)9/(TTA)9), illustrating that patterns of gene action are consistent with additive gene effects (Fig. 4). Significant differences in holocellulose content for two haplotypes in PtoCesA12 were shown (73.17% for (TTA)6-(ATT)4-(AT)5 and 74.22% for (TTA)9-(ATT)3-(AT)5) (Table 3, Fig. 4), which was supported by two adjacent markers (C12-SSR1 and C12-SSR2) in high LD (r2 > 0.8; P  < 0.001; Fig. 2). RT-PCR testing also indicated that the levels of relative expression were significantly different between these two haplotypes (Fig. 5b). These observations reveal the potential importance of this gene in the variability of holocellulose content. CesA6-related genes, which are homologous to PtoCesA12, have been identified and are expressed during cellulose biosynthesis or deposition in Arabidopsis and P. trichocarpa (Desprez et al., 2007; Persson et al., 2007). However, this marker was not successfully validated in the linkage population. Numerous reasons have been proposed to explain why some true associations may not be replicated across independent data sets, including sample size, variability in phenotype definitions, genetic heterogeneity, environmental interactions, age-dependent effects, and gene–gene interactions (Neale & Savolainen, 2004; Greene et al., 2009; Beaulieu et al., 2011; Dillon et al., 2012). Generally, wood traits are expected to be influenced by many genes, with small effects; gene–gene interactions are also likely to be of critical importance. Additionally, populations with different genetic and environmental backgrounds may have unfavorable pedigree linkage disequilibrium and phenotypic variation (Neale & Savolainen, 2004). For example, the related species Palba × P. glandulosa is the female parent in this validation population; diverse gene-by-environment interactions within and between sites were not accounted for; the association result may be false positive. These reasons might explain the ‘lack of validation’ for this important association in the linkage mapping population.

image

Figure 4. Haplotype and single-marker associations with holocellulose content are illustrated for the cellulose synthase gene 12 (PtoCesA12) in Populus tomentosa. The genotypic effects of the two significant haplotypes (< 0.05) of PtoCesA12 are shown. The haplotypes yield significantly different mean phenotypic values for holocellulose content. The marker effects of three markers (one significant) are also shown. C12-SSR1 located in the PtoCesA12 promoter region was significantly associated with holocellulose content. The other marker C12-SSR2 was associated with holocellulose content at < 0.05 and 0.05 < < 0.1 while C12-SSR3 was not significantly associated with this trait. All three markers were in linkage disequilibrium (LD) with one another.

Download figure to PowerPoint

image

Figure 5. Relative transcript levels for candidate genes in different groups representing different significant haplotypes (the error bars represent + SD). (a) The relative levels of cellulose synthase gene 10 (PtoCesA10) transcripts in three groups involving a total of 30 Populus tomentosa individuals. (b) The relative mRNA levels of PtoCesA12 in two groups representing two significant haplotypes.

Download figure to PowerPoint

Variations in the quantity and quality of cellulose in plants are suspected to be primarily a result of enzymatic activities of different types of cellulose synthases (CesAs; Atanassov et al., 2009; Kumar et al., 2009). The CesA7 genes PtiCesA7-A and AtCesA7, homologs of PtoCesA2, are expressed in developing xylem tissue undergoing secondary wall thickening in Populus or in the xylem of Arabidopsis (Suzuki et al., 2006; Atanassov et al., 2009). In PtoCesA2, the result that marker C2-SSR1 was associated with α-cellulose and holocellulose content traits in both discovery and validation populations (Table 2) suggests that C2-SSR1 may be a functional polymorphism in or near a locus involved in cellulose synthesis during secondary cell wall formation in P. tomentosa. Significant individual SNP markers or haplotype associations have been reported in the homologous gene CesA2A in P. trichocarpa (Wegrzyn et al., 2010).

Factors affecting the power of association mapping

Gene-derived SSR markers may have functional significance in regulating gene expression and function (Li et al., 2004; Varshney et al., 2005). In this study, a vast amount of genetic variation in P. tomentosa natural populations (Zhang et al., 2007) was coupled with a low level of homoplasy of SSR markers derived from conserved gene regions (Li et al., 2004), providing an appropriate tool for candidate-gene-based association studies, although previous studies have reported that size homoplasy of SSR alleles and allele reversion could be a problem in association studies (Ching et al., 2002). The selection of polymorphic genic SSR markers with a low level of size homoplasy, along with SNPs, a traditional marker type used for association analysis, would provide better potential to detect functional allelic variation underlying quantitative traits. In addition, knowledge and selection of optimal candidate genes using different approaches, such as microarray analysis, EST database searches and quantitative trait locus (QTL) mapping, in model or related plant species (Neale & Savolainen, 2004; Thumma et al., 2005) provide an important basis for identifying useful alleles located within functional genes controlling traits of interest. Deviations from HWE for SSR loci can be indicative of genotyping errors, inbreeding, population subdivision, or selection (Balding, 2006). In this study, genotyping errors and inbreeding can be excluded based on correcting genotyping errors and low FIS values for each locus. Population subdivision is thought to be the most important explanation for deviations from HWE, which was in agreement with the various geographic origins of wild P. tomentosa (Du et al., 2012). Population structure can generate spurious genotype–phenotype associations. Thus, use of the unified mixed-model method (MLM) would improve control of both type I and type II error rates (Yu et al., 2006).

Phenotyping is an important part of association mapping for forest trees. A typical association population is usually composed of a diverse set of unrelated individuals at the same location, and to increase precision in phenotypic measurements, one must usually clonally replicate individuals to reduce environmentally induced noise and measurement errors. Hence, in this study, we used a total of 1380 phenotypes (460 genotypes × 3 ramets) to compensate for the deficiency of having a limited number of SSR markers. Furthermore, when the entire collection is replicated across multiple environments, data on replicates of each individual can be combined to produce a phenotype mean value for the accession analysis, which is less influenced by environment or measurement errors (Long & Langley, 1999). Therefore, replication of genotype–phenotype associations is also crucial in association mapping to distinguish false-positive associations and to provide less biased estimates of the size of allelic effects. Additionally, validation of biological function through transgenic experiments and other molecular biology techniques can be used to verify associations (Thumma et al., 2005; Abdurakhmonov & Abdukarimov, 2008). For example, real-time PCR and linkage analysis were employed in this study to confirm the results obtained from association mapping.

Polymorphisms of SSR markers and significant haplotypes representing PtoCesA candidate genes were used to evaluate the functional loci or genes associated with lignocellulosic cell wall development. We demonstrated that the candidate gene-based association approach, along with validation in a large linkage-mapping population and confirmation using real-time PCR testing, can be employed to identify naturally occurring allelic variation in genes associated with important wood-quality traits. This study provides insights into the genetic mechanisms underlying wood development, and identifies particular markers for tree MAS breeding programs with the goals of improving the quality and quantity of wood products.

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

We thank Profs Ronald R. Sederoff and Zhao-Bang Zeng (NC State University, Raleigh, NC, USA) for their detailed comments and specific suggestions for improving the manuscript. This work was supported by the State Key Basic Research Program of China (No. 2012CB114506) and the Project of the National Natural Science Foundation of China (No. 31170622, 30872042).

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information
  • Abdurakhmonov IY, Abdukarimov A. 2008. Application of association mapping to understanding the genetic diversity of plan germplasm resources. International Journal of Plant Genomics 2008: 118.
  • Akey JM, Zhang K, Xiong M, Doris P, Jin L. 2001. The effect that genotyping errors have on the robustness of common linkage-disequilibrium measures. The American Journal of Human Genetics 68: 14471456.
  • Andreescu C, Avendano S, Brown SR, Hassen A, Lamont SJ, Dekkers JC. 2007. Linkage disequilibrium in related breeding lines of chickens. Genetics 177: 21612169.
  • Atanassov II, Pittman JK, Turner SR. 2009. Elucidating the mechanisms of assembly and subunit interaction of the cellulose synthase complex of Arabidopsis secondary cell walls. Journal of Biological Chemistry 284: 38333841.
  • Balding DJ. 2006. A tutorial on statistical methods for population association studies. Nature Reviews Genetics 7: 781791.
  • Beaulieu J, Doerksen T, Boyle B, Clement S, Deslauriers M, Beauseigle S, Blais S, Poulin PL, Lenz P, Caron S et al. 2011. Association genetics of wood physical traits in the conifer White Spruce and relationships with gene expression. Genetics 188: 197214.
  • Becker T, Knapp M. 2004. Maximum-likelihood estimation of haplotype frequencies in nuclear families. Genetic Epidemiology 27: 2132.
  • Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. 2007. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23: 26332635.
  • Brown GR, Gill GP, Kuntz RJ, Langley CH, Neale DB. 2004. Nucleotide diversity and linkage disequilibrium in loblolly pine. Proceedings of the National Academy of Sciences, USA 101: 1525515260.
  • Burton RA, Shirley NJ, King BJ, Harvey AJ, Fincher GB. 2004. The CesA gene family of barley (Hordeum vulgare): quantitative analysis of transcripts reveals two groups of co-expressed genes. Plant Physiology 134: 224236.
  • Ching A, Caldwell KS, Jung M, Dolan M, Smith OS, Tingey S, Morgante M, Rafalski A. 2002. SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines. BMC Genetics 3: 114.
  • Coleman HD, Yan J, Mansfield SD. 2009. Sucrose synthase affects carbon partitioning to increase cellulose production and altered cell wall ultrastructure. Proceedings of the National Academy of Sciences, USA 106: 1311813123.
  • Demura T, Tashiro G, Horiguchi G, Kishimoto N, Kubo M, Matsuoka N, Minami A, Nagata-Hiwatashi M, Nakamura K, Okamura Y et al. 2002. Visualization by comprehensive microarray analysis of gene expression programs during transdifferentiation of mesophyll cells into xylem cells. Proceedings of the National Academy of Sciences, USA 99: 1579415799.
  • Desprez T, Juraniec M, Crowell EF, Jouy H, Pochylova Z, Parcy F, Höfte H, Gonneau M, Vernhettes S. 2007. Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana. Proceedings of the National Academy of Sciences, USA 104: 1557215577.
  • Dillon SK, Brawner JT, Meder R, Lee DJ, Southerton SG. 2012. Association genetics in Corymbia citriodora subsp. Variegate identifies single nucleotide polymorphisms affecting wood growth and cellulosic pulp yield. New Phytologist 195: 596608.
  • Dillon SK, Nolan M, Li W, Bell C, Wu HX, Southerton SG. 2010. Allelic variation in cell wall candidate genes affecting solid wood properties in association populations and land races of Pinus radiata. Genetics 185: 14771487.
  • Djerbi S, Aspeborg H, Nilsson P, Sundberg B, Mellerowicz E, Blomqvist K, Teeri TT. 2004. Identification and expression analysis of genes encoding putative cellulose synthases (CesA) in the hybrid aspen, Populus tremula (L.) × P. tremuloides (Michx.). Cellulose 11: 301312.
  • Djerbi S, Lindskog M, Arvestad L, Sterky F, Teeri TT. 2005. The genome sequence of black cottonwood (Populus trichocarpa) reveals 18 conserved cellulosesynthase (CesA) genes. Planta 221: 739746.
  • Du QZ, Wang BW, Wei ZZ, Zhang DQ, Li BL. 2012. Genetic diversity and population structure of Chinese white poplar (Populus tomentosa) revealed by SSR markers. Journal of Heredity. doi:10.1093/jhered/ess061.
  • Dvornyk V, Sirvio A, Mikkonene M, Savolainen O. 2002. Low nucleotide diversity at two phytochrome loci along a latitudinal cline in Pinus sylvestris. Molecular Biology and Evolution 19: 179199.
  • Eckert AJ, Bower AD, Wegrzyn JL, Pande B, Jermstad KD, Krutovsky KV, St Clair JB, Neale DB. 2009. Association genetics of coastal Douglas fir (Pseudotsuga menziesii var. menziesii, Pinaceae). I. Cold hardiness related traits. Genetics 182: 12891302.
  • Evanno G, Regnaut S, Goudet J. 2005. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology 14: 26112620.
  • Gonzalez-Martinez SC, Wheeler NC, Ersoz E, Nelson CD, Neale DB. 2007. Association genetics in Pinus taeda L. I. Wood property traits. Genetics 175: 399409.
  • Greene CS, Penrod NM, Williams SM, Moore JH. 2009. Failure to replicate a genetic association may provide important clues about genetic architecture. PLoS ONE 4: e5639.
  • Hardy OJ, Vekemans X. 2002. SPAGEDi: a versatile computer program to analyze spatial genetic structure at the individual or population levels. Molecular Ecology Notes 2: 618620.
  • Herold C, Becker T. 2009. Genetic association analysis with FAMHAP: a major program update. Bioinformatics 25: 134136.
  • Hill WG, Robertson A. 1968. Linkage disequilibrium in finite populations. Theoretical and Applied Genetics 38: 226231.
  • Huang ZH. 1992. The study on the climatic regionalization of the distributional region of Populus tomentosa. Journal of Beijing Forestry University 14: 2632.
  • Ingvarsson PK. 2005. Nucleotide polymorphism and linkage disequilibrium within and among natural populations of European aspen (Populus tremula L., Salicaceae). Genetics 169: 945953.
  • Ingvarsson PK, Garcia MV, Luquez V, Hall D, Jansson S. 2008. Nucleotide polymorphism and phenotypic associations within and around the phytochrome B2 locus in European aspen (Populus tremula, Salicaceae). Genetics 178: 22172226.
  • Joshi CP, Bhandari S, Ranjan P, Kalluri UC, Liang X, Fujino T, Samuga A. 2004. Genomics of cellulose biosynthesis in poplars. New Phytologist 164: 5361.
  • Krutovsky KV, Neale DB. 2005. Nucleotide diversity and linkage disequilibrium in cold-hardiness and wood quality-related candidate genes in Douglas-fir. Genetics 171: 20292041.
  • Kumar M, Thammannagowda S, Bulone V, Chiang V, Han KH, Joshi CP, Mansfield SD, Mellerowicz E, Sundberg B, Teeri T et al. 2009. An update on the nomenclature for the cellulose synthase genes in Populus. Trends in Plant Science 14: 248254.
  • Li L, Lu S, Chiang VL. 2006. A genomic and molecular view of wood formation. Critical Review in Plant Science 25: 213233.
  • Li YC, Korol AB, Fahima T, Nevo E. 2004. Microsatellites within genes: structure, function, and evolution. Molecular Biology and Evolution 21: 9911007.
  • Long AD, Langley CH. 1999. The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Research 9: 720731.
  • Mellerowicz EJ, Sundberg B. 2008. Wood cell walls: biosynthesis, developmental dynamics and their implications for wood properties. Current Opinion in Plant Biology 11: 293300.
  • Nairn CJ, Haselkorn T. 2005. Three loblolly pine CesA genes expressed in developing xylem are orthologous to secondary cell wall CesA genes of angiosperms. New Phytologist 166: 907915.
  • Neale DB, Savolainen O. 2004. Association genetics of complex traits in conifers. Trends in Plant Science 9: 325330.
  • Nordborg M, Borevitz JO, Bergelson J, Berry CC, Chory J, Hagenblad J, Kreitman M, Maloof JN, Noyes T, Oefner PJ et al. 2002. The extent of linkage disequilibrium in Arabidopsis thaliana. Nature Genetics 30: 190193.
  • van Oosterhout CV, Hutchinson WF, Wills DPM, Shipley P. 2004. Micro-Checker: software for identifying and correcting genotyping errors in microsatellite data. Molecular Ecology Notes 4: 535538.
  • Paredez AR, Somerville CR, Ehrhardt DW. 2006. Visualization of cellulose synthase demonstrates functional association with microtubules. Science 312: 14911495.
  • Pear JR, Kawagoe Y, Schreckengost WE, Delmer DP, Stalker DM. 1996. Higher plants contain homologs of the bacterial cela genes encoding the catalytic subunit of cellulose synthase. Proceedings of the National Academy of Sciences, USA 93: 1263712642.
  • Persson S, Paredez A, Carroll A, Palsdottir H, Doblin M, Poindexter P, Khitrov N, Auer M, Somerville CR. 2007. Genetic evidence for three unique components in primary cell-wall cellulose synthase complexes in Arabidopsis. Proceedings of the National Academy of Sciences, USA 104: 1556615571.
  • Persson S, Wei H, Milne J, Page GP, Somerville CR. 2005. Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets. Proceedings of the National Academy of Sciences, USA 102: 86338638.
  • Rajangam AS, Kumar M, Aspeborg H, Guerriero G, Arvestad L, Pansri P, Brown CJL, Hober S, Blomqvist K, Divne C et al. 2008. MAP20, a microtubule-associated protein in the secondary cell walls of hybrid aspen, is a target of the cellulose synthesis inhibitor 2,6-dichlor-obenzonitrile. Plant Physiology 148: 12831294.
  • Richmond TA, Somerville CR. 2000. The cellulose synthase superfamily. Plant Physiology 124: 495498.
  • Ritland K. 1996. Estimators for pairwise relatedness and individual inbreeding coefficients. Genetical Research 67: 175185.
  • Schimleck LR, Kube PS, Raymond CA. 2004. Genetic improvement of kraft pulp yield in Eucalyptus nitens using cellulose content determined by near infrared spectroscopy. Canadian Journal of Forest Research 34: 23632370.
  • Schlesinger WH, Lichter J. 2001. Limited carbon storage in soil and litter of experimental forest plots under increased atmospheric CO2. Nature 411: 466469.
  • Sexton TR, Henry RJ, Harwood CE, Thomas DS, McManus LJ, Raymond C, Henson M, Shepherd M. 2011. Pectin methyltransferase genes influence solid wood properties of Eucalyptus pilularis. Plant Physiology 158: 531541.
  • Somerville C. 2006. Cellulose synthesis in higher plants. Annual Review of Cell and Developmental Biology 22: 5378.
  • Song D, Shen J, Li L. 2010. Characterization of cellulose synthase complexes in Populus xylem differentiation. New Phytologist 187: 777790.
  • Sterky F, Bhalerao RR, Unneberg P, Segerman B, Nilsson P, Brunner AM, Charbonnel-Campaa L, Lindvall JJ, Tandre K, Strauss SH et al. 2004. A Populus EST resource for plant functional genomics. Proceedings of the National Academy of Sciences, USA 101: 1395113956.
  • Storey JD, Tibshirani R. 2003. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences, USA 100: 94409445.
  • Suzuki S, Li L, Sun YH, Chiang VL. 2006. The cellulose synthase gene superfamily and biochemical functions of xylem-specific cellulose synthase-like genes in Populus trichocarpa. Plant Physiology 142: 12331245.
  • Szyjanowicz PM, McKinnon I, Taylor NG, Gardiner J, Jarvis MC, Turner SR. 2004. The irregular xylem 2 mutant is an allele of korrigan that affects the secondary cell wall of Arabidopsis thaliana. Plant Journal 37: 730740.
  • Tanaka K, Murata K, Yamazaki M, Onosato K, Miyao A, Hirochika H. 2003. Three distinct rice cellulose synthase catalytic subunit genes required for cellulose synthesis in the secondary wall. Plant Physiology 133: 7383.
  • Temnykh S, Declerck G, Lukashova A, Lipovich L, Cartinhpur S, Mccouch S. 2001. Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Research 11: 14411452.
  • Thumma B, Southerton S, Bell J, Owen J, Henery M, Moran G. 2010. Quantitative trait locus (QTL) analysis of wood quality traits in Eucalyptus nitens. Tree Genetics & Genomes 6: 305317.
  • Thumma BR, Matheson BA, Zhang D, Meeske C, Meder R, Downes GM, Southerton SG. 2009. Identification of a cis-acting regulatory polymorphism in a Eucalypt cobra-like gene affecting cellulose content. Genetics 183: 11531164.
  • Thumma BR, Nolan MF, Evans R, Moran GF. 2005. Polymorphisms in cinnamoyl coa reductase (CCR) are associated with variation in microfibril angle in Eucalyptus spp. Genetics 171: 12571265.
  • Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A et al. 2006. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313: 15961604.
  • Varshney RK, Graner A, Sorrells ME. 2005. Genic microsatellite markers in plants: features and applications. Trends in Biotechnology 23: 4855.
  • Wegrzyn JL, Eckert AJ, Choi M, Lee JM, Stanton BJ, Sykes R, Davis MF, Tsai CJ, Neale DB. 2010. Association genetics of traits controlling lignin and cellulose biosynthesis in black cottonwood (Populus trichocarpa, Salicaceae) secondary xylem. New Phytologist 188: 515532.
  • Wu L, Joshi CP, Chiang VL. 2000. A xylem-specific cellulose synthase gene from aspen (Populus tremuloides) is responsive to mechanical stress. Plant Journal 22: 495502.
  • Xu B, Yang X, Li B, Zhang Z, Zhang D. 2009. Isolation, expression and single nucleotide polymorphism analysis of cellulose synthase gene (PtCesA4) from Populus tomentosa. Scientia Silvae Sinicae 45: 110.
  • Yeh FC, Yang RC, Boyle T. 1999. POPGENE version 1.32: Microsoft Windows-based freeware for population genetic analysis, quick user guide. Edmonton, AB, Canada: Center for International Forestry Research, University of Alberta.
  • Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF, McMullen MD, Gaut BS, Nielsen DM, Holland JB et al. 2006. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nature Genetics 38: 203208.
  • Zhang DQ, Du QZ, Xu BH, Zhang ZY, Li B. 2010a. The actin multigene family in Populus: organization, expression and phylogenetic analysis. Molecular Genetics and Genomics 284: 105119.
  • Zhang DQ, Yang XH, Zhang ZY, Li B. 2010b. Expression and nucleotide diversity of the poplar COBL gene. Tree Genetics & Genomes 6: 331344.
  • Zhang DQ, Zhang ZY, Yang K. 2006. QTL analysis of growth and wood chemical content traits in an interspecific backcross family of white poplar (Populus tomentosa × P. bolleana) × P. tomentosa. Canadian Journal of Forest Research 36: 20152023.
  • Zhang DQ, Zhang ZY, Yang K. 2007. Identification of AFLP markers associated with embryonic root development in Populus tomentosa. Silvae Genetica 56: 2732.

Supporting Information

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office.

FilenameFormatSizeDescription
nph12072-sup-0001-TableS1-S5.docWord document145K

Table S1 Details of PtoCesA family members in the Populus genome

Table S2 Thirty-six SSR primer pairs identified from PtoCesA genes in this study

Table S3 The real-time PCR primers used in this study

Table S4 The minimum and maximum values, mean, standard error (SE) and coefficient of phenotypic variation (CV (%)) for each growth and wood property trait measured in the P. tomentosa association population

Table S5 Mean, minimum, and maximum values, standard error (SE) and coefficient of phenotypic variation (CV (%)) of different phenotypic traits measured in the linkage mapping population