Forest genomics grows up and branches out


(*Author for correspondence: tel +1540 231 3165; fax +1540 231 3698; email

The Forestry Workshop and other talks on forest trees: Plant and Animal Genome XV Conference, San Diego, CA, USA, January 2007

Forest trees have long been a tantalizing target for genetic and genomic studies. They are extremely important from both ecological and environmental standpoints, accounting for a large proportion of the biomass in terrestrial systems and providing some of the most valuable commodities in the world economy. However, trees are notoriously difficult to manipulate genetically because of their large size and long generation time, so progress has historically been slow compared with annual herbaceous plants. The genomics era offers new promise of accelerated rates of gene discovery for forest trees, and new insights into the molecular mechanisms underlying tree development, physiology and adaptation (Brunner et al., 2004). It would seem that forest genomics is entering a new era with new resources and approaches. As demonstrated at the January 2007 Plant and Animal Genome conference in San Diego (, this promise is rapidly being realized in a variety of tree taxa.

‘... the Populus genome may be more similar to ancestral angiosperm genomes than the annual angiosperms that have been sequenced to date’

The tree life style and genome evolution

A number of developmental and physiological characteristics distinguish trees from annual plants, including the ability to achieve immense size by extensive development of secondary xylem (i.e. wood) and the ability to cope with highly variable biotic and abiotic stresses over a long life span. Studies are beginning to reveal how tree growth habit affects, and is affected by, genome evolution. Analysis of the Populus trichocarpa genome sequence shows a fascinating history of duplication, followed by putative selective retention of specific classes of genes that could be associated with traits advantageous to a long-lived woody perennial (Tuskan et al., 2006). Estimates of substitution rate during Populus evolution suggest that the perennial lifestyle, combined with extensive clonal replication in Populus, may slow the rate of molecular and chromosome-level change during Populus evolution. Thus, the Populus genome may be more similar to ancestral angiosperm genomes than the annual angiosperms that have been sequenced to date.

Although the large size of conifer genomes (c. 75–180 times that of Arabidopsis) presents an obstacle to complete genome sequencing, work presented by Emanuele De Paoli (University of Udine, Italy) showed that new and important insights can be gained from sequencing of large genomic regions in gymnosperms. In particular, he looked at the types, distributions and local arrangements of transposable elements in Picea abies (Norway spruce). Increasing evidence supports the importance of intergenic regions, specifically transposable elements in these regions, in genome evolution (Morgante, 2006). Intergenic regions can vary considerably within a species and may have important regulatory roles. Whereas retrotransposon expansion in angiosperms is relatively recent and still evolving, the spruce genome appears more fixed, with estimates of transposon insertion dating long before the major angiosperm genome expansions. The slower genome evolution in poplar and the apparently reduced evolutionary flexibility of the spruce genome are likely to have important implications for adaptive evolution.

Another question beginning to be addressed is how the tree life history reflects cooption and modification of genes and mechanisms during land plant evolution. For example, a poplar homolog of the Arabidopsis flowering time gene FT not only affects flowering in poplar but also controls bud set induced by the short days of fall (Bohlenius et al., 2006). Class I KNOX homeobox genes regulate the shoot apical meristem in annuals, and Andrew Groover (USDA Forest Service, CA, USA) showed that Populus homologs also regulate the vascular cambium and the differentiation and lignification of cells during wood formation.

Noncoding small RNAs that regulate the expression of other genes potentially have a major role in the evolution of gene regulatory networks in trees. Although broadly conserved microRNAs (miRNAs) were initially identified in plants, accumulating evidence indicates that a large proportion of miRNAs are taxa-specific (Rajagopalan et al., 2006). New, faster sequencing technologies, most notably the massively parallel pyrosequencing method developed by 454 Life Sciences (Margulies et al., 2005; referred to as ‘454 sequencing’), has enabled cost-effective deep-sequencing of small RNAs. Ying-Hsuan Sun (North Carolina State University, NC, USA) reported results of 454 sequencing for miRNA discovery in poplar that has already yielded over 90 000 distinct small RNA sequences, including potentially species-specific miRNAs involved in wood formation.

The long life spans of trees as well as rates of genome evolution can also affect the genes and pathways for defense responses to insects and pathogens. The critical need for research in this area is best exemplified by the effects of exotic pathogens such as Cryphonectria parasitica, which has virtually eliminated the American chestnut that once dominated eastern hardwood forests. John Carlson (Pennsylvania State University, PA, USA) described using 454 technology to generate American and blight-resistant Chinese chestnut ESTs, and plans to use 454 sequencing for single nucleotide polymorphism (SNP) detection to develop markers for blight resistance. This ground-breaking project will be a closely watched test case of the extent to which genomics-guided approaches can help to restore a native species.

From QTLs to adaptive polymorphisms

One of the most successful approaches for identifying the genomic regions that are responsible for phenotypic variation has been quantitative trait locus (QTL) analysis. Such studies have been conducted since the early 1990s in forest trees, and have revealed valuable information about the genetic architecture of a wide variety of complex traits, ranging from well-defined traits such as leaf flavonoid chemistry (Morreel et al., 2006) or disease resistance (Jorge et al., 2005), to extremely complex traits such as growth (Wullschleger et al., 2005) and stress tolerance (Howe et al., 2003). However, since the early days of QTL analysis, there have been doubts about the ultimate utility of this approach in forest trees. QTL mapping in trees is limited to pedigrees in which high linkage disequilibrium (LD) is required for the identification of chromosomal regions containing genes influencing phenotypic traits (Strauss et al., 1992). However, this also means that a large number of genes are present in the identified intervals, making cloning of the actual gene influencing the phenotype a daunting task.

In recent years, QTL analysis has been successfully combined with other tools from the forest genomics toolbox to narrow the search for candidate genes underlying complex traits. For example, as presented by Gail Taylor (University of Southampton, UK), the European Popyomics project has combined QTL analysis with microarray analyses of gene expression and candidate gene lists to identify genes potentially involved in drought tolerance (Street et al., 2006). Another interesting example of genomics-assisted QTL studies was presented by Gerald Tuskan (Oak Ridge National Laboratory, TN, USA). Populus is dioecious, with separate male and female individuals. Tuskan and colleagues have mapped a locus controlling gender to a portion of the genome with extensive haplotypic diversity, as revealed by the whole genome sequencing project. High-density genetic mapping in this region has demonstrated that recombination is suppressed across this region, which is a typical characteristic of an autosome that is in the process of becoming a heteromorphic sex chromosome (Liu et al., 2004). Amy Brunner (Virginia Tech, VA, USA) reported the use of whole genome microarray analyses across a diverse set of Populus tissues and developmental stages to identify genes that are differentially expressed in developing reproductive buds relative to young vegetative buds. Cross-referencing these genes with the candidate gene lists from the gender intervals has produced a list of 33 candidate genes that can now be functionally characterized for their involvement in gender determination.

Association mapping approaches have been developed that take advantage of natural forest populations, which typically have very low LD. In contrast to QTL mapping, genetic markers that show significant association with a phenotypic trait are very close to, or in, the gene influencing the trait, but a larger number of markers must typically be surveyed. Association mapping is thus effective for detecting robust associations between candidate genes and phenotypes (Neale & Savolainen, 2004). The pioneering tree association studies have been conducted primarily in conifers, and major candidate gene association studies are ongoing in loblolly pine (Gonzalez-Martinez et al., 2006), radiata pine (Shannon Dillon, CSIRO, Canberra, Australia), and spruce, driven by Canada's Arborea project (Pavy et al., 2007). SNP association studies are also becoming a major component of hardwood research programs. For example, the ambitious EVOLVTREE project, as described by Christophe Plomion (INRA, France), will extend work begun in the Popyomics project with Populus and the TREESNIPs project with oak and pine, to examine candidate gene polymorphisms in Populus, pine, and oak.

An approach that is in some ways intermediate between QTL analyses and association studies in pure species is the use of hybrid zones to perform whole genome scans for associations with phenotypic traits that differ between the hybridizing species (Lexer et al., 2003). Stephen DiFazio (West Virginia University, WV, USA) presented results from a North American hybrid zone between Populus angustifolia and P. fremontii that has a moderate LD that should allow whole genome scans using a moderate number of markers. Meanwhile, populations of pure species adjacent to the hybrid zone have extremely low LD for closely linked loci, thus enabling finer association studies for candidate genes from genomic regions identified in the hybrid zone. Patterns of introgression across such hybrid zones also provide unique insights into the genetic forces maintaining differentiation of the species. For example, fragments that introgress more or less than expected under neutral models co-occur with QTL for leaf chemistry traits that have been demonstrated to have profound ecological effects in these hybrid zones (Whitham et al., 2006). Similar results have recently been reported for a European hybrid zone between P. tremula and P. alba (Lexer et al., 2007).


Parallel work in a variety of different angiosperm and gymnosperm taxa is continuing to develop and refine genomic resources for forest trees, and this work is greatly facilitated by recent technological advances. These include the 454 Life Sciences system and emerging platforms from Solexa and ABI, as well as high-throughput SNP genotyping systems such as resequencing arrays and bead array systems. This work is building a strong foundation for comparative tree genomics that will ultimately reveal similarities and differences in the genes and polymorphisms underlying traits important for the tree growth habit, and thus allow adaptive molecular variation to be placed in the broader context of woody plant evolution. The importance of using a combination of approaches and recognizing the conservation of genetic pathways in plant evolution is already being demonstrated in poplar research, and will enable other angiosperm tree taxa to extensively leverage the poplar genome sequence and poplar functional genomics studies. Conifers present the opportunity to study developmental and adaptive traits in trees evolutionarily distant from angiosperms, including identification of ancient shared mechanisms, and those mechanisms that have independently evolved. Sequencing of a significant proportion of a conifer genome would be a substantial benefit not only to conifer research, but also for understanding of the broad patterns of plant evolution and development.


Many thanks to Christophe Plomion, John MacKay, and Tom Richardson for organizing the Forestry Workshop, and to Chung-Jui Tsai for coorganizing the Populus community microarray workshop.