The plant ADH gene family

Authors


For correspondence (fax +1 519 767 0755; e-mail jstromme@uoguelph.ca).

Summary

The structures, evolution and functions of alcohol dehydrogenase gene families and their products have been scrutinized for half a century. Our understanding of the enzyme structure and catalytic activity of plant alcohol dehydrogenase (ADH-P) is based on the vast amount of information available for its animal counterpart. The probable origins of the enzyme from a simple β-coil and eventual emergence from a glutathione-dependent formaldehyde dehydrogenase have been well described. There is compelling evidence that the small ADH gene families found in plants today are the survivors of multiple rounds of gene expansion and contraction. To the probable original function of their products in the terminal reaction of anaerobic fermentation have been added roles in yeast-like aerobic fermentation and the production of characteristic scents that act to attract animals that serve as pollinators or agents of seed dispersal and to protect against herbivores.

Introduction to plant alcohol dehydrogenases

The classic alcohol dehydrogenase (ADH, alcohol:NAD+ oxidoreductase, EC 1.1.1.1) is a Zn-binding enzyme that acts as a dimer and relies on an NAD(P) co-factor to interconvert ethanol and acetaldehyde (and other short linear alcohol/aldehyde pairs). It is a member of the well-studied medium-length dehydrogenase/reductase (MDR) protein superfamily.

ADH-P enzymes were traditionally of interest because of their activity during episodes of oxygen deprivation, as reported more than 50 years ago (Hageman and Flesher, 1960). Under these conditions, ADH acts in the terminal step of anaerobic glycolysis, or fermentation, converting acetaldehyde to ethanol. During the process, NAD+ is regenerated and a limited amount of ATP is produced at a time when normal respiration is disrupted. ADH activity in the kernels and anthers of maize was first reported by Schwartz in 1971, and, more recently, ADH gene expression and ADH activity have been found generally throughout plants growing under ‘normal’ conditions or subjected to various stresses. With the exception of a few Arabidopsis species, all plants analyzed to date carry multiple ADH genes. Especially in eudicots, the tissue-specific patterns of gene expression and enzymatic activities provide evidence of functional specialization.

Plant ADH genes also have a long-standing role in evolutionary studies. Population biologists relied on analysis of model enzymes whose alternative versions – products of different alleles (allozymes) or genes (isozymes) – could be distinguished due to amino acid differences that affected electrical charge; the various forms could be separated by electrophoresis and visualized by histochemical reactions specific to the enzyme. ADH was a popular choice for such studies, and played an important role in development of the disciplines of population biology and evolutionary genetics. Even when methods for direct analysis of DNA became available, ADH genes remained popular for analysis of evolutionary dynamics. Their unique utility was largely supplanted by the availability of full-genome sequences, but, due to the amount of information available from older studies, ADH genes and their products remain significant in genomic, transcriptomic, proteomic and metabolomic research.

Other members of the MDR superfamily are sometimes classed as alcohol dehydrogenases, particularly glutathione-dependent formaldehyde dehydrogenases (GSH-FDHs, EC 1.2.1.1, also called class III ADHs; Chase, 1999) and cinnamyl alcohol dehydrogenases (CAD, EC 1.1.1.195). GSH-FDH enzymes are Zn-binding NAD-dependent enzymes with hydroxymethylglutathione as a preferred substrate (Chase, 1999); on the basis of phylogenetic analyses, it has been concluded that alcohol dehydrogenases evolved from GSH-FDHs (Danielsson and Jörnvall, 1992; Shafqat et al., 1996). CAD is a Zn-binding MDR enzyme that utilizes NADP for reduction of aromatic aldehydes to the corresponding alcohols. Found in plants and prokaryotes, its activity is required for lignin biosynthesis and apparently also for defense-related functions (see Umezawa, 2010). The CAD group includes a variety of protein types, and their classification is less settled than that of ADH-P (e.g. Goffner et al., 1998; Kim et al., 2004). They constitute an ancient group that diverged from the ADH lineage shortly after the separation of plants and fungi (Nordling et al., 2002). Some members of the more distantly related plant short-chain dehydrogenase/reductase (SDR) superfamily also function as alcohol dehydrogenases, but they lack the characteristic catalytic domain and Zn co-factors. These enzymes are typically approximately 250 amino acids long (compared with approximately 370 for MDRs). Short-chain ADH genes have been recovered as cDNAs in leaf (Kim et al., 2009) and fruit (Mánriquez et al., 2006), but their functions in plants are not yet clear.

Enzyme structure and catalytic function

After publication of a 2.4 Å resolution structure for a horse liver ADH (LADH) (Eklund et al., 1976), LADH became the ‘type specimen’ alcohol dehydrogenase, and remains a standard by which other ADH structures are determined. Its usefulness derives in large part from the considerable sequence conservation typical among MDR alcohol dehydrogenases.

In Figure 1, two animal [human (Homo sapiens) and horse (Equus caballus)] and three plant [maize (Zea mays), petunia (Petunia hybrida) and pine (Pinus banksiana)] ADH amino acid sequences are aligned. Residues shared by the animal proteins alone are shown in yellow, those shared by plant proteins are shown in blue, and those common to all five (and to most other plant and animal MDR ADH polypeptides) are shown in green. Totally conserved residues account for more than 40% of the total (based on LADH length), and many of the differences are conservative amino acid substitutions and/or occur in just one of the five polypeptides shown. The homology is especially impressive given the evidence for convergent evolution of plant and animal ADH genes from GSH-FDH after the divergence of plant and animal lineages (Danielsson and Jörnvall, 1992; Shafqat et al., 1996; Dolferus et al., 1997).

Figure 1.

 Alignment of ADH sequences from horse liver (Ec), human (Hs), maize (Zm), petunia (Ph) and pine (Pb) enzymes.
Amino acids shared by the horse and human forms are in yellow, those shared by the three plant sequences are in blue, and sequences conserved amongst all five polypeptides are in green. The major functional domains are indicated, along with the binding sites of catalytic (cross-hatched diamond) and structural (solid diamond) Zn ions. β coils of the NAD-binding Rossmann fold are underlined (Rossmann et al., 1974; Duester et al., 1986). The Genbank accessions for the horse liver, human, maize, petunia and pine enzymes are NP_001075414.1, NP_000658.1, P00333.1, AA074898.1 and AAC49540.1, respectively.

The active ADH-P dimer consists of identical, or, in the case of allozymes and isozymes, nearly identical subunits. As shown in Figure 1, each monomer has two major domains: a substrate-binding or catalytic domain, consisting of an N-terminal region with irregular β-coils and a short C-terminal region, and a co-enzyme-binding domain, comprising a doubled β-sheet known as the Rossmann fold (Rossmann et al., 1974). Subunit folding produces a deep pocket between the catalytic and co-enzyme-binding domains, creating an entry site for the substrate. A three-dimensional image of the horse liver ADH dimer, each subunit complexed with NADH and an alcohol, is shown in Figure 2.

Figure 2.

 Three-dimensional representation of horse liver ADH.
The N-terminal catalytic domain of the monomer on the left is yellow-green and that on the right is blue. The prominent α helices at the bottom left and top right are the C-termini. Co-enzyme binding domains, which include the subunit interaction regions, are in the centre. Zn ions are shown in blue. The enzyme is shown complexed with co-enzyme and substrate. Reproduced with permission from Eklund and Ramaswamy (2008). © Birkhäuser Verlag, Basel, 2008.

Each enzyme subunit binds two Zn ions, one designated ‘catalytic’ and the other ‘structural’, although the latter probably plays a bigger role than implied by that term (Jörnvall et al., 2010). Each Zn ion binds four ligands; in Figure 1, only three ligands are indicated for the catalytic ion because the fourth is a water molecule (or hydroxyl ion) that plays a direct role in the enzymatic reaction. The co-factor typically used by ADH-P enzymes is NAD+/NADH, and the enzymatic reaction can be summarized by the equation:

RCH2-OH + NAD+↔ RCH = O + NADH + H+

With the enzyme acting as a dehydrogenase, NAD+ binds first, creating a conformational change in the enzyme that forces expulsion of water from the catalytic site (Dolega, 2010; Plapp, 2010). The substrate then enters the catalytic pocket, where it binds the catalytic Zn ion, replacing the expelled water. A proton dissociates from the hydroxyl group of the alcohol and is transported away from the catalytic site by an amino acid network that relays it to the surface of the enzyme and releases it. Dissociation of this proton from the alcohol facilitates removal of hydrogen from the adjoining carbon atom as a hydride (H). The hydride is transferred to NAD+, reducing it to NADH. As a result of removal of the two hydrogen atoms, the alcohol is converted to an aldehyde, NAD+ is reduced to NADH, and a proton is added to the surrounding milieu. The reduced NADH causes a conformational change that allows release of the products, and a water molecule enters the catalytic domain to re-form the fourth bond with Zn.

The reduction of aldehyde to alcohol follows the same steps in reverse, but with no prescribed order for substrate and co-enzyme binding. Reactions in both directions are kinetically feasible, with roughly similar Km values, and plants rely on the conversion of alcohol to aldehyde and aldehyde to alcohol. Potential substrates are restricted by the amino acid sequence of the catalytic domain, which determines the shape of the substrate-binding pocket.

In the beginning

Traditionally in biology, a search for the origin of life was a search for the universal ancestor, a cell that gave rise to all extant life forms, a first true cell whose descendents slowly acquired complexity and diverged to become the three major divisions we recognize today: the archebacteria (or Archaea), eubacteria (Bacteria) and eukaryotes (Eucarya). Woese (1998) and Philippe and Forterre (1999) summarized some of the problems with this approach, all emanating from core phylogenetic incongruities. Some features, e.g. some aminoacyl-tRNA synthetases, are closer to those of eubacteria, while others are more like those of eukaryotes. Phylogenetic trees based on rRNA sequences cannot separate the three lineages. There are metabolic functions peculiar to the two prokaryotic groups, but phylogenies generally show archebacteria to be specific relatives of eukaryotes (Woese, 1998).

Woese’s contribution was to discard the ‘cell’ paradigm, and with it the concept of vertical inheritance, at least in the early stages of life. Instead he suggested the existence of simple and abundant, somewhat enclosed, systems capable of an elementary form of translation. These pregenotes, or almost-cells, contained very basic metabolic capabilities, some translational machinery, and simple genes organized on the basis of metabolic roles on plasmid-like mini-chromosomes, carried in many copies. Translation was sloppy, so only short polypeptides had much likelihood of being faithfully produced. Boundaries were minimal, and polypeptides, metabolites and chromosomes were readily exchanged within the community of cells. Mutations were rampant; what was useful was likely to be kept, not by an individual but by the community. As replication and translation slowly became more refined, sub-populations within the community began to segregate based on their continued ability to share genomic information and gene products. Lineages, including the three extant domains, eventually emerged, and members came to survive by means of vertical inheritance. Thus none of the groups is descended from another; all shared in the early days of lateral inheritance.

With the tightening of replication and translation, the ability to produce longer peptides, more faithfully translated, would have arisen. As suggested by Rossmann et al. (1974), an early polypeptide to emerge, probably in the pregenote era, formed the so-called Rossmann fold. Imagine, within the pregenote community, protoplasmids with coding capacity for an oligopeptide that formed a short β-coil, with a bit of α-helix at one end (Figure 3). Creation of a nucleotide-binding site required only triplication of that oligopeptide to form a β-sheet consisting of β-α-β-α-β, with the two α-helices on one side of the sheet. Duplication then produced a structure (the Rossmann fold) that could bind both the nicotinamide and adenine moieties of NAD(P).

Figure 3.

 Postulated origin of ADH-P enzymes in the pregenote era from a primordial β coil with a short α-helical tail.
Triplication followed by doubling created the Rossmann fold, capable of binding NAD(P). Fusion with an ancestral GroES gave rise to the superfamily of dehydrogenase/reductases known as MDRs, including GSH-FDH, the direct ancestor of ADH in both plants and animals.

The SDR superfamily is thought to have originated very early. Members consist essentially of a Rossmann fold plus a simple extension at the N-terminus that allows substrate binding (Jörnvall et al., 2010). Most are active as dimers or tetramers, and rely on α-helices of the nicotinamide-binding region for subunit interactions (Jörnvall et al., 1995). Members utilize either NAD or NADP. Best known are the insect alcohol dehydrogenases, but SDR superfamily members include steroid dehydrogenases, isomerases and members of fatty acid synthase enzyme complexes, among others (Kallberg et al., 2002). Evidence for the early origin of the SDR family is provided by its simple architecture, wide occurrence, interaction with RNA nucleotides, and derived dendrograms with deep-branched patterns suggestive of early radiation (Jörnvall et al., 2010).

As illustrated in Figure 3, the MDR parent protein was created by addition to the Rossmann fold of a specific catalytic domain consisting almost entirely of an irregular all-β segment. It has been identified as GroES-like (Murzin, 1996; Taneja and Mande, 1999), on the basis of its similarity to the GroES molecular chaperone, chaperonin-10 (Taneja and Mande, 1999). Again, due to its widespread occurrence in archebacteria, eubacteria and eukaryotes, the ancestral GroES–Rossmann-fold molecule that gave rise to the MDR superfamily is thought to have arisen in the pregenote era (Jörnvall et al., 2010).

The Rossmann fold is found in many of the estimated 1000 protein superfamilies (Chothia, 1992; Murzin, 1996) that are defined by shared structure and ancestry of their members. It is present, for example, in members of the lactate dehydrogenase/malate dehydrogenase and aldehyde dehydrogenase superfamilies, as well as glucose-, glutamate- and glyceraldehyde-3-phosphate dehydrogenases, and the SDR and MDR proteins (Buehner et al., 1973; Rao and Rossmann, 1973; Rossmann et al., 1974; Elder, 2000; Madern, 2002). All these classes of enzymes evolved from a Rossmann fold through addition of domains whose three-dimensional shapes allowed binding to and reactivity with various substrates.

A side note to this story feeds the debate between supporters of the view that introns were present in the earliest genes and proponents of their later acquisition (Elder, 2000). As Duester et al. (1986) pointed out, no intron sites are shared between plant and animal ADH genes, but all existing ADH introns fall between, never within, the nucleotide stretches encoding the β-coils of the Rossmann fold. They suggest that this finding is more consistent with an ancient form in which short functional units, e.g. the β-coils, were separated by non-coding segments that facilitated exon shuffling, giving rise to the Rossmann fold and eventually to the enzymes that carry the domain today, whose genes retain a subset of the primordial introns.

There is general consensus that the MDR superfamily can be divided into eight major families (Riveros-Rosas et al., 2003; Persson et al., 2008). Using the nomenclature and divisions of Persson et al. (2008), they are (i) ADH, (ii) TADH (active as tetramers, previously called YADH based on their prevalence in yeast), (iii) PDH (polyol dehydrogenases, including sorbitol dehydrogenase), (iv) CAD (cinnamyl alcohol and mannitol dehydrogenases), (v) LTD (leukotriene dehydrogenases), (vi) TDH (threonine dehydrogenases), (vii) BPDH (poorly characterized bacterial and plant dehydrogenases), and (viii) YHDH (found only in bacteria so far, and named for one of its members, yhdH, an Escherichia coli protein of unknown function). A number of smaller families, each with fewer than 10 members, also fall into the MDR category (Persson et al., 2008).

Members of the ADH, TADH, PDH and CAD families rely on Zn co-factors, but other MDR members, mostly from bacteria, do not; it is assumed that the Zn-free forms arose first. Association of Zn co-factors with a primitive MDR may have occurred in the early days of atmospheric oxygen, when Zn is likely to have been a preferred co-factor due to its valence stability (Jörnvall et al., 2010). The Zn-containing MDR proteins are generally characterized by less highly conserved catalytic domains, probably because Zn ligands maintain the required structure and thereby reduce the need for amino acid conservation (Jörnvall et al., 2010).

Evolution and structure of ADH-P gene families

It has been recognized for more than a decade that the ADH-P gene family originated from a glutathione-dependent formaldehyde dehydrogenase gene (GSH-FDH), also known as Class III ADH, sometime after the divergence of the plant and animal kingdoms (Figure 3). The argument rests on the presence of GSH-FDH genes in virtually all life forms, and the tight conservation of amino acid sequence (Danielsson et al., 1994): e.g. pea and human GSH-FDHs show 69% amino acid identity, compared with 47% between their respective ADHs, and there are just three changes in 23 amino acids deemed important for substrate and co-enzyme binding (Martínez et al., 1996; Shafqat et al., 1996). Plant and animal ADH genes are agreed to have subsequently undergone convergent evolution (Martínez et al., 1996; Shafqat et al., 1996; Dolferus et al., 1997).

For a generation, molecular evolutionists relied on gene or gene-family phylogenies to derive likely patterns of historical relatedness, including those of plant ADH. Once DNA sequences became available, dendrograms were sometimes based on expressed genes, sometimes on genomic sequences. Examples of two such ADH phylogenies, one created by a maximum-likelihood method (Fukuda et al., 2005) and the other by a neighbor-joining method (Small and Wendel, 2000), are reproduced, in condensed form, in Figure 4. In Figure 4(a), the clustering of lettuce (Lactuca) and potato (Solanum) genes with those of palm (Washingtonia), and the presence of two legume ADH2 genes (Wisteria and Sophora) in the monocot branch, are counter-intuitive. The same is true for the placement of peony (Paeonia) and pear (Pyrus) genes, together with a subset of cotton (Gossypium) ADH genes, in the monocot lineage in Figure 4(b). Such unexpected branching patterns are hardly unusual in gene family phylogenies.

Figure 4.

 Two phylogenies created from ADH coding sequences.
(a) Dendrogram created by the maximum-likelihood method, condensed and redrawn from Fukuda et al. (2005). Note the clustering of lettuce (Lactuca) and potato (Solanum) sequences with those of palm (Washingtonia) in the monocot branch, and the clustering of ADH2 but not ADH1, sequences from two legumes (Wisteria and Sophora), with those of trillium and pear (Pyrus), also in the monocot branch.
(b) Dendrogram created by the neighbor-joining method, condensed and redrawn from Small and Wendel (2000). Note the presence of peony (Paeonia), pear (Pyrus) and two of eight cotton (Gossypium) sequences in the monocot branch.
Both original phylogenies used pine (Pinus banksiana) ADH as an outgroup.

Shortly before publication of the first plant full-genome sequence, Ramsey and Schemske (1998) suggested that the genomes of between 47 and 70% of angiosperm species had undergone polyploidization, and other researchers concluded that the history of plant genomes must have included multiple expansion/contraction events (Clegg et al., 1997; Leitch and Bennett, 1997). Today it is clear that virtually all extant angiosperms are the products of recurring cycles of whole-genome duplication followed by massive gene loss (‘fractionation’) and genome re-organization, interspersed with small-scale duplications and deletions (Leitch and Bennett, 1997; Bowers et al., 2003; Jaillon et al., 2007; Liu and Adams, 2007; Paterson et al., 2010; Tang et al., 2010). It has been suggested that gymnosperm genomes have been much more stable (Leitch and Bennett, 1997): although there are seven ADH genes in P. banksiana, they are divided into two tight linkage groups (Perry and Furnier, 1996), presumably the result of small-scale duplications. Overall, the quirks of plant gene phylogenies are more easily accepted knowing that extant families are the outcomes of cycles of gene expansion and contraction: a slower-evolving (or newer) form could be lost at any time, or maintained in some taxa but not in closely related groups.

Another lesson emphasized by genome sequence analyses is that molecular clocks are very much lineage-specific. The difference in DNA substitution rates in the lineages leading to poplar and Arabidopsis thaliana, for example, is estimated to be six-fold (Tuskan et al., 2006). A. thaliana is unusually fast-evolving, having lost approximately 30% of its genome and undergone nine chromosomal rearrangements since its divergence from A. lyrata (Paterson et al., 2010).

The large-scale pattern of events occurring over the past few hundreds of millions of years is summarized in Figure 5; two timescales are included to cover a range of estimates derived from reliance on faster versus slower molecular clocks. Although details vary, the pattern is consistent with all currently sequenced genomes, including those of Arabidopsis (The Arabidopsis Initiative, 2000), cucumber (Cucumus sativus) (Huang et al., 2009), grape (Vitis vinifera) (Jaillon et al., 2007), maize (Zea mays) (Schnable et al., 2009), papaya (Carica papaya) (Ming et al., 2008), poplar (Populus trichocarpa) (Tuskan et al., 2006), rice (Oryza sativa) (Goff et al., 2002; Yu et al., 2002; International Rice Genome Sequencing Project, 2005) and sorghum (Sorghum bicolor) (Paterson et al., 2009). The angiosperm/gymnosperm division is set at around 300 million years ago (MYA) (Bowers et al., 2003). Along the line leading to extant monocots, which diverged from eudicots 175–235 MYA, or perhaps 125–140 MYA (Paterson et al., 2010), a whole-genome doubling designated ‘σ’ occurred approximately 130 MYA, at about the time that the angiosperm radiation began (De Bodt et al., 2005). A more recent doubling in the grass lineage, designated ‘ρ’, occurred roughly 70 MYA, close to the origin of the Poaceae, and 20 million years before divergence of the rice and maize/sorghum lineages (Paterson et al., 2009). Roughly 10 MYA, 2 million years after the divergence of maize and sorghum, maize underwent another whole-genome doubling (Woodhouse et al., 2010).

Figure 5.

 Diagrammatic representation of whole-genome plant duplication (and triplication) events occurring between roughly ten and three hundred million years ago, and identified through genome analyses.
Two timescales are provided at the top to accommodate uncertainties in the dating. Horizontal blue lines represent genomes, vertical lines indicate major divergences, and genome doubling or tripling and subsequent gene loss (fractionation) are indicated by the broadening and narrowing of horizontal lines. Two doublings in the monocot lineage, and three in the lineage leading to Arabidopsis, have been assigned Greek letters as shown (see text for details).

In the eudicot clade, the common ancestor underwent triplication sometime between 100 and 168 MYA, resulting in hexaploidy (De Bodt et al., 2005; Jaillon et al., 2007; Ming et al., 2008; Tang et al., 2008). Evidence for this event, designated ‘γ’, is particularly clear for the grape genome, which has not undergone any whole-genome events since, and appears to have suffered much less gene loss and chromosomal rearrangement than other well-characterized angiosperms (Jaillon et al., 2007). Sometime after divergence of the poplar lineage, dated at 100–120 MYA, another whole-genome doubling, ‘β’, occurred in the eurosids, roughly 66–109 MYA (Bowers et al., 2003). The genome of the poplar ancestor doubled independently approximately 65 MYA (Tuskan et al., 2006). Yet another doubling occurred in the Arabidopsis ancestor after its divergence from the line leading to papaya, an event designated ‘α’ and dated at 20–60 MYA (Blanc and Wolfe, 2004).

Following whole-genome duplications, changes occurred fairly quickly, particularly a return to diploidy for most genes (Paterson et al., 2009, 2010; Tang et al., 2010; Woodhouse et al., 2010). The probability of loss correlated roughly with gene function: genes involved in development, transcriptional regulation and signal transduction were preferentially retained (De Bodt et al., 2005). Ironically, in the Arabidopsis genome, which carries a single ADH-P gene, duplicate genes involved in glycolysis were less likely to be lost than average. It has also been shown that, following the most recent duplications in both Arabidopsis and maize, the genes of one homeologue were much more likely to be lost than those of its partner (Thomas et al., 2006; Woodhouse et al., 2010). The effect has been reproduced in the laboratory: in a study on synthetic Brassica polyploids, Song et al. (1995) concluded that, after genome doubling, the chromosomes from the paternal donor were more likely to undergo fractionation. In this case, the process was rapid, with fractionation apparent within five generations.

Work with a newly doubled genome in cotton has shown that functional diversification of duplicated genes, such as the gene loss described above, can occur rapidly. Commercial cotton, Gossypium hirsutum, is traditionally considered to be tetraploid due to the latest whole-genome duplication, roughly 1.5 MYA. Homeologous ADH genes tend to have complementary expression patterns; in floral development, for example, ADHA from the ‘At’ genome is the only form active in pollen, and its homeologue from the ‘Dt’ genome accounts for all ADHA activity in the carpel. In synthetic tetraploid cotton generated from A- and D- genome diploids, the same complementary pattern of homeologue expression is found (Adams et al., 2003; Liu and Adams, 2007). Similar to the induction of fractionation, the act of polyploidization itself is suspected to set in motion epigenetic events that result in diversified gene expression of homeologues (Adams et al., 2003).

In their analysis of Arabidopsis transcription profiles, Blanc and Wolfe (2004) identified several cases in which groups of genes that act coordinately have co-diverged following duplication, a process they call ‘concerted divergence’. Present in a single copy, ADH does not contribute to an example of concerted divergence in Arabidopsis, but in petunia, where duplicated PDC and ADH genes act consecutively in fermentation, ADH1 and PDC2 are exclusively paired for aerobic fermentation in pollen, while ADH2 and PDC1 act together in anaerobic fermentation (Strommer and Garabagi, 2009), similar to the co-divergent gene sets described by Blanc and Wolfe (2004).

Small-scale duplications are also extremely prevalent in the history of the ADH-P family. Recent events can be inferred from the presence of duplicate genes with high sequence homology and tight linkage. At least one identified accession of maize carries an ADH1 duplication, the ADH1-Cm allele (Schwartz and Endo, 1966). Stephanomeria exigua (small wirelettuce) carries between one and three tightly linked ADH1 genes, linked in turn to the ADH2 gene (Roose and Gottlieb, 1980). Tight linkage of very similar ADH genes has also been shown for cotton (Small and Wendel, 2000). Introns contribute more evidence for patterns of gene evolution. They can be informative by their total absence, taken as evidence for gene acquisition through reverse transcription and insertion, or from the loss of single introns (‘intron exclusion’), which is thought likely to occur through double-strand break repair (Hu, 2006). The standard, presumably ancestral, number of introns in ADH-P genes is nine, located at equivalent positions in ADH genes throughout the plant kingdom. Re-insertion of introns at the specific site from which they were lost is considered highly unlikely, and therefore genes with a fuller intron complement are assumed to more closely reflect the ancestral structure.

The classic intron pattern, together with a number of exceptions, is presented in Figure 6. Descendants in the FDH lineage lack exon VII (Dolferus et al., 1997). Pine (Pinus banksiana) ADH genes retain the full complement (Perry and Furnier, 1996), as do those of most angiosperms. In barley (Hordeum vulgare), however, characterized ADH genes lack intron IX (Trick et al., 1988), as does a wheat (Triticum) ADH gene with approximately 90% nucleotide sequence homology to the barley genes (Mitchell et al., 1989). Amongst eudicots, the Brassicaceae show significant variability. Arabidopsis thaliana lacks exons IV, V and VII. Plants in the closely related genus Leavenworthia have three expressed ADH genes: ADH1, which lacks introns IV, V and VII, ADH2, which also lacks intron VI, and ADH3, which has no introns (Charlesworth et al., 1998). In species of Arabis, another closely related genus, the three ADH genes are similar to those of Leavenworthia: one without introns IV, V and VII, another lacking introns IV–IX, and a third with no introns (Koch et al., 2000). Sequence analyses suggest independent origins of the Leavenworthia and Arabis patterns (Koch et al., 2000), both of which appear to have originated with an Arabidopsis-like ancestral ADH. A Brassica oleracea gene included in the same study carries the Arabidopsis intron pattern, with the additional (and presumably subsequent) loss of intron IX. In Gossypium, the ADHB, C and D loci carry genes with the usual nine introns, but ADHA lacks introns IV and VII (Small and Wendel, 2000). The simplest models for intron loss tend to fit with phylogenies based on relatedness of coding sequences (Koch et al., 2000; Small and Wendel, 2000).

Figure 6.

 Variations on the standard ADH-P pattern of nine introns.
Introns are numbered at the top and their positions in selected genes are indicated by vertical black bars. Cotton ADH introns were identified from partial cDNA clones. Sources are presented at the left, either as accession numbers or references, where DOLF indicates Dolferus et al., CHAR indicates Charlesworth et al., MITCH indicates Mitchell et al., and SMALL indicates Small and Wendel.

Summarizing what can be deduced in general about ADH-P family history from this evidence, a GSH-FDH gene duplicated early in the plant lineage, with one copy acquiring substrate affinity for acetaldehyde/ethanol and acting in ethanolic fermentation. Subsequently, plant genomes underwent cycles of duplication and contraction. At some point, sets of duplicated genes acquired sufficient functional diversification, perhaps quite rapidly, for natural selection to act upon. The fundamental ADH1/2 duplication is estimated to have occurred in the grass lineage at about the same time as the ρ doubling event (65–70 MYA) (Gaut et al., 1999). The features associated with angiospermy – flowers, insect pollination and animal-based seed dispersal – were also in place by approximately 65 MYA (De Bodt et al., 2005). Thus genome duplication, distinct monocot ADH1 and ADH2 lineages, and potential new uses for ADH activity all arose in the monocot clade over a fairly narrow period of time. It is likely that the similarly timed β whole-genome duplication, 66–109 MYA, was critical for the generation and subsequent functional divergence of ADH gene pairs in the eudicot lineage.

Since that time, genomes have undergone cycles of expansion and contraction, maintaining small ADH gene families to cover the original need for fermentation and meet additional demands associated with angiospermy. A study comparing regions of rice and sorghum genomes orthologous to the region of maize chromosome 1 carrying the ADH1 gene recreated a lively history of insertions, duplications and deletions (Ilic et al., 2003). Through such changes, amplified over the full chromosome set, the current forms of ADH gene families, together with the rest of the plant genome, have been sculpted.

ADH-P functions and biochemistry

Figure 7 summarizes ADH gene expression patterns found in three monocots and three eudicots for which significant information is available. Due to the histories of independent expansion/contraction events, functional evolution is best considered separately for monocot and eudicot lineages. Contributing to the difficulty in drawing any broad conclusions for either group is the narrow range of taxa represented, with only cereals for the monocots and two solanaceous species for the eudicots. However, some general conclusions can be drawn. The norm (although not in all Arabidopsis spp.) is a small family of two or three ADH genes, following the general pattern for enzyme-encoding genes of plants (Gottlieb, 1982). No unique role for ADH2 forms is obvious in the cereals, but there are clearly unique patterns of expression for gene family members in the Solanaceae.

Figure 7.

 ADH expression patterns for three monocots and three eudicots.
The numbers at the bottom refer to the genes expressed in listed tissues. Significantly higher or lower levels of expression are indicated in bold or parentheses, respectively. Hypoxia is indicated by ‘inline image’. In general, expression in pollen and hypoxic roots of seedlings is higher than in other tissues. Sources are: Schwartz and Endo (1966), Schwartz (1971), Freeling and Schwartz (1973) and Okimoto et al. (1980) (maize); Ricard et al. (1986), Kadowaki et al. (1988), Xie and Wu (1989, 1990) (rice); Hanson et al. (1984) and Mayne and Lea (1984) (barley); Garabagi and Strommer (2004) and Garabagi et al. (2005) (petunia); Tanksley (1979) and Tanksley and Jones (1981) (tomato); Dolferus et al. (1984) (Arabidopsis).

The best known, and perhaps oldest, role for ADH genes in plants is in the hypoxic, or anaerobic, response. Plants respond to hypoxia, indicated in Figure 7 as ‘inline image’, by induction of ADH activity in specific vegetative tissues. Under normal (‘normoxic’) conditions, glycolysis produces pyruvate, which is shuttled into the mitochondria, in which respiration relies on the presence of oxygen to convert the glycolysis-derived carbons to carbon dioxide, producing water and (most importantly) energy. When sufficient oxygen for normal respiration is not available, e.g. in a flooded field, many plants modify their architecture by elongating their stems and/or creating aerenchyma that deliver oxygen from exposed leaves to submerged plant parts (Bailey-Serres and Voesenek, 2008). In addition, or alternatively, they shift from aerobic to anaerobic glycolysis, i.e. fermentation. Normal patterns of transcription and translation cease and a small number of ‘anaerobic peptides’ are preferentially produced (Okimoto et al., 1980). These proteins primarily represent enzymes that are used in glycolysis and fermentation, including lactate dehydrogenase (Hoffman et al., 1986), pyruvate decarboxylase (PDC) (Laszlo and St Lawrence, 1983) and alcohol dehydrogenase (Okimoto et al., 1980; Sachs et al., 1980).

As shown on the left in Figure 8, when respiration is inhibited, the pyruvate produced by glycolysis is converted by lactate dehydrogenase to lactic acid. PDC and ADH genes produce the enzymes required to divert pyruvate from synthesis of lactic acid to synthesis of ethanol, which is less toxic than either lactic acid or the intermediate acetaldehyde, and is more readily diffusible. The switch to fermentation allows continued metabolism of glucose and ATP regeneration at a low level, and regeneration of NAD+. Although energy is produced at roughly one tenth of the efficiency of oxidative phosphorylation, the limited energy associated with anaerobic glycolysis is considered to play an important role in survival of the cell. The reduced ability of mutant plants lacking functional ADH genes to withstand hypoxic stress has been reported for a number of plant systems, and ADH null mutants of Arabidopsis have significantly less resistance to acetaldehyde (Schwartz, 1964; Harberd and Edwards, 1982; Jacobs et al., 1988). The reduction of acetaldehyde under hypoxic conditions can thus be assumed to represent a key role for ADH in both monocots and eudicots.

Figure 8.

 Known enzymatic roles for alcohol dehydrogenases in plants.
For clarity, the regeneration of NAD+ accompanying the production of lactic acid, ethanol and aroma-associated alcohols is not shown. Abbreviations: LDH, lactate dehydrogenase; PDC, pyruvate decarboxylase; LOX, lipoxygenase; HPL, hydroperoxide lyase; ALDH, aldehyde dehydrogenase; AcCoAS, acetyl CoA synthetase.

In addition to tissues that produce ADH in response to flooding or laboratory manipulation, there are others that are likely to be oxygen-limited, such as those associated with the vasculature, set deep in flower buds, enclosed in seed coats, or in fast-growing meristems. The existence of a chronic state of hypoxia in the root stele was predicted many years ago based on mathematical modeling and the measurement of volatiles (Armstrong and Beckett, 1987; Thomson and Greenway, 1991). Use of histochemical methods has provided evidence for a low level of ADH activity both in the stele and the vasculature of seedling leaves (Foster Atkinson, 1996), and, when young petunia plants are subjected to a hypoxic environment, ADH activity is seen to spread from the vasculature through surrounding tissues (Foster Atkinson, 1996). Low levels of ADH gene expression in the ovary (Garabagi et al., 2005) probably also reflect a state of chronic mild hypoxia, based on a report by Liskens and Schrauwen (1966) indicating that oxygen tension drops suddenly at the base of the style.

Although multiple ADH isozymes are often present in seeds, the ratios of homodimer to heterodimer are skewed from those expected by random dimerization (Tanksley and Jones, 1981; Xie and Wu, 1989; Garabagi et al., 2005). Studies in which ADH activity was followed through seed development provided clear evidence that duplicated genes are active at different times (Torres et al., 1977; Tanksley and Jones, 1981; Hanson et al., 1984), suggesting multiple modes of induction. Developing seeds are likely to experience low oxygen stress, and ADH accumulated during seed development may be important during germination, but ethanolic fermentation may have an additional important role in the mature seed, which remains metabolically active at a low level. Zhang et al. (1994, 1995a,b, 1997) showed that seeds produce acetaldehyde, particularly under low-humidity, low-temperature conditions. Acetaldehyde is especially toxic in seeds, where it tends to form aldehyde–protein adducts that are linked to decreased viability. ADH activity in seeds may thus be critical for limiting this form of acetaldehyde toxicity. The changing patterns of ADH isozyme production that occur during seed maturation may represent enzymological adaptation to distinct needs associated with embryogenesis, seed maturation and viability.

Pollen grains express very high levels of ADH, typically ADH1 forms (Figure 7). It was initially assumed that these high levels of ADH activity were important for regenerating ATP under hypoxic conditions in the anther and/or under the energy-demanding conditions of pollen tube growth (measured at 1 cm/h in maize) (Taylor and Hepler, 1997). Both acetaldehyde and ethanol are produced throughout pollen development (Tadege et al., 1999) and pollen tube growth (Bucher et al., 1995), as expected for a hypoxic response. However, there were problems with assuming that ADH1 activity in the pollen was simply another response to hypoxia. The first problem was that the pollen of adh1 null mutants is not obviously disadvantaged, with alleles from ADH1/adh1 heterozygotes passed to progeny in normal Mendelian ratios (Freeling and Bennett, 1985). A second problem was the recovery of a maize mutant allele of ADH1 that was hypoxically inducible but was not expressed in pollen (Kloeckener-Gruissem and Freeling, 1987), indicating that modes of regulation differ between hypoxic induction and pollen expression; this finding was reinforced by direct demonstration of distinct promoter regions required (McCormick, 1991, 1993).

The role of ADH in pollen grains was explained at least in part by experiments performed by Kuhlemeier and co-workers (Bucher et al., 1995; Tadege et al., 1999; Mellema et al., 2002; Gass et al., 2005). The fate of pyruvate carbon under ‘normoxic’ conditions is passage through the so-called PDH complex to form acetate and then acetyl CoA, for use in the tricarboxylic acid cycle as well as fatty acid and amino acid biosynthetic biosynthesis. Kuhlemeier’s group provided incontrovertible evidence for the existence of a yeast-like ‘PDH bypass’ operating aerobically in pollen. This system uses PDC and aldehyde dehydrogenase (ALDH) to feed pyruvate-derived aldehyde, rather than pyruvate itself, into energy production and biosynthetic pathways, as shown in the centre panel of Figure 8. Under these circumstances, ADH essentially buffers the system, regulating carbon flow into the PDH bypass while protecting the cell from acetaldehyde toxicity. These studies suggested an explanation for some apparently unrelated findings, such as the dependence on ALDH (but not ADH) for normal anther development in maize (Liu and Schnable, 2002), and identification of the rf2 gene, which is required for restoration of fertility in cytoplasmic male-sterile T maize lines, as ALDH (Cui et al., 1996).

More recently, evidence has accumulated indicating that PDH bypass activity also occurs in sporophytic tissues (Wei et al., 2009). The simplest way to demonstrate PDH bypass activity is to feed experimental tissues 14C-labeled ethanol. The ethanol can be converted to acetaldehyde by ADH, but the conversion of pyruvate to acetaldehyde by PDC is irreversible. Recovery of 14C in CO2 or high-molecular-weight compounds is therefore evidence for its passage from acetaldehyde through the PDH bypass. Recovery of 14C in fatty acids has been demonstrated in seedlings, leaves, roots and flowers of Arabidopsis. Moreover, a knockout mutant for acetyl CoA synthetase in Arabidopsis lacks the ability to metabolize the introduced ethanol to fatty acids (Lin and Oliver, 2008), and the pathway is severely depressed in plants with inactive aldh alleles (Wei et al., 2009). A close study of adh null mutants of Arabidopsis revealed elevated levels of pyruvate in ‘normoxic’ roots, evidence that the ADH-dependent PDC/PDH-stat plays a role in controlling the delivery of pyruvate to the PDH complex in roots under normal conditions (Zabalza et al., 2009). In this study, the original evidence was obtained from pea roots rather than Arabidopsis, suggesting that studies of PDH bypass activity could be extended to plants with a family of ADH genes.

High levels of ADH activity have also been reported in the styles of solanaceous flowers, specifically the transmitting tract through which pollen travels to reach the ovary (Figure 7). Transmitting tracts of petunia and tobacco flowers exhibit constitutively high levels of ADH gene expression (Garabagi and Strommer, 2004; Garabagi et al., 2005; Gass et al., 2005), and ADH activity has been shown to be induced by the growing pollen tube in styles of potato (van Eldik et al., 1997). The transmitting tract is unlikely to be hypoxic: although a transient wave of low oxygen tension accompanies the tip of the growing pollen tube, high oxygen tension has been measured in the styles of Hippeastrum (Liskens and Schrauwen, 1966). In addition, during pollen tube growth, the tube excretes ethanol at a concentration predicted to reach 100 mm, as measured in vitro using tobacco pollen (Bucher et al., 1995). It is most likely that ADH produced by the pistil scavenges ethanol released by the pollen tube and converts it to acetaldehyde for delivery into the PDH bypass. Petunia relies on ADH2/ADH3 gene products for this activity.

Attached to the base of the ovary in petunia is a pair of nectaries. They are buried low in the flower in what may be a somewhat hypoxic environment, but, more importantly, they serve as a source of aromatic volatiles for attracting pollinators (Fahn, 1979). These volatiles include C6 and C9 aldehydes and alcohols produced through the lipoxygenase pathway, shown on the right in Figure 8, and depend on ADH for the aldehyde/alcohol conversion. In petunia, these volatiles have been shown directly to excite the antennae of pollinating insects (Hoballah et al., 2005). The petals of petunia have a similar aroma and serve a similar function (Gübitz et al., 2009). In petunia, it is the genes encoding ADH2 and ADH3 that act in the lipoxygenase pathway to produce floral scent (Garabagi and Strommer, 2004; Garabagi et al., 2005).

C6 and C9 volatiles, and in some cases their acetate derivatives, are also important determinants of fruit flavor, which helps to attract animals that serve as agents of seed dispersal (Schwab et al., 2008). The expression of ADH genes is under tight developmental control in ripening fruit. In some cases, including peach (Prunus persica) and apricot (Prunus armeniaca), ADH activity and/or alcohol levels are highest at an early stage in ripening, with derivative esters predominating at maturity (González-Agüero et al., 2009; Zhang et al., 2010). In others, including tomato (Lycopersicon esculentum), grape and melon (Cucumis melo), alcohols are associated with mature fruit. As shown in Figure 7, ripening tomato fruit contains a significant amount of ADH2, but no ADH1. In the fruits of transgenic tomato plants over-expressing ADH2, the levels of C6 alcohols were significantly increased relative to C6 aldehydes, and the fruit were reported to have more of a ‘ripe fruit’ flavor, attributed specifically to increased levels of Z-3-hexenol (Speirs et al., 1998). Similarly, in grape, which has a small family of ADH genes, the fruit produce higher levels of ADH and exhibit increased ADH activity in the late stages of ripening (Tesniere and Verries, 2000). This activity parallels the shift from a predominance of aldehydes to alcohols during berry maturation (Kalua and Boss, 2009).

Volatiles produced by the lipoxygenase pathway are not restricted to flowers and fruit: maceration of leaves leads to the almost immediate release of characteristic scents. While the volatiles involved may number in the hundreds, there is invariably a ‘green note’ scent that it determined largely by the types and ratios of C6 volatiles (Hatanaka, 1993). ADH activity is implicated directly and indirectly in production of the ‘green note’ scent: crushed leaves from adh mutant plants lack aroma (Salas et al., 2005; Nick Bate, Syngenta, personal communication), and ADH activity in petunia is readily observable in trichomes (E. Foster Atkinson and J. Strommer, University of Guelph, unpublished results), where volatiles are typically stored (Pichersky and Gershenzon, 2002). The ‘green note’ scents have been shown to mediate defenses against insect predators, both indirectly through recruitment of herbivore parasites and directly through deterrence of pests (Pichersky and Gershenzon, 2002).

Thus, as shown in Figure 8, the products of plant ADH genes allow ethanolic fermentation under oxygen-limited conditions and elimination of acetaldehyde, participate in aerobic fermentation similar to that of yeast, and produce characteristic scents that attract pollinators and agents of seed dispersal, and volatiles that discourage predation. To meet these demands, regulated expression of ADH genes or combinations of genes occurs under different developmental or stress-related conditions.

For both monocots and eudicots, the earliest ADH function was almost certainly in acetaldehyde reduction. Defense against herbivory came later, as did reliance on scent to attract agents for pollination and seed dispersal. At some point, the less well-understood process of aerobic fermentation arose. A study of ADH genes in monocots by Clegg et al. (1997) indicated a twofold slower rate of sequence change in the ADH1 forms compared with ADH2 forms. A slower rate of change has been suggested to reflect both older enzymatic function (Jörnvall et al., 2010) and stronger functional constraint (Clegg et al., 1997). This finding is consistent with the primary role for ADH1 forms in anaerobic fermentation and the predominant role of ADH1 in cereals.

Based on information obtained from selected tissues in a small subset of monocots, there is no unique role for ADH2, but it has been maintained for an estimated 65 million years (Gaut et al., 1999), and there is enzymatic and structural evidence for functional divergence. Mayne and Lea (1984) reported significantly different Km values for NAD+ for ADH isozymes of barley. Thompson et al. (2007) suggested the occurrence of adaptive evolution of ADH-P genes in maize, rice and barley based on the ratio of non-synonymous:synonymous substitutions in key areas of the enzyme, e.g., the loop around the second Zn ion, the dimerization region, and, to a lesser extent, the active site. The same group also compared predicted 3D structures for a number of ADH enzymes and calculated theoretical isoelectric points. The pI values were higher for ADH1-type isoforms than for their ADH2 counterparts in all cases (mean pI 6.30 versus 5.76; Thompson et al., 2010). The retention of multiple ADH genes and consistent differences in the chemical characteristics of their products suggest functional specialization that remains unidentified.

Evidence for functional specialization is more direct in solanaceous representatives of the eudicots, in which the presumed original need for anaerobic fermentation, together with production of scent through the lipoxygenase pathway, is met by genes designated ADH2 (and ADH3 in the case of petunia). It has been reported for both tomato and grape (Tesniere and Verries, 2000) that the ADH2 dimers associated with mature fruit, like the ADH2 dimers of cereals, have lower theoretical pI values than ADH1 forms. ADH1 activity is largely limited to pollen, where its enzymatic function appears to be associated with balancing carbon needs through aerobic fermentation, and seed, in which its function is unclear. The ADH genes of distantly related cotton exhibit similar patterns of differential function.

The documented roles for ADH-P enzymes suggest some basic biochemical attributes. First, enzymes that evolved to detoxify acetaldehyde and retain that function in the hypoxic response are expected to operate efficiently as reductases. Second, acting as a carbon-stat in the PDH bypass requires reasonable activity for the reverse reaction as well. Third, enzymes acting in the lipoxygenase pathway, specifically the forms found in scent- and flavor- producing tissues, must readily reduce the aldehydes associated with the lipoxygenase pathway. These conditions appear to be generally met.

The measured affinities of plant alcohol dehydrogenases for acetaldehyde are almost invariably higher than those for ethanol. Although the values for specific enzymes may vary as much as 20-fold, Km values for the aldehyde substrate are commonly of the order of 1 mm, while those for ethanol are approximately tenfold higher (see, for example, the measurements for seven plant ADHs in the study by Bicsak et al., 1982). Two exceptions are the well-studied ADH1-S homodimer of maize, which has Km values for acetaldehyde and ethanol of 42 and 24 mm, respectively (Osterman et al., 1993), and three ADH isozymes of wheat, which all have Km values of approximately 10 mm for the reaction in either direction (Suseelan et al., 1987).

Catalytic efficiencies also differ considerably, whether it is the forward and reverse reactions or the reactions of isozymes with the same substrate that are being compared. In the case of maize ADH1-S, the Vmax for acetaldehyde reduction was three times higher than that for the reverse reaction (Osterman et al., 1993). Where kinetic data are available for isozymes, e.g. VvADH2 and VvADH3 of grape (Tesniere and Verries, 2000), a lower Km for one enzyme/substrate pair is typically counterbalanced by a lower Vmax for either the reverse reaction or the alternative isozyme. This association of higher substrate affinity with one isozyme and higher catalytic efficiency with the alternative itself constitutes a measure of functional specialization.

Although kinetic analyses of ADH enzymes predicted to act in the lipoxygenase pathway have often relied on acetaldehyde and/or ethanol substrates, ADH forms isolated from ripening fruit have reasonable Km values with C6 aldehyde substrates. The Km of tomato ADH isolated from ripe tomatoes (almost certainly ADH2; Longhurst et al., 1994) was 1 mm for a hexanal substrate (Bicsak et al., 1982); one of three enzymes from olive (Olea europaea) acted on hexanal and E-2-hexenal with especially low Km values of 0.04 and 0.012 mm, respectively (Salas and Sánchez, 1998).

Thus the kinetic data accumulated to date for ADH-P isozymes are consistent with both observed patterns of gene expression and demonstrated enzymatic functions in the hypoxic response, PDH bypass and/or lipoxygenase pathway.

In summary, then, what achievements can we claim for half a century’s work on plant alcohol dehydrogenase gene families and their products? We have very strong evidence regarding their origins and evolution. We know that ADH enzymes play multiple roles in anaerobic fermentation, aerobic fermentation, and the production of scents that discourage predation, attract pollinators and facilitate seed dispersal, and there are clues suggesting that there is more to discover. In selected taxa, we know a great deal about the differential contributions of ADH isozymes to each of the identified roles. With significant contributions from protein chemists, biochemists, population biologists and geneticists, we have obtained an array of information on the ADH gene families and the suite of proteins they produce that has allowed us to develop a level of understanding that probably exceeds what has been learned for any of the thousands of other identified plant gene families.

Acknowledgements

The invitation from the editors to prepare this review, critical reading of the manuscript by Michael McLean (School of Environmental Sciences, University of Guelph), and helpful comments from reviewers are all gratefully acknowledged.

Ancillary