Evolution of glycosaminoglycans and their glycosyltransferases: Implications for the extracellular matrices of animals and the capsules of pathogenic bacteria


  • Paul L. DeAngelis

    Corresponding author
    1. Department of Biochemistry and Molecular Biology, Oklahoma Center for Medical Glycobiology, University of Oklahoma Health Sciences Center, Oklahoma City, Oklahoma
    • Dept. of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, 940 Stanton L. Young Blvd., Oklahoma City, OK 73104
    Search for more papers by this author
    • Fax: (405) 271-3092


Glycosaminoglycans (linear polysaccharides with a repeating disaccharide backbone containing an amino sugar) are essential components of extracellular matrices of animals. These complex molecules play important structural, adhesion, and signaling roles in mammals. Direct detection of glycosaminoglycans has been reported in a variety of organisms, but perhaps more definitive tests for the glycosyltransferase genes should be utilized to clarify the distribution of glycosaminoglycans in metazoans. Recently, glycosyltransferases that form the hyaluronan, heparin/heparan, or chondroitin backbone were identified at the molecular level. The three types of glycosyltransferases appear to have evolved independently based on sequence comparisons and other characteristics. All metazoans appear to possess heparin/heparan. Chondroitin is found in some worms, arthropods, and higher animals. Hyaluronan is found only in two of the three main branches of chordates. The presence of several types of glycosaminoglycans in the body allows multiple communication channels and adhesion systems to operate simultaneously. Certain pathogenic bacteria produce extracellular coatings, called capsules, which are composed of glycosaminoglycans that increase their virulence during infection. The capsule helps shield the microbe from the host defenses and/or modulates host physiology. The bacterial and animal polysaccharides are chemically identical or at least very similar. Therefore, no immune response is generated, in contrast to the vast majority of capsular polymers from other bacteria. In microbial systems, it appears that in most cases functional convergent evolution of glycosaminoglycan glycosyltransferases occurred, rather than direct horizontal gene transfer from their vertebrate hosts. Anat Rec 268:317–326, 2002. © 2002 Wiley-Liss, Inc.

Many organisms produce extracellular polysaccharides. Microbial polysaccharides are probably ancient adaptations because they are present in “living fossils” such as bacterial stromatolites (Kawaguchi and Decho, 2000). Usually a single homogeneous anionic polysaccharide is synthesized in bacteria and simple algae. In microbes, extracellular polysaccharide coatings, called capsules, can play many roles, including: adhering to surfaces or to other cells, preventing desiccation, shielding the microbe from noxious factors, sequestering compounds from the environment, and modulating their host's physiology. From the time bacteria first swept over Earth to the present, a polysaccharide probably helped these microorganisms cling to life.

On the other hand, many multicellular organisms produce extracellular matrices containing several polysaccharides with various properties and functions. It is interesting to speculate that the ancestral eukaryotic cells acquired their polysaccharide-synthesizing machinery from their bacterial endosymbionts. Over the eons, the initial enzymes were then sculpted into a bewildering medley of catalysts. Plants produce a wide variety of neutral and acidic extracellular polymers that are structural components for wood and leafy structures. Mammals and birds usually possess four chemically similar polymers called glycosaminoglycans (GAGs), but it appears that some invertebrates may contain only a subset of these polymers. Certain marine organisms also produce fucan polymers.

In vertebrates, the molecules of the GAG family—hyaluronan (or hyaluronic acid (HA)), heparin/heparan sulfate, chondroitin sulfate, and keratan sulfate—can play structural, adhesive, and/or signaling roles (Laurent and Fraser, 1992; Sugahara and Kitagawa, 2000; Esko and Lindahl, 2001; Toole, 2001). Because of the numerous capabilities of GAG polymers, understanding their history may yield more clues as to their functions in development, health, and disease. In this review we discuss our current understanding of, and speculations about the evolution of the enzymes that produce the repeating disaccharide backbone of the GAG polysaccharides, called synthases or copolymerases.

Glycosaminoglycan Structures

Glycosaminoglycans are linear polysaccharides composed of repeating disaccharide units containing a derivative of an amino sugar (either glucosamine or galactosamine). This review focuses on the repeating disaccharide GAG backbone and its biosynthesis. Therefore, the GAG nomenclature of heparan, chondroitin, and keratan will be used to describe the various polymers without specification of any post-polymerization modifications.

Hyaluronan, chondroitin, and heparan contain a uronic acid as the other component of the disaccharide repeat, whereas keratan contains a galactose (Table 1). Vertebrates can contain all four types of GAGs, and the polysaccharide chain is often further modified after sugar polymerization. One or more modifications (including O-sulfation of certain hydroxyls, deacetylation, and subsequent N-sulfation or epimerization of glucuronic acid (GlcUA) to iduronic acid (IdoUA)) are found in most GAGs, with the exception of HA (Esko and Lindahl, 2001). A few clever microbes also produce GAG chains, but sulfation or epimerization have not yet been described. The chondroitin and heparan chains in vertebrates are initially synthesized by elongation of a xylose-containing linkage tetrasaccharide attached to a variety of proteins. Keratan is either O-linked or N-linked to certain proteins, depending on the particular glycoconjugate used. HA and all of the known bacterial GAGs are not part of glycoproteins.

Table 1. Glycosaminoglycan structures, occurrence, and glycosyltransferases*
GAGDisaccharide repeatOrganismGAG-transferase (family)
  • *

    Four structurally related polymers are found in vertebrates as well as certain lower life forms and bacteria. Many of the synthases or co polymerases that produce the GAG disaccharide repeats are now known. The enzymes have been classified into various glycosyltransferase families (Campell et al., 1997) suggesting separate origins and histories for these enzymes. The vertebrate GAGs, except for HA, are often further modified by sulfation (SO4) and/or epimerization after polymerization.

  • a

    GT2-C, Family 2 glycosyltransferase with a putative chitin synthase-like domain.

  • b

    The chitin-like region of Streptococcus HAS is not as similar to chitin synthases as the same region of eukaryotic enzymes.

  • c

    Some percentage of GlcUA converted to IdoUA by epimerase.

  • d

    In E. coli K4, the polymer has a fructose (F) attached to C3 of the GlcUA groups.

Hyaluronanβ3GlcNAc-β4GlcUAVertebratesClass I HAS1,2,3 (GT2-C)a
  StreptococcusClass I spHAS (GT2-C)b
  Chlorella virusClass I cvHAS (GT2-C)
  PasteurellaClass II pmHAS (two GT2s)
Chondroitinβ3GalNAc-β4GlcUAc (SO4)Vertebrates, insectaChSy (GT31)
 β3GalNAc-β4GlcUAPasteurellapmCS (two GT2s)
 β3GalNAc-β4GlcUA(F)dEscherichiaKfoC (two GT2s)
Heparan sulfate/heparinα4GlcNAc-β4GlcUAc (SO4)Vertebrates, Insecta, nematodaEXT 1, 2 (GT47)
Heparosanα4GlcNAc-β4GlcUAPasteurellapmHS (GT2,GT45)
  EscherichiaKfiA (GT45) + KfiC (GT2)

Enzymes That Produce Glycosaminoglycans

Glycosyltransferases are the enzymes that polymerize multiple sugars into chains, or add single sugar molecules onto existing molecules, including carbohydrates, proteins, and lipids. Glycosyltransferases are ancient enzymes, as evidenced by their presence in all forms of life. Indeed, the omnipresent nucleic acids, DNA and RNA, are highly modified phosphate-linked ribose polysaccharides, which qualifies their polymerases as specialized glycosyltransferases!

All known acidic GAG backbones (i.e., all but keratan) are synthesized using UDP (uridine diphospho)-sugar nucleotides at near neutral pH with magnesium or manganese cofactors according to the reaction:

equation image

where HexNAc = N-acetylglucosamine (GlcNAc) or N-acetylgalactosamine (GalNAc).

Representatives of the glycosaminoglycan transferases that form the repeating disaccharide backbones of acidic GAGs in mammals were cloned in 1996–2001. However, no keratan synthase (from any source) has been identified. Even though all three acidic GAGS have a similar repeating disaccharide structure, the three types of mammalian enzymes (HA synthase isozymes (reviewed in Weigel et al., 1997; Spicer and McDonald, 1998), heparan copolymerases or synthases (Cheung et al., 2001; Duncan et al., 2001), and chondroitin synthase (Kitagawa et al., 2001)) have very different amino acid sequences. As these enzymes have only recently been discovered, their complete structure/function relationships are not known. A single gene, and thus a single protein, is thought to be required for catalytic glycosyltransferase activity in all cases. No three-dimensional (3D) structures are available, and these membrane-associated proteins promise to be difficult to study in their native state. Identification of the active sites, and elucidation of the reaction mechanism are the focuses of current research.

Bacterial glycosyltransferases for all GAG polymers (except keratan) have been discovered in Escherichia, Streptococcus, and Pasteurella (reviewed in DeAngelis, 2002). In the case of the Pasteurella HA and chondroitin synthases, it was demonstrated that the enzyme is a single polypeptide containing two relatively independent active sites (Jing and DeAngelis, 2000). One site transfers GlcUA, and the other site transfers the hexosamine; single sugars are then added in a stepwise fashion to the nonreducing terminus to build the alternating GAG chain (DeAngelis, 1999a). The newly discovered E. coli chondroitin polymerase, KfoC, is homologous to the pmHAS and pmCS enzymes (∼60% identical) and appears to operate in a similar fashion (Ninomiya et al., 2002) (note that this enzyme was termed a polymerase rather than a synthase, because of its apparent requirement for an acceptor oligosaccharide). On the other hand, heparosan made in E. coli K5 is polymerized by a complex containing at least two polypeptides, KfiA and KfiC, each of which is responsible for a single sugar linkage (Hodson et al., 2000).

Classification of GAG Glycosyltransferases Based on Sequence Differences and Enzymatic Properties

The classification of glycosyltransferases has been based on amino acid sequence differences, postulated reaction mechanisms, and the structure of the carbohydrate product (Campbell et al., 1997). Using sequence alignments and hydrophobic cluster analyses, distinct types of putative catalytic modules are distinguishable. At this point, 56 individual glycosyltransferase (GT) families have been named (http://afmb.cnrs-mrs.fr/CAZY/index.html). Based on this “taxonomic” evidence, the various acidic GAG glycosyltransferases appear to be quite distinct (B. Henrissat, personal communication) (Table 1). The vertebrate, viral, and streptococcal HA synthases (class I synthase) appear to have a single GT2 module melded with a chitin synthase-like domain. On the other hand, the Pasteurella HA synthase (class II HA synthase) has two GT2 modules. The highly homologous Pasteurella chondroitin synthase, pmCS, also has two GT2 modules. Apparently, mutations in one module allow the hexosamine sugar transfer specificity to change in comparison to the HA synthase. The Escherichia chondroitin polymerase, KfoC, which is very similar to the pmHAS and pmCS enzymes, also possesses two GT2 modules. In contrast, the human chondroitin synthase appears to be a distinct class of enzyme designated GT31. The human heparan copolymerases or synthases (EXT proteins) belong to the GT47 family. However, the analogous Pasteurella enzyme contains both a GT2 and a GT45 module.

In mammals, the cellular location of biosynthesis of HA differs from heparan and chondroitin. HA is produced directly at the plasma membrane, while the latter two GAGs are made in the Golgi apparatus and secreted. A further distinction is that free chains of HA are made, but heparan and chondroitin are attached to a polypeptide to form a glycoprotein or a proteoglycan.

Even though the heparan and chondroitin biosynthetic pathways share many similarities in mammals, the basic enzymology of their synthases differs. All UDP-sugar precursors are alpha-linked. Chondroitin is an entirely beta-linked polymer; therefore, inverting mechanisms are utilized during the sugar transfer. On the other hand, heparan contains alternating alpha- and beta-linkages, implying that both a retaining and an inverting mechanism are involved in synthesis. The production of the two types of anomeric glycosidic bonds probably requires distinct catalytic sites, based on more in-depth work performed with various glycosidases. Many of these small, robust, soluble, degradative proteins have been crystallized with substrate analogs and/or subjected to detailed chemical modification or mutagenesis studies (reviewed in Jedrzejas, 2000). Overall, the taxonomic classification and the differences in enzymatic characteristics suggest that the vertebrate proteins that catalyze the synthesis of the HA, heparan, and chondroitin repeats were constructed by distinct evolutionary paths.

GAG Analyses Past and Present

The glycosaminoglycans from animal extracts have been studied for decades, since the discoveries of chondroitin in the late 1800s, heparan in 1916, and HA in 1934. Polymers were typically purified by differential extraction and precipitation with various solvents or aliphatic quaternary ammonium compounds. Carbohydrate identification was based on relatively insensitive monosaccharide analyses (which detect only major components and miss labile compounds) or relatively nonspecific, cross-reacting histochemical methods. In particular, the typical cationic dyes and stains detect almost any highly negatively charged component, surface, or tissue. Later, electrophoresis in agarose gels or cellulose acetate membranes was used to analyze intact GAG components (reviewed in Beaty and Mello, 1987; Kodama et al., 1988), but preparations of these polydisperse molecules were often cross-contaminated. Fortunately, various glycosidases with rather narrow GAG substrate specificity have become commercially available (reviewed in Linhardt et al., 1986). These reagents readily discern between polymers that differ only in anomeric linkage or epimers. Current methodologies of high-resolution chromatography, mass spectroscopy, fluorophore-assisted electrophoresis, and/or capillary electrophoresis of oligosaccharides generated by specific GAG degradative enzymes have allowed much progress to be made (Imanari et al., 1996; Calabro et al., 2000; Koketsu and Linhardt, 2000; Sturiale et al., 2001; Lamari et al., 2002). However, these analyses of high-molecular-weight polymers still involve fragmentation, separation, and virtual reconstruction to solve the “actual” structure. Nonselective degradation or incomplete digestion (i.e., presence of inhibitors) will confuse the results. Furthermore, it is difficult to dissect the intertwined mixture of polysaccharides and proteins that form the vertebrate matrix one component at a time during histochemistry, without breaking cross-links or junctions. Nuclear magnetic resonance (NMR) spectroscopy of intact (and often viscous) polymers with high-field instruments usually yields broad spectral peaks that do not reveal all the potential complexity of GAGs or the presence of minor components. Overall, GAGs are difficult molecules to characterize in detail.

GAG Polymer Occurrence in Metazoans

Cnidarians were the first animals with muscles and nervous systems. The Hydra is considered to be a representative of early metazoan life. The Hydra was thought to contain heparan proteoglycan, as well as collagen, fibronectin, and laminin, based on experiments using antibody reagents (Sarras et al., 1991). The antibody detects the protein portion of the proteoglycan, so the presence of the actual GAG chain was not verified directly. The presence of HA or chondroitin was either not reported or not tested.

Rahemtulla and Lovtrup (1974a, b, 1975a, b, 1976) reported finding HA and sulfated GAGs (assumed to be heparin/heparan sulfate and/or chondroitin sulfates) in a wide spectrum of animals. Cellulose acetate membrane electrophoresis and monosaccharide analysis was used to identify the polysaccharide components from various organisms. In these reports, a parasitic flatworm (Platyhelminthes) had chondroitin, and a parasitic roundworm (Nematoda) had chondroitin and HA. A freshwater clam (Bivalvia) and the blow fly (Insecta) were reported to have HA, chondroitin, and heparin. The lobster (Crustacea) was reported to have all GAGs, including keratan. Some of the later reports in the series used testicular hyaluronidase (which can also degrade chondroitin) or chondroitinase ABC preparations to assess the nature of the GAG. However, the specific Streptomyces HA lyase (which was not readily available at the time) was not utilized as a control.

Recently, another group detected heparin or heparan sulfate in all marine invertebrates tested, including Cnidaria, Ctenophora, Echinodermata, and Crustacea. These studies utilized specific glycosidases, disaccharide analyses, and blood coagulation bioassays to verify the presence of this GAG (Medeiros et al., 2000). Using less rigorous methodology, chondroitin or chondroitin-like polymer were also reported in most, but not all, organisms.

Recent high-resolution GAG disaccharide methodology of two well-known model eukaryotic organisms with completed genomes, the worm Caenorhabditis elegans (Nematoda) and the fruit fly Drosophila melanogaster (Insecta), suggest that HA is not present (Toyoda et al., 2000). However, heparin/heparan sulfate is found in both organisms. Certain growth factors and signaling factors involved in pattern formation and development in the fly and the mouse both utilize heparan, suggesting that this function is ancient. The fruit fly contains chondroitin sulfate, while the worm contains only unsulfated chondroitin. The latter organism may not have evolved sulfotransferase modification enzymes, or, alternatively, may have lost the enzyme. Squid (Cephalopoda) are also known to contain chondroitin sulfate (Kawai et al., 1966). The arthropods and higher animals have more complicated body plans compared to the worms, and probably require more distinct signals for intracellular communication; therefore, it is logical that these organisms need more GAG species. Overall, it is quite likely that ubiquitous heparan was the original GAG in the metazoan lineage (Fig. 1).

Figure 1.

Hypothetical phylogenetic tree of glycosaminoglycan occurrence. Based on the occurrence of glycosyltransferases as ascertained by molecular biological techniques and modern GAG analyses, we can conclude that the ubiquitous heparin-like polymers were the ancestral GAG in metazoans. Independently evolved genes for the production of chondroitin, and then HA, arose. The first organism to produce chondroitin is not known. The vertebrate HA synthases may have arisen from chitin synthases, but as yet there is no direct evidence (denoted by “??”). Certain pathogenic bacteria have capitalized on the GAGs to infect their nutrient-rich hosts. In most cases, it appears that the microbes reengineered their polysaccharide-producing machinery to synthesize a GAG, but there is a possibility that Gram-positive Streptococcus acquired an HA synthase gene from their vertebrate host. Once a bacterium develops a useful system, the genes are often spread among other microbial species by various transfer mechanisms.

GAG Glycosyltransferase Occurrence in Metazoans

GAG polysaccharides are the products of the catalytic action of enzymes. As described earlier, GAGs can be somewhat difficult to characterize due to their microheterogeneity, polydispersity, and large size. The presence of glycosyltransferase genes offers an alternative method by which to extract their possible evolutionary history.

In the animal kingdom, hyaluronan synthases have only been firmly documented in two out of three branches of chordates (Spicer and McDonald, 1998) (Spicer, personal communication). Using degenerate primers and polymerase chain reaction (PCR), a variety of organisms have tested negative for the class I HA synthases, including the sponge Microciona porifera (Porifera), the sea urchin Strongylocentrotus purpuratus (Echinoderma), and the tunicates Ciona intestinalis and Styela plicata (Urochordata) (Spicer, personal communication). Three to four HA synthase isozymes are present in organisms ranging from Amphioxus to human (as discussed in the next section). No animal has been found with a single HA synthase gene to suggest the earliest ancestor or founder.

After searching the genomic sequencing projects for the fly Drosophila (Insecta) and the worm C. elegans (Nematoda) with the BLAST algorithm, we found no candidate HA synthase genes. The lack of recognizable HA synthase genes in the fly genome was hypothesized to be evidence that the HA synthases are relatively recent innovations in the evolution of metazoans (Lee and Spicer, 2000). If the early reports of the presence of HA in some invertebrates are correct, it is possible that another, hitherto unknown glycosyltransferase can make HA polymer. Alternatively, the older methodology may have misidentified another extracted anionic polymer as HA.

The story of the glycosyltransferases involved in heparan and chondroitin biosynthesis is complicated and still unfolding. The identification of mammalian EXT1 and 2 (exotosin) isozymes as heparan synthases or copolymerases, as well as several EXTL (exotosin-like) proteins, are recent discoveries (reviewed in Duncan et al., 2001). Genes for similar enzymes were found in both the fly and the worm, as expected from the polysaccharide analyses. In the last year, a mammalian enzyme involved in chondroitin biosynthesis was discovered (Kitagawa et al., 2001). At least one protein in the Drosophila genome has great similarity to the human chondroitin synthase sequence. In the near future, more accurate genomic sequence data from human and mouse should illuminate the relationships of these glycosyltransferase genes based on intron/exon structure.

Fortunately, the heparan and chondroitin synthases are not very similar at the DNA or protein level, which enables facile identification. As more invertebrate genomes are determined, it will be exciting to follow the genes encoding heparan or chondroitin biosynthesis through the tree of life.

Chondroitin and heparan molecules are further processed by sulfation and epimerization during biosynthesis in the Golgi; therefore, more auxiliary enzymes (and more genes) are involved. A number of distinct sulfotransferases that add sulfates to various positions are emerging. The heparan (but not the chondroitin) epimerase has been cloned. In the future, the evolutionary history of these modified GAGs should be more clearly understood due to the development of more markers to distinguish possible relationships. The presence of many modification systems also suggests that these sulfated GAGs are ancient components of metazoans. On the other hand, HA, a relative newcomer to the animal matrix, is not known to be similarly modified. One can guess that HA modification by similar systems will not occur in the near future. It is likely that the plasma membrane localization of HA synthases and direct secretion of the polymer into the matrix preclude HA from these particular modifications (unless the appropriate sulfotransferases, activated sulfate donor, and epimerases stray out of the Golgi apparatus).

Evolution of HA Synthase Isozymes in Chordates

The HAS isozyme radiation must have occurred early in chordate history, because organisms in the Cephalochordata and the Craniata (but not the Urochordata) subphyla possess three or four HAS genes. Three gene duplication events occurred, as assessed by exon/intron boundary and sequence comparisons of genomic DNA (Spicer and McDonald, 1998) (Fig. 2). The genes are scattered on different chromosomes, except for Amphioxus, which has two genes in rather close proximity. Amphioxus has four genes, but only one has been shown to be functional at this stage of study (Spicer, personal communication). The frog (Xenopus) has three functional HAS genes and one apparent pseudogene. Mammals (as represented by mouse and human) have only three recognizable isozymes; apparently, the inactive pseudogene was lost (Spicer and McDonald, 1998).

Figure 2.

Possible evolutionary path of the vertebrate HA synthase isozymes. Based on the intron/exon boundaries and sequence, the HA synthases appear to have undergone three duplication events (Spicer and McDonald, 1998). The founder HA synthase(s) and the first organism (probably an early chordate or a precursor) to produce HA polysacharide are still unknown (denoted by “??”). At some later time (denoted by “?”), the ancestral HAS gene was duplicated to form the Has1-like lineage and another HAS gene, denoted here as HasX, which led to Has2 and Has3. Only one Amphioxus gene has been verified to be active thus far (A. Spicer, personal communication). One of the amphibian gene products (HA synthase related sequence (Has-rs)) appears to be inactive. This pseudogene was subsequently lost; thus, only three HA synthases are known in mammals.

Utility of Multiple HA Synthases in Mammals

Mammalian cells respond to the components of their extracellular matrix by altering proliferation, migration, or metabolism. The size and the amount of HA polymer are known to alter cellular behavior and proliferation (Laurent and Fraser, 1992; Toole, 2001). For example, high-molecular-weight HA inhibits blood vessel formation, but smaller forms stimulate angiogenesis. HA levels vary in tissues during development and wound healing. A potential mechanism to control HA size and quantity is to vary biosynthesis at the synthase level; differential expression of several synthases with distinct intrinsic enzyme activities could be an important form of regulation. In vitro comparisons of the three mammalian HA synthase isozymes shows that the relative specific activities and the size distribution of polymer products are distinct (Itano et al., 1999). Gene knockout studies in mice have shown that loss of the Has2 gene is lethal in utero (Camenisch et al., 2000). Both the Has1 and Has3 null mice are viable and fertile, and have no gross behavioral or morphological phenotypes (McDonald, Spicer, Camenisch, Itano, and Kimata, personal communication).

Potential Origin of the Ancestral Vertebrate HA Synthase Gene

The GT2 family of glycosyltransferases contains many distinct enzymes, including those that make HA, cellulose, and chitin. If this taxonomic classification is a true group, the sheer abundance of distinct enzymes suggests that this particular family is old. UDP-glucose and UDP-GlcNAc were probably among the first sugar nucleotide precursors found in the universal common ancestor. Cellulose, a β4glucose polymer, is a key component of the cell wall of plants and is made by certain bacteria. Chitin, a β4GlcNAc polymer, is very widespread. This polymer is found in the cell walls of many fungi, in parts of mollusks, and in the exoskeletons of arthropods. Multiple chitin synthase isozymes are known in fungi and insects. Thus, chitin synthase was probably one of the original polysaccharide glycosyltransferases in the common ancestor of the metazoan lineage.

Lee and Spicer (2000) speculated that a β3 transferase activity was somehow added to the preexisting β4 transferase activity of a chitin or cellulose synthase to create a HA synthase. It is probably more likely that a chitin synthase isozyme in an organism closely preceding the chordates was mutated to produce a HA synthase; a cellulose synthase was not the immediate ancestral gene (Fig. 1). Several unrelated lines of circumstantial evidence invoke this hypothesis. No animal is known to produce cellulose or to contain a cellulose synthase gene. Many invertebrates contain chitin polysaccharide, but thus far (with the reported exception of the blenny fish (Misof and Wagner, 1992)), chitin polysaccharide appears to be absent in vertebrates. In contrast, no invertebrates appear to contain HA, while fish and higher animals all possess HA in abundance. With respect to amino acid sequence analysis, the vertebrate HA synthases contain a potential chitin synthase-like region fused to the GT2 module. Both HA and chitin synthases are integral membrane proteins with multiple predicted transmembrane segments. Some groups have reported evidence that HA synthases will synthesize small chitin oligosaccharides under certain circumstances in vitro (Semino and Robbins, 1995; Yoshida et al., 2000). Evidence for small chitin oligosaccharides playing roles in the development of fishes has been presented (Bakkers et al., 1997). Perhaps an examination of other living metazoans preceding the chordate branch will identify a more chitin synthase-like HA synthase, or other intermediates.

GAG Capsules of Bacterial Pathogens

The bacteria with GAG capsules from the genera Pasteurella, Escherichia, and Streptococcus are virulent human and/or animal pathogens. Depending on the host and the strain, these microbes can cause death within a few days after infection. It is very advantageous for the microbe to use GAGs, as evidenced by the much lower virulence of isogenic acapsular mutants in comparison to the wild-type parental strains.

The exact functions of the GAG capsules are still emerging, but it appears to serve as camouflage, protective shielding, and an adhesive device, as well as a means of manipulating host cell behavior (Roberts, 1996; DeAngelis, 2002). The endogenous vertebrate GAGs in the matrix alter cellular behavior, and it is logical to assume that GAG-coated microbes are sure to seize any advantage to aid infection. It was recently reported that group A Streptococcus take advantage of a human HA-signaling pathway to assist bacterial invasion during infection (Cywes and Wessels, 2001).

An added bonus for microbes is that GAGs are relatively nonimmunogenic polymers. Typically, an antibody response is generated against foreign capsular polysaccharides (i.e., those that are not similar to the host molecules). This prevents a subsequent infection by invading microbes of the same capsular type. However, as a result of molecular mimicry, GAG capsules permit multiple successive infections.

Evolution of GAG Synthases of Bacterial Pathogens

Examination of the known bacterial GAG glycosyltransferases suggests that in most cases these genes have arisen separately from the vertebrate genes. It is highly likely that the selective pressure of the vertebrate host defenses attempting to repel microbes has forged bacterial GAG glycosyltransferases.

Gram-negative Carter type A P. multocida possess a HA synthase, pmHAS, that is very distinct at the protein level from the vertebrate and streptococcal HASs (DeAngelis et al., 1998). Furthermore, their intrinsic enzymatic activities also differ: only pmHAS will elongate exogenous HA acceptor oligosaccharides. Based on these differences, the pmHAS enzyme has been designated as a separate group (class II) from all the other known HASs (class I) (DeAngelis, 1999b). The benefit of producing a HA capsule was so great that Pasteurella redesigned an existing glycosyltransferase (or perhaps a pair of glycosyltransferases) to make HA. The plasticity of this type of polypeptide is revealed when one considers the case of type F P. multocida (DeAngelis and Padgett-McCue, 2000). This closely related microbe makes an unsulfated chondroitin capsule utilizing a synthase, pmCS, that is 90% identical to the HA synthase. Mutation of the hexosamine transfer module results in an enzyme that catalyzes the formation of a new polymer. This bacterial chondroitin synthase is not similar at the protein level to the human analog.

The utility of GAGs for enhancing bacterial virulence is further demonstrated by type D P. multocida. These bacteria make an unsulfated heparan (N-acetylheparosan or heparosan) using a distinct enzyme, pmHS, that is not similar to the other Pasteurella synthases or to the animal heparan enzymes (DeAngelis and White, 2001). It is extremely likely that the Gram-negative E. coli K5 dual glycosyltransferase complex of KfiA and KfiC is related to pmHS (Hodson et al., 2000). It is not clear yet whether a fusion or a scission event created a dual-action synthase or two single-action transferases. Again, the sequencing of more genomes should provide additional clues.

It is clear that the origin of Pasteurella HA synthase is quite distinct from the vertebrate HA synthases, but the history of the enzyme counterparts from the Gram-positive Streptococcus groups A and C is both unclear and intriguing. Streptococcus groups A and C produce HA using a synthase with a protein sequence similar to that of the vertebrate enzymes (Fig. 3). This immediately raises the question, was the bacterial HA synthase the result of convergent functional evolution or horizontal gene transfer? The issue has been pondered before (Weigel et al., 1997, Spicer and McDonald, 1998; DeAngelis, 1999a), but unfortunately, there is no unequivocal evidence at this time.

Figure 3.

Schematic organization and alignment of class I HA synthase proteins. A: The streptococcal, vertebrate, and viral HASs are integral membrane proteins that have a similar predicted overall organization (the streptococcal enzyme is shown here because topological organization data has been obtained (Heldermon et al., 2001). A central conserved region of the enzymes is predicted to contain a catalytic domain that resides in the cytosol (Cyto); this region is used in the alignment in panel B. A glycosyltransferase family 2 domain (GT2) and a chitin synthase-like (CHS-like) domain are found overlapping in this region. This putative catalytic domain is flanked on both sides by multiple transmembrane (M) and/or membrane-associated (m) segments, but it also appears to contain an internal membrane associated region. In comparison to the bacterial enzyme, the longer eukaryotic enzymes possess two additional predicted membrane-bound segments at the carboxyl terminus. B: Certain short amino acid sequence motifs are found in all class I HA synthases. In this Multalin alignment (Corpet, 1988) of the central region (residues ∼200 to ∼400 using the vertebrate enzyme numbering system) of class I HA synthases, the red and green letters indicate 90% or 50%, respectively, identical residues among the proteins (consensus symbols: %, any one of F, Y, W; $, any one of L, M; !, any one of I, V; #, any one of E, D, Q, N). Several residues of the motifs are invariant, but certain other positions correspond to the three sources of gene: vertebrates (h, human; m, mouse), Streptococcus (sp, group A; se, su, group C), and Chlorella virus (cv). At this time, it is not known whether functional convergent evolution via distinct historical routes operated, or a horizontal gene transfer event occurred followed by optimization of the usurped enzyme for a particular organism's physiology.

The HA synthases from closely related streptococcal bacteria, the human pathogen group A (S. pyogenes) and the predominantly animal pathogen group C (S. equisimilis and S. uberis), are ∼70% identical to each other at the protein level (DeAngelis et al., 1993; Kumari and Weigel, 1997; Ward et al., 2001). The organization of the capsule operon is not identical among the three isolates, but in all cases the HA synthase gene is followed by a UDP-glucose dehydrogenase gene, which encodes an enzyme that forms the UDP-GlcUA substrate required for HA biosynthesis. The most likely possibility is that a common ancestor of the modern bacteria groups A and C had a single HA synthase gene that subsequently diverged over time.

If an ancestral Streptococcus bacteria did usurp a vertebrate HA synthase coding region, then extensive modification of the synthase to optimize its function in the bacterial cell may have caused the modern streptococcal genes to be only ∼30% identical to the vertebrate genes. Also, there is no counterpart for a large portion of the carboxyl termini of the vertebrate HA synthase in the microbial enzymes (however, a simple deletion event could cause this truncation). Of course, these modifications severely hamper our ability to extract the historical evolutionary relationships. Several short amino acid sequence motifs are common among all class I enzymes. The alignment of the central portions of HA synthases from diverse sources shows the similarities among the enzymes (Fig. 3). The codons for the absolutely identical residues in these motifs are not too dissimilar (e.g., there is one difference per codon, usually at the wobble site) among the streptococcal, amphibian, and human HA synthase genes. This similarity is the strongest evidence supporting the theory that streptococcal bacteria acquired the HA synthase from their vertebrate hosts.

However, there are also arguments against horizontal gene transfer from mammals to bacteria. The biosynthetic operons of Streptococcus groups A and C are similar but not identical to other Streptococcus and Gram-positive bacteria that produce other extracellular polysaccharides. Certain distinct bacterial glycosyltransferases are very similar to HA synthase. For example, virulent S. pneumoniae type 3 produces a β3GlcUA-β4 glucose polymer with an enzyme (Cps3S) that is ∼26% identical to the streptococcal HA synthase. These enzymes are both dual-action glycosyltransferases of about the same size, and are probably related. Would a hypothetical usurped HA synthase mutate over time to form this other enzyme, which produces an immunogenic (thus “flawed”) capsular polymer? Or is drift in the reverse direction to design a stealthier capsular polymer more probable? Furthermore, Streptococcus groups A and C are poorly transformable. The acquisition of a mammalian gene would be difficult without a better gene-swapping mechanism, such as conjugation, transposition, or phage transduction. The latter three mechanisms normally operate only between bacteria, but perhaps the common streptococcal ancestor was more transformation-competent.

Overall, it may be more probable that a glycosyltransferase circulating in the microbial world was modified by functional convergent evolution into a streptococcal HA synthase with similarity to vertebrate enzymes. Without 3D structures or details of the reaction mechanism, the extent of the tinkering required to produce a HA synthase from another glycosyltransferase is not known.

Unexpected HA Synthase in an Algal Virus

A certain Phycodnavirus (the Chlorella virus), which infects freshwater green algae, also possesses an HA synthase gene (DeAngelis et al., 1997). The cvHAS protein sequence is slightly more similar to the vertebrate enzyme than to the streptococcal enzyme. Early in viral infection, the algae are coated with HA fibers. This is the only instance of HA occurring in the plant kingdom, and the first report of a carbohydrate-producing enzyme in a virus. The Chlorella virus is found worldwide and is particularly abundant at certain times of the year (104 virus per milliliter of lake water). Viral isolates from around the world possess very similar HAS genes. This omnipresence suggests that the gene has been present since the distant past (perhaps when Pangea was intact, unless these seawater-hating, desiccation-sensitive virions hitched rides around the globe in the wet feathers of migrating waterfowl!). The role of HA production during infection is unknown, but speculation abounds. The HA coat may prevent secondary infection, increase stability of the infected cell to increase viral burst size, and/or facilitate interaction with a secondary host. Viruses are well known for their ability to acquire DNA from other sources. The Chlorella virus is a master, as evidenced by its large genome replete with a variety of genes, including multiple UDP-sugar pathway enzymes and a putative chitin synthase (this latter enzyme, however, is not very similar to the HA synthase). It is likely that nucleic acid encoding an HA synthase from a vertebrate was scavenged and then employed by the virus, but more evidence is required on this point.


GAGs and their glycosyltransferases have been helping to shape organisms at the cellular and organismic level since the birth of the metazoan lineage. The evolutionary relationships are far from clear, but because of its presence in virtually all animals, heparan appears to be the primal GAG. Chondroitin, and later HA, arose to extend the repertoire of the extracellular matrix of higher animals. GAG polymers interact in a selective fashion with a wide variety of proteins and factors (reviewed in Day and Prestwich, 2002; Capila and Linhardt, 2002). These binding events yield intracellular adhesion and/or communication systems during development, growth, and adult life.

Many pathogenic microbes key on host matrix molecules to enhance their infectivity. Certain bacteria have gone a step further by actually synthesizing molecules identical or similar to their host matrix. It is possible that cases of both functional convergence and horizontal gene transfer have occurred, but the former mechanism appears to be responsible for the majority of bacterial GAG synthases. With the advent of high-resolution sugar analyses and genomic mining of a multitude of organisms, the historical evolutionary history of GAGs may be extractable if these molecules are not intertwined too deeply in our past.


I thank Dr. Bernard Henrissat for kindly providing details on the classification of glycosyltransferases over the years; Drs. John McDonald and Koji Kimata for sharing preliminary data on the Has1 and Has3 null mice; Dr. Andrew Spicer for generously providing preliminary data on the HA synthase genes of Amphioxus, and screens of other invertebrates; and Dr. James Van Etten for engaging in fun discussions about the chlorella virus.