A CELLULOSE SYNTHASE (CESA) GENE FROM THE RED ALGA PORPHYRA YEZOENSIS (RHODOPHYTA)1

Authors


  • 1

    Received 8 February 2008. Accepted 23 September 2008.

Abstract

The cell walls of Porphyra species, like those of land plants, contain cellulose microfibrils that are synthesized by clusters of cellulose synthase enzymes (“terminal complexes”), which move in the plasma membrane. However, the morphologies of the Porphyra terminal complexes and the cellulose microfibrils they produce differ from those of land plants. To characterize the genetic basis for these differences, we have identified, cloned, and sequenced a cellulose synthase (CESA) gene from Porphyra yezoensis Ueda strain TU-1. A partial cDNA sequence was identified in the P. yezoensis expressed sequence tag (EST) index using a land plant CESA sequence as a query. High-efficiency thermal asymmetric interlaced PCR was used to amplify sequences upstream of the cDNA sequence from P. yezoensis genomic DNA. Using the resulting genomic sequences as queries, we identified additional EST sequences and a full-length cDNA clone, which we named PyCESA1. The conceptual translation of PyCESA1 includes the four catalytic domains and the N- and C-terminal transmembrane domains that characterize CESA proteins. Genomic PCR demonstrated that PyCESA1 contains no introns. Southern blot analysis indicated that P. yezoensis has at least three genomic sequences with high similarity to the cloned gene; two of these are pseudogenes based on analysis of amplified genomic sequences. The P. yezoensis CESA peptide sequence is most similar to cellulose synthase sequences from the oomycete Phytophthora infestans and from cyanobacteria. Comparing the CESA genes of P. yezoensis and land plants may facilitate identification of sequences that control terminal complex and cellulose microfibril morphology.

Abbreviations:
Ccsa

cyanobacterial cellulose synthase

CESA

cellulose synthase catalytic subunit

CR-P

conserved region plant

CSR

class-specific region

EST

expressed sequence tag

HE-TAIL

high-efficiency thermal asymmetric interlaced

UTR

untranslated region

The cell walls of the sporophytes (“conchocelis phase”) of red algae from the genus Porphyra contain cellulose microfibrils, paracrystalline arrays of β-1,4-glucan chains (Mukai et al. 1981). Cellulose microfibrils are also a major component of land plant cell walls. EM has shown that the cellulose microfibrils of Porphyra sp. are ribbon-like with a thickness of 1–1.5 nm and a width that varies from 5 to 70 nm (Tsekos et al. 1999). In contrast, the cellulose microfibrils of land plants have cross-sectional dimensions of about 3.5 × 3.5 nm (Delmer 1999). Cellulose microfibril structure correlates with the morphology of the integral plasma membrane “terminal complexes” (Tsekos 1999), clusters of particles that contain cellulose synthase enzymes (Kimura et al. 1999). Freeze-fracture EM of P. yezoensis has shown that cellulose microfibrils are synthesized in association with distinctive linear terminal complexes (Tsekos and Reiss 1994). In contrast, streptophytes (land plants and charophycean green algae) have rosette terminal complexes consisting of six hexagonally arranged membrane particles. Many observations suggest that the number and arrangement of particles within terminal complexes ultimately determine the structure of the microfibrils they produce (Tsekos 1999). In turn, the arrangement of particles in a terminal complex may be determined, at least in part, by the cellulose synthases themselves (Doblin et al. 2002, Roberts and Roberts 2007). Thus, comparing the cellulose synthase (CESA) genes from red algae to those already characterized in other organisms (Nobles and Brown 2004) may reveal the mechanisms that determine terminal complex assembly.

Common features conserved in all cellulose synthases include three aspartate residues and a QXXRW motif embedded within conserved regions referred to as U1, U2, U3, and U4. These are flanked by two N-terminal and six C-terminal transmembrane helices, consistent with the plasma membrane localization of CESA proteins (Delmer 1999). The CESA proteins of streptophytes also include an N-terminal zinc-binding domain, an insertion (designated the CR-P, for “conserved region-plant”) between the U1 and U2 domains (Pear et al. 1996), and a variable “class specific region” (CSR) between U2 and U3 (Vergara and Carpita 2001). Insertions between the U1 and U2 domains also occur in CESA proteins from Dictyostelium discoideum (Blanton et al. 2000) and certain cyanobacteria (Nobles et al. 2001), although it is unclear that the insertions are homologous to the CR-P. To date, the zinc-binding domain and CSR region have been identified only in streptophyte CESA proteins.

While hypotheses on the evolution and phylogeny of cellulose synthases were originally made based on terminal complex ultrastructure (e.g., Brown 1990), the availability of CESA gene sequences provides new data to test these proposals. On the basis of cladograms of cellulose synthase sequences from a wide variety of organisms, Nobles and Brown (2004) suggested a cyanobacterial ancestry for plant cellulose synthase genes. Some cyanobacterial species, such as Nostoc punctiforme, contain two types of cellulose synthase genes designated Ccsa1 and Ccsa2 (Nobles and Brown 2004). In one proposed scenario, both genes could have been introduced into plant ancestors during the endosymbiotic acquisition of chloroplasts, but their integration into the nuclear genome was delayed. As major photosynthetic groups diverged, only one of the two genes was ultimately integrated in the nuclear genome. Nobles and Brown (2004) proposed that in streptophytes, the Ccsa1 gene led to rosettes, while in chlorophyte green algae, the Ccsa2 led to linear terminal complexes. The red alga P. yezoensis, with its linear terminal complexes, provides an attractive organism to test these hypotheses.

In this study, we present the complete sequence of a CESA gene from P. yezoensis, PyCesA1. The structure of the predicted protein is compared to those known from other cellulose-producing organisms, and the results are used to infer aspects of the phylogeny of this widespread gene.

Materials and methods

Culture methods.  The sporophyte (conchocelis) phase of P. yezoensis, strain TU-1 was obtained from Yukihiro Kitade (Hokkaido University) and cultured in filter-sterilized natural seawater supplemented with 1% (v/v) ESS2 stock solution (Kitade et al. 1996). Cultures were maintained at 25 ± 3°C under constant illumination at 15 μmol photons · m−2 · s−1 with constant aeration and transferred weekly (Asamizu et al. 2003).

Porphyra yezoensis EST index searches. The CESA1 peptide sequence from Physcomitrella patens (AAT48368) was used as a query in a tBLASTn search (Altschul et al. 1990) of the P. yezoensis EST index (http://est.kazusa.or.jp/en/plant/porphyra/EST/). Subsequently, a BLASTn search of the index was conducted using a contig assembled from PCR-amplified P. yezoensis gene fragments and cDNA clone PM006c05 as a query.

Amplification and cloning of genomic sequences.  Genomic DNA was isolated from cultured P. yezoensis cells using a rapid cetyltrimethylammonium bromide extraction method (Lukowitz et al. 1996). The high-efficiency thermal asymmetric interlaced (HE-TAIL) PCR procedure (Michiels et al. 2003) was carried out according to the published protocol except that about 80 ng of P. yezoensis genomic DNA was used as the template for the first round of amplification and Taq DNA polymerase (Eppendorf, Westbury, NY, USA) was used for the third, as well as the first two, PCR rounds. In HE-TAIL PCR, a set of nested, high Tm, gene-specific primers is paired with four different low Tm, degenerate primers through three nested amplifications. In the tertiary reaction, two different gene-specific primers are used to amplify the template produced in the secondary reaction, allowing expected differences in product length to be used as a test for specificity. Nested gene-specific reverse primers were designed using the following criteria (Michiels et al. 2003): 26 or more nucleotides in length, 40%–50% GC content, Tm 70°C or greater, and absence of internal secondary structure. Primers based on the sequence from P. yezoensis cDNA clone PM006c05 were GS1 = GTTCACCTCGTAAAAGGCGGAGAAGAAG, GS2 = AAATGGCGCAGAAGACATATCCGAAGG, GS3 = AGCCAAAACATGGTCGACACAAACGTG, and GS4 = TATCCTCCGTCACAGAGCCGTACACAAA. Primers based on the initial HE-TAIL PCR product were GS5 = TGGTGCCCACAAAGTCGGTCATGTCAA, GS6 = CACAAAGCCGATGTCCCGGTTGATCT, GS7 = ATGTCCGCGTCGAGGAAGAGCACAATCT, and GS8 = TTGTTGATGTTCCCCGCCTTGTTGTGC. Degenerate primers R1–R4 were as described (Michiels et al. 2003). A genomic sequence was amplified using a reverse primer based on the 3′ untranslated region (UTR) region of cDNA clone PM006c05 (GS9 = CAGTGACGACCTGACTAGCAA) and a forward primer based on HE-TAIL PCR product TAIL5 (GS10 = ATGCTGGTCTTTCGGTTCTTC), Taq DNA polymerase, 0.75 M betaine, and an annealing temperature of 56°C.

Amplification products were gel purified and cloned into pCR-TOPO 2.1 according to the manufacturer’s instructions (Invitrogen Corp., Carlsbad, CA, USA). Plasmids were sequenced using an Applied Biosystems 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA).

Southern blotting.  Samples of P. yezoensis genomic DNA (6 μg) were digested with 30 units of BamHI, PstI, XhoI, Sal I, or SmaI for 4 h. The digests were separated at 9 V in a 6 × 10 cm, 0.7% agarose gel for 18 or 24 h. DNA was transferred to charged nylon membranes using the upward capillary method (Sambrook and Russell 2001). The probe was amplified from cDNA clone PM006c05 using forward primer PF = GGCTCTGTGACGGAGGATAC and reverse primer PR = ACAGGAAGACGACACCAAGG. Blots were hybridized overnight at 58°C with 200 ng of probe, developed using the AlkPhos direct labeling and detection system according to the manufacturer’s instructions (GE Healthcare, Buckinghamshire, UK) and imaged using a Typhoon 9410 variable mode imager (GE Healthcare). The brightness and contrast of the resulting images were enhanced using only global filters.

Phylogenetic analysis.  Sequences were aligned using CLUSTAL-X (Thompson et al. 1997) with the Gonnet protein weight matrix, pair-wise gap opening/extension penalties of 10/0.1 and multiple alignment gap opening/extension penalties of 10/0.2. To facilitate alignment, the highly divergent N-terminus (upstream of the second predicted transmembrane helix) and C-terminus (downstream of the U4 domain) were deleted from each sequence. The alignment was edited using BioEdit (Hall 1999) to remove gaps and segments of uncertain homology (Baldauf 2003). A phylogram was constructed from the aligned sequences using the heuristic search method in PAUP* (version 4.1b10, Sinauer Associates, Sunderland, MA) with all characters given equal weight. The topology was tested with 1,000 bootstrap replicates using the parsimony method. Using the same alignment, a maximum-likelihood (ML) phylogram was constructed with TREE-PUZZLE version 5.2 (Schmidt et al. 2002), and a Bayesian phylogram was constructed with MrBayes version 3.1 (Huelsenbeck and Ronquist 2001, Ronquist and Huelsenbeck 2003), using the Whelan and Goldman amino acid substitution model (Whelan and Goldman 2001) in both cases.

Results

A tBLASTn search using a CESA sequence from the moss Physcomitrella patens as a query against the P. yezoensis EST index identified a single EST sequence (AV431832) using an E-value cutoff of 0.1. This sequence included the highly conserved CESA U3 and U4 domains. The conceptual translation of the corresponding cDNA clone (PM006c05) included the expected 3′ transmembrane domain in addition to the U3 and U4 domains. However, the 5′ region of the gene was absent from the cDNA clone.

The 5′ region of the gene was amplified from P. yezoensis genomic DNA in two rounds of HE-TAIL PCR (Michiels et al. 2003). Results of the tertiary amplification of sequences flanking the 5′ region of the PM006c05 sequence (Fig. 1a) show the expected 287 bp differences between the products amplified with GS3/R1 versus GS4/R1 and GS3/R2 versus GS4/R2. When cloned and sequenced, the larger band in lane 5 yielded two different products (TAIL1, EU279854; TAIL2, EU279855), and the single band in lane 6 yielded one product (TAIL3, EU279856; Fig. 2). Results of the tertiary amplification of sequences flanking the 5′ region of TAIL1 (Fig. 1b) show the expected 56 bp difference between the products amplified with GS7/R1 versus GS8/R1 and GS7/R2 versus GS8/R2. The large band in lane 1 (TAIL5, EU279858) and the band in lane 6 (TAIL4, EU279857) each yielded a single product, and the second largest band from lane 1 yielded two products (TAIL6, EU279859; TAIL7, EU279860; Fig. 2). Sequence comparison showed that TAIL1, TAIL2, and TAIL6 are >98% identical in their regions of overlap (Fig. 2). TAIL4 and TAIL5 were >98% identical to each other but differed from TAIL1, TAIL2, and TAIL6. TAIL3 diverged from all other sequences. The last 258 bp of TAIL7 were >98% identical to TAIL1, TAIL2, and TAIL6. However, the first 330 bp were highly divergent from all other sequences. With two exceptions, HE-TAIL PCR products started with the expected degenerate primer and ended with the expected gene-specific primer. TAIL6 ended with GS6 carried over from the secondary reaction instead of GS7 added to the tertiary reaction. TAIL1 started and ended with GS4. GenBank BLASTx searches revealed that all products have high similarity to CESA genes. Sequences translated as open-reading frames with the exception of TAIL3, TAIL4, and TAIL5, which contained frame shifts and internal stop codons, and TAIL7, which showed similarity to sensor histidine kinases within the first 330 bp.

Figure 1.

 DNA fragments amplified from Porphyra yezoensis genomic DNA in the tertiary reaction of high-efficiency thermal asymmetric interlaced (HE-TAIL) PCR. (a) DNA fragments amplified by nested gene-specific primers (GS3 and GS4) based on the sequence of cDNA clone PM006c05 paired with random primers R1–R4. The size differences between fragments in lane 1 versus lane 5 and lane 2 versus lane 6 are consistent with the expected value of 287 bp. Fragments in lanes 5 and 6 were cloned and sequenced. (b) DNA fragments amplified by nested gene-specific primers (GS7 and GS8) based on the sequence of the larger fragment isolated from lane 5, (a) paired with random primers R1–R4. The size differences between fragments in lane 1 versus lane 5 and lane 2 versus lane 6 are consistent with the expected value of 56 bp. Fragments in lanes 1 and 6 were cloned and sequenced.

Figure 2.

 Summary of the relationship of expressed sequence tag (EST) (black arrows with GenBank accession numbers), cDNA (black bar) and conventional (gray bar), and high-efficiency thermal asymmetric interlaced (HE-TAIL) PCR-amplified (gray arrows) sequences to the complete cDNA sequence of PyCESA1 (clone PL006a11). The open arrow represents the portion of TAIL7 that is not similar to the other sequences. Nucleotides corresponding to characterized protein domains (transmembrane helices TMH1–TMH8 and conserved regions U1–U4) are labeled on the complete cDNA sequence. Sequences that differ by <2% at the nucleotide level have been assigned the same letter.

The contig assembled from PM006c05 and sequences generated by HE-TAIL PCR spanned a nearly complete CESA gene, including regions predicted to encode all four conserved U domains and 3′ and 5′ transmembrane domains, but contained deletions and frame shifts. Using primers based on this contig (GS9 and GS10), a genomic sequence (PCR1, EU279861) with no frame shifts or internal stop codons was amplified (Fig. 2). When this sequence was used as a query in a BLASTn search of the Kazusa P. yezoensis EST index, four additional EST sequences (AV432955, AU196444, AU186812, AU192243) were identified (Fig. 2). A sixth EST sequence (AV429626), composed primarily of the 5′ UTR, was identified in a final BLASTn search using AU192243 as a query. These sequences were not identified in the original search because of their lack of similarity to land plant CESA genes. The sequence of the cDNA clone corresponding to AV429626 (PL006a11) contained a start codon in the correct context, a poly(A) tail and sequences predicted to encode all four U domains and eight transmembrane helices and was named PyCESA1 (Fig. 2). All EST sequences and the PCR amplified genomic sequence were identical to PL006a11 in their regions of overlap. An alignment of TAIL4/TAIL5 and PyCESA1 showed that TAIL4/TAIL5 has an insert and a deletion, which result in frame shifts. These, in addition to numerous single nucleotide substitutions, indicate that the TAIL4 and TAIL5 clones represent a pseudogene.

Southern blot analysis was used to estimate the total number of CESA genes in P. yezoensis. For four of the five restriction enzymes used, digestion produced three bands that hybridized strongly with the probe, whereas SmaI produced a very strong band at about 4 kb and a strong band at about 9 kb (Fig. 3). When the gel was run for 24 h, the low molecular mass bands were weak (arrowheads, Fig. 3a). These bands were stronger when the gel was run for 16 h (Fig. 3b). The 888 bp PstI fragment (arrowhead, Fig. 3b) was predicted from the PyCESA1 sequence. Several weaker bands were also identified in each lane.

Figure 3.

 Southern blot analysis of Porphyra yezoensis genomic DNA (6 μg per lane) digested with the indicated restriction enzymes and separated on 0.7% agarose gels for 24 h (a) or 18 h (b). (a) Weak low-molecular-weight bands are indicated by arrowheads. (b) An 888 bp PstI fragment predicted from the PyCESA1 sequence is indicated by the arrowhead.

Using PyCESA1 as a query, the top BLASTp hits against the nonredundant database in GenBank included 16 cellulose synthase sequences from Phytophthora sp. (four orthologs from each of four species; E-values = 3e−82 − 3e−22) and three cyanobacterial putative cellulose synthase sequences (E-values = 1e−49 − 1e−24). Cellulose synthase sequences from the cellular slime mold D.  discoideum (E-value = 3e−28), two species of tunicates from the genus Ciona (E-values = 2e−16 − 6e−16), and various eubacteria and cyanobacteria (E-values ≤ 2e−19) were also among the top hits. A search restricted to eukaryotes produced the 19 eukaryotic sequences above, as well as nine putative glycosyltransferase sequences from ascomycete fungi (E-values = 1e−13 − 2e−08) and various CESA sequences from embryophytes (E-values ≤ 5e−8). When the search was restricted to cyanobacteria, sequences representing both the CcsA1 (E-values = 3e−51 − 2e−26) and CcsA2 (E-values ≤ 2e−15) clades (Nobles and Brown 2004) were identified.

In the alignment constructed using putative cellulose synthase protein sequences from P. yezoensis, Phytophthora infestans (four orthologs), Ciona sp., D. discoideum, nine species of ascomycete fungi, Arabidopsis thaliana, Physcomitrella patens, Mesotaenium caldariorum, cyanobacteria (all three CcsA1 sequences and one CcsA2 sequence), and four species of eubacteria, only the regions surrounding the conserved U1, U2, and U3/U4 domains were well aligned (Appendix S1 in the supplementary material). Although the P. yezoensis, P. infestans, D. discoideum, ascomycete, and cyanobacterial Ccsa1 clade sequences included insertions in the “plant conserved region” between U1 and U2, these were poorly aligned between major taxa (Appendix S1; see also Nobles and Brown 2004, Fig. 5). When poorly aligned sequences were edited from the alignment (Appendix S2 in the supplementary material), 246 characters remained, of which 220 were parsimony informative. The resulting unrooted parsimony phylogram (Fig. 4) has five major clades that intersect in a polytomy in the bootstrap consensus tree. Phylograms created using ML and Bayesian methods had nearly identical topologies, and only the support values are shown in Fig. 4. The ascomycete, eubacterial (including a CcsA2 sequence), tunicate, and streptophyte clades all have strong bootstrap support. The fifth clade, which unites the P. infestans, P. yezoensis, cyanobacterial CcsA1, and D. discoideum sequences, has strong Bayesian support, but weak support using the parsimony and ML methods. The topology of the streptophyte CESA clade is not resolved because most of the divergence between these sequences occurs in the regions that were deleted from the alignment.

Figure 5.

 Domain structures of representative cellulose synthase proteins from Gossypium hirsutum (U58283.1), Porphyra yezoensis (EU279853), Phytophthora infestans (ABP96904.1), Anabaena variabilis (YP_322086.1), Dictyostelium discoideum (AF163835.1), Ciona intestinalis (AAR89623.1), Aspergillus fumigatus (XP_748682.1), and Gluconoacetobacter xylinus (CAA38487.1), including zinc-binding domain (Zn), transmembrane domains (TMD, black), cellulose synthase conserved regions (U1–U4, gray), conserved region plant (CR-P), class-specific region (CSR), pleckstrin homology domain, and β-glucanase homology domain. Terminal complex (TC) structures for land plants (Delmer 1999), P. yezoensis (Tsekos and Reiss 1994), D. discoideum (Grimson et al. 1996), ascidians (Kimura and Itoh 2004), and G. xylinus (Brown et al. 1976) are also shown.

Figure 4.

 Unrooted parsimony phylogram corresponding to the majority consensus of 1,000 bootstrap replicates of cellulose synthase protein sequences from Porphyra yezoensis (EU279853), Phytophthora infestans (ABP96902.1, ABP96903.1, ABP96904.1, ABP96905.1), Dictyostelium discoideum (AF163835.1), ascomycetes (XP_001557767.1, XP_001273558.1, XP_388067.1, XP_748682.1, XP_001259116.1, XP_001588400.1, BAE61927.1, EAT86578.1), Ciona sp. (AAR89623.1, NP_001041448.1), cyanobacteria (YP_322086.1, NP_487797.1, ZP_00109724.1, ZP_00108526.1), eubacteria (AAC07360.1, NP_436917.1, CAA38487.1), and streptophytes including Mesotaenium caldariorum (AAM83096, AAT48369), Physcomitrella patens (DQ902545, DQ902546, DQ902547, DQ902548, DQ902549, DQ902550, DQ902551), and Arabidopsis thaliana (TAIR locus IDs: At4g32410, At4g39350, At5g05170, At5g44030, At5g09870, At5g64740, At5g17420, At4g18780, At2g21770, At2g25540). Bootstrap values > 50 are shown with maximum-likelihood and Bayesian support values, respectively, shown in parentheses for major nodes. Shaded lines indicate polytomies.

Discussion

Similarity to known cellulose synthases supports the hypothesis that the protein encoded by PyCESA1 is also a cellulose synthase. PyCESA1 is predicted to encode all four CESA catalytic domains (U1–U4) and the expected N- and C-terminal transmembrane domains. Although other members of glycosyltransferase family 2 contain some or all of the U1–U4 domains (Coutinho et al. 2003), the sequences identified in BLASTp searches against GenBank using PyCESA1 as a query were either known or putative cellulose synthases. When the P. yezoensis EST index was queried with cellulose synthase-like (CSL) sequences from P. patens (Roberts and Bushoven 2007), the only sequence identified had the highest similarity to dolichol-phosphate mannosyltransferases (AV434923). No sequences similar to PyCESA1 were identified in BLASTp or tBLASTn searches (expectation value < 0.1) of the complete genome sequence (http://merolae.biol.s.u-tokyo.ac.jp) of the wall-less red alga Cyanidioschyzon merolae (Matsuzaki et al. 2004), which is consistent with a role for PyCESA1 in cell wall biosynthesis. We have designated the putative cellulose synthase gene from P. yezoensis PyCESA1 based on the nomenclature proposed by Delmer (1999) for all cellulose synthase genes and applied consistently to putative cellulose synthase genes from the Viridiplantae (Richmond 2000, Roberts et al. 2002, Nairn and Haselkorn 2005, Roberts and Bushoven 2007). Although applied to the cellulose synthase gene from Ciona intestinalis (Nakashima et al. 2004, Sasakura et al. 2005), the CESA designation has not been used consistently for cellulose synthase genes from other taxa (Blanton et al. 2000, Nobles et al. 2001, Matthysse et al. 2004). PyCESA1 ESTs were present in both sporophyte and gametophyte libraries. This finding is unexpected since cellulose is thought to be present in walls of sporophytes but not gametophytes (Craigie 1990). This may be explained by the fact that the gametophytes used to produce the library were known to be contaminated with a small amount of sporophyte tissue (Asamizu et al. 2003).

The HE-TAIL PCR method (Michiels et al. 2003) was highly effective in amplifying flanking sequences and allowing selection of specific products. By comparing the sizes of products amplified in the tertiary reaction by two different gene-specific primers (Fig. 1), and cloning only the products for which a product differing in size by the expected amount could be identified in the companion reaction, we avoided cloning and sequencing any nonspecific products.

The P. yezoensis genome contains at least one expressed CESA gene and two CESA pseudogenes based on sequences identified in the EST index, gene fragments cloned by HE-TAIL PCR, and Southern blot analysis. The 20,779 sequences in the P. yezoensis EST index (Nikaido et al. 2000, Asamizu et al. 2003) include six sequences with similarity to plant and microbial cellulose synthase genes. All are identical to the full-length sequence of cDNA clone PL006a11 in their regions of overlap, indicating that they represent a single expressed gene, PyCESA1. Genomic fragments amplified by HE-TAIL PCR represent at least two additional sequences (Fig. 2), both of which appear to be pseudogenes. TAIL4 and TAIL5 represent one pseudogene based on their shared identity of >99% along with the presence of internal stop codons and a frame shift. TAIL3 also contains an internal stop codon and may be a fragment of this same pseudogene. TAIL7 represents a fusion of a partial CESA gene and a histidine kinase gene. TAIL1, TAIL2, and TAIL6 share >98% identity with TAIL7, indicating that they represent the same pseudogene. The differences between these sequences may have been introduced during the three rounds of amplification required in HE-TAIL PCR. Results of Southern blot analysis are also consistent with the presence of three CESA genes (i.e., PyCESA1 and two pseudogenes) in the P. yezoensis genome. Land plants have large CESA families, and rosette terminal complex assembly in Arabidopsis requires three different CESA proteins (Gardiner et al. 2003). P. yezoensis appears to be similar to D. discoideum (Blanton et al. 2000) and C. intestinalis (Nakashima et al. 2004) in having a solitary CESA gene. This finding indicates that, in contrast to the rosettes of seed plants, linear terminal complexes are assembled from a single type of CESA protein.

Comparing the CESA proteins of P. yezoensis and other organisms (Fig. 5) may provide insight into the mechanisms that regulate assembly and function of rosette and linear terminal complexes. Specific association between CESA subunits in vitro (Kurek et al. 2002, Taylor et al. 2003) and disintegration of rosettes in Arabidopsis cesA1 mutants (Arioli et al. 1998) support the hypothesis that CESA proteins determine terminal complex structure. Differences in the susceptibility of rosette and linear terminal complexes to dissociation by cellulose synthesis inhibitors (Mizuta and Brown 1992, Peng et al. 2001, Kiedaisch et al. 2003) are consistent with differences in the way CESA subunits associate within the different types of terminal complexes. In contrast to the rosette-forming CESA proteins of streptophytes, the linear terminal complex-forming CESA proteins of P. yezoensis, Gluconoacetobacter xylinus (Saxena et al. 1990, Wong et al. 1990), D. discoideum (Blanton et al. 2000), and Ciona sp. (Matthysse et al. 2004,Nakashima et al. 2004) lack the zinc-binding domain and the “class-specific” (CSR) insertion between U2 and U3 (Fig. 5). The zinc-binding domain is required for in vitro association of cotton CESA proteins, directly implicating this feature in rosette assembly (Kurek et al. 2002). Although some CESA sequences from Phytophthora sp. have insertions between U2 and U3 (Grenville-Briggs et al. 2008), they do not appear to be homologous to the CSR, and oomycete terminal complexes have not been characterized. Thus, the absence of the CSR may be a general feature of CESA proteins that form linear terminal complexes. CESA proteins from streptophytes, P. yezoensis, Phytophthora sp., some cyanobacteria, and D. discoideum contain insertions between U1 and U2, which are not present in CESA proteins from tunicates, ascomycetes, and eubacteria (Fig. 5). Despite previous claims that proteins from the cyanobacterial CcsA1 clade contain sequences homologous to the CR-P region of streptophyte CESAs (Nobles et al. 2001), our alignments showed at most 10% amino acid identity overall and no regions of more concentrated identity between streptophyte CR-P sequences and other insertions in this region. Thus, although insertions between U1 and U2 are found in CESAs from many taxa, they appear to have had a polyphyletic origin, and their functions remain unknown.

Our results expand an earlier analysis of the prokaryotic ancestry of eukaryotic cellulose synthases (Nobles and Brown 2004) by incorporating the PyCESA1 sequence as well as oomycete and ascomycete sequences that were recently deposited in GenBank. Our analysis indicates that PyCESA1 shares a common ancestor with cellulose synthases from oomycetes, and possibly those from the cyanobacterial CcsaA1 clade and D. discoideum. Our analysis does not provide the strong support for a common ancestor of the cyanobacterial Ccsa1 clade and vascular plant CESAs shown previously (Nobles and Brown 2004). Using our alignment and Nobles and Brown's (2004) methods (i.e., including sequence segments between U1 and U2 [Appendix S1] and treating gaps as a 21st amino acid), we were able to reproduce their topology with strong bootstrap support (data not shown). However, poor alignment in segments between U1 and U2 gave us little confidence that the characters were homologous, and treating gaps as a 21st amino acid gives extremely high weight to large gaps (Baldauf 2003), some of which are up to 150 amino acids in length. Thus, our analysis is more conservative.

The cellulose synthase phylogeny presented here is consistent with current hypotheses of the origins of the major eukaryotic lineages (Parfrey et al. 2006). Although the red algae are members of the Plantae, many of their genes share more recent common ancestry with those of chromalveolates due to lateral gene transfer coinciding with acquisition of a red algal endosymbiont at the base of that lineage (Li et al. 2006). The clustering of PyCESA1 and P. infestans sequences indicates that the stramenopiles obtained their cellulose synthase genes from this red algal endosymbiont. The lack of introns in PyCESA1 and one of the CESA genes from Phytophthora ramorum is further evidence of their common ancestry. The lack of clustering of cellulose synthase genes from unikonts (i.e., D. discoideum, Ciona sp., and ascomycetes) along with the spotty occurrence of cellulose within this clade is consistent with lateral gene transfer as proposed previously (Blanton et al. 2000, Matthysse et al. 2004, Nakashima et al. 2004). The identification of cellulose synthase sequences in ascomycete fungi is surprising given that their cell walls are thought to lack cellulose (Ruiz-Herrera 1992). As shown previously for Aspergillus (Nobles and Brown 2004), the ascomycete sequences are more similar to cellulose synthases than to other glycosyltransferases, such as chitin synthase. Ascomycete cell walls have not been tested exhaustively for cellulose (Ruiz-Herrera 1992) and thus may contain minor amounts of cellulose or a noncrystalline β-1,4-glucan.

Although our analysis does not resolve a common ancestry of red algal and plant CESAs, the weak clustering of PyCESA1 and the CcsA1 clade of cyanobacterial cellulose synthases does have implications for the origin of plant cellulose synthases. Nobles and Brown (2004, p. 445) proposed “independent functional incorporation of cellulose synthases in the chlorophyte and streptophyte lineages—the independent development of linear and rosette [terminal complexes].” Although their phylogenetic analysis indicated a shared common ancestor of the streptophyte CESAs and the CcsA1 clade of cyanobacterial cellulose synthases, they speculated that the chlorophyte cellulose synthases may share a common ancestor with cyanobacterial cellulose synthases from the CcsA2 clade. Our analysis implies a common ancestry of PyCESA1 and the CcsA1 clade. Given that P. yezoensis has linear terminal complexes and that the rhodophytes are the sister clade of the Viridiplantae, it is more likely that the cellulose synthases that compose the linear terminal complexes of chlorophytes and rhodophytes and the rosettes of the streptophytes all share a common ancestor with the CcsA1 clade.

Acknowledgments

This research was supported by a grant from the Rhode Island Science and Technology Advisory Council and was based in part upon work conducted using the Rhode Island Genomics and Sequencing Center, which is supported in part by the National Science Foundation (MRI Grant No. DBI-0215393 and EPSCoR Grant No. 0554548), the U.S. Department of Agriculture (Grant Nos. 2002-34438-12688 and 2003-34438-13111), and the University of Rhode Island. We thank the Kazusa DNA Research Institute for providing cDNA clones and Yukihiro Kitade for providing cultures of P. yezoensis sporophyte. Tina Sharp and Mary Jane Mello participated in preliminary work on this project.

Ancillary