Chitinase-like proteins encoded in the genome of the pea aphid, Acyrthosiphon pisum

Authors


Atsushi Nakabachi, Advanced Science Institute, RIKEN, Wako, Saitama 351-0198, Japan. Tel.: +81 48 467 9332; fax: +81 48 462 9329; e-mail: bachi@riken.jp

Abstract

In insects, chitinases play an essential role in the degradation of old exoskeleton and turnover of the gut lining. In silico screening of the entire genome of the pea aphid (Hemimetabola), Acyrthosiphon pisum, detected nine genes encoding putative chitinase-like proteins, including six enzymatically active chitinases, one imaginal disc growth factor, and one endo-beta-N-acetylglucosaminidase. Screening of the genomes of Aedes aegypti, Anopheles gambiae, Apis mellifera, Bombyx mori, Culex quinquefasciatus, Drosophila melanogaster, Nasonia vitripennis, Pediculus humanus corporis, and Tribolium castaneum suggested repeated gene duplications in holometabolous lineages. Quantitative reverse transcription-PCR demonstrated the expression of four and two distinct chitinase-like genes of A. pisum to be highly up-regulated in the embryo and the midgut, respectively, suggesting specific roles in these pea aphid tissues.

Introduction

Chitin, a homopolymer of N-acetyl-beta-d-glucosamine (GlcNAc), is a principal structural component of the exoskeleton and gut lining of insects. Whereas the robust and durable cuticle protects insects from mechanical stress, its rigidity restricts growth. Thus, to allow the requisite growth and development to occur, the cuticle must be degraded periodically to allow replacement with newly synthesized materials. The chitinases (EC 3.2.1.14), enzymes with chitinolytic activity, play an important role in degrading old cuticles (Merzendorfer & Zimoch, 2003). The chitinases are glycosyl hydrolases that catalyse the random hydrolysis of the beta-(1,4)-glycosidic bonds in chitin. The insoluble, polymeric chitin is digested and becomes soluble, yielding low molecular mass multimers of GlcNAc, such as chitotetraose, chitotriose, and chitobiose (Kramer & Muthukrishnan, 1997; Zhu et al., 2004). The multimers of GlcNAc are subsequently hydrolysed to N-acetylglucosamine by exo-beta-N-acetylglucosaminidase (EC 3.2.1.52) (Filho et al., 2002; Merzendorfer & Zimoch, 2003). Insect chitinases belong to the evolutionarily conserved glycosyl hydrolase family 18 (GH18) (Merzendorfer & Zimoch, 2003; Zhu et al., 2004; Zhu et al., 2008), which includes enzymatically active chitinases, as well as their relatives lacking chitinase activity, such as imaginal disc growth factors (IDGFs), endo-beta-N-acetylglucosaminidases (ENGases), stabilin-1 interacting chitinase-like proteins (SI-CLPs), and chitolectins (Funkhouser & Aronson, 2007). The N-terminal catalytic domain of the GH18 family members is characterized by an eight-stranded beta/alpha-barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel (van Aalten et al., 2001; Aronson et al., 2003). Within this barrel, the beta4 strand contains a conserved sequence motif that forms the active site of the enzyme, with glutamic acid being the key residue that donates the proton required for hydrolysing the beta-(1,4)-glycosidic bond in chitin (Shuhui et al., 2009). In the enzymatically inactive GH18 family members, the substitution of this essential glutamic acid accounts for the lack of chitinolytic activity, even though they may still be capable of binding to chitin with high affinity (Funkhouser & Aronson, 2007; Shuhui et al., 2009).

In insects, the chitinases are present mainly in the moulting fluid and midgut to enable periodic shedding of old exoskeleton and turnover of the midgut lining (Merzendorfer & Zimoch, 2003; Zhu et al., 2004; Zhu et al., 2008). Previous studies based on similarity searches detected 16, 16, and 13 genes for chitinase-like proteins in the genomes of Drosophila melanogaster (Diptera), Tribolium castaneum (Coleoptera), and Anopheles gambiae (Diptera), respectively (Zhu et al., 2004; Zhu et al., 2008). This implied that insect chitinase-like proteins are encoded by a rather large and diverse group of genes (Zhu et al., 2008). However, all of these insects with completely sequenced genomes were of the holometabolous lineage, which exhibit complete metamorphosis during development. The newly sequenced genome of the pea aphid, Acyrthosiphon pisum (Hemiptera) (IAGC, 2009), together with that of the human body louse, Pediculus humanus corporis (Phthiraptera), presents the first opportunity to discover the gene inventory of the chitinase-like proteins in hemimetabolous insects.

In this study, the genomes of the pea aphid, as well as all of the other fully sequenced insects, were screened for genes encoding chitinase-like proteins (GH18 family members). The retrieved chitinase-like gene models of A. pisum were compared with those of other insects, and were characterized using structural and molecular phylogenetic analyses. We further analysed the expression profiles of the A. pisum chitinase-like genes in the whole body, the bacteriocyte, the embryo, and the midgut, using the real-time quantitative reverse transcription (RT)-PCR technique.

Results

In silico screening identified nine genes for chitinase-like proteins in the Acyrthosiphon pisum genome

BLASTP searches of the A. pisum RefSeq protein database detected eight gene models for chitinase-like proteins [AcypiCht1 (ACYPI001365, LOC100160032), AcypiCht2 (ACYPI010095, LOC100169480), AcypiCht3 (ACYPI001396, LOC100160065), AcypiCht4 (ACYPI006403, LOC100165452), AcypiCht5 (ACYPI009964, LOC100169337), AcypiCht6 (ACYPI009878, LOC100169240), AcypiCht7 (ACYPI005756, LOC100164767), and AcypiCht8 (ACYPI003866, LOC100162732) (In this paper, for simplicity, these will be hereafter referred to as Cht1-Cht8, respectively)] (Table 1). BLASTP searches of the A. pisum Gnomon-predicted protein database and TBLASTN searches of A. pisum genome scaffolds (Acry_1.0) identified the same candidates.

Table 1.  Genes for chitinase-like proteins detected in the Acyrthosiphon pisum genome
Gene symbolGene nameGene IDsRefSeq mRNARefSeq proteinProtein length (aa)Conserved E in CR_IIPutative chitinase activityNote
AcypiCht1Chitinase-like protein 1ACYPI001365, LOC100160032XM_001943565.1XP_001943600.1414− (IDGF) 
AcypiCht2Chitinase-like protein 2ACYPI010095, LOC100169480XM_001943003.1XP_001943038.12098++ 
AcypiCht3Chitinase-like protein 3ACYPI001396, LOC100160065XM_001942561.1XP_001942596.11207++ 
AcypiCht4Chitinase-like protein 4ACYPI006403, LOC100165452XM_001950345.1XP_001950380.1998++ 
AcypiCht5Chitinase-like protein 5ACYPI009964, LOC100169337XM_001947381.1XP_001947416.1(859->) 535++5′-end corrected
AcypiCht6Chitinase-like protein 6ACYPI009878, LOC100169240XM_001952683.1XP_001952718.1(452->) 473++3′-end corrected
AcypiCht7Chitinase-like protein 7ACYPI005756, LOC100164767XM_001947852.1XP_001947887.11284++ 
AcypiCht8Chitinase-like protein 8ACYPI003866, LOC100162732XM_001945435.1XP_001945470.11581 
ENGaseEndo-beta-N-acetylglucosaminidaseACYPI009249, LOC100168559XM_001949910.1XP_001949945.1528+− (ENGase) 

Conserved domain (CD-) searches of the A. pisum RefSeq protein database for gene models having glycosyl hydrolase family 18 (GH18) chitinase-like superfamily domain(s) (superfamily cluster accession number: cl10447) detected one additional candidate for a chitinase-like protein [ENGase (ACYPI009249, LOC100168559)] (Table 1). This appeared to be an endo-beta-N-acetylglucosaminidase (ENGase) (EC 3.2.1.96), which hydrolyses the N,N′-diacetylchitobiosyl core of N-glycosylproteins (Suzuki et al., 2002). ENGases belong to the glycosyl hydrolase family 85 (GH85), and are included in the GH18 chitinase-like superfamily, as they are phylogenetically closely related to the GH18 chitinases (Waddling et al., 2000). Hidden Markov Model (HMM) searches of the Gnomon-predicted protein database and ACYPI protein database of A. pisum for gene models with Pfam Glycosyl hydrolase family 18 domain(s) (Glyco_hydro_18_fs.hmm, PF00704) detected no further candidates.

Structure of the aphid chitinase-like proteins

The retrieved nine gene models for the chitinase-like proteins of A. pisum were visually inspected with reference to the full-length cDNA clone sequence information (Shigenobu et al., 2010) and other supportive data, including the orthology-based protein evidence. The structures of Cht5 and Cht6 were corrected based on this manual checking procedure (Table 1). Domain analysis of the A. pisum chitinase-like proteins showed that Cht1, Cht3, Cht5, Cht6, Cht7, Cht8 and ENGase have a single copy each of the GH18 chitinase-like superfamily domain, whereas Cht2 and Cht4 have four and two copies of this domain, respectively (Fig. 1). Signal peptides were detected in Cht1, Cht2, Cht3, Cht4, Cht5 and Cht8, suggesting that they are secreted proteins, as are the canonical chitinases studied to date (Merzendorfer & Zimoch, 2003; Zhu et al., 2008). A single transmembrane region was observed in Cht6. One or more chitin-binding peritrophin-A domains (cl02629: CBM_14; Chitin-binding domain type 2) (Tellam et al., 1999) were detected in Cht2, Cht3, Cht4, Cht5, and Cht8. Three non-peritrophin-type putative chitin-binding domains (cd06918: ChtBD1_like) were detected in Cht7.

Figure 1.

Domain architecture of chitinase-like proteins of Acyrthosiphon pisum. Domain structures of the aphid chitinase-like gene models were analysed by using the NCBI CD-search and the Pfam domain search programs. The presence and location of signal peptides were predicted by using the program SignalP 3.0. Transmembrane regions were analysed using the TMHMM Server ver. 2.0. Closed and open circles above the GH18 domains indicate the presence and absence, respectively, of the conserved glutamic acid residue in CR_II (see Fig. 2). The protein sizes are shown on the right.

Previous studies showed that there are four conserved regions in the amino acid sequences of the GH18 domains of insect chitinases (Kramer & Muthukrishnan, 1997; de la Vega et al., 1998; Zhu et al., 2004). Conserved region I (CR_I) is represented by the sequence KXXXXXGGW, where X is a non-specific amino acid. Conserved region II (CR_II), FDGXDLDWEYP, was shown to be the most important for enzymatic activity. This region forms the catalytic active site of the enzyme, where the residue E is a putative proton donor essential for the chitinase activity. The conserved regions III and IV (CR_III, CR_IV) are MXYDXXG and GXXXWXXDXDD, respectively. The amino acid sequences of the GH18 domains of the aphid chitinase-like proteins are shown in Fig. 2. In Cht1, all of the four regions were observed to be poorly conserved. The key active site glutamate was replaced with glutamine in CR_II of Cht1, suggesting that this protein lacks chitinase activity. In Cht2, the first GH18 domain (Cht2-1) was truncated at the N-terminus and lacked CR_I and CR_II. CR_III was poorly conserved, whereas CR_IV was well conserved in this domain. In the second GH18 domain of Cht2 (Cht2-2), CR_I, CR_III, and CR_IV were well conserved, whereas the key active site glutamate was replaced with asparagine in CR_II. The third GH18 domain in Cht2 (Cht2-3) was somewhat shorter at the C-terminus, and lacked CR_IV. CR_I, CR_II, and CR_III were well conserved in this domain. In the fourth GH18 domain of Cht2 (Cht2-4) all of the four regions were well conserved. Thus, Cht2 appears to be a functional chitinase, having at least two catalytic domains (Cht2-3 and Cht2-4). Essentially, all of the four regions were well conserved in Cht3, two domains in Cht4 (Cht4-1 and Cht4-2), Cht5, and Cht7, suggesting that they have chitinolytic activities. In Cht6, CR_I was poorly conserved, whereas CR_II, CR_III, and CR_IV were moderately well conserved. The conservation of the key glutamate in CR_II implies that this protein is also an enzymatically active chitinase. These four regions were poorly conserved in Cht8 as well as ENGase, suggesting that they lack chitinase activity. Whereas all four of the regions were poorly conserved in ENGase, the key glutamic acid residue was conserved in CR_II. This is consistent with the previous findings that this glutamic acid is the proton donor and is essential for the enzymatic activity of the ENGases, as is the case in the chitinases (Waddling et al., 2000). In Cht8, this key glutamate residue was not conserved. The presence/absence of putative chitinase activity in each chitinase-like protein is summarized in Table 1. The inference is based on the presence/absence of the conserved glutamate, together with the inferred phylogenetic position of each protein (see below).

Figure 2.

Conserved regions in GH18 domains of the aphid chitinase-like proteins. Amino acid sequences of the GH18 chitinase-like superfamily domains of the aphid chitinase-like proteins are aligned with the amino acid sequence of the domain model cd02872 (GH18_chitolectin_chitotriosidase), a representative of the GH18_chitinase-like superfamily. The residues conserved in all, 80%, and 60% of the lineages are shaded in black, dark grey, and light grey, respectively. Four conserved regions (CR_I-IV) that were predicted to be important for enzymatic activity are boxed. Their amino acid sequences are shown above the boxes. The proton donor glutamate, which is essential for the chitinolytic activity, is indicated with an asterisk. Amino acid residues that match the conservation rule are shown in red.

Chitinase-like proteins encoded in other fully-sequenced insect genomes

The genomes of Aedes aegypti (Diptera), An. gambiae (Diptera), Apis mellifera (Hymenoptera), Bombyx mori (Lepidoptera), Culex quinquefasciatus (Diptera), D. melanogaster (Diptera), Nasonia vitripennis (Hymenoptera), P. humanus corporis (Phthiraptera), and T. castaneum (Coleoptera) were screened for genes encoding chitinase-like proteins in the same manner as performed for A. pisum. The retrieved gene models for chitinase-like proteins are listed in Table S1. In the genomes of Ae. aegypti, An. gambiae, Ap. mellifera, B. mori, C. quinquefasciatus, D. melanogaster, N. vitripennis, P. humanus corporis, and T. castaneum, 31, 27, 16, 12, 34, 18, 15, 10, and 24 chitinase-like genes, respectively, were identified (Table 2). There appeared to be a tendency for the number of chitinase-like genes to be smaller in the hemimetabolous insects (9 and 10 in A. pisum and P. humanus, respectively) than the holometabolous insects (12–34).

Table 2.  Number of genes for chitinase-like proteins in insect genomes
SpeciesChitinaseIDGFENGaseOthersTotalData source (Number of predeicted gene models)
  1. IDGF, imaginal disc growth factor; ENGase, endo-beta-N-acetylglucosaminidase.

Acyrthosiphon pisum61119NCBI RefSeq (10499), Gnomon (37994), ACYPI (34821)
Aedes aegypti13221431NCBI RefSeq (16802), VectorBase (16789)
Anopheles gambiae1521927NCBI RefSeq (12659), KEGG (12527)
Apis mellifera821516NCBI RefSeq (9171), euGenes (17182)
Bombyx mori711312SilkDB (14623)
Culex quinquefasciatus13311734NCBI RefSeq (20299), VectorBase (18883)
Drosophila melanogaster960318NCBI RefSeq (21099), KEGG (14081)
Nasonia vitripennis1111215NCBI RefSeq (9373), euGenes (27287)
Pediculus humanus corporis701210NCBI GenBank (10775), VectorBase (10775)
Tribolium castaneum2021124NCBI RefSeq (9849), euGenes (16422)

Conservation of the key glutamate in CR_II was checked for all of the retrieved gene models, and the presence/absence of putative chitinase activity was inferred from the same criteria as used for the aphid gene models (Table S1). Ae. aegypti, An. gambiae, Ap. mellifera, B. mori, C. quinquefasciatus, D. melanogaster, N. vitripennis, P. humanus corporis, and T. castaneum appeared to have 13, 15, 8, 7, 13, 9, 11, 7, and 20 genes, respectively, which encode enzymatically active chitinases (Table 2). Again, the number in the two hemimetabolous insects (6 and 7 in A. pisum and P. humanus, respectively) was at the lower end of the range of values.

Phylogenetic positions of the aphid chitinase-like proteins

Molecular phylogenetic analyses were performed to further identify the nine chitinase-like proteins of A. pisum. The amino acid sequences of the highly conserved GH18 chitinase-like superfamily domains of chitinase-like proteins of A. pisum and other fully sequenced insects were aligned and used for the analyses. Phylogenetic trees were inferred by the neighbour joining (NJ), maximum likelihood (ML), and Bayesian (BI) methods. The proteins used for the analyses are listed in Table S1. For Cht2 and Cht4 of A. pisum, domains Cht2-4 and Cht4-1 were used to construct the tree shown in Fig. 3, whereas the other domains (Cht2-1-3 for Cht2; Cht4-2 for Cht4) exhibited similar phylogenetic positions (data not shown). When multiple GH18 domains were detected in chitinase-like proteins of insects other than A. pisum, only the best-aligned domain was used for the analysis.

Figure 3.

Phylogenetic positions of chitinase-like proteins of Acyrthosiphon pisum. A neighbour joining (NJ) tree is shown, and the maximum likelihood (ML) and Bayesian inference (BI) trees exhibited substantially the same topologies. Values at the nodes are bootstrap support percentages over 50% and Bayesian posterior probability percentages over 95% (NJ/ML/BI). Dashes indicate statistical values of less than 50 (NJ/ML) or 95 (BI). Branch length is proportional to the number of amino acid substitutions, which is indicated by the scale bar. The names of proteins or GenBank accession numbers are shown with the organism names. Chitinase-like proteins of A. pisum, Drosophila melanogaster, and Pediculus humanus are shown in red, blue, and green, respectively. Closed and open circles indicate proteins with and without a conserved glutamic acid in CR_II, respectively. IDGF, imaginal disc growth factor; ENGase, endo-beta-N-acetylglucosaminidase; SI-CLP, stabilin-1 interacting chitinase-like protein.

The analyses robustly demonstrated (94% for NJ/86% for ML/96% for BI) that Cht1of A. pisum is an orthologue of IDGFs, a family of growth factors identified in insects (Fig. 3) (Kawamura et al., 1999). The IDGFs have an eight-stranded alpha/beta barrel fold and are closely related to the GH18 chitinases, although they have an amino acid substitution (typically E->Q in CR_II) known to abolish chitinase catalytic activity. Thus, IDGFs are inferred to have evolved from chitinases to gain new functions as growth factors, interacting with the cell surface glycoproteins involved in growth-promoting processes (Kawamura et al., 1999). The typical E->Q substitution was observed in the aphid Cht1 as described above (Fig. 2), which is consistent with its inferred phylogenetic position.

The analyses verified that the aphid ENGase is an orthologue of insect ENGases. The monophyly of the insect ENGases, including the aphid ENGase, was demonstrated with robust statistical support (98/100/95) (Fig. 3).

Cht2, Cht3, Cht4, Cht5, Cht6 and Cht7 of A. pisum were shown to belong to the large cluster of enzymatically active chitinases and chitinase-like lectins (Fig. 3). Within this cluster, we identified seven statistically supported subclusters formed by ≥ten of the proteins derived from ≥three insect species (Fig. 3). Groups I-III corresponds to the previously reported groups of the same names (Zhu et al., 2008). However, groups IV-VII are not related to the former groupings, because molecular phylogenetic analyses do not support the former group IV, and the former group V corresponds to the IDGFs. Cht2, Cht3, Cht4, Cht5, and Cht6 of A. pisum were shown to belong to groups II, V, III, I, and IV, respectively (Fig. 3. For the detailed statistical values, see supporting information Fig. S1.). Group I consists of a single gene each from all of the analysed insects except P. humanus, which has two genes in this group. Group II, III, and IV consist of a single gene each, respectively, from all the analysed insects. Group V consists of a single gene each from the analysed insects, except for An. gambiae (no gene) and Ap. mellifera (two genes). The tree suggests the loss of a gene in the lineage of An. gambiae after its divergence from the common ancestor of mosquitoes, and a single gene duplication in the lineage of Ap. mellifera. Group VI consists of a single gene each from the analysed insects, except for A. pisum (no gene) and Ae. aegypti (three genes), implying the loss of a gene in the A. pisum lineage and gene duplications in the Ae. aegypti lineage. Group VII included only mosquito genes (3, 5, and 4 from Ae. aegypti, An. gambiae, and C. quinquefasciatus, respectively). The tree suggests repeated duplications of Group VII genes in the mosquito lineages, exemplifying radical expansion of chitinase genes in holometabolous insects, which will be discussed below.

Cht2, Cht3, Cht4, Cht5 and Cht6 of A. pisum were thus indicated to be orthologues of Cht3, Cht6, Cht7, Cht5 and Cht11 of D. melanogaster, respectively (Fig. 3). A. pisum was shown to lack the orthologues for Cht2, Cht4, Cht8, Cht9, Cht12, CG5613 and CG8460 of D. melanogaster (Fig. 3). The aphid Cht7 appeared to be closely related to BGIBMGA008709 of B. mori, which is hypothesized to have been laterally transferred from a serratial bacterium or a baculovirus, into the genome of B. mori (Daimon et al., 2003; Daimon et al., 2005). However, phylogenetic analyses using homologous sequences from bacteria, viruses and invertebrates did not resolve the phylogenetic position of the A. pisum Cht7 (data not shown).

The analyses indicated that P. humanus, a hemimetabolous insect that is closely related to A. pisum, possesses a single orthologue each of A. pisum Cht2, Cht3, Cht4, and Cht6, whereas it appeared to have two copies of chitinase-like genes that are closely related to A. pisum Cht5 (group I) (Fig. 3). An orthologue of D. melanogaster Cht2 (group VI), which is absent from A. pisum, was detected in P. humanus. IDGF was not found in P. humanus. A single gene each for ENGase and stabilin-1 interacting chitinase-like protein (SI-CLP) was identified in the P. humanus genome, whereas the latter was absent from A. pisum. One additional gene (EEB14480.1) was found in the P. humanus genome, but its phylogenetic position was not resolved.

Repeated duplications of chitinase-like genes in the holometabolous lineages

The phylogenetic analyses suggested that the genes for the chitinase-like proteins have been repeatedly duplicated in the holometabolous lineages, especially Diptera and Coleoptera (Fig. 3). Among a number of examples which exist, some can be seen in the group VII as mentioned earlier. Other examples are represented in the IDGF cluster, to which six homologues of D. melanogaster, Chit and IDGF1-5, belong (Fig. 3. For the detailed statistical values, see supporting information Fig. S1). Their branching pattern suggests repeated duplications of IDGF genes in the Drosophila lineage. The tree also suggests duplications of IDGF genes in Ae. aegypti, An. gambiae, C. quinquefasciatus, Ap. mellifera, and T. castaneum, although the homologue number is smaller (three for C. quinquefasciatus, two for the rest) than that of D. melanogaster (Fig. 3. Supporting information Fig. S1). Examples of the gene expansions in T. castaneum are denoted in Fig. 3 and Fig. S1. The “Tribolium expansion 1” and “Tribolium expansion 2” indicate monophyletic clades that consist of nine and five genes, respectively, from T. castaneum. No other genes from any of the other sequenced insects fell within these clades, implying that these genes emerged from a single gene each in the lineage of T. castaneum. Whereas gene duplication at more moderate levels was observed in hymenopteran insects, no such duplication was detected in B. mori, suggesting that expansion of the chitinase-like genes is not a prerequisite for holometabolism.

Expression profiles of genes for chitinase-like proteins in Acyrthosiphon pisum

To examine the expression profiles of the nine chitinase-like genes identified in the genome of the pea aphid, we quantified their transcripts in the whole body, bacteriocyte, embryo and midgut of A. pisum with the real-time quantitative RT-PCR technique (Fig. 4). The analyses demonstrated that the expression levels of Cht2, Cht3, Cht4, and Cht8 were significantly higher in the embryo than in other organs. The transcripts for Cht2, Cht3, Cht4, and Cht8 were 588, 52.0, 876, and 805–fold more abundant in the embryo than in the bacteriocyte, respectively (P < 0.05, Steel-Dwass test). The expression levels of Cht2, Cht3, Cht4, and Cht8 were 30.6, 11.8, 5.38, and 51.1–fold higher in the embryo than in the midgut, respectively (P < 0.05). As the ‘whole body’ samples included embryos, the total volume of which is relatively large (Whitehead & Douglas, 1993), the expression levels in the whole body appeared to be affected by those in the embryos. Still, the transcripts for Cht2, Cht3, Cht4, and Cht8 were 2.34, 1.69, 6.81, and 1.43–fold more abundant in the embryo than in the whole body, respectively (P < 0.05). The expression levels of Cht6 and ENGase were significantly higher in the midgut than in other organs. The transcripts for Cht6 were, respectively, 3.15, 7.60, and 4.96–fold more abundant in the midgut than in the whole body, bacteriocyte, and embryo (P < 0.05). The expression level of ENGase was 6.56, 7.34, and 5.89-fold higher in the midgut than in the whole body, bacteriocyte, and embryo, respectively (P < 0.05).

Figure 4.

Expression of the Acyrthosiphon pisum genes for chitinase-like proteins. The ivory, blue, green, and orange columns represent expression levels in the whole body, the bacteriocyte, the embryo, and the midgut, respectively; bars, standard errors (n= 6). The expression levels are shown in terms of the mRNA copies of the target genes per copy of the mRNA for RpL7. For each gene, different letters above the columns indicate significant differences at P < 0.05 (Steel-Dwass test).

Discussion

Screening of the genome of the pea aphid, Acyrthosiphon pisum, revealed it to have nine genes for chitinase-like proteins, among which six (Cht2, Cht3, Cht4, Cht5, Cht6 and Cht7) appear to encode true chitinases with chitinolytic activities. Two of the remaining three genes appear to encode an IDGF (Cht1) and an ENGase (ENGase), respectively. Cht8 could be a chitinase-like lectin (Table 1).

Furthermore, screening of the fully-sequenced genomes of other insects revealed that Ae. aegypti, An. gambiae, Ap. mellifera, B. mori, C. quinquefasciatus, D. melanogaster, N. vitripennis, P. humanus corporis, and T. castaneum have 31, 27, 16, 12, 34, 18, 15, 10 and 24 chitinase-like genes, respectively (Table 2). It is noteworthy that the numbers from the two hemimetabolous insects (9 and 10 in A. pisum and P. humanus, respectively) were at the lower end of the range of values (9, 10, 12, 15, 16, 18, 24, 27, 31, 34). Previous studies based on BLAST similarity searches detected 16, 16, and 13 genes for chitinase-like proteins, in the genomes of D. melanogaster, T. castaneum, and An. gambiae, respectively (Zhu et al., 2004; Zhu et al., 2008). However, these numbers are based on a more conservative criterion in which candidates need to have at least three of the four conserved regions (CR_I- CR_IV). If this criterion were to be applied to A. pisum, the number of candidate genes for chitinase-like proteins would be reduced to six, as Cht1, Cht8 and ENGase of A. pisum do not meet this requirement. Conservation of the key glutamate residue and phylogenetic analyses suggested that Ae. aegypti, An. gambiae, Ap. mellifera, B. mori, C. quinquefasciatus, D. melanogaster, N. vitripennis, P. humanus corporis, and T. castaneum have 13, 15, 8, 7, 13, 9, 11, 7, and 20 genes, respectively, which encode enzymatically active chitinases (Table 2). Again, the number from the two hemimetabolous insects (6 and 7 in A. pisum and P. humanus, respectively) were at the lower end of the range of values. This appears to be largely explained by expansion of the chitinase genes in holometabolous insects.

Molecular phylogenetic analyses suggested that the genes for the chitinase-like proteins have been repeatedly duplicated in holometabolous lineages, especially in Diptera and Coleoptera (Fig. 3). The expansion of the chitinase-like genes in holometabolous lineages might reflect the fact that Holometabola undergo a complete metamorphosis (holometabolism) in which distinctive larval, pupal and adult stages are observed. A wide variety of enzymatically active chitinases as well as other GH18 family members, including growth factors and lectins in the holometabolous lineages, appear to have evolved to facilitate radical metamorphosis, which requires a much more dramatic reconstruction of the body structure during development. However, such a radical gene expansion was not detected in B. mori, indicating that expansion of the chitinase-like genes is not prerequisite for holometabolism.

Quantitative RT-PCR demonstrated the expression levels of Cht2, Cht3, Cht4, and Cht8 of A. pisum to be significantly higher in the embryo than other organs (Fig. 4). Among these, Cht2, Cht3, and Cht4 appear to encode true chitinases with chitinolytic activity (Fig. 2, Table 1). Although the synthesis, rather than the degradation of chitin, should be dominant in embryos, apoptotic reconstruction of the body structure does occur in the animal embryo (Baehrecke, 2002). Thus, it is plausible that these chitinases are essential for embryonic development in insects. Cht8 is highly divergent (Fig. 3), and appears to lack chitinolytic activity (Fig. 2, Table 1). Although its functionality has yet to be determined, conspicuous up-regulation of Cht8 suggests an essential role in the aphid embryo.

The analyses also revealed that the expression levels of Cht6 and ENGase of A. pisum are significantly higher in the midgut than in other organs. In most lineages of insects, the midgut epithelium is lined by the peritrophic membrane (PM), which consists of chitin, proteins, glycoproteins, and proteoglycans. PM functions as a permeability barrier between the food bolus and the midgut epithelium, enhancing digestive processes and protecting the brush border from mechanical disruption, as well as from attack by toxins and pathogens (Merzendorfer & Zimoch, 2003). However, hemipteran insects, including aphids, lack PM. Instead, they have the perimicrovillar membrane (PMM), an extracellular lipoprotein membrane ensheathing the microvilli of midgut cells (Silva et al., 2004). In aphids, the membrane is also referred to as the modified perimicrovillar membrane (MPM) (Cristofoletti et al., 2003). Although the composition of the aphid PMM/MPM is not well known, chitin was detected in the lining of the midgut of the green peach aphid, Myzus persicae (Irving & Fenton, 1996). In insects with PM, PM is degraded and replaced periodically, and chitinolytic enzymes play an important role in this PM turnover, as chitin is an integral part of PM (Merzendorfer & Zimoch, 2003; Zhu et al., 2008). Thus, the aphid Cht6 may also be essential for degradation and turnover of the midgut lining in A. pisum. Although ENGase is not a true chitinase, this enzyme might also function in the degradation of chitin that was detected in the aphid midgut (Note that the ENGases (EC 3.2.1.96) are different from the exo-beta-N-acetylglucosaminidases (EC 3.2.1.52) that are well known to work cooperatively with chitinases in degrading chitin).

Experimental procedures

Screening of the Acyrthosiphon pisum genome for genes encoding chitinase-like proteins

The genome of the pea aphid was screened for genes encoding chitinase-like proteins by using Basic Local Alignment Search Tool (BLAST) search, Conserved Domain (CD-) search, and Pfam Hidden Markov Model (HMM) search.

BLAST searches (Altschul et al., 1997) were performed at the website of the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/) using amino acid sequences of the chitinase-like proteins of D. melanogaster as queries. The RefSeq protein database and the ab initio Gnomon predicted protein database of A. pisum were screened using BLASTP. The A. pisum genome scaffolds (Acry_1.0) were screened using TBLASTN. Default parameters were used for the analyses (E-value cutoff = 1.0e-5). Gene models with similarity to the chitinase catalytic domains were chosen as candidates. Gene models with similarity to only the chitin binding domains were not selected.

CD-searches were performed using the Conserved Domain Database (CDD) and the CD-search tool at NCBI. The RefSeq protein database of A. pisum was screened for gene models that have glycosyl hydrolase family 18 (GH18) chitinase-like superfamily domain(s) (superfamily cluster accession number: cl10447).

Pfam HMM searches were performed with the hmmsearch program of the HMMER package (version: 2.3.2) (http://hmmer.janelia.org/). The Gnomon-predicted protein database of A. pisum and the ACYPI protein database [a comprehensive dataset generated by combining all sets of the ab initio gene predictions of A. pisum using the gene predictor combiner tool GLEAN (Elsik et al., 2007)] were screened for gene models having Pfam Glycosyl hydrolases family 18 domain(s) (Glyco_hydro_18_fs.hmm, PF00704) (E-value cutoff = 1.0e-5).

Structural analysis

The domain structures of the retrieved gene models were analysed using the CD-search at the NCBI website (Marchler-Bauer et al., 2007) and the Pfam domain search (http://pfam.sanger.ac.uk/) (Finn et al., 2008). The presence and location of the signal peptides were predicted using the program SignalP 3.0 (Bendtsen et al., 2004). Protein subcellular localization was predicted with TargetP (Emanuelsson et al., 2000). Transmembrane regions were analysed using the TMHMM Server ver. 2.0 (Krogh et al., 2001).

Screening of other fully-sequenced insect genomes for chitinase genes

Other fully sequenced insect genomes were screened for genes that encode chitinase-like proteins in the same manner as described above. The insects used were Ae. aegypti, An. gambiae, Ap. mellifera, B. mori, C. quinquefasciatus, D. melanogaster, N. vitripennis, P. humanus corporis, and T. castaneum. BLAST searches were performed at the NCBI website, except in the case of B. mori, for which analyses were conducted using SilkDB (http://silkworm.genomics.org.cn/). The RefSeq protein database was screened for gene models with GH18 chitinase-like superfamily domain(s) using CDD and the CD-search tool at NCBI. Pfam HMM searches were performed using the databases Ae. aegypti, PEPTIDES-AaegL1.1.fa [VectorBase (http://www.vectorbase.org/)]; An. gambiae, a.gambiae.pep [Kyoto Encyclopedia of Genes and Genomes (KEGG) (http://www.genome.jp/kegg/)]; Ap. mellifera, apis.aa [euGenes (http://iubio.bio.indiana.edu:8089/)]; B. Mori, silkpep.fa (SilkDB); C. quinquefasciatus, PEPTIDES-CpipJ1.2.fa (VectorBase); D. melanogaster, d.melanogaster.pep (KEGG); N. vitripennis, nasonia.aa (euGenes); P. humanus corporis, PEPTIDES-PhumU1.2.fa (VectorBase); and T. castaneum, tcas3_gleangenes2.aa (euGenes).

Molecular phylogenetic analysis

Amino acid sequences were aligned using ClustalW (Thompson et al., 1994) embedded in MEGA version 4.0 (Tamura et al., 2007), followed by manual refinement. Amino acid sites corresponding to alignment gap(s) were omitted from the data set. Only unambiguously aligned sequences were used for the phylogenetic analysis. There were a total of 374 amino acid positions in the final dataset. Phylogenetic trees were inferred by the NJ (Saitou & Nei, 1987), ML (Felsenstein, 1981), and BI methods (Ronquist & Huelsenbeck, 2003). Neighbour joining trees were constructed using the program package MEGA version 4.0 (Tamura et al., 2007). The evolutionary distances were computed using the JTT matrix-based method (Jones et al., 1992) and are shown in the units of the number of amino acid substitutions per site. The bootstrap probability for each node was calculated by generating 500 bootstrap replicates (Felsenstein, 1985). Maximum likelihood trees were constructed using RAxML 7.0.4 (Stamatakis, 2006) with 100 replicates using the WAG matrix of the amino acid replacements, assuming a proportion of invariant positions and four gamma-distributed rates (WAG+I+gamma model). Bayesian inference was performed with the program MrBayes version 3.1.2 (Ronquist & Huelsenbeck, 2003) using the WAG+I+gamma model. For the MrBayes consensus trees, 1000 000 generations were completed, with trees collected every 100 generations.

Real-time quantitative reverse transcription-PCR

Strain ISO, a parthenogenetic clone of the pea aphid that is free from secondary symbionts, was used for the analysis. The insects were reared on Vicia faba at 15 °C in a long-day regime of 16 h light and 8 h dark. RNA was isolated from the whole bodies, bacteriocytes, later stage embryos, and midguts of 12–15 day-old (1–3 days after final molt) parthenogenetic apterous adults using TRIzol reagent, followed by RNase-free DNase I treatment. First-strand cDNAs were synthesized using pd(N)6 primer and PrimeScript reverse transcriptase (Takara, Otsu, Shiga, Japan). Quantification was performed with the LightCycler instrument and FastStart DNA MasterPLUS SYBR Green I kit (Roche, Indianapolis, IN, USA), as described previously (Nakabachi et al., 2005). The primers used are listed in Table 3. The running parameters were: 95 °C for 10 min, followed by 45 cycles of 95 °C for 10 s, 55 °C for 5 s, and 72 °C for 4 s. Results were analysed using the LightCycler software version 3.5 (Roche), and the relative expression levels were normalized to mRNA for the ribosomal protein RpL7. Statistical analyses were performed using the Kruskal-Wallis test followed by the Steel-Dwass test.

Table 3.  Primer sets used for quantitative reverse transcription-PCR
TargetForward primer 5′-3′Reverse primer 5′-3′Amplicon size (bp)
AcypiCht1GGGCTGCAAAAGTATGTGGAGTCAGCCAGTTTGGGCACT81
AcypiCht2TGGGCATGGAAACCTACCACTTTATACTGTCCCGTCAAATGATAGTT82
AcypiCht3CCAGCCGGTCCCGTATCGCAGTCGTGGGCGATTGT81
AcypiCht4TGATGGCTCTTCGTGAGAAAAATTGAACGGTGTCGATCCAAA82
AcypiCht5CTTCATGGCAATTAACTATGGCAGTAAACGACATAGGCCTGGAACGT81
AcypiCht6CAAGGACGGTTTTGTAACTTTTTCAGTGTCATGGTCAAAAACAGTGGAA81
AcypiCht7CGTGGGTCAAGTCGATTGAATCCCTGAATCTGCATCGTTGT81
AcypiCht8GCGTATGGCCAGGATCTTTATCCCTTCTATCGGGGGACAGACA100
ENGaseGCATTAAAGCCCTATGACTCCCACCTCCCTGAATGCCAAACTCCATTTT91

Acknowledgements

We thank Terence Murphy at the NCBI for his helpful advice. This study was supported in part by a Grant-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan (AN).

Ancillary