Genome‐wide identification of chitin‐binding proteins and characterization of BmCBP1 in the silkworm, Bombyx mori

Abstract The insect cuticle plays important roles in numerous physiological functions to protect the body from invasion of pathogens, physical injury and dehydration. In this report, we conducted a comprehensive genome‐wide search for genes encoding proteins with peritrophin A‐type (ChtBD2) chitin‐binding domain (CBD) in the silkworm, Bombyx mori. One of these genes, which encodes the cuticle protein BmCBP1, was additionally cloned, and its expression and location during the process of development and molting in B. mori were investigated. In total, 46 protein‐coding genes were identified in the silkworm genome, including those encoding 15 cuticle proteins analogous to peritrophins with one CBD (CPAP1s), nine cuticle proteins analogous to peritrophins with three CBD (CPAP3s), 15 peritrophic membrane proteins (PMPs), four chitinases, and three chitin deacetylases, which contained at least one ChtBD2 domain. Microarray analysis indicated that CPAP‐encoding genes were widely expressed in various tissues, whereas PMP genes were highly expressed in the midgut. Quantitative polymerase chain reaction and western blotting showed that the cuticle protein BmCBP1 was highly expressed in the epidermis and head, particularly during molting and metamorphosis. An immunofluorescence study revealed that chitin co‐localized with BmCBP1 at the epidermal surface during molting. Additionally, BmCBP1 was notably up‐regulated by 20‐hydroxyecdysone treatment. These results provide a genome‐level view of the chitin‐binding protein in silkworm and suggest that BmCBP1 participates in the formation of the new cuticle during molting.


Introduction
The cuticle forms the insect exoskeleton, which covers the surface of the insect body and plays important roles in growth control, environmental protection and wound healing. During development, the insect cuticle undergoes several rounds of molting to overcome size limitations (Petkau et al., 2012). The cuticle is mainly formed from chitin, small amounts of lipids and cuticle proteins, which blend with chitin to form the natural architecture of biomacromolecules that protect the insect's body (Togawa et al., 2004;Deng et al., 2016). To date, several chitinbinding cuticle proteins have been identified and characterized in various insect species (Andersen et al., 1995;Hamodrakas et al., 2002;Togawa et al., 2004;Iconomidou et al., 2005;Futahashi et al., 2008;Petkau et al., 2012;Mun et al., 2015;Deng et al., 2016).
ChtBD2 was discovered in the insect peritrophic membrane proteins (PMPs); therefore, proteins with the ChtBD2 motif used to be referred to as "peritrophins" and the ChtBD2 motif was termed the peritrophin Atype domain (Tellam et al., 1999;Jasrapuria et al., 2010). An analysis of the sequence features of PMPs shows that the CBDs of these proteins have conserved cysteine residues or non-polar amino acid residues, which might have similar functionality for interacting with chitin/proteins (Wang & Granados, 2001). To date, most proteins with the ChtBD2 motif have been identified in the peritrophic membrane of insect species; examples include peritrophin-44 and peritrophin-48 of Lucilia cuprina (Elvin et al., 1996;Schorderet et al., 1998), Ag-Aper1 of Anopheles gambiae (Shen & Jacobs-Lorena, 1998), and peritrophin-57 and peritrophin-37 of Spodoptera litura . These peritrophins within the ChtBD2 domain and the conserved cysteine in the ChtBD2 domain are linked by disulfide bridges to form a binding pocket, in which conserved hydrophobic residues form hydrogen bonds with chitin fibrils. The ChtBD2 domain may influence PM structure and properties, as some peritrophins may interact only with specific chitin conformations (Hegedus et al., 2009). The multiple ChtBD2s of PMPs enable the assembly of proteins and chitin fibrils to form membranous structures supported by chitin fibrils in the PM (Wang & Granados, 2001).
Besides the peritrophins, ChtBD2 was also discovered in the obstructor family of invertebrates, which contained three ChtBD2 domain and a signaling peptide (Behr & Hoch, 2005). Obstructor-A in Drosophila has been considered to play an important role for packaging of protein and chitin matrix in apical cells and is necessary for extracellular matrix in cuticle forming organs (Petkau et al., 2012;Pesch et al., 2015). Obstructor-E has been reported as an important protein that controls the oriented contractility/expandability in Drosophila (Tajiri et al., 2017). The completion of genome sequencing projects for numerous species has facilitated the analysis of species at the genome level. Jasrapuria et al. (2010) conducted a genome-wide bioinformatics search of the genes encoding ChtBD2-containing proteins in the genome of Tribolium castaneum, and classified these genes into three main families. In addition, Tetreau et al. (2015) performed an exhaustive search for genes encoding ChtBD2-containing proteins in the genome of M. sexta, and found 53 genes encoding 56 chitin-binding proteins, containing CPAP1s, CPAP3s, PMPs, chitinase, chitin deacetylases. Members of the CPAP family exhibited differential spatial expression patterns and are widely expressed in various cuticleforming tissues.
In this study, we performed a genome-level search to identify genes encoding chitin-binding proteins containing ChtBD2 in the silkworm, and examined the expression pattern of these proteins in silkworm tissues. Furthermore, BmCBP1, one of the ChtBD2-containing proteins of the silkworm, was characterized and its expression and localization studied during silkworm development and ecdysis.

Insects and reagents
The silkworm strain Dazao was used in the present study. The silkworm was reared at a temperature of 25 ± 1°C on mulberry leaves. The 4th and 5th instar larvae were used for dissection to isolate the cuticle and various tissues for analyses. RNA extraction was performed using the Total RNA kit (Omega Bio-Tek, Norcross, GA, USA) according to the manufacturer's instructions. The p28 vector (a pET28a-derived vector), T4 DNA ligase, and restriction enzymes were purchased from TaKaRa (Otsu, Japan). All polymerase chain reaction (PCR) primers (Table 1) were synthesized by Shanghai Sangong Co. Ltd. (Shanghai, China).

Identification of genes encoding chitin-binding proteins in the B. mori genome
The identification of genes encoding chitin-binding proteins was performed according to a previously reported method (Tetreau et al., 2015). We conducted an extensive search of the silkworm genome (SilkDB, http://www.silkdb.org/silkdb/) and National Center for Biotechnology Information (NCBI) (http://www.ncbi. nlm.nih.gov/protein/) to identify all proteins predicted to contain the peritrophin A-type (ChtBD2-type) domain (pfam01607). Protein sequences with the ChtBD2 chitin-binding domain (CBD) from Drosophila and other insects were downloaded from NCBI (http://www.ncbi. nlm.nih.gov/protein/) and utilized as queries to identify genes encoding proteins with ChtBD2s in B. mori. The protein sequences for B. mori identified in the initial search were used as queries for a second round of Basic Local Alignment Search Tool (BLAST) search to identify additional proteins with ChtBD2. This process was repeated until no additional proteins with the ChtBD2 domain could be identified.
Based on their sequence homology and expression distribution in different tissues, the identified proteins with CBD domain identified are classified into five different classes, including CPAP1s, CPAP3s, PMPs, chitinase, and chitin deacetylases.

Whole-genome microarray analysis of expression of silkworm ChtBD2 genes
Oligonucleotide microarray data were acquired from the SilkDB (http://www.silkdb.org/microarray/ search.php) for a total of 35 genes from ten different tissues (Xia et al., 2007). Gene expression in multiple silkworm tissues from the 3rd day of the 5th instar larvae of Dazao was then investigated. Hierarchical clustering of gene expression patterns was performed using HemI software, as previously reported (Deng et al., 2014).

Protein sequence and phylogenetic analyses
Protein sequence analysis was used to predict molecular weight (MW) and isoelectric point (pI) using the Ex-PASy proteomics website (http://web.expasy.org). Conserved domains in the protein sequence were identified via NCBI (http://www.ncbi.nlm.nih.gov/) and SilkDB (http://www.silkdb.org/silkdb/). Multiple sequence alignments of proteins were performed using ClustalX (http://clustal-x.software.informer.com/1.8/) and GENE-DOC software (http://genedoc.software.informer.com/). ClustalX software was also used to perform multiple sequence alignments prior to phylogenetic analysis. MEGA6.0 software (http://www.megasoftware.net/ mega6) was used to construct the phylogenetic tree using the neighbor-joining method. To assess the branch strength of the phylogenetic tree, a bootstrap analysis of 2000 replications was performed. Bootstrap values of no less than 20% are shown on each branch of all trees generated. The identification of genes encoding chitin-binding proteins was performed according to a previously reported method (Tetreau et al., 2015).

Bioinformatics Analysis of the BmCBP1 promoter
Bioinformatics analysis of cis-regulatory elements was performed on the upstream 2.0-kb promoter sequences of BmCBP1 genes, which were obtained from the silkworm genome database (http://www.silkdb.org/silkdb/). Potential ecdysone response elements (EcREs) were predicted via the JASPAR CORE database (http:// jaspar.genereg.net/) using the elements of Drosophila, such as Eip74EF and EcR::usp.

Production and purification of recombinant BmCBP1 and preparation of polyclonal antibody
To obtain the complementary DNA (cDNA) sequence of BmCBP1, total RNA was extracted from the epidermis of 3rd day 5th instar larvae using the Total RNA kit (Omega Bio-Tek, Norcross, GA, USA). Two primers (Table 1) were then used to amplify the open reading frames (ORFs). The PCR products obtained as a result of the amplification of the ORFs were inserted into the protein expression vector p28, between the EcoR I and BamH I restriction sites. Escherichia coli BL21 (DE3) cells were transformed with the recombinant plasmid, and protein expression was induced by adding 1.0 mmol/L isopropyl β-D-1thiogalactopryranoside at 37°C. The Histagged fusion protein was purified using Ni-resins (Novagen, Madison, WI, USA). The Bradford assay was used for protein quantification (Bradford, 1976). The fusion BmCBP1 proteins were used for production of polyclonal antibodies by ZeHeng Biotech (Chongqing, China).

Real-time quantitative PCR and western blot analysis
Total RNA was extracted from various tissues at the prepupal stage, as well as from the epidermal tissues from the 4th instar larval stage to the 1st day of moth, using the Total RNA kit (Omega Bio-Tek, Norcross, GA, USA). Extracted total RNA was then treated with RNase-free Dnase I (TaKaRa, Otsu, Japan). Then, 3 μg of messenger RNA (mRNA) was transcribed into single-strand cDNAs by first-strand cDNA synthesis. Reverse transcription was performed using the reverse transcription Moloney Murine Leukemia Virus kit (Omega Bio-Tek, Norcross, GA, USA) according to the manufacturer's protocol. Real-time quantification PCR was performed to check the tissue specificity and expression pattern of various developmental stages, using pairs of gene-specific primers of which are listed in Table 1. Real-time quantification PCR was performed using a Premix Ex Taq TMII SYBR RT-PCR kit (TaKaRa, Otsu, Japan) and an ABI Prism 7500 Step One PlusTM Real-Time PCR System (Applied Biosystem, California, USA). The mRNA relative expression level of BmCBP1 was analyzed using the 2 − Ct methods. Triplicate experiments were conducted for each sample. The statistical analyses were performed using non-paired t-test. The epidermis was isolated from silkworm tissues from the 4th instar larval stage to the 1st day of moth. The samples were homogenized in protein extraction buffer (8 mol/L urea, 1 mmol/L dithioerythritol and 10 mmol/L Chaps) and vortexed at 4°C for 2 h. Samples were then centrifuged (Eppendorf) at 10 000 × g at 4°C for 10 min. Supernatants containing soluble proteins were stored at −80°C. The Bradford assay was used for protein quantification.

Ecdysteroid treatment
To survey whether 20-hydroxyecdysone (20E) affects the expression of the BmCBP1 gene, 20E was dissolved (10 mg/mL) and diluted to working concentrations with dimethyl sulfoxide (DMSO). Solutions containing 1 μg, 2.5 μg, 5 μg or 10 μg of 20E (Sigma, St Louis, MO, USA) were injected into the silkworm via the spiracle of the 2nd day of 5th instar larvae. The same volume of DMSO was injected into the larvae as a control. After 24 h, the epidermal tissue from each treatment was dissected on ice, and immediately stored in liquid nitrogen for real-time quantitative PCR analysis. The statistical analyses were conducted using non-paired t-test.

Immunofluorescence
Immunofluorescence co-localization of chitin and Bm-CBP1 was performed as described by Deng et al. (2016). Sections of cuticles (thickness, 5 μm) were prepared. The sections were blocked for 1 h in 20 mmol/L phosphatebuffered saline (PBS) containing 5% bovine serum albumin. The sections were then incubated with anti-BmCBP1 antibodies at a dilution of 1 : 300 for 1 h at 37°C, then washed three times with PBST (PBS + 0.5% Triton-X100) buffer for 10 min each. Then, the sections were incubated with Cy3-conjugated anti-rabbit immunoglobulin G (1 : 1000 in blocking buffer) as the secondary antibody for 1 h at room temperature, followed by washing with PBST three times. Next, fluorescein isothiocyanateconjugated wheat germ agglutinin (WGA 1 : 100; Sigma, St Louis, MO, USA) chitin-binding probe was applied and incubated at room temperature for 1 h. The sections were washed with PBS three times, for 10 min each, and then the nuclei were stained with 4',6-diamidino-2-phenylindole dihydrochloride (DAPI) at a dilution of 1 : 1000, for 30 min. After washing three times in PBS buffer, the sections were observed and photographed under a fluorescence microscope (Olympus FV500, Tokyo, Japan).

Identification and characterization of proteins with the ChtBD2 domain
When ChtBD2-containing proteins from Drosophila melanogaster and other insects were queried against the silkworm genome via a BLAST search, a total of 46 proteins containing at least one ChtBD2 domain were identified in B. mori. The proteins with identified ChtBD2 domains are classified into five different classes based on their sequence homology and tissue specificity of gene expression from the B. mori microarray database. These included 15 CPAP1s, nine CPAP3s, 15 PMPs, four chitinases, and three chitin deacetylases (Table 2). Compared with the CPAP family, the PMP family of proteins Table 2 Genes encoding proteins with ChtBD2 domains in Bombyx mori.  exhibits a much larger variation in terms of the number of ChtBD2s, which ranges from one to 42 CBDs, while chitinases and chitin deacetylases contain both the ChtBD2 domain and a catalytic domain. Alignment of the CBDs from the proteins in each family showed that the protein sequences of the five categories are highly variable, while six cysteine residues form a highly conserved motif between the five categories. The lengths of the CBDs range from 50 to 63 amino acid residues, when counting from the first to the sixth cysteines. The spacing between cysteines was found to be highly similar; however, the spacing between the other cysteines was greater than that between the second and third cysteines (Fig. 1). The cysteines of the ChtBD2 domain, for each category, are conserved within those of previous reports for other species, including M. sexta, T. castaneum, and D. melanogaster (Behr & Hoch, 2005;Jasrapuria et al., 2010;Tetreau et al., 2015).

Gene expression patterns
Microarray analysis of the expression patterns of 35 genes encoding ChtBD2 proteins in B. mori showed that most CPAPs were clustered together and expressed in the integument and head. In addition, some CPAP genes were also found to be expressed in the malpighian tubules, MSG (anterior/median silk gland), and PSG (posterior silk gland). Some of the PMPs were clustered together and highly expressed in the midgut, while others exhibited divergent expression patterns with low expression in the hemocyte, testis, ovary and malpighian tubules (Fig. 2).

Expression profiles and phylogenetic analysis of BmCBP1
In order to further examine the function of cuticlerelated genes with the ChtBD2 domain in silkworm, BmCBP1 (BmCPAP3-A2), one of the CPAP3s, which showed high expression in the epidermis and accumulated in molt fluid during metamorphosis (Qu et al., 2014), was characterized in this study and its expression and localization was investigated using molecular tools. The result showed that BmCBP1 possesses a signal peptide and three ChtBD2 domains, spaced by two linker regions (Fig. 3A). Phylogenetic analysis showed the CPAP3 family has conserved sequences within different species, including  Supplementary Table S1. CPAP3s are grouped into seven different groups: group A1 (light green), groups A2 (red), group B (dark gray), group C (dark purple), group D1 (dark blue), group D2 (pink) and group E (orange).
M. sexta, Nasonia vitripennis, T. casteneum, Apis mellifera, D. melanogaster as well as P. h. corporis. A total of nine BmCPAP3 proteins from the B. mori genome were divided into seven phylogenetic groups of CPAP3 family from different species. BmCBP1 (BmCPAP3-A2) was observed in Group A2 of the phylogenetic tree (Fig. 3B).
The ChtBD2 chitin-binding domain may act as a conserved basic module for BmCBP1, and its homologs may play a role in the organization of the chitinous cuticle (Petkau et al., 2012). The data from quantitative PCR showed that BmCBP1 is mainly expressed in the larval head and epidermis, and weakly expressed in the fat bodies (Fig. 4A). The distribution of protein among the tissues was consistent with the mRNA expressional profiles, as indicated by western blot analysis (Fig. 4B).

Expression patterns of BmCBP1 in the epidermis at various developmental stages
RNA and protein extracts were obtained from the epidermis to perform a temporal expression analysis of the 4th instar larvae to the moth stages. The results demonstrated BmCBP1 exhibits varied expression during the developmental stages, with dramatic upregulation of expression just before ecdysis and metamorphosis, during molting day of 4th instar, 2nd day of wandering stage, and the late pupal stage. Furthermore, the level of protein expression was consistent with that of mRNA expression. Similar patterns were confirmed at the developmental stages, from 1st day of 4th instar to day of eclosion, via western blotting (Fig. 5).

The responses of the BmCBP1 gene to 20E
We performed bioinformatics analysis on the upstream promoter sequences of BmCBP1 genes. Some putative ecdysone response elements including two elements of EcR::usp (−542 to −528; −35 to −21) and two elements of Eip74EF (−769 to −763; −506 to −500) were identified from the promoter sequence of BmCBP1 (Fig. 6A). In order to identify whether 20E could affect the expression of BmCBP1 gene in vivo, 20E treatment was conducted. Quantitative reverse transcription PCR results for larval epidermis showed that BmCBP1 was notably up-regulated after 24 h by 20E treatment compared to the control, and expression of BmCBP1 gene increased gradually with the increase of the 20E injection, which suggested that Bm-CBP1 was regulated by 20E during molting ( Fig. 6B-C).

Immunofluorescence co-localization of chitin and BmCBP1 in the cuticle
In order to explore the function of BmCBP1 during the development of the silkworm, we additionally performed co-localization of BmCBP1 protein and chitin from the 4th instar larval molting stage to the 1st day of 5th instar. At the early stage of the 4th molt of silkworm larva, the chitin showed a clear distribution in the cuticle. As development progressed, the new cuticle acquired shape and a weak chitin signal appeared under the old cuticle. The BmCBP1 protein demonstrated strong expression in the old cuticle and a faint signal during the early stage of molting. The expression of this protein was upregulated during the formation of the new cuticle, at 12 h of the 4th molt. Prior to ecdysis in the silkworm, the chitin was highly enriched in the outer sphere of the old cuticle. Strong expression of BmCBP1 could be detected at 22 h of the 4th molt. Immunofluorescence data revealed that chitin and BmCBP1 shared similar expression patterns in the epidermis throughout molting of the 4th instar. The uniform distribution of BmCBP1 and chitin in the entire cuticle, and their relative abundance, suggested that BmCBP1 participated in the formation of the new cuticle during molting (Fig. 7).

Discussion
In this study, we performed a comprehensive search of the silkworm genome for genes encoding proteins that contain ChtBD2 chitin-binding domains. Our results indicated that six cysteines of the ChtBD2 chitin-binding domain, for each category, are conserved in insects such as M. sexta and T. castaneum (Jasrapuria et al., 2010;Tetreau et al., 2015). Compared with the 46 chitin-binding proteins in silkworm, 56 chitin-binding proteins are found in M. sexta, which include 26 CPAPs, 17 PMPs, six chitinases and seven chitin deacetylases. Fifty chitin-binding proteins have been identified in T. castaneum, which comprise 18 CPAPs, 11 PMPs, 13 chitinases and chitin deacetylases, and the remaining eight proteins are classified as miscellaneous proteins (Jasrapuria et al., 2010;Tetreau et al., 2015). The CPAP1 family and CPAP3 family also have been found in M. sexta and T. castaneum  and the CPAP family may be derived from a common ancestor (Jasrapuria et al., 2010;Tetreau et al., 2015). The ChtBD2 chitin-binding domain, which is considered to act as a basic module that is acquired by other proteins, including enzymes, is involved in chitin metabolism to modify their function (Tetreau et al., 2015). It is possible that a higher number of ChtBD2 leads to binding to the chitin (Jasrapuria et al., 2010).
The homologs of the CPAP-encoding genes were expressed in the head and epidermis in the tissues of the 3rd day of the 5th instar (Fig. 2). The spatial expression profiles additionally showed that BmCBP1 is mainly expressed in the head and epidermis (Fig. 4). However, the expression pattern of the PMP gene family differed from that of the CPAPs, which are mainly expressed in the midgut, although they have high similarity to and share conserved sequence and domains with the CPAP family. Elvin et al. reported that multiple cysteine-rich domains in peritrophin-44 are responsible for binding to chitin. It was additionally reported that the major structural protein of the peritrophic membrane may play a role in the maintenance of the peritrophic membrane structure and porosity (Elvin et al., 1996). Although the CPAP and PMP families possess similar domains, they differ in terms of function.
Several cuticle-related proteins were identified from molting fluid in 2014, including chitinase, hexosaminides, and several CPAP proteins (Qu et al., 2014). BmCBP1, one of these cuticle-related proteins, exhibited mostly an eight-fold increase in expression at the pupal-adult stage compared to at the larval-pupal stage (Qu et al., 2014). In this study, we found that expression of BmCBP1 increased obviously before larva-pupa and pupa-adult metamorphosis. However, the highest expression of BmCBP1 occurred in the larval-pupa rather than pupal-adult stages of metamorphism in the quantitative PCR analyses, which was different from the proteomic result using the molting fluid (Qu et al., 2014). Constant accumulation of Bm-CBP1 from the 1st day to 9th day of the pupal stage was observed on western blotting. Therefore, we presumed that BmCBP1 protein was expressed in the cuticle during the whole pupal stage and secreted in the molting fluid of the silkworm before eclosion.
BmCBP1 was upregulated during the wandering stage and late-pupal stage, which is constant with the change of ecdysone titer in silkworm (Mizoguchi et al., 2002). The results from the 20E treatment also confirmed that BmCBP1 was dramatically upregulated by 20E. It has been reported that the promoter region of the cuticle protein possesses ecdysone response elements that mediate the response to ecdysone under regulation by ecdysoneresponsive transcription factors (Wang et al., 2009;Ali et al., 2012;Akagi et al., 2013). Bioinformatics analysis of the upstream promoter sequences of BmCBP1 genes revealed two putative EcR::usp and two Eip74EF elements in the BmCBP1 promoter. In future studies, the putative elements will be deleted or mutated to determine which elements are responsible for the response to 20E in the BmCBP1 promoter.
It has been reported that proteins containing the ChtBD2 domain have chitin-binding activity (Elvin et al., 1996;Tang et al., 2010;Petkau et al., 2012;Chen et al., 2014;Dong et al., 2016;Tajiri et al., 2017). In this study, we demonstrated that BmCBP1 and chitin co-localized in the epidermal layer, and that the emergence of the Bm-CBP1 signal in the new cuticle appeared slightly earlier than the chitin signal in the new cuticle. These results indicate that BmCBP1 is an important structural component of the new cuticle. In Drosophila, obstructor-A, a member of CPAP3 family, was confirmed to be involved in the extracellular matrix in cuticle-forming organs. Loss of obstructor-A led to severe defects during cuticle molting and tube expansion (Petkau et al., 2012;Pesch et al., 2015). In fact, we also performed RNA interference (RNAi) for the gene encoding BmCBP1 in the silkworm at the 4th instar of larvae. However, obvious differences of phenotype were not observed between RNAi for BmCBP1 and the control (Fig. S1). Many RNAi studies have been tried in Lepidoptera; however, some of them proved difficult to achieve (Terenius et al., 2011). The high expression of BmCBP1 during metamorphosis may lead to less obvious differences of phenotype in RNAi experiments. We think the gene knockout may be a more effective tool for studying the function of the BmCBP1 gene in silkworm.