Correspondence: Svend Birkelund, Department of Medical Microbiology and Immunology, The Bartholin Building, University of Aarhus, DK-8000 Aarhus C, Denmark. Tel.: +45 4096 1916; fax: +45 8619 6128; e-mail: email@example.com
The protein composition and N-terminal sequences of proteins in the outer membrane of Chlamydia trachomatis L2 were analysed following isolation of N-terminal peptides using combined fractional diagonal chromatography and identification by liquid chromatography tandem MS. Acetylation of primary amino groups of in vivo generated proteolytic cleavage sites facilitated identification of such sites in known outer membrane proteins (MOMPs). Our results further support a proposed prediction of the topology of the MOMPs. Furthermore, a previously unknown MOMP, CTL0626 (Ct372), was assigned as an MOMP with a carbohydrate-selective porin (OprB) family motif, and the presence of CTL0626 was confirmed using antibodies raised against the protein.
Surface-exposed outer membrane proteins of Chlamydia elementary bodies (EB) are exposed to the host immune system because Chlamydia lack capsule and the lipopolysaccharide is a rough mutant without an O-chain (Holst et al., 1991). The outer membrane of Chlamydia can be extracted with ionic detergents such as sodium deoxycholate (Jenkin, 1960). The major outer membrane protein (MOMP) is the most dominant protein in this outer membrane complex (COMC) (Caldwell et al., 1981). The size of MOMP is c. 43 kDa with a small variation in size between species and serotypes (Hatch et al., 1981). Additional proteins of 62/59 and 12 kDa were also observed by Hatch et al. (1981) in COMC. The proteins were shown to be rich in cysteine, and reduction was necessary to dissolve the cysteine-rich proteins as well as MOMP from COMC (Hatch et al., 1984). The cysteine-rich protein of 59/62 kDa, Omp2, and the small cysteine-rich protein that migrates as 12–15 kDa is named Omp3. Omp3 was predicted to be a lipoprotein (Allen et al., 1990) and was experimentally shown to be a lipoprotein in Chlamydia psittaci (Everett et al., 1994). In EB, MOMP trimers are in close association with lipopolysaccharide (Birkelund et al., 1988). MOMP has four variable surface-exposed domains that are immunogenic and possess the serovar-specific epitopes (Stephens et al., 1987). Other proteins present in Chlamydia trachomatis COMC are the nine polymorphic membrane proteins (Pmp) of c. 100 kDa or higher molecular weight. PmpB, D, F, H and G were detected in 1D and 2D gels using MS (Mygind et al., 2000; Shaw et al., 2002). Shaw et al. (2002) found the cleavage fragments PmpB and PmpD. Pmps are autotransporter proteins with a C-terminal antiparallel β-sheet forming a barrel through the outer membrane and an N-terminal part that is exposed on the surface of the bacteria or secreted (Vandahl et al., 2002b; Kiselev et al., 2007).
The MOMPs may be proteolytically cleaved by proteases present in the chlamydial inclusion. The proteolytically most sensitive sites in proteins are stretches of amino acid residues between domains or surface-exposed loops in an integral membrane protein. Therefore, knowledge about proteolytic fragments of a protein can give information about the structure of the protein (Fontana et al., 2004), as shown for Pmp proteins where a protease sensitive part is localized between the C-terminal β-barrel and the N-terminal β-tube structures (Vandahl et al., 2002b; Wehrl et al., 2004).
The aim of the present study was to determine in vivo proteolytic cleavage sites of proteins of COMC of C. trachomatis L2. In contrast to earlier studies, where two-dimensional-polycacrylamide gel electrophoresis (2D-PAGE), MS and N-terminal amino acid sequencing were used (Shaw et al., 2002), we used combined fractional diagonal chromatography (COFRADIC) and liquid chromatography tandem MS (LC-MS/MS), a method developed by Gevaert et al. (2003). This technique, which highly enriches N-terminal peptides in complete digests of proteomes, besides identifying the proteins present in COMC, identifies the actual N-terminal sequence of each in vivo generated fragment of COMC proteins.
Materials and methods
Cultivation and purification of C. trachomatis
Chlamydia trachomatis L2 434/Bu was cultivated in HeLa 229 cells for 48 h, and EB were purified by density gradient centrifugation. From 25 mg of EB protein, the COMC was extracted with sarkosyl in accordance with Caldwell et al. (1981) with the modification that 1 U mL−1 benzoase (Sigma-Aldrich, St. Louis, MO) and 20 μg mL−1 RNAse (Worthington Biochemical Corporation, Lakewood, NJ) were added to the 2% sarkosyl solution in the presence of 2 mM CaCl2 and 2 mM MgCl. Approximately 4 mg COMC was obtained. The purity of EB and COMC were tested using negative staining and electron microscopy.
2D gel electrophoresis
2D gel electrophoresis on immobilized pH gradients pH 3.5–10, with silver staining and protein identification was carried out in accordance with Vandahl et al. (2002a).
COFRADIC isolation of N-terminal peptides
COFRADIC was done as described by Gevaert et al. (2003) and full experimental details can be found in that paper. Briefly, cysteines in the COMC proteins were reduced and then alkylated with iodoacetamide. Subsequently, all free α (protein N-terminus) and epsilon (lysine) amino groups were blocked by acetylation. The sample was then digested with trypsin, which only cleaves after arginine, and the generated peptides were separated by reverse-phase HPLC. A total of 12 so-called primary fractions were collected, in which essentially two different types of peptides are present: peptides with a blocked (acetylated) α-N-terminus and peptides with a free α-N-terminus (internal peptides). The latter were then reacted and blocked with the hydrophobic reagent 2,4,6-trinitrobenzenesulfonic acid (TNBS). Each TNBS-treated primary fraction was then individually rerun on the same reverse-phase chromatograph. Owing to the strong hydrophobicity of TNBS, TNBS-reacted internal peptides now had longer retention times and segregated from the acetylated N-terminal peptides, which were collected. These N-terminal peptides were then analysed using LC-MS/MS with a Bruker Esquire HCT ion trap MS that was operated as described previously (Juul et al., 2007). As proline, pyroglutamic acid or pyrocarbamido methyl cysteine cannot react with TNBS, such peptides would appear unacetylated.
The C. trachomatis L2 434/Bu genome sequence (accession number NC_010287) was used to predict ORFs for searching MS/MS data; as control, a shuffled database of the protein sequences was also generated. A locally installed version of the mascot database searching software (Perkins et al., 1999) was used. Incremental searches were performed first in the protein database and then in peptide databases (truncated protein database) derived from this protein database which mimic protein processing in silico and which were made using the dbtookit software (Martens et al., 2005). The following mascot search parameters were set when searching the protein database. Arg-C was set as the protease with one allowed missed cleavage. Fixed modifications were acetylation of lysine and carbamidomethylation of cysteine. Variable modifications were acetylation of the N-terminus, formylation of the N-terminus of the protein, oxidation of methionine, formation of pyroglutamate and pyrocarbamidomethyl cysteine, and carbamylation of lysine and the N-terminus. Searches in the peptide database were performed using essentially the same set of parameters except that now no protease setting was used and formylation was not allowed. Only peptides for which the mascot ion score of the matched MS/MS spectrum exceeded the mascot identity threshold score set at 95% confidence and ranked first were considered identified.
The oprB gene encoding amino acid 19–442 was amplified with the forward primer: 5′-GAC GAC GAC AAG ATG CAG GCT GCA CAC CAT CAC TAT CAC C-3′, and with the reverse primer: 5′-GAG GAG AAG CCC GGT TTA GAT AGC TAA ATT AGC TCG CAC GCT ATA AAC ACG-3′ from C. trachomatis D DNA. The underlined sequences are ligation-independent cloning (LIC) sites used for cloning the PCR product in pET-30 Ek/LIC vector in accordance with the manufacturer's instructions (Novagen, Madison, WI). The vector encodes a histidine tag on the protein, which was used for affinity purification (Mygind et al., 1998b). The purified protein was checked by sodium dodecyl sulphate (SDS) PAGE and peptide mass fingerprinting on a MALDI-TOF MS instrument. Two guinea-pigs were immunized three times intramuscularly and three times subcutaneously with 50 μg of protein in Freund's incomplete adjuvant (Difco, Detroit, MI). The antibody was diluted 1 : 200 for immunoblotting.
Other antibodies used were mouse monoclonal antibody (mAb) 32.3 against MOMP, mAb 35.2 against DnaK and rabbit polyclonal against Omp2 (Lundemose et al., 1990; Birkelund et al., 1994). The secondary antibodies: goat anti-mouse and anti-guinea-pig IgG conjugated with alkaline phosphatase (Sigma-Aldrich) diluted 1 : 20 000; and goat anti-mouse and anti-guinea-pig IgG conjugated with field inversion gel electrophoresis (Jackson ImmunoResearch Laboratories, West Grove, PA) diluted 1 : 100 were used. Immunofluorescence microscopy was performed according to Clausen et al. (1997).
Results and discussion
2D-PAGE analysis of COMC
EB and extracted COMC were separated using 2D-PAGE and the gels were silver stained (Fig. 1). The extraction removed most of the dominant cytoplasmic proteins (DnaK and GroEL) from the EB preparation (Fig. 1a and b) compared with MOMP. Of known membrane proteins, MOMP, Omp2, PorB, PmpF and PmpG were identified (Fig. 1b). In addition, a protein CTL0887 (Ct623) unknown in COMC was identified in two spots. The spots were clearly enriched in the COMC gel compared with the EB gel, indicating that CTL0887 is part of COMC (C. trachomatis L2 434/Bu CTL0887, accession number YP_001654956).
N-terminal COFRADIC analysis of COMC proteins
Using gel densitometry, GroEL and DnaK were <1 : 100 of MOMP, showing a high enrichment of COMC proteins.
The COFRADIC results using a threshold of four spectra and two unique peptides are shown in Table 1. With these thresholds, 25 proteins were identified, including many known COMC proteins. For database searches, acetylation of α-N-termini was set as a potential modification, and therefore if the N-terminal amino acid in a peptide was identified as being acetylated, it could be concluded that the corresponding proteins were cleaved at this position before trypsin digestion. In addition, peptides with unacetylated N-termini were identified because of incomplete TNBS treatment, and the presence of such internal peptides further supported the identification of a protein as being present in COMC (Table 2). Not all N-terminal peptides are identified by COFRADIC, for example peptides shorter than six amino acids or peptides >2500 Da. Therefore, some peptides with known cleavage sites were not identified.
Table 1. COFRADIC identification of proteins with more than four spectra and two unique peptides from Chlamydia trachomatis L2 COMC
Signal-P cleavage site
† Bacterial inner membrane protein.
Low significance of signal-p 3.0 prediction of cleavage site.
The most abundant noncytoplasmic protein fragments identified are from MOMP, Omp2, CTL0626 (Ct372, OprB), PmpG, CTL0541 (Ct289), CTL0887 (Ct623), PmpH and YscC. Frequently, the same peptide was identified multiple times, strengthening the identification of the corresponding protein (Table 1). In addition to N-terminally cleaved fragments, COFRADIC was able to identify the N-terminal sequence of six proteins starting at amino acid 1 or 2 depending on whether N-formylmethionine (fMet) was cleaved off. Owing to the high sensitivity of the COFRADIC method and the inability to completely remove all cytoplasmic proteins during the COMC extraction process (Fig. 1b), some known cytoplasmic proteins contaminating the COMC preparation were also identified (Table 1). Here, highly expressed cytoplasmic proteins such as GroEL, DnaK, RNA polymerase β, elongation factor Tu, and Histone-H1-like proteins Hc1 and Hc2, many of which are known to bind to other proteins (Christiansen et al., 1993; Shaw et al., 2002), were found as contaminants. The N-terminal amino acid sequence was identified for six of these proteins (Table 1).
Porins form a transport channel that allows passive diffusion of small polar molecules through the outer membrane to the periplasmic space. MOMP was identified with 236 spectra and 14 unique peptides. The predicted signal peptidase I cleavage site was not detected but 17 spectra of acetylated peptides starting at amino acids 61, 62 or 63 were obtained (Table 2). This region is in a conserved part of MOMP, and in a model generated by Findlay et al. (2005) amino acids 61–63 were located at the end of a predicted loop, indicating that this part may be accessible to proteolytic cleavage in vivo. This was confirmed by the COFRADIC results.
Other porins in smaller amounts than MOMP are known to be present in COMC including PorB (Fig. 1b) and Omp85 (Kubo & Stephens, 2000; Stephens & Lammel, 2001; Shaw et al., 2002). PorB was detected with five spectra and four unacetylated internal peptides. COFRADIC thus gave no data about its N-terminal amino acid sequence (Table 1). Omp85 was detected with two spectra of two internal peptides (data not shown).
A hypothetical protein CTL0626 (Ct372) was identified with 16 spectra of nine unique peptides. By conserved domain database search, CTL0626 was found to hold a carbohydrate-selective porin family motif (OprB, Ct372) in its C-terminal part (amino acids 169–461) (Wylie et al., 1993; Wylie & Worobec, 1995; Marchler-Bauer et al., 2007). Using blast analysis, well-conserved CTL0626 homologues were found to be present in the other sequenced C. trachomatis serovars (A, D), as well as in Chlamydia muridarum Nigg, Chlamydia pneumoniae, Chlamydia abortus, Chlamydia caviae GPIC and Chlamydia felis Fe/C-56. The OprB domain was clearly the best conserved part and the only part of the protein that matched OprB proteins in other bacteria. The first acetylated N-terminal peptide of CTL0626 was found at amino acid 60 but internal peptides down to amino acid 48 were detected and therefore the N-terminal amino acid should be found further upstream (Table 2). signalp 3.0 did not identify a signal peptidase I cleavage site of CTL0626 (Bendtsen et al., 2004). It was, however, possible to find such a site in OprB of C. muridarum, C. abortus, C. caviae, C. felis and C. pneumoniae. By multiple alignments it was found that compared with the other Chlamydia, C. trachomatis OprB was missing 19 amino acids of the leader sequence. In the C. trachomatis and C. muridarum sequences a GTG codon is present at the position where the ATG codon is present in the other Chlamydia species, 57 nucleotides upstream of the annotated start codon. If this GTG codon was used as alternative start codon, a ribosomal binding site was present in front of GTG (Fig. 2), and the translated sequence with 19 additional amino acids gave a perfect leader sequence prediction. The easygene program also predicted the GTG to be the most likely start codon (score 5 × 10−13) for CTL0626 of C. trachomatis. An indirect proof of the use of the GTG as start codon is the presence of OprB in the COMC. Therefore, an antibody to a histidine-tagged fusion protein of OprB was generated. The predicted size without leader sequence of C. trachomatis L2 OprB is 49.1 kDa. In immunoblots the antibody reacted with a band that migrated at 46 kDa in both EB and COMC but the reaction was stronger in COMC (Fig. 3). When parallel samples were reacted with antibodies to MOMP, an equally strong reaction was seen in EB and COMC, whereas an antibody to the cytoplasmic antigen DnaK reacted more strongly with EB than with COMC. The data thus further indicated that OprB is present in COMC, and that the alternative start codon is used, because otherwise OprB could not be exported over the cytoplasmic membrane. To analyse the localization of OprB, immunofluorescence microscopy was performed on cell cultures infected for 44 h with C. trachomatis L2 using the same primary antibodies as in immunoblotting (Fig. 3). In Fig. 4 it is seen that the OprB antibody reacts with ring-shaped structures in the chlamydial inclusion (Fig. 4a) similar to the reaction of MOMP (Fig. 4b), and that the DnaK antibody gives filled out structures (Fig. 4c) in accordance with the cytoplasmic localization of DnaK. This indicates that OprB is located in a membrane, but because it was not possible to obtain a reaction with unfixed C. trachomatis L2 attached to the host cell surface with the OprB antibody (data not shown), it was not possible to determine whether OprB is surface localized. CTL0626 (Ct372) OprB was not previously found by proteomic analysis (Shaw et al., 2002). A reason for this might be its high pI of 9.
In all, 135 Omp2 spectra were obtained from 38 unique peptides. Omp2 migrates as a double band in SDS-gels. Using N-terminal sequencing of the bands with Edman degradation, Allen & Stephens (1989) found a sequence starting at amino acid 23 close to the predicted cleavage site (Table 1) and a sequence starting at amino acid 41. Using COFRADIC, peptides starting at amino acids 41, 45, 47 and 53 were identified (Fig. 5, Table 2), indicating that the N-terminus of Omp2 is accessible for proteolysis. The predicted signal peptidase I cleavage site identified by Allen & Stephens (1989) was not found because the resulting N-terminal peptide is too big to be picked up by COFRADIC. Omp2 has a proposed two-domain structure shown by a protease accessible site at approximately amino acid 100 (Ting et al., 1995; Mygind et al., 1998a). Acetylated peptides starting at positions 98 and 99 were identified in agreement with its possible two-domain structure. Other major cleavage sites were identified in the middle of Omp2 at amino acids 235, 236 and 237 as well as 267 and 268. These two sites are found in the middle and just after the first domain of unknown function 11 (DUF11) sequence (Fig. 5). A major area for proteolysis was located in the C-terminal part at amino acids 486 and 487, 494 and 495 and 499, 500 and 501. These sites are found in front of a predicted α-helix showing that Omp2 has multiple domains (Birkelund et al., 2006). COFRADIC thus identified cleavage of Omp2 at amino acid 40, and further cleavage in front of domains was seen, supporting its predicted secondary structure.
Chlamydia trachomatis L2 has nine pmp genes, none of which is truncated, and the genes encode proteins with molecular masses of 95–187 kDa (Thomson et al., 2008). Three Pmps were detected using 2D-gel (Fig. 1b) and seven using COFRADIC: PmpB, C, D, E, F, G and H. All Pmp proteins have predicted cleavage sites for signal peptidase I cleavage, but only for PmpF was a peptide corresponding to the signal peptidase I cleavage site at position 25–26 identified (Table 1). For the 157-kDa PmpD the only acetylated peptides found started at amino acids 53 and 63, but the starting point of the C-terminal 80-kDa fragment identified by Kiselev et al. (2007) was not found. Of nine remaining acetylated peptides identified in the Pmp proteins, four were located in the region between the predicted β-barrel embedded in the outer membrane and the passenger domain (Table 2), in agreement with the proposed C. pneumoniae Pmp structure (Vandahl et al., 2002b). In earlier proteomic studies of C. trachomatis L2, EB, PmpB, F, G and H were identified by 2D-PAGE (Shaw et al., 2002). Using ‘shotgun proteomics’, where purified C. trachomatis L2 EB and RB were digested with trypsin and all peptides were determined by MS/MS, peptides from all of the Pmp proteins were identified (Skipp et al., 2005). In agreement with this, COFRADIC identified acetylated peptides from all Pmp proteins except PmpA and PmpI. However, it must be noted that one peptide was identified from PmpI.
Proteins of the type III secretion system
Chlamydia has a type III secretion system (TTSS) by which proteins from the cytoplasm of Chlamydia can be secreted across the eukaryotic cell membrane, or across the chlamydial inclusion membrane into the host cell cytoplasm. Using COFRADIC the Yersinia YscC homologue, the TTSS protein, which is embedded into COMC (Vandahl et al., 2002a), and low calcium response protein D (LcrD) were identified. Chlamydia YscC was identified with internal peptides, and the N-terminal fMet of LcrD was identified. This is in agreement with Plano et al. (1991) who located LcrD from Yersinia pestis to the bacterial inner membrane (Table 1).
Other identified proteins
CTL0541 (Ct289) was identified with 12 spectra and five unique peptides and a peptide starting at amino acid residue 40, strongly supporting a weak prediction of a signal I cleavage between residues 39 and 40 (Tables 1 and 2). CTL0541 has 381 amino acid residues and the mature protein is 37.5 kDa and has an alkaline pI of 9.87. The protein has no motives or secondary structure predictions that are characteristic for an MOMP. Antibodies generated to the protein did not detect the protein in COMC or EB, indicating that it is present in small amounts. In agreement with this, it was not identified previously using shotgun proteomics (Skipp et al., 2005).
CTL0887 (Ct623) found by 2D-PAGE of COMC was identified with 11 spectra and four unique peptides. It has a predicted leader sequence but the cleavage site was not found by COFRADIC. CTL0887 is a homologue to the 76-kDa C. pneumoniae protein but this protein is not fully characterized (Melgosa et al., 1994).
Comparing EB and COMC protein images by 2D-PAGE gives a visual picture of the proteins present in COMC, and this technique is quantitative. The COFRADIC approach is only semi-quantitative and proteins present in low amounts can be over-represented in spectra due to the presence, localization and accessibility of trypsin sites. This can also result in an abundant protein being undetected.
The COFRADIC approach identified all known MOMPs. In two cases it further identified the N-terminal amino acid sequence after signal peptidase I cleavage identical to the signal-P prediction. COFRADIC data also supported the prediction of an additional loop in MOMP. We further suggest the OprB protein to be an MOMP, based on its detection in both the EB and COMC by immunoblotting and by immunofluorescence microscopy, where it reacted with ring-shaped structures in agreement with its localization in the membrane. Therefore, the transcription of OprB is most likely initiated at GTG as alternative start codon. The strengths of COFRADIC are its sensitivity for protein identification and its potential for identifying in vivo cleavage sites. However, it should be noted that not all known N-terminal amino acid sequences could be determined, as the position of the first arginine residue influences the likelihood of identifying the generated peptide.
This study was supported by ‘The Danish Medical Research Council’ (grant nos 22-03-0245 and 271-05-0488), ‘Aarhus Universitets Forskningsfond’, Biotek grant from the Faculty of Health Sciences, University of Aarhus, EU (grant NoE EPG LSHB-CT-2005-512061), and John and Birthe Meyer Foundation. The lab in Ghent acknowledges the support of the EU grant ‘Interaction Proteome’ within the 6th Framework Program.