Notice: Wiley Online Library will be unavailable on Saturday 27th February from 09:00-14:00 GMT / 04:00-09:00 EST / 17:00-22:00 SGT for essential maintenance. Apologies for the inconvenience.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
A wide range of bacteria secrete proteases into their extracellular environment for various purposes, such as degrading extracellular proteins for facilitating nutrient transport or effecting bacterial virulence and toxicity.1 A bacterial secreted protease is normally composed of a secretion signal peptide, a propeptide which will be cleaved upon the protease activation, and a mature secreted protease. The mature protease is the functional enzyme and can be isolated and characterized from the extracellular medium.
This article concerns an endoproteinase secreted by an alkaliphilic and moderately halophilic microbe belonging to the Nesterenkonia abyssinica family (originally named as Nesterenkonia sp. AL20). This protease, designated as NAALP (Nesterenkonia abyssinica alkaline protease), was isolated from an alkaline soda lake in the East African rift valley.2, 3 The bacterium AL20 grows well with chicken feather as nutrient source, and the NAALP has shown good activity towards casein and hemoglobin as substrates in vitro, with sequence preference in the order of Tyr > Phe > Leu at the P1 site.4 Although activity profiles of the NAALP suggested that the enzyme to be a subtilisin-like protease, its activity and stability were calcium independent. The NAALP is optimally active at pH 10, 1.0M NaCl, and 70°C and shows good stability at 50°C in the presence of EDTA and detergents.5
With the information of hundreds of bacterial genomes available in the postgenomic era, thousands of novel proteins, annotated as open reading frames (ORFs), have been identified without biochemical characterization. Sequence searches using the NAALP as probe have revealed dozens of homologues in the sequenced bacterial genomes, the overwhelming majority of which are uncharacterized putative proteins, thus the NAALP has defined a novel protein family (defined by sequence identity over 30% to ensure the same structural fold) of bacterial secreted proteases.
In this report, using high resolution crystal structure determination, we have unambiguously characterized the NAALP and its sequence related family as a trypsin-like serine protease.
MATERIALS AND METHODS
Identification of the NAALP family
The sequence of the mature enzyme of NAALP was used to search the European Bioinformatics Institute (EBI) UniProt Knowledgebase at the website: http://www.ebi.ac.uk/fasta33/.
Representative homologous sequences from FASTA searched results were selected and aligned with the program CLUSTALX.6
Protein structure determination and refinement
The protein preparation, crystallization, and diffraction data collection have been described.7 There are two molecules per asymmetric unit, and a two-fold noncrystallographic symmetry (NCS) was revealed by self-rotational analysis. The crystal structure determination was carried out by molecular replacement (MR) method using the program MolRep in the CCP4 package.8 The crystal structure of a glutamyl endopeptidase (with a sequence identity of 22%, the closest homolog of NAALP could be found in the PDB database) from Bacillus intermedius (PDB ID: 1P3E) was used as the searching model.9 Based on the sequence alignment, several different constructs of 1P3E were prepared for MR. For each construct, poly-alanine, poly-serine, and partial mutation models were tested with MolRep by exactly the same protocol. Self-rotation results were input for MR search in the range of 20–3 Å. The final solution was determined with the poly-serine model of residues 20–215 from 1P3E_chain A. All the top 30 rotation peaks were used for translation searches. The solution for one molecule was solved from the first rotation peak, which was confirmed by a quite sharp translation peak (TF/sig value 6.7, whereas the following peaks were around 3.7). The position of the second molecule was searched by fixing the first one, and the resulted dimer was subjected to refinement.
After rigid body refinement by MolRep, the program ARP/wARP was used for further refinement and automatic model tracing.10 Refinement of the high resolution (1.39 Å) structure was carried out with the program Refmac5 combined with manually rebuilding of the model by the graphical program COOT.11, 12 Stereochemistry qualities of the model were evaluated and checked by PROCHECK.13 The data collection and structure refinement statistics were listed in Table I. Structure factors and the coordinates have been deposited in the PDB (PDB ID: 3CP7).
Table I. Refinement Statistics and Model Quality
Values in parentheses are for outer (highest) resolution shell.
Unit cell parameters (Å)
a = b = 92.26, c = 137.88, α = β = 90°, γ = 120°
No. of reflections observed
No. of Unique reflections
Resolution range (Å)
Rmerge (%) (last shell)
B factor from Wilson plot (Å2)
Reflections used in refinement
No.of Mol./asymmetry unit
No. of non-H atoms
No. of solvent molecules
657 H2O + 2 FMT
rmsd of bond lengths (Å)
rmsd of bond angles (°)
Averaged B-factors (Å2)
Monomer A: 11.9
Monomer B: 12.8
Ramachadran plot, residues in
Core region (%)
Additional allowed region (%)
Genenrally allowed region (%)
Disallowed region (%)
RESULTS AND DISCUSSION
NAALP has defined a novel family of bacterial secreted serine proteases
Among about 100 returned search results, more than 95% sequences were previously uncharacterized putative proteins or ORFs, we have used the criteria of 30% sequence identity to define the NAALP-like family, about 30 sequences were selected and all of them were annotated as putative proteins without any biochemical characterization except for NAALP. Hence, this NAALP-like family belongs to a novel family of bacterial secreted serine proteases. Thirteen sequences (including NAALP itself) were selected as representatives of this NAALP-like family to do a structure-based multiple sequence alignment by CLUSTALX as shown in Figure. 1(A). The selected protein sequences are from the following bacterium organism sources:
Figure 1(A) has clearly shown that all the structural elements, and functional important residues including the active site triad (S169, H41 and D91 as numbered in NAALPs) labeled by filled squares, the oxyanion hole labeled by filled rings, and the two pairs of intramolecular disulfide bridges, C23-C42, C144-C162 are very well conserved in the NAALP-like family.
Overall structure of NAALP
The final model of NAALP is refined to 1.39 Å with R-factor and freeR-factor of 17.8% and 19.9%, respectively. Refinement statistics and model quality of NAALP are listed in Table I. The overall structure of NAALP contains two molecules in one asymmetry unit [Fig. 1(B), labeled as Mol A and Mol B], each molecule is very similar in three-dimensional structure and adopts a typical trypsin-like fold, consisting of two lobes, each formed by a six-stranded β-barrel. In addition to its beta protein features, NAALP also contains two short alpha helices, a longer α helix, α3, and three turns at the C-terminus [Fig. 1(B,C)]. In the crystal lattice, two molecules of NAALP pack together to form a crystallographic dimer with an interacting surface of about 880 Å2 which is in the range of a weak protein dimer (the interacting surface area of NAALP dimer was calculated at the website: http://www.ebi.ac.uk/msd-srv/prot_int/cgi-bin/piserver).14 The dimer interfaces are formed mainly by the α1 and α3 helices from both monomers packing against each other with predominantly H-bonds, salt bridges, and hydrophobic interactions [Fig. 1(B)].
In the previous studies, because the full-length sequence was not available, NAALP was identified as a subtilisin-like family of serine protease mainly due to biochemical features.2, 4 With further bioinformatics annotations after the structure was available, the NAALP has been assigned as trypsin-like from the CATH Protein Structure Classification database (http://www.cathdb.info) and Pfam Protein Families database (http://pfam.sanger.ac.uk). Furthermore, our structural results have undoubtedly shown that the NAALP is very similar to the trypsin in three-dimensional structure and all the functional elements for a trypsin-like serine protease are completely conserved, therefore the NAALP-like family has a trypsin-like structure and function.
The active sites of NAALP
The active sites in both molecules of NAALP are very similar and readily identified as in the active form with the intact catalytic triad and oxyanion hole shown, labeled as S169, H41, D91 depicted in Figure 1(D,E). A formic acid molecule (existing in 2.9M sodium formate in the crystallization buffer) has been observed in both active sites, with an oxygen atom positioned in the oxyanion hole formed by the main-chain NH groups of Gly167 and Ser169, somewhat resembling the new carboxy terminus of a cleaved substrate.