Y. Sambongi, Graduate School of Biosphere Sciences, Hiroshima University, 1-4-4 Kagamiyama, Higashi-Hiroshima, Hiroshima 739-8528, Japan. Fax/Tel.: +81 824 24 7924, E-mail: firstname.lastname@example.org
Cytochrome c is widely distributed in bacterial species, from mesophiles to thermophiles, and is one of the best-characterized redox proteins in terms of biogenesis, folding, structure, function, and evolution. Experimental molecular biology techniques (gene cloning and expression) have become applicable to cytochrome c, enabling its engineering and manipulation. Heterologous expression systems for cytochromes c in bacteria, for use in mutagenesis studies, have been established by extensive investigation of the biological process by which the functional structure is formed. Mutagenesis and structure analyses based on comparative studies using a thermophile Hydrogenobacter thermophilus cytochrome c-552 and its mesophilic counterpart have provided substantial clues to the mechanism underlying protein stability at the amino-acid level. The molecular mechanisms underlying protein maturation, folding, and stability in bacterial cytochromes c are beginning to be understood.
In mitochondria, electrons shuttle between the membrane-bound cytochrome bc1 complex and cytochrome oxidase via a water-soluble protein cytochrome c. This process is required for generation of an electrochemical proton gradient across the membrane (proton-motive force), which drives the synthesis of ATP, a biological energy currency. Bacteria also have soluble monoheme Class I cytochromes c functioning as similar electron carriers on the peripheral surface of the cytoplasmic membrane. This type of cytochrome c is widely distributed in bacteria, from mesophiles to thermophiles, and is one of the best-characterized proteins in terms of biogenesis, folding, structure, function, and evolution.
Cytochrome c is unique among heme proteins in having a heme covalently attached to the polypeptide chain via two thioether bonds, formed from the vinyl groups of the heme and the two cysteine residues in the consensus Cys-X-X-Cys-His motif [1–4]. Site-directed mutagenesis studies of cytochrome c are at a relatively early stage, because of the difficulty of expressing the holoprotein heterologously. However, during the last decade, it has been found that bacteria have specific cellular apparatus for covalent attachment of a heme to the cytochrome c polypeptide [1–4]. Knowledge obtained from studies on the cytochrome c biogenesis pathway in bacteria has been used to produce heterologous cytochromes c in large quantities, which has facilitated mutagenesis studies and structure analysis.
In this review, we will summarize how bacterial cytochromes c have been used in recent mutagenesis and structure studies to elucidate protein stability. Through such investigations, we have gained insight into the molecular mechanisms underlying cytochrome c maturation and folding as well as stability. Bacterial cytochrome c has contributed greatly to our understanding in these areas, and is one of the most successful model proteins. The basic ideas obtained with cytochromes c should be applicable to other proteins of industrial and/or medical interest.
Cytochrome c as a model for protein stability studies
Proteins isolated from thermophilic bacteria are usually stable to heat and chemical denaturants, indicating that they must have most of the determinants of protein stability. Initial clues to the relationship between protein structure and stability can be obtained by pairwise sequence comparison of homologous proteins from thermophiles and mesophiles. The cytochromes c, which play a central role in electron-transport chains in both thermophilic and mesophilic bacteria as well as in eukaryotes, are useful in investigations of the structural basis of protein stability at the amino-acid level.
Highly homologous cytochromes c from thermophiles and mesophiles
For pairwise comparison to elucidate protein structure–stability relationships using cytochromes c, we must find homologues in thermophiles and mesophiles that exhibit high sequence identity. Cytochrome c-552 from a thermophilic bacterium, Hydrogenobacter thermophilus (HT c-552), is an 80-amino-acid protein with a heme covalently attached to the polypeptide chain . The amino-acid sequence (Fig. 1) and main chain folding (Fig. 2) of HT c-552 from this thermophile closely resemble those of the 82-amino-acid cytochrome c-551 from a mesophile, Pseudomonas aeruginosa (PA c-551). Comparisons of the two proteins indicated that the amino-acid residues are 56% identical , and that the root mean square deviation for their main chain folding is within 1 Å. However, as expected from the difference in their optimal growth temperatures (H. thermophilus, 70 °C; P. aeruginosa, 37 °C), HT c-552 is more stable to heat and chemical denaturants than PA c-551 [7–9]. For instance, the former has a significantly higher mid-point denaturation temperature than the latter, as judged both spectrophotometrically and calorimetrically. As described below, investigation of the relationship between the 3D structures and thermodynamic properties accompanying protein denaturation of HT c-552 and PA c-551 (wild-types and mutants) has revealed the molecular mechanism underlying protein stability. The difference in stability between them can be attributed to differences in side-chain interactions in a few select regions  (see below).
Other homologous cytochromes c
To determine the amino-acid residues responsible for protein stability, it would be better if sequence information was available from a larger number of homologous proteins in a variety of bacteria that differ in optimal growth temperature. Many homologous cytochromes c that exhibit sequence identity with HT c-552 and PA c-551 of more than 50% are also known in other mesophiles, e.g. Pseudomonas, Azotobacter, and Nitrosomonas species (Fig. 1). On direct sequence comparison of these proteins, we can find amino-acid residues that exist in HT c-552, but not in most of the others at the corresponding positions. It is likely that some of them are the residues responsible for stability, and in fact Ala7, Met13, and Tyr43 in HT c-552 have been shown, by mutagenesis and 3D structure analyses, to be determinants of the higher stability of HT c-552 (see below). For clarity, the residue numbers used for HT c-552 are those of PA c-551.
A hyperthermophile, Aquifex aeolicus, has an 86-amino-acid cytochrome c-555s (AA c-555), in its mature form . The sequence identity between HT c-552 and AA c-555 (33%) is not as high as that between HT c-552 and PA c-551, but these three proteins still exhibit high sequence similarity (more than 50%, Fig. 1), therefore their main-chain folding may be similar. Furthermore, the optimal growth temperature greatly differs between H. thermophilus and A. aeolicus (70 °C and 95 °C, respectively). Therefore, AA c-555 should be included in the group of cytochromes c to be examined for protein stability. Thermodynamic analysis of AA c-555 will be of great interest.
The homologous cytochromes c listed in Fig. 1 are excellent models for determining the structural origin of protein stability. Experimental data on the stability and 3D structure of these proteins in addition to their amino-acid sequences will provide valuable information on the mechanism underlying protein stability.
Biogenesis of bacterial cytochromes c
To confirm experimentally the amino-acid residues responsible for the stability indicated by sequence and 3D structure comparisons, site-directed mutagenesis needs to be performed on a series of homologous cytochromes c. For this, the cytochrome c gene needs to be heterologously expressed as a holoprotein that has a covalently attached heme and is in a correctly folded form, like the authentic proteins isolated from the original bacteria. Heme attachment, which must occur regardless of whether the cytochrome c gene is endogenous or exogenous, is a unique step during cytochrome c biogenesis. To be able to attempt heterologous expression of the cytochrome c gene, it is important to determine how the heme attaches to the polypeptide in the cell and how its functional structure is formed. The path-way of cytochrome c biogenesis has been extensively studied [1–4]. In this section, we briefly summarize recent advances in our understanding of bacterial cytochrome c biogenesis.
Genes required for cytochrome c biogenesis in bacteria
Cytochrome c heme lyase has been identified as the enzyme responsible for heme attachment to the mitochondrial cytochrome c polypeptide . The apparatus for cytochrome c biogenesis in bacteria is not analogous to that in mitochondria, because no orthologue of the cytochrome c heme lyase has been found in the genomes of bacteria that can synthesize cytochrome c. Instead, genetic evidence has suggested that at least 12 genes (ccm, cytochrome c maturation; dsb, disulphide bond formation) are required for bacterial cytochrome c biogenesis [1–4,12]. It would be interesting to compare the biogenesis apparatus for cytochromes c and nonheme iron-sulfur proteins (Nif proteins involved in Fe/S cluster formation), as the latter appears to be common in prokaryotes and eukaryotes to some extent . Among the ccm gene products, Escherichia coli CcmE was first biochemically characterized as a factor transferring a heme to the cytochrome c apopolypeptide . A subsequent biochemical study indicated that E. coli CcmC can interact with CcmE during heme transfer . It was predicted that heme transfer occurs in the periplasm invivo. A site-directed mutagenesis study on bacterial cytochrome c polypeptides also supported the idea that heme attachment takes place after the apoprotein has left the cytoplasm .
Effect of thiol–disulfide redox conditions
A defect in an integral membrane protein, DsbD (also known as DipZ), was first characterized as an E. coli mutation that prevented the synthesis of mature cytochrome c in the periplasm . DsbD contains a domain with potential disulfide isomerase activity facing the periplasm [18–20]. Other Dsb proteins, DsbA and DsbB, which have been determined to oxidize cysteine thiols to form the internal disulfide bonds of many proteins in the E. coli periplasm, are also required for cytochrome c biogenesis [21,22].
Cytochrome c biogenesis in dsbD mutant cells can be restored by adding low-molecular-mass thiol compounds to the growth medium , and that in dsbA and dsbB mutants by adding disulfide compounds . These complementation results are consistent with the general role of Dsb proteins in the regulation of the thiol–disulfide redox conditions during periplasmic protein folding. Although no biochemical evidence for the requirement of the Dsb system during cytochrome c biogenesis has been obtained yet, the genetic evidence suggests that thiol–disulfide redox control is also essential for cytochrome c biogenesis in the periplasm. Importantly, these results indicated that the level of cytochrome c production could easily be controlled by the thiol–disulfide redox potential, and this is the case, as described below.
Expression of exogenous cytochrome c genes in bacteria
For mutagenesis studies and structure analysis, it is necessary to obtain heterologously expressed, mature holocytochromes c in large quantities. Knowledge obtained from studies on bacterial cytochrome c biogenesis needs to be extended for the efficient production of mature holocytochromes c.
Targeting to the periplasm
The functional regions of gene products (Ccm and Dsb proteins) required for cytochrome c biogenesis are located in the periplasm of bacteria. Thus, cytochrome c apopolypeptide targeting to the periplasm is a physiologically essential feature for its maturation. This is also indicated by the presence of a typical signal peptide in the precursor of bacterial soluble cytochrome c. Requirement of the signal peptide was experimentally verified by heterologous gene expression of Paracoccus denitrificans cytochrome c-550 in E. coli; the holoprotein is produced in the periplasm if the gene retains the coding region for a native signal peptide, but the cytoplasmic apoprotein is produced if this signal is removed . Even a mitochondrial soluble cytochrome c can be expressed as a holoprotein in the E. coli periplasm when the eukaryotic gene product is targeted to the periplasm by fusing the signal peptide of bacterial cytochrome c at the N-terminus .
Periplasmically expressed exogenous cytochromes c in host cells so far have spectrophotometric characteristics identical with those of the authentic proteins produced in the original organisms [8,23–26]. These findings indicate that the heme attachment mode and protein folding are correct during heterologous gene expression and protein maturation. Thus, heterologously expressed cytochromes c, which are ‘quality-controlled’ in the bacterial periplasm, can be used for further biochemical analysis as if we are dealing with the ‘native proteins’.
Control of production level
The yields of heterologously expressed cytochromes c may depend on the copy numbers of the plasmids used. In addition, coexpression with ccm genes is effective for higher levels of plasmid-borne cytochrome c gene expression . Not only the protein factors functioning in the periplasm, but also low-molecular-mass thiol/disulfide compounds, which can maintain the periplasmic redox balance , successfully control the cytochrome c production level. For instance, the yields of exogenous and endogenous cytochromes c reach about 10% of the total periplasmic protein fraction in E. coli with the addition of disulfide compounds to the medium . Furthermore, a certain E. coli strain, JCB7120, can produce exogenous holo-(PA c-551) up to 30% of the total periplasmic protein level , although the mechanism underlying this high expression in this strain is not yet known. Now, using bacterial expression systems, we can obtain large amounts of holocytochromes c, which in terms of visible absorption spectra and other properties are indistinguishable from the native proteins. This progress has facilitated mutagenesis and structure analyses of bacterial cytochromes c, including HT c-552 and PA c-551.
Mutagenesis studies for stability
In general, site-directed mutagenesis is a powerful tool for investigating the relationship between protein structure and function. Experimental techniques for mutagenesis are applicable to bacterial cytochrome c, as discussed above. We are able to dissect the molecular mechanisms of cytochrome c, in terms of biogenesis, protein folding, redox properties, electron-transfer kinetics, and stability, at the amino-acid residue level through mutagenesis studies. In this section, we describe successful mutagenesis using PA c-551 variants modeled by the homologous and more stable HT c-552.
Amino-acid residues responsible for stability
As HT c-552 and PA c-551 have almost identical main-chain folding , subtle differences in the side-chain interactions must explain the remarkable difference in their stabilities. By careful comparison of their 3D structures , we found that aromatic amino-acid interactions uniquely occur between Arg37 and Tyr34 and/or Tyr43, the latter also having hydrophobic contacts with the side chains of Tyr34, Ala40, and Leu44 in HT c-552. These interactions are not found in PA c-551. Small hydrophobic cores formed by the side chains of Ala7, Met13, and Ile78 in HT c-552 are more tightly packed than the corresponding regions formed by Phe7, Val13, and Val78 in PA c-551. All these residues are distributed in three separate regions (Fig. 3). We expected that these multiple residues spread over the separate regions of HT c-552 would cause overall protein stability in an additive manner, and hoped that we would be able to clearly determine the stabilizing factors by mutagenesis studies.
Engineering stable proteins
To characterize the factors that affect protein stability, we attempted to achieve maximal enhancement of the stability of PA c-551 by introducing minimal mutations into spatially separate regions. Five amino-acid residues in PA c-551, which were selected on structure comparison, were substituted with those found at the corresponding positions in HT c-552, and the stabilities of the resulting PA c-551 mutants were measured. A single mutation [Val78 to Ile (V78I)] and double mutations [Phe7 to Ala/Val13 to Met (F7A/V13M) and Phe34 to Tyr/Glu43 to Tyr (F34Y/E43Y)] in the three regions of PA c-551 each individually enhanced the overall protein stability . These studies, together with structure analysis, provided substantial clues to the mechanism of protein stability in HT c-552. Ala7/Met13 and Ile78 fill small spaces present in the corresponding regions of PA c-551, and Tyr34/Tyr43 cause a favorable electrostatic interaction. These side-chain interactions may contribute to the enhanced stability of HT c-552. It would be worth trying to mutate HT c-552 so that it has the amino-acid residues found in PA c-551, and to examine whether the resulting HT c-552 mutants have decreased stability.
Surprisingly, a PA c-551 variant simultaneously carrying the five mutations in the three separate regions (F7A/V13M/F34Y/E43Y/V78I, quintuple mutant, Fig. 3) exhibits almost the same stability as that of natural HT c-552 (Fig. 4) . This demonstrates that it is possible to convert a mesophilic protein into an artificial one with stability similar to that of the natural thermophile by replacing a few amino-acid residues. Therefore, the thermophilic character of HT c-552 may depend on a few strong noncovalent interactions. We further found that the increase in the stability of the quintuple mutant is almost the same as the sum of the three individual stabilities [9,29]. Thus, the mutation(s) in each of the three regions contribute to the overall stability in an additive manner. The multiple mutations in the separate regions of PA c-551 provide experimental evidence on the mechanism underlying the enhanced stability of HT c-552, the relationship between local side-chain interactions, and overall protein stability.
The rational design resulting from careful structural comparison of HT c-552 and PA c-551 makes it possible to select a set of amino-acid residues that are completely responsible for the stability of a thermophilic protein. Only a small number of mutant proteins is required to experimentally confirm which amino acids are responsible for overall protein stability. This is the advantage of structural comparison of highly homologous proteins. If we randomly selected five of the 35 amino-acid residues that differ between the two cytochromes c, we would havetested (35 × 34 × 33 × 32 × 31)/(5 × 4 × 3 × 2 × 1) = 324 632 variants.
What we can learn from thermophilic proteins
The strategies used by thermophilic proteins to enhance stability are, for example, relatively increased polarity of the solvent-exposed surface area, increased packing density and core hydrophobicity, and generation of ion pairs or hydrogen bonds between polar residues [30–32]. However, these interactions are often related to each other in a protein molecule, and subtle changes in them can affect the overall stability. Thus, it is usually difficult to identify the exact factors that contribute to the enhanced stability of the proteins from thermophilic bacteria. This is reasonable because proteins in the native state are stabilized by 10–20 kJ·mol−1 compared with those in the denatured state; the energy is equivalent to the formation of only a few hydrogen bonds. Therefore, to understand protein stability, it is necessary to carry out precise comparative studies using homologues exhibiting high sequence identity with similar 3D structures, but differing in stability. From such comparisons, we must carefully detect subtle differences in side-chain interactions, and examine their contributions to the overall protein stability by mutagenesis studies. If the interactions are spatially separated, their contributions may be additive, in which case we can clearly identify protein-stabilizing factors. The combination of precise comparison of the structures of thermophilic and mesophilic homologous proteins and selection for multiple mutations in separate regions is a valuable approach to elucidating the relationship between structure and stability. This approach has been successfully applied to cytochrome c (as discussed above), triose phosphate isomerase , ribonuclease HI , and cold shock protein .
Recent advances in genome projects have revealed the gene resources of thermophilic bacteria, providing further opportunity for systematic comparisons of homologous proteins. A similar approach to that used for cytochromes c will be applicable to other proteins. Not only sequence comparison to identify the determinants of protein stability, but also the thermophilic proteins themselves can be used to elucidate the basic mechanisms underlying protein functions because they are usually purified and handled more easily than their mesophilic counterparts. Although thermophilic proteins are isolated from diverse extreme environments, they should reveal general features of protein structure, function, and stability.
Mutagenesis For maturation studies
In addition to protein stability, mutagenesis studies with cytochrome c have contributed to the understanding of protein maturation. As discussed above, physiological attachment of heme to apo-(cytochrome c) takes place in the periplasm. An exception to periplasmic covalent heme attachment was first found for HT c-552 a decade ago through a mutagenesis study . This unique case has unexpectedly shed light on the chemical aspects of heme attachment and apoprotein folding through the follow-up experiments.
Heterologous holo-(HT c-552), which has the heme attached covalently, is found in the cytoplasm when the truncated gene coding for the mature protein without the original signal sequence is transformed into E. coli[23,36]. An apo-(HT c-552) variant carrying mutations at the heme covalent binding site (C12A/C15A) has also been expressed in the E. coli cytoplasm. This gene product was found to have a compactly folded structure, which apparently differs from that of the natural holoprotein with the heme attached covalently . The ‘folded’ apoprotein aggregates into amyloid fibrillar structures over a long time period , but, in the presence of excess heme, it retains the prosthetic group noncovalently like a b-type cytochrome . In contrast, mesophilic apocytochromes c seem to form a random coil structure, and holoprotein formation does not occur in the bacterial cytoplasm . These observations suggest that apo-(HT c-552) has a sufficiently ‘folded’ structure to incorporate a heme at moderate temperature, possibly because of its thermostable properties. After insertion of the heme, cytochrome c folding occurs. Therefore, heme is not only required for the redox properties of cytochrome c, but is also essential for correct protein folding during cytochrome c biogenesis.
The unique case of cytoplasmic heme attachment also leads to the hypothesis that covalent thioether bond formation itself can proceed spontaneously without enzymatic assistance once the heme is inserted into the apoprotein [23,40]. Recently, a thermophile, Thermus thermophilus, cytochrome c-552 (TT c-552) was produced as a holoprotein in the E. coli cytoplasm similar to the case of HT c-552 . Although the cytoplasmic holo-(TT c-552) has the same function and spectra as the authentic one, the cytoplasmic soluble protein fraction also contains a minor product, which has a heme attached covalently but differs in heme-binding mode . This heterogeneity found in the cytoplasmic products also appears to indicate that the heme attachment itself is not catalyzed enzymatically.
Conclusions and perspective
Highly homologous monoheme Class I cytochromes c are available from a wide range of bacteria, from mesophiles to thermophiles. Their small size and high sequence identity are advantageous for determining the amino-acid residues responsible for protein stability. The heterologous expression systems established for cytochrome c, together with the results from rapidly increasing X-ray crystal and NMR analyses, have stimulated mutagenesis studies, which contribute to the understanding of the mechanisms underlying protein maturation, folding, and stability.
A mutagenesis study has also been carried out on cytochrome c to investigate its redox properties . The redox function of stable HT c-552 will be promising for electrochemical applications, such as the creation of a useful molecular device. A cytochrome c folding study will also reveal the basic features of protein conformational diseases, as first demonstrated with a HT c-552 mutant . Various technologies, such as NMR relaxation analysis, temperature jump methods, high pressure NMR, and stopped flow and single--molecule analyses, have been established, and are applicable to the elucidation of the dynamic features of cytochromes c. It is of interest to characterize cytochrome c with respect to protein maturation, folding, stability, and function using a variety of combined experimental techniques. Cytochrome c molecules will become very well understood through such interdisciplinary methods.
We thank Ikuo Ueda for his support and encouragement, and Kazuaki Nishio and Yuko Iko for critical reading of the manuscript.