Concepts and tools to exploit the potential of bacterial inclusion bodies in protein science and biotechnology


S. M. Doglia, M. Lotti, Department of Biotechnology and Biosciences, State University of Milano-Bicocca, Piazza della Scienza 2, 20126 Milano, Italy
Fax: +39 02 64483565
Tel: +39 02 64483459


Cells have evolved complex and overlapping mechanisms to protect their proteins from aggregation. However, several reasons can cause the failure of such defences, among them mutations, stress conditions and high rates of protein synthesis, all common consequences of heterologous protein production. As a result, in the bacterial cytoplasm several recombinant proteins aggregate as insoluble inclusion bodies. The recent discovery that aggregated proteins can retain native-like conformation and biological activity has opened the way for a dramatic change in the means by which intracellular aggregation is approached and exploited. This paper summarizes recent studies towards the direct use of inclusion bodies in biotechnology and for the detection of bottlenecks in the folding pathways of specific proteins. We also review the major biophysical methods available for revealing fine structural details of aggregated proteins and which information can be obtained through these techniques.


D-amino acid oxidase


green fluorescent protein


inclusion body


trigger factor

Protein aggregation in the bacterial cytoplasm: regulation, override and effects

It is estimated that the global macromolecule concentration in the Escherichia coli cytoplasm is around 200–400 g·L−1 and that macromolecules occupy 20–30% of the total cytoplasmic volume [1,2]. Individual proteins are represented at relatively low concentration (nm to μm) but in the cytoplasm this translates into the distance between any two molecules having the same dimensions as proteins themselves [3]. Crowding increases non-specific, attractive and electrostatic interactions and modifies diffusion rates, with detrimental effects on the behaviour of all macromolecules [4]. In these conditions, folding becomes a kinetic race against aggregation: although the native state is thermodynamically favoured [5], aggregation can trap folding intermediates into non-native folding landscapes that, in the absence of further control mechanisms, would irreversibly lead to the formation of aggregates (for an excellent review on protein folding in the cytoplasm see [6] and references therein).

As translation is a relatively slow process (it can take up to 75 s to synthesize a protein 300 amino acids long) and proteins larger than around 100 amino acids fold slowly [7], cells developed a series of mechanisms to avoid the exposure of aggregation-prone proteins to the cytoplasm. As first line of defence, around 40 amino acids of the nascent polypeptide can be accommodated inside the ribosome exit tunnel and it has been demonstrated that secondary (mainly helical) structure formation is possible inside the tunnel [8]. Outside the ribosome, de novo folding of a growing chain is facilitated by a number of chaperones: the trigger factor (TF), the DnaK, DnaJ, GrpE system and the GroEL–GroES pair. A comprehensive review of the folding process transcends the aim of this paper and can be found in [9,10] and references therein. The folding machinery allows most proteins to efficiently reach their native state but, even in non-stress conditions, some molecules fail to do so. When the folding machinery fails, cells deal with unfolded proteins through alternative mechanisms. Holding chaperones (IbpA/B, Hsp31 and Hsp33) temporarily bind misfolded peptides on their surfaces and present them to DnaK/J or GroEL/ES. AAA+ proteases act on formed aggregates triggering the degradation of misfolded proteins while ClpB releases them from inclusion bodies (IBs) and presents unfolded polypeptides to the (re)folding machinery. Altogether, under physiological conditions, this quality control system can sense, react to, control and reduce to negligible levels the amount of partially unfolded and aggregated proteins in the E. coli cytoplasm (Fig. 1A).

Figure 1.

 Protein biosynthesis and aggregation under normal and stress conditions. (A) Under normal conditions, nascent polypeptides either can fold autonomously or require the help of folding chaperones. Aberrant protein products due to translation errors and misfolding are handled by the quality control system, composed of refolding chaperones and proteases. The system is energetically demanding (most processes are ATP-dependent) but drives the equilibrium towards the native, folded state [10]. (B) Under most stress conditions equilibrium is shifted toward the formation of aberrant products (red lines). This is naturally counteracted by cellular optimization strategies already present at the source (DNA, protein sequences and regulation of expression levels) or induced upon exposure to stress conditions (upregulation of the quality control machinery). Heterologous protein overproduction, however, can further affect this delicate balance by competing for available resources (ribosomes, chaperones but also ATP).

Stress conditions, however, cause the impairment of the cellular quality control system, thus inducing misfolded proteins to accumulate in the cytoplasm as insoluble aggregates, or IBs (Fig. 1B). In the case of E. coli and other bacteria used as microbial cell factories, main stress conditions are ageing, rate of protein synthesis, mutations and aberrant protein biogenesis, environmental (usually heat or oxidative) stress and heterologous protein production.

Ageing is mostly known to induce protein aggregation-related diseases in higher eukaryotes but there is evidence for age-dependent protein aggregation also in bacterial cells [11] and mechanisms to neutralize it have been characterized in E. coli [12]. If IBs are present in a cell, as they tend to aggregate at one extremity of the bacterium, cell division will produce an IB-free cell (healthier, young and with higher growth rate) and an IB-containing one that will grow more slowly [13]. Half of the bacterial progeny will thus have better fitness: ageing is not avoided at single cell but at population level.

Rate and ‘quality’ of protein synthesis can favour misfolding over folding. Intuitively, an increase in the concentration of nascent polypeptides makes the folding process more severe and this is indeed naturally counteracted by the increase in chaperone concentration in exponentially growing cells [2]. Rate can, however, be increased above tolerable limits by mutations and overexpression, as discussed in the next paragraphs. Also, ‘quality’ of protein synthesis is affected by environmental stress due to an increase in the rate of translational errors, amino acid misincorporation, premature chain truncation and incomplete modifications. Such aberrant molecules accumulate in the cytoplasm and increase protein aggregation [14].

In general, mutations are retained only if folding propensity remains above a critical point, independently of the advantage that they would provide to the host [15]. Mutations can affect aggregation, however, even if protein activity is not compromised, even if no amino acid replacement is introduced: the DNA sequence itself determines the rate of synthesis. There is indeed evidence for genomic-level optimization of protein folding at both DNA and protein sequence levels. The distribution of codons in mRNAs has been found to be unbalanced: as the first 30–50 codons have low translation efficiency, translation has a slow start, reducing ribosome clashing, translation stalling and eventually favouring folding [16]. At protein level, regions in the primary sequence that are intrinsically aggregation-prone correlate with those having low folding propensity (and with chaperone dependence) [17]. The localization of ‘fast’ codons around those regions [18] is believed to kinetically promote folding and the burial of aggregation-prone patches in the core of the native structure. Although DNA and protein sequences evolved to optimize translation and folding efficiency, even a single silent mutation can induce IB formation, while amino acid replacements that alter the chemical properties of the polypeptide will easily result in increased aggregation propensity. During recombinant protein production, heterologous proteins will not have their sequence optimized for expression in E. coli and therefore suffer from poor folding efficiency even if expression levels are kept low.

In microbial cell factories, overproduced proteins can represent up to 90% of the total protein content and cause the failure of the quality control system that will result in the accumulation of misfolded proteins first and eventually lead to the formation of IBs. This process is highly protein-dependent, driven by DNA and protein sequences, as discussed above, but can also be affected by specific folding requirements (i.e. disulfide bonds) or transcend the folding capability of E. coli. Other causes of aggregation are heat or oxidative stresses, environmental conditions that cells are likely to face in natural environments and biotechnological applications. Growth above optimal temperature eventually results in massive protein unfolding while reactive oxygen species cause fragmentation and chemical modification of side chains. Both these events raise the aggregation propensity of proteins in the cytoplasm, either by increasing hydrophobic patch exposure or by altering protein chemical properties that can result in crosslinking and misfolding.

Recombinant protein production might induce aggregation and elicit stress responses

Heterologous protein production is by itself cause of toxicity for cells, independently of the nature of the recombinant protein. Energy depletion is the most immediate result and is due to both the overproduced protein and the upregulation of those involved in stress responses. If degradation of the heterologous protein occurs, even higher energy consumption will result in little product accumulation at the expenses of biomass and growth rate. Also, aminoacilated-tRNA depletion triggers the stringent response [19,20] that causes the downregulation of the protein and amino acid biosynthesis machinery. In a condition of limited resources for protein biosynthesis, competition is won by the recombinant mRNA, causing a decrease in housekeeping mechanisms (i.e. DNA and protein synthesis), rearrangements in cellular catabolic rates and slower, if any, growth rate [21] (Fig. 1B). The DNA damage-induced SOS response is also reported to be activated and, although there is no agreement about how protein overproduction triggers this response, it is likely that elevated transcription rates of plasmid-encoded genes causes DNA suffering in cells [22].

While these effects occur ubiquitously, overproduced proteins have been reported to specifically trigger different cellular responses depending on their properties, particularly for what concerns aggregation propensity. Reports on the upregulation of the quality control system upon the accumulation of misfolded proteins in the cytoplasm suggest that this mechanism shares similar features with the heat-shock response, which causes the upregulation of genes controlled by the transcription factor σ32. σ32 regulates the expression of genes coding for known heat-shock proteins (which include chaperones and proteases) and its own activity depends on the same chaperones that it regulates [23,24]. It is believed that, under non-stress conditions, chaperones act as anti-sigma factors, inhibiting σ32 activity through an induced conformational change [24,25]. When the number of misfolded proteins increases in the cell, chaperones are saturated and the equilibrium shifts toward the free version of σ32, leading to induction of the stress response.

Nevertheless, the nature and variability of the recombinant protein stress response suggests a far more complex and adjustable ‘heat-shock-like’ mechanism [26]. The normal heat-shock response is transient, fading away shortly after cells are released from stress, but increased synthesis rates of DnaK, GroEL chaperones and Lon (the main heat-shock protease) have been found to last for the whole length of overproduction. The extent and kinetics of the heat-shock-like response vary among different production systems and are influenced by the nature of the protein synthesized: while energy metabolism, SOS response, nutrient uptake and the core of the heat-shock response undergo comparable changes, different recombinant proteins have distinct impacts on intracellular stress control and growth rates [27–29]. The small heat-shock proteins IbpA and IbpB, for example, are upregulated exclusively when proteins accumulate as IBs, inhibit IB degradation and reduce the stress response, thus favouring growth [30,31]. A membrane and a membrane-bound recombinant protein have opposite effects on growth rate but activate the same stress-response pattern both at cytoplasm and envelope level [32]. Conversely, in the cytoplasm recombinant proteins with different aggregation profiles increase the abundance of the same set of envelope proteins while membrane composition and permeability specifically react to the aggregation state of the recombinant protein. It has been suggested that the cell membrane might react with exquisite sensitivity not only to aggregation but even to the complexity of the aggregates (whether soluble aggregates or large insoluble IBs) and that membrane lipids may act as a second stress sensor responsive to the aggregation state of the recombinant protein [33,34].

The bright side of IBs: from recombinant protein reservoir to tools for basic investigation and direct application in biotechnology

Before the last decade, the properties of protein aggregates knew little glory while most studies pursued either solubility improvement or denaturation/renaturation of purified IBs. Within the first line, the most successful techniques are fusion with solubility tags, use of molecular and chemical chaperones and modulation of the expression conditions to reduce the rate of protein biosynthesis [9,35–37], whereas in the second major efforts are devoted to optimizing the refolding process so as to regain highest biological activity (reviewed in [38]). Only during the last decade has a deeper knowledge of the structural and functional properties of IBs drawn researchers’ attention to the possibility to control the conformation of aggregated proteins, paving the way for the use of IBs in a series of studies and applications that were difficult to envisage only a few years ago.

Such developments require that IBs can be characterized in fine detail, their structure and aggregation process monitored and controlled. Having structural information in hand would enable these methods to be applied in an informed fashion and thus allow a fine modulation of the aggregation process. In the next section we describe and illustrate with some examples the major tools available for the structural analysis of proteins within aggregates and of aggregates within cells. Synergic to the latter goal are computational methods allowing the identification of aggregation-prone regions within protein primary sequences (reviewed by Hamodrakas in this issue).

Structural properties of IBs: a review of the methods

We provide in the following an updated view about the principal biophysical methods available for the characterization of proteins aggregated in IBs and summarize the information generated by their application (Fig. 2).

Figure 2.

 Methods for the characterization of IBs. (A) Scheme of IB formation and structural properties. Folding intermediates form soluble aggregates that merge in one or two IBs per cell. The polypeptides embedded in IBs can retain native-like structure and activity. Moreover, IBs can acquire amyloid-like features. Possible applications related to the peculiar IB structural properties are indicated. (B) Principal methods of investigation of IB formation and characterization.

The aggregation of recombinant proteins can be monitored in vivo by fluorescence spectroscopy and microscopy if the target protein is fused to a fluorescent partner such as the green fluorescent protein (GFP) or its variants [39]. Using this approach it was determined that multiple, small and soluble aggregates form at early stages of the process while, at later times, these assemblies merge into one or two large aggregates localized at the poles of the cells [39,40]. In vivo aggregation can also be monitored in real time labelling the target protein with the tetra-Cys sequence tag (Cys-Cys-X-X-Cys-Cys) that specifically binds a fluorescein analogue containing two arsenoxides (FIAsH). In this approach, the tetra-Cys motif is introduced by mutagenesis into the protein sequence at a specific position where its accessibility and binding to FIAsH will depend on the folding state of the protein. In this way, FIAsH fluorescence reports on protein stability and aggregation within cells [41]. Other applications of fluorescence-based analysis rely on proteins within IBs retaining native-like structure and activity. For example, it was shown that in IBs formed by a GFP-fusion protein fluorescence emission was higher in the core of the aggregates than in their external shell [42]. This observation ruled out the possibility that the biological activity retained by IBs depends on native-like proteins passively trapped in the aggregate and instead attributed this distribution to the specific mechanisms of protein deposition and removal, and further suggested that aggregated proteins can complete their folding and activation process once deposited in IBs [42]. Protein–protein interactions within IBs have also been studied using higher resolution fluorescence approaches such as the Förster resonance energy transfer (FRET) in which interacting proteins are labelled by two different fluorescent probes [43]. Higher FRET efficiency was obtained when the two probes were fused to the same peptide rather than to different ones, suggesting that the process of aggregation is highly protein-specific [44]. The spatial resolution of optical microscopies, including fluorescence microscopy, is of the order of 0.1 μm (in the image X, Y plane) due to the diffraction limit of the employed light. Even in laser scanning confocal microscopy, the highest resolution of about 0.5 μm is obtained in the Z direction [45].

Electron and atomic force microscopies reach a nanometric – and even subnanometric – resolution but they rely on a more invasive approach to the sample. In transmission electron microscopy, thin sections of fixed cells show IBs as spherical or ellipsoidal electron dense structures [46,47] and purified IBs appear as spherical, ellipsoidal or cylindrical particles of 0.5–1.8 μm characterized by a smooth and porous surface in both scanning and transmission electron microscopy (Fig. 3) [46,48]. The porous structure of IBs, also confirmed by sedimentation techniques [49], is of relevance in view of a direct application of active aggregates in biocatalysis: thanks to the porous and hydrated IB structure, substrates and products can diffuse inside and outside making IBs useful depositories of highly purified enzymes. Electron microscopy was also applied to studying the shape and surface to volume ratio of protein aggregates used as biomaterials in applications where these features are of relevance [50]. Furthermore, both electron microscopy and, in particular, atomic force microscopy image the surface morphology of the sample at nanometric resolution [51] and allowed amyloid-like fibrils to be detected in freshly purified IBs of the human bone morphogenetic protein-2 (fragment 13–74) [52] and of the prion of the filamentous fungus Podospora anserine HET-s (fragment 218–289) [53]. Fibrillar structures became more evident after IB incubation at 37 °C for 12 h [52] or in the presence of proteinase K [44,54].

Figure 3.

 Transmission electron micrograph of IBs within E. coli cells. The picture shows IBs formed by GFP fused to an aggregation-prone domain and the immunolocalization of GFP. Courtesy of Elena García-Fruitós and Antonio Villaverde.

The structural properties of IBs at molecular level have been investigated at a resolution ranging from protein backbone conformations to single residues by several optical spectroscopies, such as FTIR, Raman, CD and fluorescence, as well as by NMR and X-ray diffraction.

FTIR spectroscopy allows the study of protein secondary structures and aggregation through the analysis of the amide I band, occurring in the 1700–1600 cm−1 absorption region, which is due to the CO stretching vibration of the peptide bond (reviewed in [55] and references therein). Absorption of the different secondary structures of the proteins overlaps in this spectral range and can be resolved by resolution enhancement approaches, such as the second derivative analysis of the spectra. In this way, the secondary structure components appear as negative peaks in the derivative spectrum and each peak can be assigned according to its wavenumber. For instance, in water α-helices and random coils absorb between 1660 and 1648 cm−1, intramolecular β-sheets between 1640 and 1623 cm−1 and around 1686 cm−1, whereas intermolecular β-sheet absorption in protein aggregates is found between 1630 and 1620 cm−1 and around 1695 cm−1. FTIR (micro)spectroscopy allows protein secondary structures and aggregation to be studied also within complex biological systems, i.e. whole intact cells [56–58], tissues [59] and whole organisms [60]. Moreover, changes in the intensity of the aggregate spectral component around 1625 cm−1 have been used to follow the kinetics of IB formation within a growing culture of E. coli. To exemplify this approach, Fig. 4A reports the second derivative spectrum of E. coli cells during production of a recombinant lipase. Six hours after induction at 37 °C the protein is mainly deposited in aggregates, as can easily be determined based on the appearance of a shoulder at ∼ 1627 cm−1 that has no counterpart in the control cells and is attributed to intermolecular β-sheet structures in protein aggregates. Subtraction of the spectrum of control cells allowed the spectral component (1627 cm−1) unique to aggregates to be resolved in more detail (Fig. 4B) and the kinetics of IB formation at different temperatures, namely at 37 and 27 °C, the latter compatible with the partitioning of the recombinant protein between soluble and insoluble proteins, to be monitored and compared [57]. Spectra of IBs (Fig. 4C) purified from cells revealed that the intermolecular β-sheet component of protein aggregates, peaked at 1627 cm−1, was higher at the higher temperature, while proteins embedded in IBs formed at 27 °C retained more native-like α-helical content (∼1656 cm−1). These results suggest FTIR (micro)spectroscopy as a technique of choice also in the study of the influence of the physiology of expression (i.e. temperature, induction, formation of disulfide bonds) on the kinetics of aggregation and on the structure of aggregated proteins [57,61].

Figure 4.

 FTIR analysis of the aggregation of a recombinant protein in E. coli. (A) Second derivatives of the FTIR absorption spectra of E. coli cells synthesizing a recombinant lipase from Pseudomonas fragi (PFL) at 37 °C after 6 h from induction (continuous line) and of the control cells (dashed line). (B) Second derivative of the difference spectrum between cells producing the recombinant protein and control cells reported in (A) (continuous line). In this subtracted spectrum, the band at 1627 cm−1 due to intermolecular β-sheets in aggregates is well resolved allowing the kinetics of IB formation within intact cells to be monitored. The same analysis performed at 27 °C is shown (dotted-dashed line). (C) Second derivative absorption spectra of IBs extracted after 10 h from induction at 27 °C (dotted-dashed line) and 37 °C (continuous line).

Another vibrational technique that can be employed to characterize the structural properties of IBs is Raman (micro)spectroscopy, where the inelastic scattering of laser light from the sample is detected. Pioneering work of Przybycien et al. detected in IBs formed by recombinant β-lactamase an increased level of β-sheet structures and the retention of native-like α-helix content [62]. This technique can be considered complementary to FTIR spectroscopy, since the two methods detect different vibrational modes of the sample. Raman spectroscopy is more sensitive to the amino acid side chain response [63] while – as discussed above – FTIR is more sensitive to the backbone amide I vibrations. We believe that Raman (micro)spectroscopy could offer advantages still unexplored in IB studies, since relevant information on disulfide bond formation and on solvent accessibility of specific amino acid side chains can be obtained [63].

The presence of β-sheet structures in extracted IBs can also be detected by far UV CD [52,54], even if it is not easy to discriminate between intramolecular and intermolecular β-sheets. The use of this spectroscopic technique for the study of IB aggregates is often limited by the intrinsic insolubility of the samples, responsible for a high level of light scattering disturbances and signal loss.

The characteristic presence of β-sheet structures within extracted IBs has also been confirmed by X-ray diffraction. Spectra typically display two circular reflections around 4.7 Å and 10.2 Å, respectively, assigned to the spacing between strands within a β-sheet and between β-sheets. The circular shape of these reflections has been suggested to arise from not strongly aligned β-sheets within IBs [52,64].

NMR spectroscopy has been widely applied in protein science, since it enables detailed structural information at the specific residue level up to the three-dimensional structure of the protein to be obtained. In particular, solid state NMR rotational-echo double-resonance (REDOR) has been applied to IBs, both extracted and within intact cells [65]. In this approach, the backbone carbonyl and nitrogen are labelled (13CO and 15N) for each amino acid, since its 13CO chemical shift allows information to be obtained on local conformation. In this way, Curtis-Fiske et al. were able to identify native α-helices of the N-terminal 185 residues of the functional domain of the HA2 subunit of the influenza virus hemagglutinin protein and to detect conformational heterogeneity of the protein within IBs [65]. NMR spectroscopy has also been applied to localize β-sheet structures in protein aggregates, mainly by hydrogen/deuterium (H/D) exchange experiments that allow residue-specific backbone amides protected from solvent exchange because they are involved in hydrogen bonds to be detected. The assignment of solvent-protected residues to β-sheet structures can be obtained also by other spectroscopic techniques such as CD and X-ray diffraction [52]. It is noteworthy that NMR-based approaches, such as solid-state NMR 13C–13C proton-driven spin diffusion and liquid-state NMR H/D exchange experiments, offer the unique possibility of comparing at the residue-specific level protein aggregates of different types, such as IBs, amyloid fibrils and thermal aggregates [53,64]. The outcomes of these NMR experiments could therefore allow the aggregate residue-specific structural properties to be correlated with their functional features, such as enzymatic activity or cellular toxicity.

Exploitation of IBs in biotechnology and in protein science

It is widely recognized that proteins can aggregate in IBs in different folding states that can eventually coexist within the same aggregates. The conformation acquired within aggregates is dependent on the nature of the protein itself [66] but can also be controlled through the genetic background of the host cells and/or manipulation of the experimental conditions. This novel and in a way revolutionary knowledge has important consequences in the rationale of handling and studying IBs. The development of methods to control and monitor the process of aggregation allows for the production of aggregated proteins endowed with residual structure and biological activity that can find direct use in biotechnology. In addition, a detailed analysis of the mode of building and of the structure of aggregates can be useful to dissect pathways and bottlenecks in the folding of specific proteins, for example those containing disulfide bonds or requiring cofactors, multidomain proteins, fusion proteins. In the following we summarize recent progress in this field, whereas the use of IBs in the study of amyloid aggregation is developed in the accompanying review paper by García-Fruitós and colleagues.

Two very relevant accomplishments towards IB exploitation in biotechnology are based on the ability to enrich aggregates in native-like structured proteins making them suitable for direct use in biocatalysis and/or as a source of relatively pure proteins that can be released through mild solubilization. Given that aggregation often cannot be fully avoided – or is even considered an advantage – the same experimental ‘tricks’ developed to improve the solubility of recombinant proteins (reviewed in [35]) can be applied to produce IBs mostly composed of native-like, although not soluble, recombinant proteins.

The list of recombinant proteins that precipitate in IBs in a conformation permissive for biological activity has progressively grown since researchers started to measure this parameter and includes, among others, β-galactosidase [67], endoglucanase [68], GFP [69], a bacterial lipase [57], oxidases [70], kinases [71] phosphorylases [72] aldolases [73], transglutaminases [74] and the colony stimulating factor [75]. This knowledge soon generated the idea of directly using IBs in biocatalysis, thus avoiding the cumbersome step of resolubilization. Since recovery of IBs from cell extracts can be quite easily achieved, this method could be of broad scope, provided aggregated proteins retain enough biological activity. Unfortunately, so far the comparison of the specific activity of soluble and aggregated proteins has been performed only sporadically although the competitiveness of IB catalysis depends on the balance between a possible reduction of specific activity and the advantages produced by avoiding solubilization steps. Data available show that depending on the protein and the production protocol the biological activity of aggregates can vary from 11% [69] to nearly 100% [68] of the soluble counterpart.

IBs embedding native-like proteins are also proposed as a source of pure recombinant proteins that can be easily released upon mild treatments that avoid chemical disruption of cells and denaturation of the aggregates. Protein–protein interactions are in fact weaker and ‘relaxed’ IBs can be dissolved in mild detergent at low concentration. Since proteins have not been denatured during solubilization, there is no need to introduce refolding steps, which is of great advantage since solubilization/refolding is often a critical step in the production of recombinant proteins. Interestingly the approach has been successfully tested with proteins not related in their structure, among them the granulocyte colony stimulating factor, GFP and a truncated form of the tumour necrosis factor [76].

An innovative evolution towards IB-based catalysis exploits the idea of forcing otherwise soluble proteins to aggregate in IBs. This method is proposed as an alternative to the better known procedures of enzyme insolubilization via immobilization on carriers or via aggregation by crosslinking (reviewed in [77]). The protein of interest is fused to an aggregation-prone moiety promoting the aggregation of the chimeric polypeptide. The cellulose-binding module, a very poorly soluble protein, was used to induce intracellular deposition of the recombinant d-amino acid oxidase (DAAO) from Trigonopsis variabilis, an enzyme used in the synthesis of 7-amino cephalosporanic acid [70]. The observation that DAAO IBs retained specific activity close to that of the soluble enzyme and were resistant under conditions that inactivate free DAAO substantiated the feasibility of this approach, which was then applied also to a maltodextrin phosphorylase [72], a polyphosphate kinase [71] and a sialic acid aldolase [73]. Clearly, fusion with the cellulose binding domain did not interfere with the correct folding of the partner protein that aggregated in a form endowed with biological activity (in the case of DAAO this means also ability to bind the cofactor).

In the same conceptual frame – making soluble proteins insoluble – other authors have developed a self-assembly complex in which IBs are formed through in vivo aggregation of polyhydroxybutyrate synthase PhaC carrying at its N-terminus a negatively charged coil [78]. Aggregates of this protein expose on their surface charged regions that can bind active soluble enzymes tagged at their C-terminus with a positively charged coil.

In both cases, examples available are still too few to be generalized in a broad scope experimental approach. However, the importance of IBs as direct or indirect immobilization carriers might increase when, for instance, different enzymes/proteins can participate in the same aggregate to build a multifunctional aggregated catalyst.

Finally, but not less important, it should be considered that pathways of protein folding are reflected in the formation of IBs and in their structure. Studying protein aggregates can therefore provide a first glimpse about the occurrence of folding-limiting steps. The finding that aggregates of several different proteins, for example INF-α-2b [56], a bacterial lipase [57], a mutant of the Aβ42 Alzheimer peptide [79] and GFP [69] can be endowed with substantial amounts of native structure led to the conclusion that the process of intracellular aggregation can involve proteins in a continuum of conformational states. This idea is well substantiated by the demonstration that different conformations of the same polypeptide coexist in IBs [80]. However, the structure of aggregated TEM1-β-lactamase inside IBs could not be affected by any of the usual means [81]. In this particular case it was therefore concluded that TEM aggregation is only controlled by the amino acid sequence and not by the kinetics of folding, since changing the rate of biosynthesis did not result in structural changes in the aggregates. This result was interpreted as evidence about the existence of a single specific folding step critical for the protein undergoing either aggregation or native folding.

Analysis of the modulation of aggregation in bacteria was also of support in clarifying critical steps of oxidative folding of bovine β-lactoglobulin. β-Lactoglobulin carries five cysteine residues, four of which link in disulfide bridges, raising questions about the role (if any) of the free thiol during in vivo folding. Upon overproduction in E. coli cells optimized for the intracellular formation of disulfide bonds, it was observed that a mutant protein deprived of the unpaired Cys was more prone to aggregation than the wild type, pointing to a contribution of the free thiol in the pathway leading to the formation of native bonds [61].

The number of proteins studied up to now is still too limited to try to generalize which structural, sequence and kinetic properties might dictate the fine detail of aggregation. However, structural analysis of IBs produced in different conditions can be considered as an easy tool to detect the presence of critical folding intermediates to be characterized with other techniques.

To conclude, we believe that a truly successful understanding and exploitation of IBs requires an advanced understanding of cellular and protein mechanisms leading to aggregation as well as powerful biophysical detection methods. Reported examples highlight the potential of these approaches in creating new generation protein depositories and biocatalysts.


S. M. D. and M. L. acknowledge support by FAR (Fondo di Ateneo per la Ricerca) of the University of Milano-Bicocca. P. G. -L. is the recipient of a Marie Curie Intra-European Fellowship. A. N. and D. A. acknowledge postdoctoral fellowships of the University of Milano-Bicocca.