V. N. Uversky, Department of Chemistry and Biochemistry, University of California, Santa Cruz, CA 95064. Fax: + 831 459 2935, Tel.: + 831 459 2915, E-mail: firstname.lastname@example.org
Natively unfolded or intrinsically unstructured proteins constitute a unique group of the protein kingdom. The evolutionary persistence of such proteins represents strong evidence in the favor of their importance and raises intriguing questions about the role of protein disorders in biological processes. Additionally, natively unfolded proteins, with their lack of ordered structure, represent attractive targets for the biophysical studies of the unfolded polypeptide chain under physiological conditions in vitro. The goal of this study was to summarize the structural information on natively unfolded proteins in order to evaluate their major conformational characteristics. It appeared that natively unfolded proteins are characterized by low overall hydrophobicity and large net charge. They possess hydrodynamic properties typical of random coils in poor solvent, or premolten globule conformation. These proteins show a low level of ordered secondary structure and no tightly packed core. They are very flexible, but may adopt relatively rigid conformations in the presence of natural ligands. Finally, in comparison with the globular proteins, natively unfolded polypeptides possess ‘turn out’ responses to changes in the environment, as their structural complexities increase at high temperature or at extreme pH.
Before the phenomenon of natively unfolded proteins will be considered, a definition of the major players is required. The importance of this issue follows from the fact that many proteins have been shown to have nonrigid structures under physiological conditions. These proteins may be separated in two different groups. Members of the first group, despite their flexibility, are rather compact and possess a well-developed secondary structure, i.e. they show properties typical of the molten globule . Proteins from the other group behave almost as random coils . Only members of the second group will be described below. Thus, to be considered as natively unfolded (or intrinsically unstructured), a protein should be extremely flexible, essentially noncompact (extended), and have little or no ordered secondary structure under physiological conditions.
Why study intrinsically disordered proteins?
The number of proteins and protein domains, that have been shown in vitro to have little or no ordered structure under physiological conditions, is rapidly increasing. In fact, over the past 10 years there has been an exponential increase in the number of such studies, starting from one paper in 1989, and ending with more than 30 in 2000. The current list of natively unfolded proteins includes more than 100 entries (91 of them were tabulated in our recent work ). This collection comprises the full-length proteins and their domains with chain length of more than 50 amino-acid residues. Including shorter polypeptides (30–50 residues long) would probably double this amount.
The growing interest in this class of proteins is for several reasons. The first issue is the structure–function relationship. The existence of biologically active but extremely flexible proteins questions the assumption that rigid well-folded 3D-structure is required for functioning. To overcome this problem, it has been suggested that the lack of rigid globular structure under physiological conditions might represent a considerable functional advantage for ‘natively unfolded’ proteins, as their large plasticity allows them to interact efficiently with several different targets [4,5]. Moreover, a disorder/order transition induced in ‘natively unfolded’ proteins during the binding of specific targets in vivo might represent a simple mechanism for regulation of numerous cellular processes, including regulation of transcription and translation, and cell cycle control. Precise control over the thermodynamics of the binding process may also be achieved in this way (reviewed in [4,5]). Evolutionary continuance of the intrinsically disordered proteins represents additional confirmation of their importance and raises intriguing␣questions on the role of protein disorder in biological processes.
Secondly, biomedical aspects are of great importance too.␣It␣has been established that deposition of some natively␣unfolded proteins is related to the development of several␣neurodegenerative disorders [6,7]. Examples include Alzheimer's disease [AD; deposition of amyloid-β, tau-protein, α-synuclein fragment nonamyloids component (NAC)][8–11], Niemann-Pick disease type C, subacute sclerosing panencephalitis, argyrophilic grain disease, myotonic dystrophy, motor neuron disease with neurofibrillary tangles (accumulation of tau-protein in the form of neurofibrillary tangles ), Down's syndrome (nonfilamentous amyloid-β deposits ), Parkinson's disease (PD), dementia with Lewy body (LB), LB variant of AD, multiple system atrophy and Hallervorden-Spatz disease (deposition of α-synuclein in form of LBs and Lewy neurites (LNs) [13–17]).
Finally, intrinsically unstructured proteins represent an attractive subject for the biophysical characterization of unfolded polypeptide chain under the physiological␣conditions.
The special term ‘natively unfolded’ was introduced in 1994 to describe the behavior of tau protein , and has been frequently used ever since. Although large amounts of experimental data have been accumulated and several disordered proteins have been rather well characterized (reviewed in [4,5]), the systematic analysis of structural data for the family of natively unfolded proteins has not been made as yet. This lack of methodical inspection of the conformational behavior of intrinsically unordered proteins has already lead to some confusion. For example, based on high thermostability, acidic pI, anomalous electrophoretic mobility, and the high content of turns and random coil (≈ 50%), it was concluded that manganese stabilizing protein is natively unfolded . It was also suggested that the natively unfolded structure of this protein facilitates the highly effective protein–protein interactions that are necessary for its assembly into photosystem II. However, the validity of this conclusion was recently questioned . In fact, more careful analysis of the structural properties of manganese stabilizing protein showed that it has a rather compact conformation with a well-developed secondary structure (47% β sheet), i.e. it is closer to a molten globule, than to an unfolded state . Finally, it was reasonably noted that ‘the structural feature of a ‘natively unfolded’ state is not the only possibility for conformational flexibility of a protein to achieve optimal conditions for interaction with other proteins. An alternative state with a high potential for structural adaptability is that of a molten globule' .
All this demonstrates that a systematic analysis of the structural and conformational properties of the family of natively unfolded proteins is required.
Why are intrinsically disordered proteins unfolded?
It is known that the unique three-dimensional structure of a globular protein is stabilized by various noncovalent interactions (conformational forces) of different nature, namely hydrogen bonds, hydrophobic interactions, van der Vaals interactions, etc. Furthermore, all the necessary information for the correct folding of a regular protein into the rigid biologically active conformation is included in its amino-acid sequence . The absence of regular structure in natively unfolded proteins raises a question about the specific features of their amino-acid sequences. Some of the sequence peculiarities of these proteins were recognized long ago. These include the␣presence of numerous uncompensated charged groups (often negative), i.e. a large net charge at neutral pH, arising from the extreme pI values in such proteins [22–24], and a low content of hydrophobic amino-acid residues [22,23].
The comparison of the overall hydrophobicity and net charge of native and natively unfolded protein sequences showed that it is possible to predict whether a given amino-acid sequence encodes a native (folded) or an intrinsically unstructured protein. In fact, this analysis established that the combination of low mean hydrophobicity and relatively high net charge represents an important prerequisite for the absence of compact structure in proteins under physiological conditions, thus leading to ‘natively unfolded’ proteins . Figure 1 represents the results of this survey and shows that the natively unfolded proteins are specifically localized within a unique region of the charge–hydrophobicity phase space. The solid line in this figure represents the border between intrinsically unstructured and native proteins. Obviously, this allows the estimation of the ‘boundary’ mean hydrophobicity value, <H>b, below which a polypeptide chain with a given mean net charge <R> will be most probably unfolded:
The validity of these predictions has been successfully shown for several proteins . This means that degree of compaction of a given polypeptide chain is determined by the balance in the competition between the charge repulsion driving unfolding and hydrophobic interactions driving folding.
In an attempt to understand the relationship between sequence and disorder, Dunker and coauthors have elaborated several neuronal network predictors [5,26–35]. They assumed that if a protein structure has evolved to have a functional disordered state, then a propensity for disorder might be predictable from its amino-acid sequence and composition. The results of such analysis were more than impressive. It has been established that disordered regions share at least some common sequence features over many proteins. This includes low sequence complexity, with amino-acid compositional bias and high predicted flexibility [28,29]. Furthermore, the majority of the intrinsically disordered proteins, being substantially depleted in I, L, V, W, F, Y, C, and N, are enriched in E, K, R, G, Q, S, P, and A . Note that these features may account for the low overall hydrophobicity and high net charge of the polypetide chain of natively unfolded proteins. Interestingly, more than 15 000 proteins in the SwissProt database were identified as having long regions of sequence that share these same features .
What are the general structural␣characteristics of␣natively unfolded proteins?
The general conformational properties of intrinsically unfolded proteins are summarized below. Here we will mostly focus on the structural characteristics, which make such proteins exceptional among others. These are low compactness, absence of globularity, low secondary structure content, and high flexibility.
The most unambiguous characteristic of the conformational state of a globular protein is the hydrodynamic dimensions. It was noted long ago that hydrodynamic techniques may help to recognize when a protein has lost all of its noncovalent structure, i.e. when it became unfolded . This is because an essential increase in the hydrodynamic volume is associated with the unfolding of a protein molecule. It is known that globular proteins may exist in at least four different conformations, native, molten globule, premolten globule and unfolded [1,36–39], that may easily be discriminated by the degree of compactness of the polypeptide chain. Finally, it has been established that the native and unfolded conformations of globular proteins possess very different molecular mass dependencies of their hydrodynamic radii (the Stokes radius), RS[2,40,41].
In order to clarify the physical nature of natively unfolded proteins, Fig. 2 compares log(RS) vs. log(M) curves for these proteins (see Table 1 for details) with same dependencies for the native, molten globule, premolten globule, and urea- or GdmCl-unfolded globular proteins (data for different conformations of globular proteins were taken from ). The log(RS) vs. log(M) dependencies for␣different conformations of globular proteins might be described by straight lines:
Table 2. Hydrodynamic characteristics of 8 m urea-unfolded proteins without cross-links.
Where N, native; MG, molten globule; PMG, premolten globule and U(urea) and U(GdmCl) correspond to the unfolded urea and GdmCl globular proteins, respectively.
As for natively unfolded proteins, Fig. 2 clearly shows that in respect of their log(RS) vs. log(M) dependence they may be divided in two groups (see Table 1). One group of the intrinsically unstructured proteins behaves as random coils in poor solvent [denoted as natively unfolded (NU)(coil)]. Proteins from the other group are essentially more compact, being close with respect to their hydrodynamic characteristics to premolten globules [denoted as NU(PMG)]:
This is a very important observation, which may help in understanding the physical nature of the natively unfolded proteins. In fact, it is well established that the behavior of unfolded proteins obeys the theoretical and empirical rules that apply to linear random coils . Specifically, it is known that the hydrodynamic dimensions of random coils depends essentially on the quality of solvent [2,40,43]. A poor solvent encourages the attraction of macromolecular segments and hence a chain has to squeeze. Whereas, in a good solvent, repulsive forces act primarily between the segments and the macromolecule conforms to a loose fluctuating coil . Water is a poor solvent, whereas solutions of urea and GdmCl are rather good solvents, with GdmCl being closer to the ideal one [2,40]. This difference in solvent quality may account for the observed divergence in log(RS) vs. log(M) dependencies for the coil-like part of intrinsically unstructured proteins. The existence of well-defined difference between the log(RS) vs. log(M) dependencies for globular proteins unfolded by urea and GdmCl also should be noted in this respect.
Another very important structural parameter is the degree of globularization that reflects the presence or absence of tightly packed core in the protein molecule. In fact, it has been shown that the protein molecules in PMG are characterized by low (coil-like) intramolecular packing density [37,38,42,45]. This information could be extracted from the analysis of small angle X-ray scattering (SAXS) data (Kratky plot), whose shape is sensitive to the conformational state of the scattering protein molecules [45–48]. It has been shown that a scattering curve in the Kratky plot has a characteristic maximum when the globular protein is in the native state or in the molten globule state (i.e. has a globular structure). If a protein is completely unfolded or in a premolten globule conformation (has no globular structure), such a maximum will be absent on the respective scattering curve [37,38,42, 45–48].
Figure 3A compares the Kratky plots of three natively unfolded proteins (α-synuclein, prothymosin α and caldesmon 636–771 fragment) with that of the rigid globular protein SNase. One can see that intrinsically unstructured proteins give Kratky plots without maxima typical of folded conformations of globular proteins. The same data has also been reported for another intrinsically unordered protein, pig calpastatin domain I . Thus, these four natively unfolded proteins are characterized by the absence of globular structure, or, in other words, they do not have a tightly packed core under physiological conditions in vitro. This is a very important observation, which allows the assumption that all other natively unfolded proteins may possess the same property. In fact, the analysis of hydrodynamic data shows that two of the three considered proteins (α-synuclein and prothymosin α) behave as coils␣in poor solvent, whereas RS of caldesmon 636–771 fragment is typical of PMG (see Table 1). Consequently, representatives of both classes of intrinsically unstructured proteins (coil-like and PMG-like) have been shown to be characterized by the absence of rigid globular core. This is in good agreement with SAXS data on conformational characteristics of the PMG state of globular proteins [37,38,42,45].
Figure 3B presents the far-UV CD spectra of α-synuclein, prothymosin α, phosphodiesterase γ-subunit and caldesmon 636–771 fragment as typical representatives of the family of natively unfolded proteins. One can see that these proteins (as well as all other intrinsically unstructured proteins, whose far-UV CD spectra were studied) possess distinctive far-UV CD spectra with characteristic deep minima in vicinity of 200 nm, and relatively low ellipticity at 220 nm. The analysis of these spectra yields low content of ordered secondary structure (α helices and β sheets). This is also confirmed by the Fourier-transform infrared (FTIR) analysis of secondary structure composition of natively unfolded proteins, such as tau protein , α-synuclein [24,50], β- and γ-synucleins; αs-casein , and cAMP-dependent protein kinase inhibitor . Importantly, even the caldesmon 636–771 fragment, which was shown to have hydrodynamic properties typical of the PMG (see above), possesses far-UV CD characteristic of essentially distorted polypeptide chain. Thus, the low overall content of ordered secondary structure could be considered as a general property of intrinsically unstructured proteins.
The fact that intrinsically unfolded proteins are characterized by an increased intramolecular flexibility may be easily derived from a large amount of NMR studies (summarized in [4,5,53]). Moreover, recent advances in NMR technology (especially the use of heteronuclear multidimensional approach) have even opened the way to detailed structural and dynamic description of these proteins . Increased flexibility of natively unfolded proteins is indirectly confirmed by their extremely high sensitivity to protease degradation in vitro[4,5,54–59].
Figure 4A depicts temperature-induced changes in the far-UV CD spectra of α-synuclein  measured at different temperatures. At low temperatures, the protein shows a far-UV CD spectrum typical of an unfolded polypeptide chain. As the temperature is increased, the spectrum changes, consistent with temperature-induced formation of secondary structure. Figure 4B represents the temperature-dependence of [θ]222 for α-synuclein, caldesmon 636–771 fragment, and phosphodiesterase γ-subunit. One can see that for these three proteins major spectral changes occur over the range of 3 to 30–50 °C. Further heating leads to a less pronounced effects. Analogous temperature dependencies indicative of heat-induced structure formation have been reported for the receptor extracellular domain of nerve growth factor  and αs-casein . Interestingly, it has been shown that the structural changes induced in all these proteins by heating are completely reversible. Thus, an increase in temperature induces the partial folding of intrinsically unstructured proteins, rather than the unfolding typical of globular proteins. The effects of elevated temperature may be attributed to increased strength of the hydrophobic interaction at higher temperatures, leading to␣a␣stronger hydrophobic driving force for folding. This␣observation definitely has to be taken into account while discussing conformational behavior of intrinsically unstructured proteins.
Effect of pH
Figure 4C represents the pH dependence of [θ]222 for α-synuclein and prothymosin α. There is little change in the far-UV CD spectra between pH ≈ 9.0 and ≈ 5.5. However, a decrease in pH from 5.5 to 3.0 results in a substantial increase in negative intensity in the vicinity of 220 nm. It has also been established that the pH-induced changes in the far-UV CD spectrum of these two proteins were completely reversible and consistent with the formation of partially folded PMG-like intermediate conformation [50,62].
Same pH-induced structural transformations have been described for pig calpastatin domain I , histidine rich protein II , and the naturally occurring human peptide LL-37 . These observations show that a decrease (or increase) in pH induces partial folding of intrinsically unordered proteins due to the minimization of their large net charge present at neutral pH, thereby decreasing charge/charge intramolecular repulsion and permitting hydrophobic-driven collapse to the partially folded intermediate.
Effect of counter ions
It was already noted that, under physiological pH, intrinsically unstructured proteins are unfolded mainly because of the electrostatic repulsion between the noncompensated charges of the same sign. To some extent, this resembles the situation occurring for many proteins at extremely low or high pH. It has been established that these unfolded proteins could be transformed into more ordered conformations if electrostatic repulsion was reduced by binding of oppositely charged ions [65,66]. Similar situation may be expected for natively unfolded proteins, and, in fact, the metal ion-stimulated conformational changes have been described for many intrinsically unstructured proteins.
As an illustration, Fig. 4D represents the [θ]222 dependencies on [Al3+] for α-synuclein. One can see that an increase in the␣cation content is accompanied by an essential increase in the intensity of the far-UV CD spectra, reflecting partial folding of the protein. It has been established that other cations (monovalent, bivalent and trivalent) induce conformational changes in α-synuclein and transform this natively unfolded protein into a partially folded intermediate too. The folding strength of cations increases with the ionic charge density increase . This reflects the effective screening of the Coulombic charge/charge repulsion. For polyvalent cations, an additional important factor could be hypothesized, which is the potential capability for cross-linking or bridging between two or more carboxylates.
Importantly, human antibacterial protein LL-37, a natively unfolded protein with extremely basic net charge, was shown to be essentially folded in the presence of several anions .
What else is required for␣intrinsically unordered proteins to fold?
Structure forming role of natural ligands
It has been suggested that natively unfolded proteins may be significantly folded in their normal cellular milieu due to binding to specific targets and ligands (such as a variety of small molecules, substrates, cofactors, other proteins, nucleic acids, membranes, etc.) [3–5,53,68]. The structure-forming effect of natural partners can be explained by their influence on the mean hydrophobicity and/or net charge of the natively unfolded polypeptide. In fact, any interaction of such protein with natural ligand affecting mean net charge and/or mean hydrophobicity of the protein could change these parameters in such a way that they will approach values typical of folded native proteins. This hypothesis has been confirmed by calculation the joint mean net charge and mean hydrophobicity of complexes of several natively unfolded proteins, ostecalcin, osteonectin, α-casein, HPV16 E7 protein, calsequestrin, manganese stabilizing protein and HIV-1 integrase, with their natural ligands, metal ions . The existence of pronounced ligand-induced folding has been indeed established in numerous in vitro studies for many intrinsically unstructured proteins. Examples include: DNA (or RNA) induced structure formation in protamines [69,70], Max protein , high mobility group proteins HMG-14  and HMG-17 ; cation-induced folding of ostecalcine , osteonectine , Sdrd protein , chromatogranins A  and B , Δ131Δ fragment of SNase , histone H1 , protamine  and prothymosin-α; folding of cytochrome c in the presence of heme ; membrane-induced secondary structure formation in parathyroid hormone related protein ; trimethylamine N-oxide induced structure formation in glucocorticoid receptor ; heme-induced folding of histidine-rich protein II , and many others.
Based on the data summarized above, a typical natively unfolded protein is characterized by: (a) a specific amino-acid sequence with low overall hydrophobicity and high net charge; (b) hydrodynamic properties typical of a random coil in poor solvent, or PMG conformation; (c) low level of ordered secondary structure; (d) the absence of a tightly packed core; (e) high conformational flexibility; (f) its ability to adopt relatively rigid conformation in the presence of natural ligands; and (g) a ‘turn out’ response to environmental changes, with the structural complexity increase at high temperature or at extreme pH.
I am grateful to Dr P. Souillac for the careful reading and editing of the manuscript. This work was supported in part by fellowships from the Parkinson's Institute and the National Parkinson's Foundation.