The past decade has witnessed great advances in our understanding of protein structure-function relationships in terms of the ubiquitous existence of intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs). The structural disorder of IDPs/IDRs enables them to play essential functions that are complementary to those of ordered proteins. In addition, IDPs/IDRs are persistent in evolution. Therefore, they are expected to possess some advantages over ordered proteins. In this review, we summarize and survey nine possible advantages of IDPs/IDRs: economizing genome/protein resources, overcoming steric restrictions in binding, achieving high specificity with low affinity, increasing binding rate, facilitating posttranslational modifications, enabling flexible linkers, preventing aggregation, providing resistance to non-native conditions, and allowing compatibility with more available sequences. Some potential advantages of IDPs/IDRs are not well understood and require both experimental and theoretical approaches to decipher. The connection with protein design is also briefly discussed.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
Being a challenge or amendment to the conventional sequence-structure-function paradigm for proteins, intrinsically disordered proteins (IDPs) have attracted ever-increasing attention during the past decade.[1-5] IDPs are intriguing because they do not have ordered structures in the free state under physiological conditions but still possess biological functions. The structural disorder of IDPs and intrinsically disordered regions (IDRs) lies in their distinct amino-acid sequences, that is, they usually have a low hydrophobicity content combined with a high net charge content.[1, 6, 7] Without folded structures, IDPs/IDRs exist in an ensemble of rapidly-changing conformations with a flat free-energy landscape[8-10] and exhibit almost unlimited structural heterogeneity.[1, 11] On the other hand, most IDPs/IDRs undergo a disorder-to-order transition upon binding to their biological partners (i.e., coupled folding and binding),[12, 13] although some remain disordered even in their bound state.[14-17]
The structural disorder of IDPs/IDRs enables them to play essential functions that are complementary to those of ordered proteins. Roughly, IDPs/IDRs can be classified into six broad functional classes,[18, 19] including effectors, scavengers, assemblers, entropic chains, display sites, and chaperones. A detailed bioinformatics analysis of the Swiss-Prot database revealed a positive correlation between IDPs/IDRs and 238 function keywords, the majority of which were related to signaling and regulation of key cellular processes. Due to the essential functions of IDPs/IDRs, it is not unexpected that they are abundant in all species.[21-24] In particular, the average fraction of disordered residues predicted in eukaryotes is higher than that in prokaryotes, suggesting the importance of IDPs/IDRs in evolution. The abundant existence of IDPs/IDRs and their vital functions, as well as the fact that usual techniques used in characterizing conventional proteins may be not applicable for IDPs/IDRs, make IDPs/IDRs an important subfield of molecular structural biology. In fact, IDPs/IDRs are taking a due place in mainstream studies. For example, predicting disordered regions has become a part of the critical assessment of structure prediction (CASP).
As IDPs/IDRs perform functions complementary to those of ordered proteins and are persistent in evolution, they might possess some advantages over ordered proteins (and at the same time possess some disadvantages). In this article, we give a brief review of the advantages of IDPs/IDRs provided by their conformational flexibility. Nine possible advantages are summarized and surveyed (Fig. 1): (1) economizing genome and protein resources; (2) overcoming steric restrictions in binding; (3) achieving high specificity with low affinity; (4) increasing binding rate; (5) facilitating posttranslational modifications; (6) enabling flexible linkers; (7) preventing aggregation; (8) providing resistance to non-native conditions; (9) allowing compatibility with more available sequences. Some advantages are self-evident, for example, the potency of acting as flexible linkers is incompatible with ordered structures. Some other advantages, however, are less clear and knowledge on them is still far from complete. It is acknowledged that the current review addresses only a narrow aspect of IDPs' properties. Readers are directed to several excellent reviews[1, 2, 26-30] for a more comprehensive understanding of IDPs.
Advantage 1: Economizing Genome and Protein Resources
The interface area of IDPs in protein-protein complexes is similar in size to that of ordered proteins, but the sequence used to create the same interface area is much shorter for IDPs [Fig. 1(a)].[31, 32] It originates from the fact that IDPs have extended structures which are stable only in the complexes but not in monomers. As a result, IDPs possess greater interface area per residue than ordered proteins. It has been estimated that the protein size would be two to three times larger if IDPs were required to be as stable as monomers, which would unavoidably exacerbate cellular crowding and increase the sequence size. Therefore, IDPs/IDRs are advantageous in saving genome and protein resources. This feature may be more critical for species with small genome size. For example, viruses were shown to have the widest spread of IDPs/IDRs content compared with three domains of life, and Avian carcinoma virus possesses the highest disorder content of 77.3% among 3500 proteomes.
Intrinsic disorder may also help to reduce the genome size in other mechanisms. The number of genes in the human genome is significantly smaller than that of the diverse proteins necessary in higher organisms. Alternative splicing is a way to avoid over-large genomes, in which multiple proteins can be produced from a single gene. Regions affected by alternative splicing are frequently biased to be disordered, which helps to avoid structural disruption in the spliced proteins. “Moonlighting” is another solution to control the genome size, in which a single protein is capable of carrying out more than one function. IDPs/IDRs provide an important mechanism for moonlighting since they can use the same or overlapping regions to fulfill distinct functions by adopting different conformations upon binding. For example, the intrinsically disordered domain of the sulfhydryl oxidase ALR performs dual functions: as a mitochondrial targeting signal in the cytosol and as a crucial recognition site in the disulfide relay system of intermembrane space. Another example of multifunctional IDPs is anhydrin which acts as a chaperone and an endonuclease.
Advantage 2: Overcoming Steric Restrictions in Binding
The high chain flexibility and conformational disorder of IDPs/IDRs enables them to form complementary binding interfaces with their targets more easily. Via coupled folding and binding, IDPs/IDRs can overcome steric restrictions by protruding into the concavities of partners or wrapping around partners in ways difficult for ordered proteins [Fig. 1(b)]. It was found that the partner interfaces of MoRFs (the short segments of IDPs that perform molecular recognition and usually undergo disorder-to-order transitions upon binding) are significantly less flat than those of ordered proteins. When measured by the RMSD of interface atoms to the fitting plane, the interface planarity of IDPs partners is 3.76 Å while that of ordered proteins is 2.98 Å. The binding modes of IDPs/IDRs are highly diverse and create multifarious unusual complexes: wrappers, chameleons, penetrators, huggers, intertwined strings, long cylindrical containers, connectors, armature, etc. For example, the intrinsically disordered calpastatin (endogenous inhibitor of calpain) wraps around and binds to its target on three surfaces to form a flexible wrapper.
After folding coupled with binding, the interface structure of IDPs/IDRs is still more flexible than that of ordered proteins. The average crystallographic B-factor of interfaces for IDPs (51 Å2) in complex structures was much larger than that of ordered proteins (21 Å2), indicating interface atoms of IDPs remain highly dynamic. IDPs/IDRs also have a high content of the poly-l-proline type II (PPII) helix which is markedly more flexible in comparison with α-helix and β-sheet.[40, 41] In addition to the complete folding upon binding where IDPs/IDRs adopt ordered structures, some IDPs/IDRs experience incomplete folding where a significant part remains disordered in the bound state, forming dynamic or fuzzy complexes.[42-44] In the extreme case (random fuzziness), IDPs remain entirely disordered in the bound state and resemble a “binding cloud” where multiple binding sites are dynamically distributed.[1, 42] An example is the Sic1-Cdc4 complex, where a few binding motifs of Sic1 interact with Cdc4 in a dynamic equilibrium, one at a time, and the parts of Sic1 not interacting with Cdc4 remain disordered.[14, 45] For the binding with small molecule ligands, IDPs remain disordered and the ligands bind at different sites along the protein chains, so it may be described as ligand clouds around protein clouds.
Advantage 3: Achieving High Specificity With Low Affinity
Combining high specificity with low affinity is a widely-mentioned advantage of IDPs/IDRs,[4, 12, 46, 47] which is a useful pair of properties for a reversible signaling transduction by enabling rapid association/dissociation with the partner without excessive binding strength.
The low affinity of IDPs/IDRs originates from the coupled folding and binding. The ordered (folded) state is unstable for IDPs/IDRs when they are free in solution. In a coupled folding and binding process, the folding increases the free energy while the binding of IDPs/IDRs to their targets decreases the free energy. As a result, the net free-energy change in such a coupled folding-binding process is smaller than that in a pure binding process, leading to a lower affinity. The low affinity allows IDPs to have a high dissociation rate, which can be essential for regulatory and signaling functions. Based on a dataset of protein–protein associations for 35 IDPs and 45 ordered proteins, we showed that the distribution of the equilibrium dissociation constants for IDPs is slightly different from that for ordered proteins (Fig. 2), where for IDPs and for ordered proteins. In another benchmark dataset, it was found that 62% of the complexes with lower binding affinity contain regions formed via coupled folding and binding. The relatively low affinity of IDPs/IDRs also suggested that high-affinity binding proteins can tolerate more structural disorder, consistent with the difference in the content of IDPs/IDRs of prokaryotic and eukaryotic genomes. It is noted that although the disorder-to-order transition of IDPs/IDRs decreases the conformational entropy, IDPs/IDRs usually possess more favorable interface interactions than ordered proteins.[32, 39] Furthermore, the binding free energy is affected by many other factors.[51, 52] The affinity of either IDPs/IDRs or ordered proteins is highly variable and covers a wide range. So the affinity difference between IDPs/IDRs and ordered proteins applies to average values, not individual cases. Actually, some IDPs exhibit extremely high affinities. For example, the binding affinity of the intrinsically disordered antitoxin CcdA to the gyrase poison CcdB is in the pico-molar range, much higher than that of most ordered proteins.[53, 54]
The high specificity of IDPs/IDRs is generally believed to be related to their interface characteristics.[3, 4, 12, 27] The binding specificity is mainly determined by the size and complementarity of the binding interface. Due to the chain flexibility and the coupled folding-and-binding, as mentioned above, the interaction interfaces of IDPs/IDRs to their targets are highly complementary [Fig. 1(c)]. Although IDPs/IDRs usually use short segments in molecular recognitions, the extended structures formed in the complexes enable them to achieve large binding interfaces. Therefore, IDPs/IDRs are expected to exhibit high specificity in molecular recognitions. High specificity is necessary for their critical functions in signal recognition, transduction, and regulation.
However, there is still some debate on the specificity of IDPs/IDRs.[47, 55] The above reasoning is not consistent with the conventional opinion that more flexible proteins are more promiscuous,[47, 55] that is, if structural flexibility enables IDPs/IDRs to form complementary interfaces with their cognate targets, the same property could also enable them to fit the surfaces of noncognate targets and thus IDPs/IDRs would lose their interaction specificity. Under interface-interaction perturbations (to mimic the structural difference between cognate and noncognate targets), IDPs/IDRs can either adjust their conformations to maintain the interface complementarity to stabilize the enthalpy or become conformationally more dynamic to stabilize the entropy, and as a result, IDPs/IDRs have a more complete enthalpy–entropy compensation to reduce the influence of the perturbations. Indeed, an analysis on the thermodynamic data of interface mutation verified that IDPs had higher enthalpy–entropy compensation than ordered proteins, suggesting IDPs to be more malleable.
To clarify the specificity difference between IDPs and ordered proteins, a coarse-grained molecular dynamics (MD) study was conducted to calculate the mutation free energy (ΔΔG) of IDPs and ordered proteins under the same perturbations. It was found that ΔΔG for IDPs is smaller than that for ordered proteins by an averaged difference of kcal/mol. This difference is smaller than what would be expected from the enthalpy–entropy compensation mentioned above. The reason is the same perturbation caused larger mutation enthalpy (ΔΔH) in IDPs. Recently, an analysis of protein-protein complexes from the Protein Data Bank revealed that polar interface interactions make a larger contribution to the specificity in IDPs, and the follow-up computational alanine scanning of both hydrophobic and charged residues confirmed that these mutations caused larger ΔΔH in IDPs. Therefore, IDPs are likely to possess high specificity, although maybe not as high as ordered proteins.
High specificity may be critical for the biological functions of IDPs/IDRs. On the other hand, since the specificity of IDPs/IDRs may not be as high as that of ordered proteins, IDPs/IDRs may be more promiscuous in molecular recognitions and need to be tightly controlled in cells. Indeed, the abundance of IDPs/IDRs is regulated so that they are available in appropriate amounts and at appropriate times as needed.[28, 57] The promiscuity of IDPs/IDRs also makes them a major determinant of dosage sensitivity where promiscuous interactions drive pathological changes in response to gene over-expression. In the absence of tight regulation of IDPs/IDRs, severe outcomes may result, for example, the occurrence of various diseases.[59, 60]
A closely related characteristic is multiple specificity—the capacity of interacting with many partners. The high plasticity and malleability of IDPs/IDRs enable them to bind to multiple partners more readily by changing conformations or interaction regions according to the templates provided by different targets.[62, 63] The cyclin-dependent kinase (CDK) inhibitor p21, which is a very early example of IDPs, binds to different CDK/cyclin complexes during the cell cycle and the binding diversity is mediated by the adaptability of its LH region. The induced structures of the same IDP/IDRs may be different in binding with different partners, for example, the C-terminus of p53 becomes a helix, a sheet, or a coil in different complexes. Even when the induced structures of the same IDP/IDRs in different complexes are similar, the utilized anchor residues may be totally different which are important in controlling the specificity. Multispecificity is useful for signaling and regulation. IDPs/IDRs are commonly involved in the hubs of protein–protein interaction networks,[67-70] where one IDP/IDR may bind to many partners or many IDPs/IDRs bind to one partner. The chaperone function of IDPs is also related to the multispecificity where it is vital to bind multiple aggregation-sensitive client proteins. In addition, multispecificity is also a basis of the moonlighting mentioned above.
Advantage 4: Increasing Binding Rate
The binding rate of proteins to their targets is important for diverse processes ranging from enzyme catalysis/inhibition to cellular signaling. Compared with ordered proteins, IDPs/IDRs possess a kinetic advantage, that is, their binding rate is relatively higher.
The first insight into the kinetic advantage of IDPs/IDRs was provided by Shoemaker et al. in terms of the “fly-casting mechanism.” By virtue of extended conformations, IDPs/IDRs could bind to their targets weakly at a larger distance (capture radius) from the actual binding site. Then they “reel in” the targets to complete the binding process while simultaneously folding. Later, Huang and Liu performed a critical assessment of this mechanism by investigating the inherent influence of chain disorder on the binding kinetics of IDPs/IDRs. Via Langevin dynamics simulations with a Gō-like coarse-grained model, it was found that although IDPs/IDRs possess a larger capture radius, the larger capture radius inevitably leads to a slower translational diffusion. As a result, the rate constant for forming the encounter complex actually decreased modestly as the chain flexibility increases, rather than increased as originally proposed. The real source for the kinetic advantage of IDPs/IDRs lies in the second step, where the encounter complex evolves into the final bound state faster and escapes to the unbound state slower than ordered proteins [Fig. 1(d)]. In addition, the available experimental data on the binding kinetics of IDPs and ordered proteins were compared to show the general difference in real systems. On average, IDPs bind two to three times as fast as ordered proteins under the same affinity. It was also noted that the binding rate spans a large range of magnitude so that comparison of a single datum is meaningless.
Protein binding rates are influenced by various factors, for example, nonspecific binding and electrostatic interactions. Due to their high chain flexibility, IDPs/IDRs may possess more nonnative interactions in the folding and binding processes than ordered proteins. It was shown that nonnative hydrophobic interactions greatly amplify the kinetic advantages of IDPs/IDRs. IDPs/IDRs usually contain a high content of charge residues, and electrostatic interactions were also found to contribute to their kinetic advantage via the electrostatic steering mechanism. Evidently, the rate of the intrinsically disordered WASP GBD (the GTPase binding domain of the Wiskott-Aldrich syndrome protein) upon binding its target (Cdc42) is highly dependent on ionic strength in the experimental study. In the interactions between IDPs and DNA, it was shown that electrostatic steering and protein flexibility synergistically couple to achieve rapidity in recognition. Recently, Ganguly et al. showed that long-range electrostatic interactions not only enhance the encounter rate via the electrostatic steering but also promote the folding-competent topologies in the encounter complexes, allowing rapid formation of short-range native interactions to form the final bound state.[78, 79]
Binding kinetics is important for functions at molecular and system levels.[73, 80] Association between signaling proteins and their cellular targets should be both fast and transient. The enhanced binding rate of IDPs/IDRs, together with the low affinity, provides an advantageous solution to these needs. For the intrinsically disordered translocation domain of Colicin E9, although its binding with TolB is weaker than the competitors, it wins in the competitive recruitment of TolB by its much higher binding rate. In nonsense-mediated mRNA decay (NMD), it was suggested that a fly-casting mechanism enabled by long disordered regions in NMD complexes is exploited for effective long-range communication.
Posttranslational modifications (PTMs) are an important means to regulate functions of proteins. The conformational flexibility of IDPs/IDRs greatly facilitates exposure of their modification sites and binding to the modifying enzymes [Fig. 1(e)]. The flexibility of IDPs/IDRs also makes it possible for a single enzyme to bind to and modify sites in a wide variety of proteins. In contrast, PTMs on ordered proteins may be restricted by site accessibility. Therefore, IDPs/IDRs facilitate the regulation of cellular processes by PTMs.
Phosphorylation is one of the most important and well-studied PTMs. It dominates the number of experimentally-identified PTMs by an order of magnitude. It was found that sequence attributes of regions adjacent to phosphorylation sites are very similar to those of IDPs/IDRs, and such a correlation between the structure disorder and the occurrence of phosphorylation has been employed to improve the prediction of protein phosphorylation sites. For Arabidopsis plasma membrane proteins, 30% of the phosphorylation sites are located in long intrinsically disordered regions and 28% of sites are located in shorter disordered regions. Phosphorylation of IDPs/IDRs serves as an essential control mechanism in signaling and regulation. A study on the phosphorylation variation during the cell cycle indicated that intrinsically disordered regions tend to contain sites with dynamically varying levels, while ordered regions retain more constant phosphorylation levels. Typical examples are eukaryotic cyclin-Cdk inhibitors Sic1 and p27 whose inhibitory activity, stability, and subcellular localization are regulated by phosphorylation on different sites.[87, 88]
Many other PTMs are also associated with IDPs/IDRs. A large-scale analysis of the Swiss-Prot database showed that 17 PTMs keywords (i.e., phosphorylation, amidation, ubiquitination, glycosylation, sulfation, and methylation, etc) are strongly correlated with predicted disorder. In comparison with ordered proteins, IDPs/IDRs contain more ubiquitination sites and this observation was used in developing a predictor program for such sites.[90, 91] The preference of ubiquitination in IDPs/IDRs provides a basis for their efficient regulation via rapid degradation.
Advantage 6: Enabling Flexible Linkers
Acting as entropic chains such as flexible linkers is one of the six broad functional classes of IDPs/IDRs.[18, 92] The lack of ordered structures of IDPs/IDRs enables them to act as flexible linkers which are obviously out of reach of ordered proteins.
In multidomain proteins, modular domains are often connected by flexible linkers [Fig. 1(f)]. The primary function of these linkers is to restrict the distance and to enable an orientational freedom of the attached domains. IDRs carry out such functions without undergoing a disorder-to-order transition and can be mimicked reasonably well by the behavior of low-complexity polypeptides. The role of IDRs as flexible linkers was nicely demonstrated in an experimental study on the Escherichia coli tubulin homologue FtsZ. When the linker in the E. coli FtsZ was replaced by those from other bacteria, or even from an unrelated IDP (human α-adducin), its function in cell division was found to be unaffected if the length of the new linker was similar to that of the original. In the voltage-activated potassium channels Kv, it was found that the intrinsically disordered C-terminal domain behaves as an ideal flexible chain in binding with intracellular scaffold proteins.
Linking two modular domains or segments via a flexible linker increases the local concentration of one relative to the other as has been widely adopted in biological systems to increase efficiency. This can be used to connect two different functions, for example, the functions of the DNA-binding domain and the activation domain in a DNA transcription factor. In some other cases, this provides a simple way to increase the binding affinity of two linked domains (segments) to a common target.[75, 95] Furthermore, the binding affinity can be tuned by varying the length of the linker. For example, the variation of linker length in ratiometric fluorescent sensor proteins was shown to tune their Zn(II) binding affinity from picomolar to femtomolar where the effective local concentration was well described by a worm-like chain model.
Advantage 7: Preventing Aggregation
Uncontrolled aggregation is a constant threat for proteins, and proteins have to acquire sequence and structural adaptations to avoid undesired amyloid-like aggregation. The physicochemical properties of the sequences of IDPs/IDRs are generally negatively correlated with those of aggregation-prone sequences. For example, most aggregation-prone sequences have high hydrophobicity and low net charge,[98, 99] while IDPs/IDRs have low hydrophobicity and high net charge. So IDP/IDRs sequences are distinct from amyloidogenic peptides, and tend to prevent aggregation. The advantage of IDPs/IDRs relative to ordered proteins in preventing amyloid-like aggregation is apparent in examining the packing density of sequences, that is, the expected number of neighbor contacts per residue within a given distance (similar to the concept of ligancy in chemistry). A residue with high packing density is more sticking to contact with other residues. It was shown that amyloid proteins possess high expected packing density, and IDPs/IDRs possess low values, while ordered proteins possess moderate values locating between those of amyloids and IDPs/IDRs (Fig. 3). It was also shown that ordered proteins contain almost three times as much aggregation nucleating regions as IDPs.
Attaching disordered regions to ordered proteins or aggregation-prone sequences could protect the latter from aggregation [Fig. 1(g)]. Bioinformatics analysis demonstrated that interaction linear motifs are on average embedded in locally disordered regions, where a typical motif contains about six residues and is surrounded by approximately 20 residues that are intrinsically disordered. A simulation with a lattice model showed that small hydrophobic peptides with disordered flanks remain stable under conditions where peptides without flanks tend to aggregate. Such a principle was also adopted in artificial protein fusions to effectively prevent aggregation and achieve high soluble protein expression. Recently, de novo design was conducted to develop a stable monomeric peptide targeting to the tumor necrosis factor (TNF)-α by increasing the propensity of intrinsic disorder.
The capability of IDPs/IDRs to prevent aggregation is also essential for their chaperone function. For example, the E. coli protein HdeA experiences a stress-induced order-to-disorder transition to display chaperone activity and binds to other unfolded proteins to prevent their aggregation under the stress conditions.
Advantage 8: Providing Resistance to Non-Native Conditions
Ordered proteins are usually vulnerable to unfolding. Their structures and functions may be destroyed by various external stresses such as acid, base, inorganic salt, organic solvent, heating and cooling. In contrast, IDPs are more likely to keep stable with respect to various extreme conditions due to their lack of structures. For example, prothymosin α can be boiled for a few days without losing its activity. It even displays an atypical “turn out” response where partial structure is induced by heating. A positive correlation between disorder content and the resistance to heat shock was shown in a survey on 11 proteins. A study on a freezing-induced loss-of-function model of globular-disordered functional protein pairs also confirmed that IDPs are more resistant to cold treatment than ordered proteins. Late embryogenesis abundant proteins, which play crucial roles in cellular dehydration tolerance, are mostly IDPs/IDRs. The super stability of IDPs/IDRs was exactly summarized by Uversky as “you cannot break what is already broken” [Fig. 1(h)].
The structural disorder of IDPs/IDRs not only helps them to survive under extreme conditions, but is also beneficial for their resistance to environmental perturbations under physiological conditions. In a real biological system, proteins perform their functions in complicated environments, which can be affected by various factors. To maintain their activity, proteins should not be overly sensitive to perturbations in cellular conditions.[113, 114] A coarse-grained simulation revealed that the binding affinity and kinetics of IDPs were less sensitive to the perturbations of temperature and intermolecular interactions than ordered proteins. The origin of such robustness was attributed to the capacity of IDPs to adjust their bound conformations to compensate for various perturbations, which was supported by an analysis of protein complex structures. This ability was termed a “buffer effect,” which, together with the kinetic advantage discussed above, enables IDPs to conduct their functions in cellular signaling and regulation.
Advantage 9: Allowing Compatibility With More Available Sequences
The protein sequences span only a very small fraction of the whole available sequence space. For ordered proteins, the number of available sequence is greatly reduced by the need to form a unique folded structure while avoiding insoluble aggregates.[100, 115] For IDPs/IDRs, due to the removal of restrictions posed by forming ordered structure, the available sequence space is expected to be much larger than that of ordered proteins [Fig. 1(i)].[1, 11] The sequence difference between IDPs/IDRs and ordered proteins can be graphically depicted in the two-dimensional charge-hydropathy (CH) plot where ordered proteins reside in the region with high hydropathy and low net charge (Fig. 4). A comparison of the areas for IDPs and ordered proteins in the CH plot showed that the sequence space of extended IDPs is at least fivefold greater than that of compact soluble proteins (including both “molten globule”-like IDPs and well-folded ordered proteins). The difference in available sequence space between IDPs/IDRs and ordered proteins can also be discussed in terms of their sequence designability,[4, 116] which was defined as the number of sequences coding a protein. Without the structural restrictions, IDPs/IDRs may possess more sequence redundancy, resulting in high sequence designability.
Proteins were selected by evolution in nature. The greater number of available sequences may contribute some benefit to IDPs/IDRs in the evolution process. During evolution, many mutations are neutral and do not affect the activity of proteins, while rare mutations are advantageous (positive) and will presumably be preserved because they improve or change the protein function in a desired way. An IDP with a given function may on average have more neutral mutations than an ordered protein due to its sequence redundancy, and it may also have more positive mutations to achieve an advantageous phenotype due to the sequence redundancy of the resulting new IDP. The involvement of IDPs/IDRs in organisms may be closely related to their evolution. For flaviviruses, it was shown that the rapid evolutionary dynamics of structural disorder is a potential driving force for their phenotypic divergence. In mitochondria, it was shown that the IDPs/IDRs contents are markedly different between those descending from a bacterial ancestor and those being added to the mitochondria more recently, suggesting that the frequency of IDPs/IDRs in mitochondria was due to the evolutionary origin rather than the functional difference of the protein.
The sequence redundancy of IDPs/IDRs sheds light on the design of IDPs/IDRs. At present, both IDPs design and IDPs-targeted drug design are in their infancy.[10, 120, 121] Structure-based design of ordered proteins has been extensively studied, resulting in some well developed strategies. But it is unclear whether these approaches can be applied to design IDPs/IDRs, and rare attempts were conducted in this direction. The lack of a unique folded structure also brings extra hindrance. With an ordered protein, one knows the location and geometry of the active/binding site, which is the basis of conventional rational design. For an IDP, it may be difficult to obtain such knowledge in order to achieve a desired function. On the other hand, IDPs/IDRs possess greater available sequence space than ordered proteins, which would be favorable for the rational IDPs/IDRs design. Recently, Shen et al. have successfully designed a small IDP to inhibiting TNF-α, where the coupled-folding-and-binding complex structure was adopted as a basis in the design. It delivered an optimistic hint on the possibility of IDPs/IDRs design.
The subject of this review is the possible advantages of IDPs, and inevitable bias exists in emphasizing them. In fact, no benefit is free. IDPs indeed possess disadvantages (and even harmfulness). IDPs/IDRs often engage in promiscuous interactions and are associated with various human diseases.[58, 59] It was found that 79% of cancer-associated proteins contain predicted IDRs of 30 residues or longer. In the Swiss-Prot disease category, it was shown that 11 disease-related keywords were strongly correlated with IDPs/IDRs while none correlated with ordered proteins. To suppress their harmfulness, the availability of IDPs/IDRs is tightly regulated in cells, which would require extra costs. Another apparent disadvantage of IDPs is their susceptibility to proteolysis.[27, 92, 124] To avoid unwanted proteolytic degradation as can occur in many biological environments, some mechanisms may exist to protect IDPs/IDRs in vivo.[27, 124] Binding with partners can provide effective protection for IDPs/IDRs by hiding the protease-sensitive sites. “Nanny” chaperones were also proposed to protect newly synthesized IDPs/IDRs from degradation by specific interactions. In addition, proteases are usually compartmentalized and sequestered in the cell, and their activity is tightly regulated.[27, 92]
Advantages of IDPs/IDRs have been widely discussed in the literature. In the above we have summarized and surveyed nine possible advantages. It is not a complete list, for example, Uversky gave a list with 21 advantages in a recent review. It should be noted that many of them lack direct proof. Therefore, they are working concepts and hypotheses rather than established facts. Some advantages seem self-evident, for example, economizing genome/protein resources, overcoming steric restrictions in binding, facilitating posttranslational modifications, and enabling flexible linkers. Some other advantages, however, are complicated and even contradictory, for example, high specificity and compatibility with more available sequences. More work is needed to illuminate these properties of IDPs/IDRs. In this aspect, molecular modeling, especially with coarse-grained models, are powerful tools and extremely useful.
The discussed advantages are not orthogonal. Many are connected in nature. For example, the ability of overcoming steric restrictions in binding would also facilitate posttranslational modifications.
The possible advantages of IDPs/IDRs over ordered proteins differ in magnitude. For example, the interface area per residue in binding can vary substantially, whereas the average binding rate of IDPs is only two to three times faster than that of ordered proteins when the binding rate itself covers a few orders of magnitude. Therefore, it may be difficult for experimental studies to clearly demonstrate some advantages of disorder. Some conflicting studies on IDPs/IDRs, for example, the role in hubs, the evolution rate, and the role of structural disorder in p53-related diseases, may relate to this.
IDPs/IDRs are distinct from conventional ordered proteins in sequence, structure and function. IDPs/IDRs possess various advantages over ordered proteins, enabling them to perform vital functions in cells and to persist during evolution. Some possible advantages of IDPs/IDRs are not well understood and require both experimental and theoretical approaches to decipher. Such studies would advance our understanding of IDPs/IDRs, and facilitate the development of their application.
The authors gratefully acknowledge Dr. Fan Jin and Huaiqing Cao for their insightful discussions. The authors also thank Prof. Dr. Brian Matthews and the anonymous reviewer for their helpful and constructive comments that greatly helped to improve the article.