Time, space, and disorder in the expanding proteome universe

Proteins are highly dynamic entities. Their myriad functions require specific structures, but proteins’ dynamic nature ranges all the way from the local mobility of their amino acid constituents to mobility within and well beyond single cells. A truly comprehensive view of the dynamic structural proteome includes: (i) alternative sequences, (ii) alternative conformations, (iii) alternative interactions with a range of biomolecules, (iv) cellular localizations, (v) alternative behaviors in different cell types. While these aspects have traditionally been explored one protein at a time, we highlight recently emerging global approaches that accelerate comprehensive insights into these facets of the dynamic nature of protein structure. Computational tools that integrate and expand on multiple orthogonal data types promise to enable the transition from a disjointed list of static snapshots to a structurally explicit understanding of the dynamics of cellular mechanisms.


Introduction
The human genome sequence has a smaller number of genes than expected: ß19 000 compared to 6.7 million genes in earlier estimates [1]. It has remained largely unclear how this small number of genes can be sufficient to support human complexity. In recent years, hierarchical layers of regulation have been revealed that give rise to some of the functional complexity observed in living cells despite the compact nature of the protein coding genome. These are directly linked to spatiotemporal dynamics on all levels of protein structure from their sequence, three-dimensional structure to alternative cellular localizations and spatial organization of specific proteins in tissues and organs. We discuss these new regulatory mechanisms which contribute to emergent complexity of living systems (Fig. 1), as follows: (i) About 80% of the human genome maps to non-coding yet functional genomic elements [2]. These regulatory elements include sites for DNA methylation, DNase I hypersensitive regions that function as preferential interaction sites for transcription factors and long-range regulatory elements. Fine-tuning the control of transcription makes it possible to switch among a large variety of transcriptional states depending on intracellular and extracellular changes. (ii) Alternative splicing has been implicated in tissue differentiation and is positively correlated with organism complexity [3][4][5]. Alternative splicing is an important mechanism to generate multiple sequence variants from the same gene, for instance in different tissues or developmental stages [6]. (v) Switchable alternative protein-protein interactions lead to yet more diversity [10]. IDPs can engage in a large number of alternative interactions as function of PTM and overlapping short linear motifs. Some IDPs use overlapping linear segments for binding to multiple, distinct protein partners with low affinities [11], thus enabling rapid rewiring of large cellular interaction networks; these same capabilities enable the rapid rewiring of gene regulatory networks [6, 12, 13]. (vi) Protein turnover. The half-lives of eukaryotic proteins range from on the order of minutes to decades [14,15]. Such differential protein turnover leads to a greater range of protein abundance than for example, transcript abundance. Transcripts vary some 2 orders of magnitude in their cellular abundance whereas proteins cover a dynamic range greater than 6 orders of magnitude or higher in some cell types. Low-abundant proteins are turned over more rapidly by proteasomal proteolysis, contain more IDRs and are enriched in PEST motifs [16,17]. (vii) Multiple subcellular locations enable the same protein to exert different functions in different parts of the cell [18]. (viii) Proteins involved in transducing intrinsic and extrinsic signaling exist within spatially restricted concentration gradients. For example, Wnt signaling gradients control asymmetric cell divisions during early development and later in the life of complex organisms maintain tissue organization. Spatial organization enables the formation of complex tissues and organs up to the highly interconnected human brain.
The term "proteoform" has been recently proposed as an umbrella term to summarize all possible alternative protein sequences for a given protein including genetic sequence variants, PTMs, splice variants, proteolysis variants [19]. Powerful methods to characterize proteoforms have been comprehensively covered in several excellent papers [20][21][22]. It should be added at this point that IDPs and IDRs can be viewed as "outliers" in the context of structural biology terminology: Folded proteins are readily described by the well-established hierarchy of 1D structure (i.e. protein sequence) to 2D structure (i.e. local secondary structure elements) to 3D structure (i.e. atomic coordinates of atoms of a folded protein chain), but IDPs lack a fixed 2D or 3D structure and therefore elude a straightforward classification in the established terminol- Challenging questions in proteomics. The proteome is not a fixed entity but a dynamic system. Unraveling a multitude of dynamic layers of its regulation is key to comprehensive understanding.
ogy framework. To cope with this phenomenon, it was recently suggested to extend the concept of "proteoforms" to include manifold alternative conformations of IDPs and IDRs as "conformational (or basic or intrinsic) proteoforms" [23]. Other authors have used various descriptors for IDPs, [24] including "4D proteins" [25] to indicate that their conformations and functions can change over time or, alternatively, other authors have attempted to classify IDPs by physical parameters such as charge patterns, IDR length and residual structure [26]. While an in-depth discussion of the issue of IDP classification and terminology is clearly beyond the scope of this viewpoint, it is important to acknowledge the current imperfections of our terminology and to encourage community-wide efforts to find a new consensus solution for a more effective terminology that would fully integrate IDPs and IDRs into the terminology of biological sciences.
Compared to extensive insights into multiple aspects of proteoforms, much less is known about higher-order structural proteome dynamics that enable cellular complexity (Fig. 1). We focus on recently developed methodologies designed to study dynamic protein conformations, interactions, and subcellular mobility. We also present a brief summary of what we consider to be remaining key challenges in studying the structure and function of cellular proteomes.

How can alternative structures tune functional protein interactions (and vice versa)?
Not all protein interactions fit the classical lock and key model of molecular recognition achieved by docking of rigid components. Fine-tuning target recognition can require 'conformability' as in the case of bacterial Lac repressor protein, which assumes a fuzzy complex when sliding along non-specific (3 of 13) 1600399 DNA sequence but a mostly structured state in the specific, tightly DNA-bound complex once associated with its specific target sequence [27]. Similar observations have been made for human sequence-specific transcription factor LEF1 which is mostly disordered free in solution but assumes a defined 3D structure in complex with its specific target DNA [28]. Even more pronounced structural transitions from unstructured to pathologically structured fibril conformations can contribute to neurodegenerative disorders as in the case of Parkinson disease, which is associated with toxic accumulation of ␣-Synuclein aggregates [29]. Many cell-regulatory hub proteins contain IDRs [30]. Adenomatosis polyposis coli (APC), a tumor suppressor protein, is frequently mutated in cancer, and cancer mutated forms of APC often lack most of their 2000 residue long IDR. Axin1, an interaction partner of APC, can gain pathological functions if single point mutations disrupt the normal fold of a small folded domain that is located between its long IDRs [31][32][33]. Increasing largely anecdotal evidence suggests that both transient and persistent structural disorder play crucial roles in biology and understanding disease mechanisms and that there is no unique disordered state but rather a continuum from fully structured to fully disordered [34,35].

X-ray crystallography beyond static structures
Traditionally, structural biology was rationalized by the dogma that biological function requires a rigid 3D protein structure. According to this dogma, it should be possible to understand biology by solving one minimal energy structure per protein. Greater than 100 000 structures of folded domains have been solved over the last decades and first near-complete structural proteome models have been proposed based on homology modeling [36]. 90% of these protein structures have been solved using X-ray crystallography, which is intrinsically restricted to the solid phase of proteins. Structural protein dynamics in solution are, therefore, incompletely characterized so far. Despite its historical bias towards solving static structures, X-ray crystallography has chiefly contributed to the birth of the IDP field [37] as thousands of polypeptide segments in crystallised protein constructs do not give rise to a well-defined electron density and can therefore be classified as "disordered" [38]. More direct time-resolved methods are currently under development building on the latest advances in highbrilliance X-ray sources. Spectacular first dynamic pictures of ultrafast light-induced femtosecond isomerization events in the photoactive yellow protein and alternative conformations of riboswitches dynamically reshaping upon ligand-binding highlight the possibility of capturing dynamic structural data in the future [39,40]. In addition to exciting technological developments, it will be interesting to explore improved com-putational possibilities for a more comprehensive analysis of existing X-ray crystallographic datasets: further improvements are possible by treating protein dynamics explicitly and enabling improved fitting of existing electron density maps to alternative conformations and locally flexible parts in proteins [41].
Even fully disordered proteins are no longer outside of the reach of X-ray crystallography. Several important c-Myc structures have been solved in complex with specifically binding partner proteins [42]. It is hoped that this will make previously undruggable IDPs specifically targetable by exploiting unique interfaces that only arise in specific protein-protein complexes of these IDPs [43]. X-rays can make numerous protein dynamics crystal clear.

Cryo-EM and NMR -a dynamic pair
In the last two decades, major technological breakthroughs in electron detection efficiency and image processing have culminated in a recent explosion of new Cryo-EM structures, which is experiencing a higher average annual growth compared to x-ray crystallography (with an average of 34 versus 9%). Cryo-EM, like NMR spectroscopy, is capable of revealing local structural disorder. NMR peaks of disordered protein segments cluster together more closely because their more averaged chemical environments result in lower chemical shift dispersion [44]. Anisotropic Cryo-EM resolution scales with flexibility [45], i.e. highest resolution is achievable for rigid and lowest resolution for very flexible regions [46]. Their preferred molecular size ranges are complementary: typically below 50 kDa for NMR and above 150 kDa for Cryo-EM. While the rate of progress is nicely accelerating, costs of state of the art Cryo-EM and NMR facilities still restrict broader community access to these technologies. Establishing optimal protein production protocols and sample conditions remain shared bottle-necks among all high-resolution structural techniques [47,48]. While the contribution of NMR to solving new structures might shrink in the future, it cannot be over-emphasised that this technique has unique capabilities in covering directly a large range of protein solution dynamics on timescales ranging from picoseconds to hours [49]. Briefly, all major high-resolution structural biology technologies continue to develop dynamically and complement each other.

Biochemical approaches to study protein conformational dynamics
Many aspects of protein conformational dynamics are either impractical or impossible to study using exclusively above-mentioned high-resolution structural methods. Biochemical methods including 1D SDS-PAGE and proteolysis have been successfully used as valuable complementary methods to characterize protein folding and conformational heterogeneity in solution [50]. Short digestion protocols as in pulse proteolysis [51,52], membrane pulse proteolysis [53], SILAC pulse proteolysis [54], and FASTpp [55] considerably increased throughput in recent years. FASTpp uses thermal denaturation in contrast to chemical denaturation in pulse proteolysis. FASTpp exploits the principle of rapid digestion of exposed, thermally unfolded polypeptide segments before they had a chance to aggregate. FASTpp detects ligandinduced folding and stabilisation, missense mutation effects on protein stability [56][57][58]. While FASTpp is technically simple and fast to implement without the need to equilibrate samples in denaturant, which can take months in the case of kinetically stable proteins [59], pulse proteolysis can be used to derive equilibrium unfolding energies (⌬⌬Gs). Limited proteolysis (LiP) has been used for many decades in structural biology and continues to be actively developed using a wide range of proteases and readout methods from low to high multiplexity [60,61]. A recent breakthrough study used LiP in combination with peptide sequencing by mass spectrometry to simultaneously map conformations of 1000 yeast proteins and to reveal quantitative structural changes in 300 proteins upon growth on different sugars [62,63]. Similar methods combining the best of classical biochemical methods and ultra-sensitive and large-scale protein detection have a great potential for revealing structural proteome dynamics under a large range of biological [64], physical, and chemical conditions, thereby redefining our understanding of protein stability and folding in the cellular context.

Label-dependent protein folding assays
A wide range of highly specific methods to study protein conformations depends on selective chemical protein labeling. Tryptophan-free proteins can be selectively labeled using a single tryptophan substitution of a chemically similar aromatic residue like phenylalanine, which often does not perturb the biological behavior of the wild-type protein [65,66]. Another more widely used chemical labeling method is hydrogen deuterium exchange (HDX). As all proteins contain hydrogens, their exchange with deuterons presents a very generic and minimally perturbing strategy of labeling. Hydrogens are ubiquitous in proteins yet local hydrogen to deuterium exchange rates vary over many orders of magnitude depending on their structural interactions: rigidly folded and hydrogen-bonded segments of proteins exchange very slowly (ßhours to years) while random coil regions can often exchange rapidly (ßmilliseconds-seconds) [67]. This effect can be used to investigate how much structure a disordered region assumes upon addition of specific ligands by investigating how the exchange rates decrease as ligand is added. A recent study demonstrated the use of reverse (i.e. deuterium to hydrogen) exchange to map peptidome-wide peptide-protein interactions. This study highlights the fundamental possibility of exploiting atomic changes to map protein interactions on a global scale [68]. Using HDX technologies on whole cells for cellular structural studies is a desirable extension of the method, however a significant hurdle is the need to minimize back-exchange during necessary processing steps such as cell lysis and protein digestion prior to bottom-up LC-MS/MS analysis. Novel strategies in directed evolution or metagenomics selection [69] have the potential to identify novel types of acid-compatible specific proteases that can help to accelerate specific digestion under conditions that drastically slow down back-exchange. These highly acidic conditions would be only necessary after conformational features are "encoded" as deuterium incorporation and thus do not affect the native structural states of cells. Ultra-rapid digestion methods, mass-spectrometry compatible detergents and faster computation of complex spectra resulting from a large number of variable isotope changes may further help to pave the way toward proteome-wide in vivo HDX experiments [70,71].

Solubility methods to probe protein conformation
Alternative methods based on physical principles increasingly complement chemical methods. One of the earliest physical methods to characterize protein unfolding is monitoring their soluble fraction at a range of temperatures. Analogous to eggwhite protein in boiled eggs, most proteins irreversibly precipitate above their unfolding temperature. Temperatures just slightly above the physiological growth optimum can cause dramatic reductions of proteome solubility in cells lacking the Hsp70 system that is an essential component of the cellular heat shock protection system by interacting with aggregationprone unfolded and partially folded proteins [72,73]. The cellular thermal shift assay (CETSA) assay exploits this effect to screen ligand-dependent changes of thermal solubility of proteins [74]. CETSA revealed drug-dependent increases of kinase stability. Initial examples of CETSA required a large number of samples to be screened by quantitative antibodybased detection methods [74]. Thermal proteome profiling (TPP) overcomes the dependence on antibodies and limited throughput by combining the CETSA principle with TMT 10-plex mass spectrometric detection in a large temperature window between 37 and 67ЊC. A small number of TPP runs in human cells and cell lysates enabled quantitatively tracing drug interactions with nearly 7000 human proteins and revealed off-target interactions of a drug [75]. Interestingly, TPP can be also applied to many transmembrane proteins either before or after detergent solubilisation using a range of mild detergents [76]. A systematic comparison of both datasets suggests that cellular compartments alter the biophysical stability of membrane proteins: membrane proteins in native membranes are more stable than intracellular proteins while detergent-solubilized membrane-proteins are less stable compared to intracellular proteins. This finding suggests that membrane proteins are more stable in vivo than intracellular proteins yet significantly less stable in vitro (5 of 13) 1600399 consistent with their reputation of being notoriously unstable during crystallization trials in detergents. As protein structural dynamics affect protein interactions and interactions in turn affect structural stability, characterizing these dynamicsfunctional relations is of fundamental interest and has started to be bio-medically transformative by establishing novel drug discovery routes.

How can transient protein-protein interactions contribute to functional diversity?
Specific protein-protein interactions (PPI) are widely considered as key to understanding cellular functions of proteins. One might intuitively expect most PPI to be high affinity as this ensures a high fraction of specifically bound complexes. Highest affinity can be reached with rigid proteins, but transient and biophysically weak interactions are at the hub of biological interaction networks [77] and ultra-affinity is rare [78]. One of the most striking examples for two non-rigid proteins interacting specifically is the mutually synergistic folding of two independently flexible proteins in the ACTR-NCBD complex [79]. Both protein domains engage in an intimate complex that covers a large, rather hydrophobic interface to jointly regulate transcription as crucial parts of a large number of larger proteinaceous transcription-regulatory machineries [80,81].

High-throughput affinity-based methods to study protein interactions
Most large-scale methods depend on short, disordered affinity-tags [48,82]. Affinity purification (AP)-MS uses a single affinity enrichment step and investigates all co-eluting proteins, while tandem (T)AP-MS uses two sequential affinity steps. More specific interactors than in sequential multipleaffinity methods can be retrieved using two or more orthogonal tag systems in parallel for the same target protein, for instance FLAG-tag and Strep-tag, in interactomes using parallel affinity capture (iPAC) [83] or quantitative SILAC-iPAC [84]. Recently, GFP was introduced as novel affinity tag in AP-MS [85], which made it possible to build on existing large GFP-fusion libraries and to selectively enrich interactors of most human proteins [77]. How good are these methods for capturing weak yet potentially biologically important interactions? It is a priori not clear how these methods might bias against the detection of very transient binding events shorter than current affinity protocols or bias toward complexes that only form in vitro in dilute lysis and affinity purification buffers but would never form in the crowded intracellular environment in the presence of optimal concentrations of molecular chaperones. Clearly, orthogonal methods are needed to validate interactions and to discover additional interactions that are too transient or weak for detection by affinity-enrichment methods.

Overcoming the quantitative protein-protein interaction validation bottle-neck
While affinity methods readily provide large lists of specific interactors, it is generally difficult to derive predictions about proteoform-specific dissociation constants, which would enable quantitative predictions for other protein concentrations. Direct biophysical high-throughput quantification of binding strength of putative protein-protein interactions has remained highly challenging. Single-molecular-interaction sequencing (SMI-seq) enables high-throughput quantification of up to hundreds of protein interactions in parallel covering a broad range of affinities by covalently crosslinking proteins to nucleotide-barcodes for multiplexed sequencing in situ [86]. SMI-seq has been successfully applied to both water-soluble and membrane proteins incorporated in phospholipid bilayer nanodiscs [86]. SMI-seq uses cell-free in vitro production of proteins and is, therefore, not fundamentally limited by the natural genetic code. Related approaches that offer highthroughput and quantification of protein interactions will be valuable for coping with the validation bottle-neck in proteinprotein interaction research.

Comparing in vitro and in vivo protein associations
Comparison of in vitro and in vivo protein complexes is in principle possible by fixation of protein interactions using chemical crosslinking or using fluorescence correlation spectroscopy (FCS) [87]. In vivo FCS can visualize the dynamic assembly and disassembly of protein complexes during the cell cycle [87]. Crosslinking mass spectrometry (XL-MS) recently advanced from the study of a few crosslinks of small protein complexes to large viruses thanks to improvements on all levels from MS-cleavable cross-linkers over new mass spectrometric strategies to novel data analysis workflows [88][89][90]. A wide range of cross-linkers exist that cover zero-length to several nanometers in distance between crosslinked molecules. Relatively short lengths, such as 0.5 nm for MS-cleavable DSSO, can be ideal for use in integrative biology to refine structural models of protein complexes of partly solved composition [91]. Larger cross-linkers can be beneficial to elucidate the network of transiently or weakly binding proteins in large protein complexes [92]. Future expansion of these novel crosslinking-MS strategies to in vivo analysis of intracellular protein complexes using a class of cross-linkers that combines clickable affinity purification handles for enrichment of crosslinked peptides and MS-cleavability for accelerated peptide identification is becoming possible [93,94].

How does protein-organelle partitioning affect protein interactions?
Even the simplest known living cells are compartmentalized [95]. Membrane enrichment is crucial for membrane-intrinsic transporters and helps to orchestrate a variety of metabolic pathways [96]. Eukaryotic cells have multiple membrane-enclosed organelles that enable a wide range of physicochemical conditions to coexist in a single cell. Secretory granules can have a pH of 5.0 while other compartments typically vary between pH 6.4 and pH 7.2 [97]. Some proteins are fully folded in one compartment but unfolded in another [98].

Chemical proximity-labeling strategies to discover protein co-localisation
Efficient strategies are being developed to selectively label membrane-associated protein complexes for subsequent MS detection. APEX2-MS [99] is based on an enzyme that catalyzes the conversion of biotin phenol to a biotin radical and rapid labeling of nearby proteins [99,100]. Both phenol and peroxide as co-substrates of this labeling reaction might induce cellular stress in some organisms and cell types. Selective proteomic proximity labeling using tyramide (SPPLAT) is a chemical variation to the same theme of enzymatically creating an activated biotin-conjugate that has a short half-life and therefore can only react in the immediate vicinity of the activating enzyme [101,102], which is horse-radish peroxidase in the case of SPPLAT in contrast to ascorbate peroxidase in APEX [103]. Biotinylation is in principle also possible using more gentle enzymatic approaches as biotinylation is one of the most specific known PTMs [104]. This natural specificity is, however, a challenge for APEX-like applications that require promiscuous biotinylation in the proximity of the enzyme. A mutant of the bacterial BirA ligase that lacks this substrate specificity, "BioID", has been applied to discover transient interaction partners of specific BirA-mutant labelled proteins [105]; an accelerated unspecific biotin-ligase called BioID2 is available [106,107]. Directed evolution might further improve the activity of BioID2 at 37ЊC as BioID2 is derived from a highly thermophilic (Aquifex aelicus) source and displays optimal activity far above 37ЊC [106]. Additional improvements of the method appear possibly for many applications if biotin-enrichment is performed on the peptide level instead of protein level as ß200-fold increased direct mass spectrometric detection was demonstrated for biotin-peptides [108].

How does lipid-less subcellular partitioning affect protein interactions?
Even within a single organelle, biomolecules are not homogenously mixed. Active sub-organellar partitioning often involves ATP-fuelled molecular machines, for instance dynein guiding cargo proteins along the cytoskeleton [109].
Other sub-organellar structures form spontaneously. IDPs have been recently identified as crucial components driving the assembly of membrane-less cellular compartments. The prion-like domain of Xvelo, an IDP, is crucial for formation Balbiani bodies that are a hallmark of asymmetry in oocyte formation [110]. A variety of different flavours of protein-RNA bodies have been identified including stress granules, nucleoli, Cajal bodies, and PML bodies in the nucleus. Intriguingly, some of their properties can be explained by sequence patterns in their specific IDPs. Specific F/R/G-rich motifs in these IDPs can efficiently drive liquid-liquid phase separations and contribute to formation of these membrane-less bodies [111]. Thus subcellular order comes, at least in part, out of intrinsic disorder. Given their large molecular size, the ribosome and other large cellular machines including the proteasome and chaperonins constitute nanoscopic cellular compartments in their own right. Based on RNA-seq and isolation of translationally halted ribosomes, and one-by-one addition of chaperones, it is now becoming possible to selectively profile ribosomal complexes to unravel how molecular chaperones engage during the translation process. This "selective ribosome profiling" approach revealed that trigger factor (TF) engages in vivo only upon emergence of ß100 nascent residues in contrast to the earlier suggestions based on in vitro work on TF that TF is waiting per default at the ribosomal exit tunnel [112]; analogous approaches have great potential to transform our understanding of spatiotemporal organisation of proteostasis including synthesis and folding of membrane proteins.
Exciting open questions related to suborganellar cellular structures include: how is the timing of metabolic pathways tuned by subcellular structures? Are PTMs regulating their formation? How can we monitor systems-wide perturbations of these structures by changing environments?

Organelle proteomics
Combining state of the art mass spectrometry, partial separation of organelles in a density gradient, and statistical analysis of resulting patterns enabled first quantitative and nearly proteome-wide maps of cellular localizations for eukaryotic cells, such methods include protein correlation profiling (PCP) [113] and localization of organelle proteins by isotope tagging (LOPIT) [18]. LOPIT has been further refined by combination with 10-plex TMT labeling in hyper-LOPIT [114]. TMT labeling of peptides is independent of subcellular protein fractionation in density gradients and solely used to achieve maximal subcellular resolution, coverage of sub-cellular niches and reduction of false assignments to different sub-cellular niches; differential centrifugation and insolution digests have been used as technical variations of hy-perLOPIT [115]. LOPIT studies have revealed that many more proteins than expected are present in multiple locations of the cell. This observation gives rise to intriguing questions including how multiple locations are linked to structural and functional diversity and PTMs as well as splice variants and IDRs.  [114,118]. Mouse stem cell hyperLOPIT data [114]. Predominant variations of the partitioning of individual proteins into fractions of the density gradient are captured by the first two components (denoted PC1 and PC2) of a principal component analysis (PCA). Wnt signaling proteins APC2, CK1, GSK3␤, neurodegeneration-linked Huntingtin, and the breast cancer-linked tumor suppressor protein BRCA1 (highlighted as solid black circles) are not assigned to a single location, characteristic of proteins with mixed localization.
APC, which contains an unstructured region of some 2000 residues, for instance, can travel from the nucleus to near the membrane and engage in several condition-dependent transient functional protein and protein-RNA complexes including the machinery for its own synthesis [116,117]. It will be a fascinating challenge to explore globally how other IDPs act differently in different parts of the cell and how dynamic cellular structure form under direct control from IDP regions. Selected examples for other Wnt pathway members are highlighted in a HyperLOPIT plot (Fig. 2) [118].
Despite their current limitations to relatively small numbers of different proteins that can be observed simultaneously, it will be interesting to explore the complementary benefits of cryo-electron tomography (cryo-ET) [119][120][121] and super-resolution (SR) fluorescence microscopy [122]. Both techniques are experiencing rapid technological advances and further improvements have the potential to provide novel insights into high-resolution spatiotemporal subcellular dynamics as well as fine details of tissue architectures [123].

How can tissue and organ partitioning affect localized interactions?
Complex tissues and organs such as the human brain clearly require a high degree of spatial organization beyond single cells. Nearly 50 years ago, Francis Crick proposed diffusive "morphogen" gradients as minimal ingredient for spatial organization of cells during embryogenesis [124]. Only very recently, it has become possible to directly visualize morphogen gradients in vivo using elegant organoid models that reflect most architectural features of organs while adding benefits of infinite expansion and culturability. Surprisingly, the measured short-range cellular Wnt gradients are inconsistent with free diffusion but appear to require a cell-bound propagation mechanism [125]. Wnt signaling as a whole is a perfect illustration of the importance of various levels of disorder in establishing multicellular order. Many of its crucial signaling components including the scaffolds APC, Axin and WTX contain large IDRs up to some 2000 residues [33], have large numbers of PTMs and alternative interactions [126], are cellularly mobile (Fig. 2) and read the gradient signal that spans across several cell length and ultimately established tissue and organ shape. Curiously, the massively disordered APC protein is also needed for proper synapse formation in the brain. Specific mutations of APC correlate with autism and a conditional knock-out impaired synapse maturation [127]. Defined disorder appears to be an architectural hallmark of some of the most intricate structures in nature, which are just becoming observable by mass spectrometry imaging [128].

Computational biology helping to fill the voids in structural proteomics
Acquiring all-atom movies of the living organisms is clearly beyond experimental reach. Computational methods increasingly help to fill gaps in our understanding of structural biology. Efficient algorithms can predict secondary structure, IDRs and increasingly 3D structure can be predicted from readily available genomic sequences [129][130][131]. Despite the astronomic conformational possibilities to arrange a given short polypeptide sequence in 3D, de novo prediction of the folding of ß100 residue long peptides based on physical principles in silico has been shown for some examples [132]. However, the community experiment on protein structure prediction known as CASP shows that de novo prediction of even small, single domain proteins, while improving over time, is still far from routine, and further shows that the most reliable method for protein 3D structure prediction remains the construction of protein models using the known structures of homologous proteins as templates. These templatebased models suffer from template bias, e.g. the resulting structures are more similar to the templates than to the true structures. Improvements in protein dynamics methods are finally leading to approaches for reducing the degree of template bias [133]. Similarly, the most recent force-field developments now show promise toward correct prediction of conformational ensemble properties of IDPs [134,135]. Computational approaches can amplify the attainable insight from highly complex multi-dimensional proteomics experiments by efficient dimensionality reduction methods. PCA plots often capture most of the variation of highly dimensional data in visually intuitive two-dimensional plots (Fig. 2) [136]. Significant computational science community efforts are needed to maximize the knowledge gain from rapidly accumulating and diversifying multi-omics datasets to ultimately reveal fascinating new hidden ordered patterns in complex cellular dynamic systems [137].

Outstanding challenges in proteomics
(i) Which weak or transient interactions are functionally important? (ii) How to quantitatively understand and predict in vivo versus in vitro protein interactions? (iii) How can we quantitatively link various "omics" from DNA to RNA and the higher-order structure of proteins including their cellular trafficking?
(iv) What are the underlying principles determining cellular protein structural dynamics and how to predict them from readily accessible genomic sequences? (v) How can we improve the mutually enhancing efforts of experimentalists and theoretical scientists to tackle highly complex "multi-omics" projects? It is a formidable challenge for computational biologists and mathematicians to glean sufficient breadth of data types from experimentalists, to discover overarching patterns in various "omics" datasets that are fundamentally connected by common cellular biology. (vi) How can we link different protein structural states to functional diversity? The decade-old C-value paradox states that genome sizes are not well-correlated with organism complexity [138]. Extensive multi-purposing in eukaryotic proteomes might explain the exceptional "coding efficiency" in many eukaryotic genomes that are too small relative to their complexity [4,13].
Quantifying the extent of multi-purposing is highly challenging as individual dimensions such as PTM, (9 of 13) 1600399 alternative splicing and IDR discovery, and protein function prediction and validation are individually challenging. Expanding and integrating these efforts into comprehensive high-throughput methods is highly desirable but not yet straightforward [139]. (vii) Can we use our improved understanding of spatiotemporal proteome dynamics to improve life of ageing and growing societies?

Conclusion
Bottom-up approaches have been very powerful in structural biology over many decades. DNA and RNA sequencing technologies have become highly robust and widely accessible technologies and rapid proteome-wide protein sequencing is now possible for several organisms and transform our understanding of biology. Protein de novo folding simulations have reached near-atomic precision for small folded domains and IDRs. Higher-order structures are less readily predictable so far. Complicating factors are the intracellular and environmental fluctuations, which can be observed even in the most simple model systems [140,141]. Clever combinations of traditional biochemical and physical assays with increasingly rapid bottom-up mass spectrometry generate many new opportunities to characterize these higher-order structures as outlined in this review (Fig. 3). Collectively, these new bottom-up mass spectrometric techniques make it possible to "sequence" many crucial layers of dynamic regulation of protein structures. Very recent breakthrough studies demonstrated the possibility of few-protein spatiotemporal engineering of organisms to improve carbon fixation or accelerate the process of switching from reduced photosynthetic activity under lowlight conditions to full photosynthetic productivity once more light becomes available after clouds have passed [142,143]. An improved proteome-wide understanding of the hidden order in apparent disorder of higher-order protein structures in living organisms can pave the way to de novo spatiotemporal engineering of organisms with beneficial properties. While this might sound like a long way off at present, it was well beyond the wildest imaginations just 20 years ago that we would be able to routinely sequence entire proteomes in an hour of measurement time [144]. It will become increasingly possible to avoid late-stage failures in drug discovery pipelines due to an improved understanding of cellular dynamics.
Plenty of dynamics at the bottom of biology (Fig. 3).
We wish to apologize to all authors of brilliant work concerning structural proteome dynamics, which we could not directly highlight here due to space restrictions. We wish to thank Dan Nightingale, Nina Kočevar Britovšek, Chris Taylor [12] Bondos, S. E., Swint-Kruse, L., Matthews, K. S., Flexibility and disorder in gene regulation: LacI/GalR and Hox proteins. J. Biol. Chem. 2015, 290, 24669-24677.