The logic of protein post‐translational modifications (PTMs): Chemistry, mechanisms and evolution of protein regulation through covalent attachments

Protein post‐translational modifications (PTMs) play a crucial role in all cellular functions by regulating protein activity, interactions and half‐life. Despite the enormous diversity of modifications, various PTM systems show parallels in their chemical and catalytic underpinnings. Here, focussing on modifications that involve the addition of new elements to amino‐acid sidechains, I describe historical milestones and fundamental concepts that support the current understanding of PTMs. The historical survey covers selected key research programmes, including the study of protein phosphorylation as a regulatory switch, protein ubiquitylation as a degradation signal and histone modifications as a functional code. The contribution of crucial techniques for studying PTMs is also discussed. The central part of the essay explores shared chemical principles and catalytic strategies observed across diverse PTM systems, together with mechanisms of substrate selection, the reversibility of PTMs by erasers and the recognition of PTMs by reader domains. Similarities in the basic chemical mechanism are highlighted and their implications are discussed. The final part is dedicated to the evolutionary trajectories of PTM systems, beginning with their possible emergence in the context of rivalry in the prokaryotic world. Together, the essay provides a unified perspective on the diverse world of major protein modifications.


INTRODUCTION
The term protein post-translational modification (PTM), technically speaking, refers to any covalent addition to, transformation of, or subtraction from a protein's chemical structure that occurs in living organisms post-translationally, that is, after ribosome-catalysed biosynthesis of a given protein.However, especially in recent years, the usage has often been narrowed down to covalent addition to

Protein phosphorylation as a regulatory switch
The first relevant line of research is the study of reversible protein phosphorylation as a mechanism for regulating enzyme function [6,7] (Figure 2A).This programme emerged from the study of sugar metabolism pioneered in the 1940s by Carl and Gerti Cori.The Coris, among their other contributions, identified glycogen phosphorylase as the enzyme that catalyses the rate-limiting step of glycogen breakdown and demonstrated that it exists in two interconvertible states, [8,9] a discovery that laid the foundation for the subsequent work.
[12][13] The interconversion was enzymatic, catalysed in each direction by a specific enzyme and involved a transfer of the γ phosphoryl of ATP onto one major serine residue [14] and, in reverse, hydrolysis of the phosphorylated serine.These phosphorylation/dephosphorylation events directly influenced the catalytic activity.The underlying structural mechanism was later elucidated by means of a crystal structure of phosphorylated glycogen phosphorylase by Louise Johnson's laboratory. [15]While Krebs, Fischer and Sutherland were not the first to observe protein phosphorylation -one should mention the detection of phosphate [16] and phosphoserine [17] in the protein vitellin by Phoebus A. Levene and co-workers and, later, the study of enzymatic casein phosphorylation by Burnett and Kennedy [18] -the attractiveness of their findings stemmed from assigning to phosphorylation clear functional relevance that was corroborated through subsequent work.
The kinase that regulates glycogen phosphorylase (phosphorylase kinase) was later shown to be itself activated by another kinase (cAMPdependent protein kinase or PKA), which provided the first example of a kinase ʻcascade' . [19]Around the same time, it was demonstrated that not only glycogen phosphorylase (which breaks down glycogen), but also glycogen synthase (which produces it), is regulated by phosphorylation, in the latter case, the more active form being the dephosphorylated one. [20]Ultimately, both enzymes are regulated by hormones insulin and glucagon, through signalling pathways that -as has been gradually revealed -rely heavily on both serine/threonine and tyrosine phosphorylation, with the insulin receptor itself being a receptor tyrosine kinase. [21]It was not until 1970s and 1980s that protein phosphorylation began to be appreciated as a general regulatory mechanism implicated in, among other processes, the fundamental cellular synchronisation system: the cell cycle.As Paul Nurse, Tim Hunt and their co-workers showed, the transitions and checkpoints during the cell cycle depend in part on protein phosphorylation by cyclin-dependent kinases (CDKs). [22]e role of protein phosphorylation in the regulation of sugar metabolism by hormones and its extension to other signalling pathways and the cell cycle provide a powerful paradigm for thinking about PTMs as a regulatory switch.A related but somewhat separate field of research evolved around the structure, [23] allosteric regulation and small-molecule inhibition of canonical protein kinases, [24] which arguably remain the best-studied protein-modifying enzymes.

Ubiquitin as a protein degradation signal
The second relevant research programme has been the study of protein ubiquitylation (also known as ubiquitination) as a eukaryotic signal for intra-cellular protein degradation [25][26][27] (Figure 2B).[30][31] This protein was soon identified, by Keith D. Wilkinson and colleagues, [32] as ubiquitin, which by that time had already been detected covalently linked to histones through an isopeptide bond to a lysine. [33]ram Hershko and colleagues subsequently purified a set of E1, E2 and E3 enzymes required for efficient ubiquitylation [34,35] and showed that ubiquitin can form polymeric chains on a substrate [36] -soon recognised by Alexander Varshavsky and co-workers to be the actual signal required for degradation. [37][40] The biological importance of these findings lies in showing that, contrary to what had been thought before, protein degradation, like protein synthesis, can be a highly specific process dependent on a specific signal (ʻspecificityʼ being one of the keywords of early molecular biology, as the historian of biology Michel Morange argued [41] ).In later decades, other PTMs -notably Pupylation and arginine phosphorylation -have been demonstrated to serve analogous roles as degradation signals in some prokaryotes. [42,43]At the same time, ubiquitylation has been shown to have multiple other functions in eukaryotes beyond promoting proteasomal degradation, [44] and homologues of ubiquitin, including small ubiquitin-like modifier (SUMO) paralogues and other ubiquitin-like proteins, have been implicated in various functions as well, mainly related to stress and immune responses.
[47][48][49] These two modifications have also been at the forefront of PTM-related pharmaceutical research, with kinase inhibitors and small-molecule degraders (that is molecules, including PROTACs and 'molecular glues' , that enhance degradation of desired substrates by physically linking them with E3 ligases [50] ) as two major modalities in current drug research.

Detecting and cataloguing protein modifications
In parallel to these two research programmes -which started from a function and identified a PTM behind the function -protein chemists and biochemists have long reported various noncanonical amino acids in proteins without necessarily knowing their functional importance.
We see the fruit of these efforts summarised in early reviews such as that by Finn Wold and Rosa Uy published in 1977, which catalogued 140 possible ʻamino-acid derivatives' , gathering them (together with proteolysis) in a single category of ʻcovalent posttranslational modifications' . [51]Wold and Uy proposed to see the modification of specific amino-acid sidechains in a protein as a maturation step, on a par with proteolytic processing, that occurs after translation to yield the final protein product.This is reminiscent of the recent concept of a ʻproteoformʼ that is discussed in Section 'Techniques for studying PTMs' below.
The 2006 seminal book on PTMs [52] and a popular review article [53] published around the same time by the recently deceased Christopher T. Walsh (1944-2023) can be seen as a continuation of Wold's systematising project.Walsh, a biological chemist who had previously published a textbook on catalytic mechanisms of metabolic enzymes, [54] conceived of PTM reactions in chemical terms, as, for the most part, instances of transfer of electrophilic groups onto nucleophilic protein side-chains [55] -a realisation that I take up and develop below Since the early reviews by Wold and co-workers, the project of cataloguing PTMs has progressed, reaching the current total count of over 650 known PTM types and steadily increasing. [5]It is worth noting that, although PTMs have been traditionally more studied in eukaryotes and especially in human cells, recent years show that bacteria are a fertile ground for the discovery of interesting PTM systems, some of which -such as Pupylation or histidine and arginine phosphorylation -are apparently unique to, or particularly abundant in, bacteria. [56]e histone code The last major thread that greatly contributed to the history of PTM research and the way we currently understand PTMs is the concept, put forward by Strahl and Allis in 2000, of a histone ʻlanguageʼ or ʻcodeʼ [57] (Figure 2C).The ʻcodeʼ in question refers to a set of different PTM signals (especially methylation, acetylation and phosphorylation) on specific sites within histone proteins, which could act sequentially or together to determine particular functional outcomes, primarily related to gene expression. [58]The idea that histone modifications could have regulatory roles is much older, [59] but its conceptualisation by Strahl and Allis as a ʻcode' -and the rapid adoption of this idea by other authors [60,61] -has had a crucial impact of its own.
According to this framework, the various PTM ʻletters' of the code are supposed to be ʻwritten' (i.e., produced), ʻerased' (i.e., removed), and ʻread' (i.e., recognised via a noncovalent interaction) by dedicated protein classes, which soon began to be called ʻwriters' , ʻerasers' and ʻreaders' , respectively, a terminology that is now broadly followed in the PTM field (Figure 2D).The effects of histone PTMs on chromatin state are thought to be either direct (the PTM itself affects histone:histone and/or histone:DNA interactions) or mediated by readers such as energy-consuming chromatin remodellers.Moreover, a coupling between reading and writing can lead to one modification promoting the installation of another.
A vindication of sorts of the writer-eraser-reader paradigm for histone modifications came with the discovery of erasers of protein methylation, [62,63] which happened after the paradigm was first proposed, methylation initially appearing exceptional in having no known erasers.Furthermore, the attractiveness of this framework lies in it gathering together many diverse chromatin-associated factors whose mutations are known to be associated with disease and which are targets in the ongoing efforts of drug development and clinical trials.
Conceptually closest to the ʻhistone code' is the so-called RNA polymerase II CTD code, which refers to the functional link between changing patterns of PTMs in the C-terminal domain (CTD) of RNA polymerase II and various (co-)transcriptional events. [64,65]Moreover, the use of the term ʻcode' -undoubtedly tracing its roots back to the ʻgenetic code' -is a recurring theme in the PTM field, evident in concepts such as the ʻtubulin code' (regulation of tubulin by its different PTMs [66] ) or ʻubiquitin code' (distinct functional roles of different ubiquitin chains [67][68][69] ).

Techniques for studying PTMs
The famous biologist Sydney Brenner argued that ʻprogress in science depends on new techniques, new discoveries and new ideas, probably in that order' . [70]Above, I have reviewed some key discoveries and ideas in the PTM field, but they would not have happened without appropriate technology.
The birth of the protein phosphorylation field was made possible, in part, by the use of 32 P as a radioactive tracer, which had been pioneered, alongside that of some other radioactive as well as nonradioactive but heavy isotopes, by Georg de Hevesy in the 1920s.
As the technique became more widely used on both sides of the Atlantic from the late 1930s, it played a significant role in the study of metabolism and early developments in molecular biology. [71]Likewise, the use of 32 P, and specifically 32 P-labelled ATP, was instrumental in the early days of protein phosphorylation and remains widely used in the field to this day, particularly for in vitro kinase assays, where it is typically coupled with a sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE), a now standard technique introduced in 1970s. [72]Other forms of isotope labelling (for example, 3 H, 125 I or 35 S) have been important for tracking the fate of specific proteins in the ubiquitylation field.
Like the early studies of serine phosphorylation, the discovery of protein tyrosine phosphorylation was also indebted to the use of 32 P. [73] However, the key role in the later development of the tyrosine phosphorylation field has been played by high-quality general anti-phosphotyrosine antibodies introduced in the 1980s, [74,75] all the more important for tyrosine phosphorylation being less abundant than canonical serine/threonine phosphorylation.Around the same time, the first anti-ubiquitin antibody was also introduced. [76]Indeed, throughout the PTM field, anti-PTM antibodies that can either detect a given modification on all/many substrates or be specific for one site in a particular substrate, continue to play a key role through their use in immunoprecipitation, immunoblotting and immunofluorescence.For complex modifications such as ubiquitylation, antibodies with preferences for different chain types exist [77] , and for some PTMs, non-covalent and covalent probes other than antibodies are also widely used. [78]Aside from the tools for detecting and enriching PTMs, the study of protein PTMs has also benefited from the use of small-molecule inhibitors targeting eraser enzymes, exemplified by the early serine/threonine phosphatase inhibitor, okadaic acid. [79]These inhibitors not only increase the stability and abundance of PTMs in extracts but also allow observing the effects of dysregulating specific PTMs in living cells.
Research into PTMs has received a powerful boost from the development of mass spectrometric (MS) methods, which provide an unparalleled tool for detecting protein modifications, both those obtained through enzymatic reactions reconstituted in vitro and those occurring natively within cells. [80,81]As PTMs are relatively lowly abundant and vary in their enzymatic, chemical and fragmentation sensitivity, new advances in the field have been associated with tailored extraction, enrichment and fragmentation protocols, in addition to the steady general improvement in MS instrumentation and software.[84] In some cases, a PTM is transformed using chemicals or enzymes prior to enrichment and detection.Moreover, various label-based or labelfree quantitative proteomics approaches allow comparing PTM levels and detecting changes in PTMs related to particular conditions or stimuli. [85]he MS-based analysis comes with several caveats.The differential sensitivity and detectability of different PTM types and different modified peptides pose a challenge, as it introduces a bias against those modifications and sites that are less amenable to the available technical pipelines.Moreover, it might sometimes be challenging to distinguish native PTMs from chemical protein modification that occurred during sample preparation.A further important caveat of MS-based identification of PTMs is that, unless we actively look for a given PTM based on the prior knowledge (or assumption) of its existence, it might become lost in unassigned spectral peaks, making it conceptually challenging to detect yet unidentified modifications.
These qualifications notwithstanding, the astonishing progress in MS methods over the years played a key role in the identification of PTM substrates, yielding high-quality profiles of major PTMs in various organisms and cell types.This includes deep-probing profiling of, for example, phosphorylation, [86,87] acetylation, [88,89] ubiquitylation, [90] SUMOylation [91,92] and ADP-ribosylation [93,94] in human or other mammalian cells.Databases grouping PTM sites from both low-and high-throughput (the latter all MS-based) studies are available, including PhosphoSitePlus, [95] iPTMnet [96] and dbPTM. [97]ile the mentioned datasets were predominantly obtained with bottom-up MS approaches, in which protein samples are digested into peptides prior to the analysis, an alternative top-down strategy involves injecting intact proteins into a mass spectrometer.This allows the identification of distinct intact covalent protein productsknown as ʻproteoforms' -each having a signature molecular weight [98] (Figure 2E).The weight of a proteoform is a function of its amino-acid sequence (determined by the DNA coding sequence and the way it is spliced), as well as any combination of PTMs that might be simultaneously present on the same polypeptide.While the number of human protein-coding genes is relatively limited (estimated to be around 20 000), the combined effect of alternative splicing and PTMs gives rise to a much larger number of proteoforms.The Human Proteoform Project has so far detected over 60 000 distinct species originating from around 5000 human genes, but the total number of proteoforms is likely much larger, reaching the range of hundreds of thousands or more. [99,100]It has been proposed that such diversity could explain the stunning complexity of the human organism, which is difficult to square with the limited gene number.However, it remains challenging to distinguish functionally distinct proteoforms from variation that might be the result of PTM ʻnoise' with limited functional bearing.
The current understanding of PTMs would not be possible without MS, which arguably remains the most powerful technique for probing the PTM world.Ultimately, however, the research into PTMs -like in other areas -has advanced through a fruitful mixture of different techniques and approaches.

From the past to the future
The above-mentioned lines of research are of course interdependent.
For example, there is a clear link between the idea of writing and erasing the ʻhistone code' and the pioneering work on phosphorylation as a reversible process.The concept of ʻPTM reading' is also ultimately indebted to the phosphorylation field, where first domains capable of recognising phosphorylated peptides were reported in the early 1990s, [101][102][103][104] which motivated the search for analogous specific domains for the recognition of other major protein PTMs.Similarly, the concept of a proteoform is related not only to the project of cataloguing different PTMs and to the technical developments in MS but also to the idea of a functional ʻcode' potentially produced by various PTMs simultaneously present on the same polypeptide.
The history of the PTM research has furnished powerful paradigms for thinking about PTMs as a regulatory mechanism, but all paradigmbased thinking can both facilitate and distort understanding.For example, the classical case of glycogen phosphorylase phosphorylation (by a single kinase, on a single main site, with a direct effect on activity and structure) is not -as Krebs and Fischer already realised [7] representative of all protein phosphorylation, let alone other PTMs.
Similarly, not all PTM instances are ʻread' by specific domains, or are necessarily reversible.
As a matter of fact, we are now aware that PTMs are quite widely installed on different proteins (but sometimes only on a fraction of copies of a given protein) and most detected sites might not have any clear functional relevance, at least when the site or the protein is considered in isolation from others. [105]To overcome this ambiguity, efforts to distinguish sites that are more likely to be functionally important -based on evolutionary conservation and other criteriaare made. [106,107]Explanations drawing on evolutionary and systems biology are likely needed to fully appreciate the biological role of PTMs, in addition to more traditional biochemical, structural, proteomics and cell biology analyses.

CHEMICAL AND CATALYTIC LOGIC OF PTM REACTIONS
PTMs come in different shapes and sizes, ranging from small chemical groups such as methyl, acetyl or phosphoryl, through lipids, nucleotides and sugars, to proteinaceous modifications such as ubiquitin and its homologues (Figure 3A).Some PTMs can be attached as chains, which further increases their size.Due to the inherent specificity of main writer enzymes, each PTM type is predominantly associated with particular amino-acid acceptors (in a given group of organisms), as in -in eukaryotes -serine, threonine and tyrosine phosphorylation; lysine and arginine methylation; lysine acetylation, ubiquitylation and SUMOylation or serine ADP-ribosylation.While phosphorylation, acetylation and ubiquitylation appear to be among the most abundant modifications in eukaryotes, [108] comparisons of abundance are confounded by differential stability and detectability of, and unequal investment of the scientific community in, different PTMs.

Modification reaction
While PTMs are chemically diverse at the level of the modification itself, the modification reaction tends to, at least for some of the most common PTMs, follow a unified scheme, whereby the modification (in The box shows a simplified representation of a DNA synthesis reaction, in which dNMP is transferred from dNTP to the 3′ end of a growing DNA strand with concomitant release of pyrophosphate.Removal of a proton from the acceptor amino-acid residue, which is frequent in PTM reactions, is not shown.(C) Variation in nucleophilic substitution types observed among PTM reactions.In S N 2-type reaction, the bond between the substrate and the modification is being created at the same time as the bond between the modification and the leaving group is being broken.In S N 1 and addition-elimination reactions, these two events happen in two separate events shown in a simplified manner one after the other.Bonds that are being broken or formed in a given step are indicated with dashed lines.Acc, acceptor.light red) is transferred from the donor to an acceptor residue in the substrate (in green) with the concomitant departure of a leaving group (in grey) (Figure 3B).
To help the readers grasp this process more easily, I will draw a parallel between a protein modification reaction and a classical example of covalent addition in biology: that of a new nucleotide to the 3′ hydroxyl group on a growing DNA chain during DNA synthesis (Figure 3B, box).In the latter reaction, whether occurring in a cell or a PCR tube, DNA polymerase employs deoxynucleotide triphosphates (dNTPs) as nucleotide donors.When a deoxynucleotide monophosphate (dNMP) becomes incorporated into DNA, the remaining two phosphates (the pyrophosphate moiety) from dNTP depart as a leaving group.Pyrophosphate is a good leaving group, which means, in simplified terms, that it is adept at accepting electrons and dissociating, thus making the group to which it is/was attached (in this case, dNMP) an easy prey for the attacking nucleophile (in this case, the 3′ hydroxyl).
A dNTP molecule can, therefore, be seen as an ʻactivated form' of a deoxynucleotide, the pyrophosphate making the dNMP more reactive by depriving it of electrons.We can observe the same logic during main PTM reactions.
A PTM reaction that is chemically very similar to the DNA synthesis reaction is protein AMPylation (also known as adenylylation), [109] which involves the attachment of adenosine monophosphate (AMP) to protein residues.Here, the donor molecule is again a triphosphate (ATP), and the role of the pyrophosphate moiety is to ʻactivate' AMP by serving as a good leaving group.Other PTM reactions are less directly similar to DNA synthesis but still analogous.Protein phosphorylation, interestingly, uses the same donor as AMPylation, the triphosphate ATP, but a different part of the donor molecule (the γ phosphoryl) becomes attached to the protein, while ADP departs as the leaving group.For other modification reactions, donors are chemically different, but it is still possible to distinguish, in each case, a modification and a (good) leaving group, as discussed in Section 'Donor properties and enzymatic donor activation' below.
Chemically speaking, the described processes conform to a nucleophilic substitution mechanism in which an electron pair from a nucleophilic acceptor makes a bond to an electrophilic modification in the donor, displacing a leaving group.The leaving group can depart before, during or after the formation of the bond between the acceptor and the modification, which corresponds to S N 1-type, S N 2-type or additionelimination substitution mechanisms, respectively (Figure 3C).The S N 2 mechanism, which is thought to be dominant in biological systems, [110] might also be predominant among PTMs, but the mechanism depends on the group that is being transferred.Phosphorylation most likely proceeds via an S N 2-like mechanism, [111] with a particular loose or ʻdissociative' transition state, [112] but the issue is not settled yet.In contrast, acyl-based modifications, such as acetylation, palmitoylation or ubiquitylation, generally proceed by addition-elimination. Lastly, ADP-ribosylation likely has some S N 1 character, where the departure of nicotinamide slightly precedes the formation of a bond between a protein residue and ADP-ribose. [113]

PTM catalysis
The difference between a biological PTM reaction and an idealised chemical mechanism is that the first is a biochemical process that takes place in an active site of a specific enzyme.PTM acceptors and donors would be expected to have some tendency to spontaneously reactat least when brought into a close proximity -as this can facilitate the emergence of an enzymatically controlled reaction (for reasons elaborated in Section ʻChemical driving forces' below).Generally speaking, however, many of the key chemical groups encountered in biological systems tend to strike a balance between being relatively inert in the absence of catalysts at physiological pH and getting easily activated in the active site of an appropriate enzyme -and this is also the case for PTM acceptors and donors.
Since chemical reactions are similar at some level for different PTMs, there are also analogies between different writer enzymes in terms of how they exert their catalytic role.In addition to the proper orientation of the acceptor and donor groups relative to each otherwhich is particularly important for S N 2-type substitution mechanism that requires good orbital alignment -there are also other recurrent ʻtricks' by which enzymes potentiate PTM acceptors and donors.

Acceptor properties and enzymatic acceptor activation
PTMs can be installed on various amino-acid side-chains, the most common being serine, lysine, threonine, tyrosine, cysteine and arginine. [108]Enzymes that produce (and also those that erase) PTMs tend to have a strong preference for a specific amino-acid residue or, at most, closely similar residues, as an acceptor, which allows the existence, in the same cell, of enzymatically orthogonal and potentially functionally distinct systems for the same PTM (for example, serine/threonine vs. tyrosine phosphorylation).
PTM acceptors, chemically speaking, behave as nucleophiles that use their electrons to attack -and attach -a PTM.The common feature of the most frequently used acceptors of PTMs is that they exist predominantly in their poorly nucleophilic protonated forms at neutral pH, but can become more highly nucleophilic upon the loss of a proton.Active sites of writer enzymes, therefore, tend to encourage the deprotonated form and sometimes perform direct proton abstraction from (ʻdeprotonation' of) the acceptor by a suitably positioned catalytic base.This mechanism might be particularly important for serine, threonine and arginine modification due to their extremely low intrinsic tendency to lose a proton (approximate pK a value of 13, compared to 10 for tyrosine and lysine, and 8 for cysteine).
23] An interesting case is provided by the histone poly(ADPribosyl)ation factor 1 (HPF1):poly(ADP-ribose)polymerase 1 (PARP1) and HPF1:PARP2 complexes, which are responsible for the bulk of DNA damage-induced protein ADP-ribosylation in human cells (Figure 4B).[126][127][128][129] In the absence of HPF1, PARP1 or PARP2 appear unable to significantly modify serine residues and instead slowly ADPribosylates glutamate acceptors in proteins, which are predominantly deprotonated at a neutral pH (thus not requiring a catalytic base for deprotonation) but are not strongly nucleophilic and rarely serve as PTM acceptors. [124,128,130]In HPF1's presence, the complex efficiently targets serine acceptors as the main physiological ADP-ribosylation targets.
In the case of some protein methyltransferases, there is no explicit suggestion of a specific catalytic base residue, but the active site of the enzyme is thought to create an environment that promotes proton loss by the bound lysine, thereby effectively lowering its pK a value.In these enzymes, a channel has been postulated to provide access to bulk water, which would accept the proton. [131]

Donor properties and enzymatic donor activation
In this general framework, the donor molecules are composed of the transferable group that will become the protein modification (in light red in figures) and a part that will become the leaving group (in grey), both elements being important for the reaction.We can distinguish these two elements in major PTM donors including ATP (phosphorylation -see Figure 4A -and AMPylation/adenylylation), SAM (methylation, Figure 4C), NAD (ADPribosylation, Figure 4B), acetyl-coA (acetylation, Figure 4C) and lipid-CoA (various forms of lipidation) and UDP-sugars (various forms of glycosylation).All these molecules serve also other roles in cellular metabolism [132] and cells have pathways for maintaining their appropriate (often relatively high) levels through regeneration from the leaving group or other synthetic pathways.In fact, it is the cell's ability to maintain a high ratio of these donors relative to their corresponding leaving groups (for example ATP relative to ADP) -rather than any specific ʻhigh-energy bonds' that they might contain -that is the key to these donors' status as ʻhigh-energy molecules' that can thermodynamically drive the reactions in which they are involved in the forward direction.The favourable chemical properties of these molecules, including the types of bonds involved, can account for their favourable reactivity, but it is the displacement of their concentration from equilibrium relative to the concentration of the leaving group that ʻstores' free energy and makes many PTM writing reactions effectively unidirectional.As a rule, the demodification reaction has to be catalysed using a different mechanism and a different enzyme -an eraseras discussed in Section 'Reversibility' below.
In the case of ubiquitylation and ubiquitin-like modifications, the donor is a covalent complex of two proteins, including an enzyme (typically an E2) and the modifier, linked through a thioester bond (Figure 4D).The thioester between a cysteine in E2 and the modifier's C terminus is created in an ATP-dependent reaction catalysed by an E1 enzyme.Indeed, the need to create a dedicated donor molecule for ubiquitylation explains why this and related PTMs, unlike others, require a complex catalytic cascade composed of E1, E2 and E3 proteins instead of a single writer.Within this cascade, an E2 enzyme functions both as the modification carrier (the 'leaving group') and also, in part, the writer responsible for catalysis.In the latter task, E2 is assisted by an E3, typically a scaffolding protein that stabilises the active conformation of the modifier∼E2 molecule and recruits a protein substrate.In some cases -that is, homologous to the E6AP carboxyl terminus (HECT) and RING-between-RING (RBR) families and some more recently identified instances -the E3 ligase contains a reactive cysteine, or a relay of cysteines, on which it can temporarily accept ubiquitin from the E2. [133]In these cases, it is the E3 protein that serves the role analogous to an E2: both the 'leaving group' and a writer.Despite these special aspects, the ubiquitylation and related reactions can be said to broadly conform to the same basic scheme as other PTMs.
For all major PTMs, the part of the donor that will become the protein modification contains an atom that becomes the target of the nucleophilic attack by the PTM acceptor.This atom tends to have electrophilic properties, that is to say, be relatively electron-poor, for example, due to covalent bonds to electron-withdrawing groups.
For instance, in ATP, all three phosphorus atoms are relatively electrophilic due to bonds to surrounding oxygens, and we know that two of them (γ and α) can mediate protein PTMs (phosphorylation and AMPylation/adenylylation, respectively), as mentioned above.The choice between these two modifications is governed, in this case, by the catalysing enzyme, through the positioning of the appropriate phosphorus atom for the nucleophilic attack.Interestingly, examples of evolutionary plasticity where a kinase-like protein (SELENOO) catalyses protein AMPylation [134] while an AMPylator-like protein (catalyses protein phosphorylation [135] -due to a shift in ATP orientationhave been reported, highlighting the chemical suitability of various phosphorus atoms in ATP as targets of nucleophilic attack and the decisive role of positioning by the writer enzyme.In the case of the methylation and ADP-ribosylation donors, SAM and NAD, the central carbon atoms of the modification parts -which are not inherently electrophilic -are rendered electrophilic through their linkage to pow-erfully electron-withdrawing positively charged sulphur or nitrogen atoms. As argued above in the parallel drawn between DNA synthesis and protein modification, the part of the donor that becomes the leaving group plays an important role in the reaction by accepting the electron pair that previously made the bond between it and the modification, thus pulling electrons from the modification and ʻliberating' it for transfer.A chemical group that is adept at these tasks can promote the overall substitution reaction.The quality of the leaving group depends on its ability to accommodate extra electrons, which can be done, for example, by using them to neutralise a positive charge or by distributing them within a resonance system.Again, SAM -in which the leaving part (SAH) turns from positively charged to neutral upon methyl transfer -is an excellent example of a donor with a potent leaving group, as is NAD, in which nicotinamide leaves as a neutral aromatic compound (Figure 4B).On the other hand, the pyrophosphate group (in inorganic pyrophosphate or ADP) provides a case of a resonance system (Figure 4A).
While canonical donor molecules are well suited for their roleexplaining their recurrent use not only for PTMs but also in various other cellular reactions -they do tend to remain relatively inert in the absence of catalysis, as otherwise, they would dissipate through spontaneous non-specific reactions.Therefore, the donor molecules have to combine their non-equilibrium status with high kinetic stability in the absence of catalysts.For phosphate-containing molecules such as ATP, this has been proposed to be related to the electrostatic effects such as shielding of the central electrophilic phosphorus atoms from nucleophilic attacks by neighbouring negatively charged oxygens. [136,137] active sites of enzymes, such electrostatic protection can be overcome by proximity/orientation effects combined with a neutralising electrostatic environment, and both these aspects might be fulfilled by a tight network of positively charged amino-acid side-chains and metal ions observed for canonical and noncanonical protein kinases and AMPylating enzymes. [23,109,138]re generally, donor activation might involve binding of the donor in a conformation where the modification moiety is exposed to the nucleophilic attack.[141][142][143] Analogously, NAD appears to be always bound in a similar conformation by ADP-ribosylating writer enzymes, which possibly not only exposes the anomeric carbon atom to the attack by the acceptor residue but also introduces a strain into the donor molecule, the release of which could help drive the reaction. [144]

MECHANISTIC LOGIC OF PTM SYSTEMS
While the active site catalysing a PTM reaction can be likened to an ʻengine' of a PTM system, other elements and mechanisms are needed to turn it into effective regulatory machinery.The ʻsteering wheel' is provided by mechanisms of substrate recruitment that direct the modification to particular proteins and sites, while eraser enzymes furnish a necessary ʻbrake' (or, indeed, a ʻreverse gear') to regulate the output.These features are discussed in turns below.Finally, the mechanisms by which the modifications exert their regulatory roles will be briefly discussed, with a particular focus on the propensity of PTMs for promoting new protein:protein interactions.

Substrate specificity
Traditionally, the question of PTM specificity has been reduced to that of the targeted amino-acid residue and the surrounding aminoacid sequence motif (Figure 5A).For writers as diverse as protein kinases, [145,146] the SUMOylating E2 enzyme UBC9 [147] or the ADPribosylating complex HPF1:PARP1, [148] consensus motifs ranging from a dozen or so down to two residues flanking the modified site can be detected, although the requirement for them is rarely absolute.
Recent research suggests that it is not always desirable for a substrate to have an optimal consensus modification motif; instead, by diverging from ideal sequence patterns, substrates can be modified to the extent and with kinetics that are appropriate for their function. [ 149] a result, sequence motifs can determine the order of substrate modification during the cell cycle [150] and are responsible for varying sensitivity of substrates to the inhibition of kinase activity. [151] one recently described example, artificially improving a suboptimal phosphorylation motif found in a substrate important for T-cell activation resulted in detrimental excessive reactivity to self-antigens due to enhanced signalling. [152]Similar logic of a spectrum of ʻbetter' and ʻworse' substrates can apply to PTM removal, where some proteins and sites are faster demodified than others in cells by eraser enzymes. [153]ile motifs surrounding the modification site dominated attention in the past, more recent research suggests that such motifs -where they exist -are rarely the sole or even the main determinant of which proteins become modified, although they can guide where the modification will be within a protein.In many cases, another mechanism has to first ensure co-localisation of the substrate and the writer.One way this can be achieved is by a secondary docking element -typically also a linear motif, but sometimes a structural one -on the substrate that binds to a substrate-recruiting region on the writer distinct from the active site (Figures 5B and 5C).In protein kinase substrates, these two types of motifs often synergise in promoting phosphorylation. [149]In substrates of some other writers -for example, the ADP-ribosylating enzyme tankyrase [154] or many ubiquitin E3 ligases [39,155] -the docking motif actually plays the dominant role, whereas there is little apparent preference regarding the actual modification site other than that it is the correct amino-acid type and is accessible.The secondary docking site can be found on the same protein domain of the writer as the active site (for example, in some kinases; Figure 5B) or a different domain within the writer (in tankyrase and some single-protein ubiquitin E3 ligases; Figure 5C).Alternatively, substrate recruitment is sometimes performed by a different protein that physically associates with the writer and acts as its substrate receptor (Figure 5D).A good example of the last-mentioned mechanism is provided by some multiprotein ubiquitin E3 ligase complexes, such as cullin-RING E3 ubiquitin ligases. [133,155,156]Interestingly, viruses can derail the host's ubiquitin signalling by encoding an alternative substrate receptor that directs an endogenous E3 ubiquitin ligase to a novel substrate. [157]e particularly well-studied class of docking motifs are degrons, which target proteins for ubiquitylation by the combined action of E3 and E2 enzymes. [40]Degrons are often located on protein termini, [38,156,158,159] although they can be internal as well.The interaction of some degrons with ubiquitin E3 ligases is regulated by other PTMs such as phosphorylation or acetylation, which can either prevent or promote binding to the ligase, [158,160] providing an example of PTM interdependence.An unusual mechanism for substrate selection has been reported for the bacterial protein arginine kinase McsB, which marks proteins for degradation in some bacteria.This non-canonical kinase does not appear to select substrates via degron sequences but instead forms an oligomeric cage with a narrow entrance, which excludes bulky, folded proteins. [161]other interesting case of substrate targeting is represented by histone-modifying and demodifying enzymes, which are often first recruited to the nucleosome, for instance by binding to the H2A-H2B acidic patch using an arginine-containing anchoring sequence [162] (Figure 5E).Here, it is the enzyme that develops a simple motif to engage a pocket on the substrate -a reverse of the above examples where the enzyme recognised simple motifs in a substrate.Additionally, histone writers and erasers can contain reader domains for recognising PTM signals present on the nucleosome, [163] resulting in reading-writing coupling that again exemplifies a functional interplay between different PTMs.
In addition to mechanisms reliant on (direct or indirect) physical contacts between a writer and a specific substrate, some enzymes define their range of targets simply through co-localisation in the same cellular location.[166] Co-localisation to the same locale can be achieved through the binding of both the writer and its substrates to the membrane or DNA (Figure 5F).Thus, the main ADP-ribosylating writer PARP1, which physically interacts with DNA breaks, appears to modify substrates that associate with DNA or nucleosomes in the vicinity of DNA breaks.
Lastly, it should be mentioned that, at the structural level, the majority of the mechanisms mentioned above rely on structural complementarity between linear motifs embedded in flexible or intrinsically disordered regions of substrates and either the active site or dedicated pockets within writer complexes.This can be illustrated by the SUMOylation consensus sequence found within SUMOylation substrates, which binds to the active site of the SUMOylation writer UBC9 (Figure 5G), and by docking motifs bound to a substrate-recruiting pocket distinct from the active site found in the protein kinase ERK2 (Figure 5H).In (relatively rare) cases where the recognised motifs are in a folded substrate region, they might not be linear but instead be assembled in space out of non-contiguous residues. [167]While PTM writers tend to target multiple substrates by recognising relatively simple motifs, there are also more extreme examples of very specific enzymes that target a single main substrate or a small set of substrates.
As a rule, such enzymes tend to develop a larger binding surface that facilitates discrimination of their cognate substrate and specific site within them [149,168,169] (Figure 5I).

Chains and hybrids
72] In the latter case, chain formation can be divided into two distinct steps: initiation or priming (attachment of the first PTM unit to a protein) and elongation (attachment of succeeding units to preceding ones), which are sometimes controlled by distinct enzymatic activities (both in terms of reading and writing) (Figure 5J).This appears to be the case for serine ADP-ribosylation, where the initial attachment is performed by the HPF1:PARP1 complex (and reversed by the eraser ARH3), whereas elongation is catalysed by PARP1 alone (and reversed primarily by PARG). [173]Similarly, for ubiquitin, distinct E2s can be responsible for the initiation and elongation stages. [174]Erasers that process chains can employ either endo (within the chain) or exo (at chain termini) cleavage. [175]portantly, different linkages between repeating units in chains can lead to structurally and functionally distinct signals, as best understood for poly(ubiquityl)ation and encapsulated in the idea of the ubiquitin code [67] mentioned earlier.
As a final addition to this trend, recent research has provided examples of composite or hybrid modifications in which a modification attached to a protein substrate is itself modified by another modification, as in the case of phosphorylated or acetylated ubiquitin, [68] hybrid chains of various ubiquitin-like proteins [176] or mixed ADPribose-ubiquitin signals [123] (the last ones detected only in vitro so far).A recently reported combination of acetylation and methylation on the same lysine represents a further case of a combined PTM [177] (Figure 5K).Such PTM hybrids -more of which likely remain to be dis- Enzymatic reversibility of a PTM relies on the existence of opposing eraser enzymes that counteract the action of writers.Importantly, although we often speak of erasers ʻreversing' the PTMs, they generally do not recreate the donor molecule used in the forward reaction.It could be argued that the chemical nature of canonical PTMs explained above -where an electrophilic element is added to a nucleophilic protein side-chain -already implies the chemical possibility of detachment, most simply by hydrolysis, which is the common mechanism used by eraser enzymes (Figure 6A).The enormous concentration of water molecules thermodynamically drives the reaction.Hydrolysis can take place in a single step -as in the case of canonical protein serine/threonine phosphatases [178] and some ADP-ribosyl hydrolases [179] -the water molecule directly attacking the proteinligated modification.Catalysis of such hydrolytic reactions usually consists of exposing the modification to the attack and activating the water molecule with metal ions and catalytic residues.Alternatively, the modification can be transferred onto a nucleophilic residue on the eraser (for example, a cysteine on tyrosine phosphatases [178] ) prior to being attacked by water (Figure 6B).Both types of mechanisms are also observed for deubiquitinating enzymes, where ubiquityl cleavage proceeds either in a single step, using a metal-activated water molecule, or through a serine/cysteine protease-like mechanism involving a covalent intermediate analogous to that for tyrosine phosphatases. [175]difications that are not intrinsically electrophilic and are efficiently installed owing to a particularly good leaving group in the donor -which is the case for ADP-ribosylation, but even more so, for methylation -might be more difficult to detach, possibly explaining why ADP-ribosylation removal sometimes involves unusual hydrolytic pathways [179] and demethylation proceeds via complex, oxidation-dependent mechanisms. [180]In fact, as mentioned earlier, protein methylation has initially been considered irreversible and it was not until early 2000s that first protein demethylases have been discovered. [62,63]Furthermore, acetylation can be removed both through metal-catalysed hydrolysis and -in the case of acetylation erasers called sirtuins -through a more complex mechanism reliant on NAD as a co-substrate. [181]though, at the molecular level, PTM erasers counteract the reaction catalysed by writers, their role is not simply limited to extinguishing responses that were elicited by the modification.Writers and erasers are often active, at least to some extent, simultaneously, which might appear as a ʻfutile cycle' that wastes donor molecules, but can, in fact, serve positive roles, such as fine-tuning the responsiveness of the system or allowing proofreading. [182]The proofreading role, fulfilled in this case by deubiquitinases, might be particular important for regulating ubiquitylation because of its potential to commit a protein to irreversible degradation. [183,184]gether with other mechanisms, such as sequential activation or of kinases and phosphatases drive dynamic transitions between subsequent stages. [185]

Molecular effects of PTMs
Proteins are remarkably sensitive systems.We know from countless mutational studies that even subtle alterations to a protein's chemical make-up, such as a single well-placed amino-acid substitution, can, in some cases, tip the delicate balance between different protein conformations, catalytically proficient/deficient enzymatic environments, or the presence/absence of an interaction with a partner.This sensitivity makes proteins inherently regulatable.Indeed, similarly to what is observed with mutations, the introduction or removal of even the smallest covalent PTM can profoundly alter protein properties.
However, unlike mutations, PTMs are reversible and tightly controlled.
It is worth noting that bulkier and charged groups are predicted to have a more substantial impact on structure and function than smaller or neutral ones, which could explain the existence of many large PTMs and, among small chemical groups, the prominence of charged protein phosphoryl. [186]Disrupting interactions through steric or electrostatic repulsion is arguably ʻeasier' with a larger or charged modification, and the same goes for creating new intra-or intermolecular interactions, with larger modifications potentially creating more new contacts and burying larger surfaces upon binding.Particularly large PTMsespecially those that can make chains, such as ubiquitylation and ADPribosylation -can provide platforms for the simultaneous recruitment of multiple components. [48]e basic molecular mechanism by which PTMs can exert their functions boils down to induction or disruption of either intra-molecular interactions (altering the protein's structure and dynamics) or intermolecular interactions.In the latter case, the affected interactions could be (oligomerisation of the modified protein) or heterotypic (interactions of the modified protein with other proteins or non-proteinaceous molecules such as DNA, membranes or smallmolecule ligands).A PTM's effect on intra-and inter-molecular contacts can in turn translate into at least three different PTM-dependent functional outcomes: modulation of enzymatic activity, changes in subcellular localisation (including localisation to specific compartments and formation of molecular complexes) or altered half-life.
Two examples of PTM-dependent regulation of enzymatic activity were already discussed in Section ʻProtein phosphorylation as a regulatory switch' in the context of sugar metabolism, where both glycogen phosphorylase (Figure 2A) and glycogen synthase change their enzy-matic activity as a function of their phosphorylation status.Another example of enzyme regulation by phosphorylation concerns canonical protein kinases themselves, which are often activated through phosphorylation of the so-called activation loop or segment, which is located near the active site.Activation loop phosphorylation typically happens on a serine or threonine residue in serine/threonine kinases and tyrosine residues in tyrosine kinases and can proceed either in cis (which is controversial), where a kinase molecule modifies its own activation loop, or in trans (which is more established).In trans, phosphorylation can be catalysed by another copy of the same kinase or by an upstream kinase in a cascade.Upon phosphorylation, the activation loop adopts a conformation that supports substrate binding and catalytic activity.A classic example of this mechanism is provided by the insulin tyrosine kinase receptor, for which crystal structures in the dephosphorylated and phosphorylated state have long been available, allowing visualisation of this functional switch [187,188] (Figure 6C).
PTM-induced changes in sub-cellular localisation can be both dramatic (the modified protein is found preferentially in a different compartment than the unmodified one) and subtle (the modified protein is found within the same compartment, but co-localised or not with different specific partners).A change in the global pattern of subcellular distribution is often related to a PTM's effect on the modified protein's interaction with cellular machinery that regulates passage between compartments, especially nuclear import and export pathways.Many instances of phosphorylation of residues located in the vicinity of nuclear import/export signals resulting in either increased or decreased nuclear localisation are known [189,190] (Figure 6D and 6E).
In some cases, phosphorylation does not directly affect the signal but instead triggers a conformational change that exposes an otherwise inaccessible signalling sequence.An alternative mechanism for regulating compartmental localisation is by affecting interactions that are crucial not for transport but for retaining the protein in a given compartment.A further potential mode of regulation is related to changes in solubility and condensation properties of a protein upon modification. [191]Ms often trigger more subtle changes in localisation, such as recruitment or not to particular binding partners.While disruption of intermolecular interactions by PTMs typically relies on a steric or charge conflict between the modification and the would-be binding partner, induction of new interactions by PTMs is a more specific process where the binding partner has to preferentially recognise the modified protein form.This recognition is often mediated by specific reader domains that specialise in binding to a particular PTM, such as SH2 (Src homology 2) domains for phosphotyrosine, 14-3-3, catalytic activity.(D) and (E) A schematic representation of selected mechanisms by which PTMs can regulate nuclear import.The nuclear import signal can be PTM-dependent, requiring strengthening by a PTM to confer efficient translocation through the pore (D).Alternatively, the nuclear import signal can be constitutively active but be weakened or blocked upon modification (E).(F) and (G) A three-dimensional structural representation of ʻreadingʼ of PTMs by specific domains.The recognition of a phosphotyrosine-containing peptide by the SH2 domain of LCK and of a trimethyllysine-containing histone H3 peptide by the double tudor domain of KDM4A is shown.PDB entries 1LCJ and 2GFA.(H) and (I) A comparison of domain-mediated and peptide-mediated PTM ʻreading' .For each situation, a schematic and a corresponding illustrative structure are shown.A modification that is small in size is typically recognised by folded domains that contain a pocket which can surround the modification and its flanking sequence, increasing the interaction surface (H).In contrast, a large modification such as SUMO can itself contain a pocket for binding a linear motif on a reader (I).PDB entries 1LCJ and 6JXW.Acc, acceptor.
[194] The reader domains specific for small modifications typically recognise not only the modification itself but also the rest of the modified amino-acid sidechain, sometimes together with the surrounding sequence.Nonetheless, the modification makes a key contribution to binding, rendering the interaction PTM-dependent.This can be seen for the interaction between an SH2 domain and a phosphotyrosine peptide, where the anionic phosphoryl moiety in the modified peptide is surrounded by three positively charged arginine or lysine residues and two serine hydrogenbond donors in the reader (Figure 6F).Methyllysine-recognising tudor domains, on the other hand, surround the methylated lysine side-chain with an aromatic cage (Figure 6G).In the case of poly(ubiquityl), different reader domains or motifs can distinguish between different types of ubiquitin chains (different inter-ubiquitin linkages). [67]The specificity of readers not only for the PTM type but also for the particular modified substrate can come from the reader domains favouring particular sequence motifs around the PTM site or from secondary interactions between other elements in the reader protein and in the modified substrate.In addition to folded reader domains, some modifications can be recognised by simple reader motifs, as discussed at the end of Section ʻEvolutionary development of PTM systemsʼ below.
In the cases where a modification promotes physical interaction with proteins that have proteolytic functions -as with ubiquitindependent targeting to the 26S proteasome discussed in Section ʻProtein phosphorylation as a regulatory switch' (Figure 2B) or similar mechanisms dependent on Pup (prokaryotic ubiquitin-like protein; note that this is an analogue, not a homologue, of ubiquitin) and phosphoarginine in bacteria -modification can act as a trigger of protein degradation.The self-compartmentalised proteases involved in these processes contain reader domains that recognise the appropriate PTMs, with at least three different ubiquitin receptors present within the eukaryotic 26S proteasome [195] and 12 repeats of a similar phosphoarginine-binding site in the bacterial protease ClpCP. [43]One step upstream of the degradation signal itself, eukaryotic proteolysis can be promoted or inhibited by PTMs that either increase or decrease a substrate's interaction with an E3 ubiquitin ligase. [160]

EVOLUTIONARY LOGIC OF EMERGENCE AND EXPANSION OF PTM SYSTEMS
When discussing evolutionary ʻrationale' for the emergence of specific PTMs and for various aspects of the PTM reactions (for instance, the use of specific donors), it is important to bear in mind that the evolution of a new process does not simply tend towards a theoretically optimal solution to a specific problem.Thus, it is not possible to claim that the existing PTMs are an all-optimal tool for intracellular signalling or, more generally, protein regulation (although it should be possible to argue that they are well-suited for this task -as I have in Section 'Molecu-lar effects of PTMs' above).Evolving processes, shaped as they are by historical contingency and entrenchment, should be seen in the context of specific evolutionary histories and potentialities and limitations of resources at hand.As François Jacob remarked, evolution proceeds by ʻtinkering' and is ʻa matter of using the same elements, of adjusting them, of altering here or there, of arranging various combinations to produce new objects' . [196]It is within such a framework that I discuss some hypotheses about PTM emergence and expansion below.

The conflict context hypothesis
L. Aravind Iyer and co-workers trace back the evolutionary origin of several key regulatory PTMs, alongside nucleic acid modifications, to a phase in the history of prokaryotes, which might overlap with a period in Earth's history known as the great oxygenation event and be a key to the origin of eukaryotes. [197,198]According to this scenario, following the development of more basic metabolic processes, competition within and between prokaryotic species as well as between prokaryotes and phages led to the rapid development of systems for attacking and defending, which relied to a large extent on enzymatic modification of small-molecules, nucleic acids and proteins.Modifications could serve to meddle with the enemy's physiology (for example by blocking essential activities or interaction) as well as to inactivate the enemy's weapons (by modifying and thus inactivating an antibiotic, a protein toxin or an infectious nucleic acid).In Aravind's account, increasing evolutionary pressure characteristic of the conflict context could account for the explosive innovation in terms of new catalytic activities, many of which catalysed the transfer of chemical groups as a means of interfering with an opponent or protecting from their interference.
Such a context could also explain the emergence of PTM erasers as ʻanti-toxins' that counteract PTMs.The rich enzymatic repertoire of writers and erasers would later be re-used and developed for regulatory, and especially epigenetic, purposes by prokaryotes themselves and especially emerging eukaryotes as an example of a ʻ«peacetime» use of «wartime» inventions' . [197]It could be added that the use of PTMs for attack and defence is still widely observed in the living world.For instance, some of the notorious bacterial toxins (including cholera and diphtheria) are protein ADP-ribosylating enzymes that target the essential processes of a host.In another, more recent, example, a human E3 ligase RNF213 is involved in ubiquitylating invading intra-cellular bacteria as part of a defensive response. [199]e ancient conflict context could conceivably explain not only that many potential writer activities developed but also some of the features of the protein writer enzymes.For example, these enzymes, while quite specific for donors, can often target various substrates and relatively rapidly change their substrate preference on an evolutionary timescale, even switching between protein and non-protein substrates.

New use for ʻold' donors and mechanisms
Whether the conflict scenario of PTM origin is true or not, PTMs did not develop in a vacuum but in cells that already contained certain resources (genes, chemicals), and these ʻstarting conditions' must be included in an evolutionary account of PTM origins.In particular, cells already possessed suitable donor molecules, which were present there for other reasons.Indeed, as already mentioned, donors used for many key PTMs (ATP, SAM, NAD, acetyl-coenzyme A, lipid-coenzyme A, GDP-sugars etc.) are also used in primary metabolism, whether as donors for metabolite modification or as ʻenergy molecules' (as with ATP and NADH/NAD). [132]Some of these core compounds might have originally emerged through spontaneous reactions in a primordial environment (as proposed for SAM [210] ), but by the time enzymatic PTMs became prominent regulatory mechanisms, these molecules were likely maintained at relatively high levels in the cell through dedicated biosynthetic and salvage pathways.Relying on the same compounds as the ones used for energy homoeostasis has an added advantage of allowing for a direct crosstalk between the metabolic state of the cell and protein regulation by PTMs. [211]e prominent exception to the repurposing of ancient metabolic donors for PTMs is ubiquitylation and other ubiquitin-like PTMs.But even here, a suitable donor -a C-terminally activated protein modifier -arguably first evolved for another purpose, to serve as a ʻsulphur carrier protein' in sulphur transfer reactions.18][219][220] A different example of repurposing an existing resource (this timean enzymatic activity) for a PTM reaction is provided by the Pupylation pathway of Mycobacterium tuberculosis. [42]Here, the writer enzyme PafA, which ligates a small intrinsically disordered protein called Pup to substrates, is related to metabolic enzymes including glutamine synthetase, which catalyse the attachment of a glutamyl moiety to an amino group in a biosynthetic pathway. [221][224] A further case of potential repurposing is represented by the possible evolution of enzymatic activities for histone lysine methylation and acetylation from enzymes that catalyse corresponding modifications of polyamines such as spermine and spermidine, cationic compounds widely found in living cells. [225]

Chemical driving forces
Apart from the historical context and the available resources, the course of evolution is shaped by the inherent reactivities of potential PTM acceptors and donors.Although -as stressed above -chemical groups found in biological systems are relatively inert by conventional chemical standards, some are more likely than others to engage in chemical reactions.At initial stages of PTM evolution, when relevant enzymatic activities only began to emerge or were switching from a non-protein to a protein substrate, catalysis was likely weak, making the intrinsic substrate reactivities relatively more important in the process of emergence of a new PTM.Thus, acceptor residues that are more inherently nucleophilic and donor molecules that contain an electrophilic modification and/or a good leaving group are, all other things being equal, more likely to become substrates of an enzymatic reaction than do less reactive acceptors and donors.Indeed, some canonical PTM reactions are observed to proceed even without enzymes, and we could envision some of the most ancient PTM events to have been quasi-spontaneous reactions, happening, for example, in active sites of metabolic enzymes that interact with reactive donors. [211]From the point of view of thermodynamics, the maintenance of high levels of the donor compounds and low levels of the corresponding free leaving groups by the cell -which likely preceded or coincided with the emergence of PTMs -could drive modification reactions as soon as they were kinetically permitted.
The final, most ʻclassical' part of the search for an evolutionary ʻlogic' of PTM emergence has to do with the physiological advantage of PTMs.
The supposed existence of such an advantage is what promoted the retention and expansion of accidental genetic changes that supported PTM emergence and development.
Here, the answer to the question whether PTM systems could be advantageous is the enormous regulatory potential of even the smallest covalent addition, discussed in more detail in Section ʻMolecular effects of PTMs' above.

Evolutionary development of PTM systems
In addition to the initial emergence of PTMs, the current widespread use of protein modification by living organisms reflects the expansion and fine-tuning of PTM system in the course of natural history.Some major PTMs, notably phosphorylation and acetylation, are universally present across bacteria, archea and eukaryotes. [105,211]Other modifications -and some specific sub-types of those mentionedemerged after the common ancestor of these three domains of life and exist only in some of the lineages, the best example being ubiquitylation, which is present in archea and eukaryotes but not bacteria. [105]stly, instances of PTM loss in particular lineages and horizontal transfer between lineages muddle a simple narrative of PTM evolutionary history.An example of a PTM with a complex history is protein ADPribosylation, which is present in species from all kingdoms of life, but apparently lacking in some notable species (such as yeast), and has spread at least in part through horizontal gene transfer. [226]ross kingdoms of life, PTMs tend to show two distinct expansion trends.On the one hand, we observe a diversity of PTMs related to the range of lifestyles and occupied habitats.This is particularly evident in bacteria, where various species differ widely in their PTM repertoires and represent a large PTM diversity when considered collectively.On the other hand, we observe the expansion of PTM systems that is seemingly correlated with increasing organismal complexity and which might be linked to the challenging development and signal processing in complex, multi-cellular organisms.The latter trend has been at play in higher eukaryotes.Both trends can be illustrated by the evolution of protein phosphorylation.On the one hand, phosphorylation exhibits large diversity in bacteria, with not only canonical protein kinases that modify serine and threonine amino acids, but also non-canonical protein tyrosine, histidine and arginine kinases, which, generally, are not found in eukaryotes. [56,227,228]On the other hand, the canonical protein kinase domain and the associated substrate, eraser and reader pools for hydroxyl amino-acid phosphorylation have expanded tremendously during the emergence and later evolution of eukaryotes.By the time the last eukaryotic common ancestor appeared, the set of different canonical protein kinases -presumably mostly targeting serine and threonine residues -seems to have already expended to almost a hundred members, [229] and later has further expanded, reaching several hundred in humans.One particular branch of the canonical protein kinase family whose development has been much studied is protein tyrosine kinases, which are not present in bacteria or yeast but emerged in higher eukaryotes by switching their acceptor specificity from serine/threonine to tyrosine (they are different from non-canonical bacterial tyrosine kinases).
Wendell Lim and Tony Pawson have discussed how, following the individual emergence of three tyrosine-specific functions -a kinase, a phosphatase and a reader domain (SH2) -this new complete signalling ʻtoolkit' dramatically expanded, becoming a crucial means of cell-tocell communication in metazoans. [230]Interestingly, the expansion of phosphotyrosine signalling seems to have gone hand in hand with the decrease in tyrosine frequency in the proteome, possibly to avoid unintended phosphorylation. [231]Another fascinating example related to the evolution of protein phosphorylation in eukaryotes is provided by the replacement, in the course of evolution, of some glutamate or aspartate residues in key positions of certain proteins with phosphorylatable serine, threonine or tyrosine residues. [232]Phosphorylated versions of these residues can functionally mimic negatively charged glutamate/aspartate residues, allowing phosphorylation-dependent regulation of protein structure and function.
As illustrated by the development of phosphotyrosine signalling, the expansion of PTM systems in eukaryotes is closely related to the modular nature of eukaryotic proteins.Over the course of evolution, protein domains responsible for writing, erasing and reading modificationsonce all three types are found together in one species, creating an advantageous functional toolkit -can be individually duplicated and re-combined in new arrangements with other domains, thus achieving specialisation for new functions and sub-cellular niches.This is powerfully illustrated by the human family of 17 PARP proteins, most of which are active as ADP-ribosylation writers. [233]Each of these proteins contains the same writer domain, related to bacterial ADP-ribosylating toxins such as cholera and (especially) diphtheria toxin, but, in PARPs, these ancient domains have diverged in their catalytic and allosteric properties and became combined with other domains and motifs.As a result, distinct PARPs catalyse ADP-ribosylation on different acceptor residues and in different functional contexts ranging from DNA repair to anti-viral immunity and localise to various possible cellular compartments.One of the human PARPs, PARP14, has recently been shown to be not only a writer, but also a substrate, a reader and an eraser of ADPribosylation, each function being associated with a different region or domain in this large multi-domain protein. [234,235] addition to the evolution based on domain duplication, the expansion of PTM systems in eukaryotes benefits from some elements of these systems relying on short linear motifs (SLiMs), which are widespread in eukaryotic proteins. [236]Due to their simplicity and location in intrinsically disordered regions that are tolerant to mutations, such motifs can relatively rapidly emerge both ex nihilo and through duplication, and then further adjust or disappear in the course of evolution, thus facilitating the expansion and fine-tuning of PTM systems of which they are part.In particular, this applies to PTM sites on substrates, which are enri in loops or disordered regions, [237] but it is also relevant to secondary docking sites on substrates such as degrons.
Fast evolving linear motifs might also, in some cases, be responsible for PTM reader or even writer functions, provided the modification is sufficiently large.This is related to the idea that, to achieve sufficient binding affinity, heterodimeric interactions typically require at least one relatively large partner, which -when physically interacting with a smaller molecule or motif -is able to increase the interaction surface by surrounding the small partner.Following this logic, PTMs that consist of small chemical groups tend to be recognised by protein domains with dedicated pockets (reader domains) that encapsulate the modification and its surrounding sequence (Figure 6H).In contrast, large modifications -particularly those involving a proteinaceous modifier -are able to form sufficiently strong interactions with short protein motifs.The best-studied example of this phenomenon is the ubiquitin-like protein SUMO, which can interact with short hydrophobic SUMO-interacting motifs or SIMs [238][239][240] (Figure 6I).SIM motifs can, therefore, function as very simple SUMO reader modules.The distinction between domain-and motif-based PTM reading is important from an evolutionary point of view, considering the favourable evolutionary properties of linear motifs.It also has implications for reader detection, as domains are easily found through sequence or structural homology, but functional linear motifs are more challenging to conclusively identify with bioinformatic tools alone.Hydrophobic linear motifs such as SIMs might emerge particularly rapidly during protein evolution, as hydrophobic substitutions are apparently favoured during random mutagenesis due to their codon composition. [241]Linear motifs have also been reported for ADP-ribosylation binding [194] and helical, but simple, motifs are known for ubiquitin recognition. [192]Of note, a particular SIM-linker-SIM region found in zinc-finger protein 451 (ZNF451) is sufficient to confer on this human protein a SUMO E3 ligase activity, [242] showing that short motifs can also participate in writing of some PTMs.

Concluding remarks
The field of protein PTMs has evolved from small areas of research dedicated to the regulation of specific processes (initially sugar metabolism) to an enormous and diverse discipline that touches on every aspect of molecular biology across all kingdoms of life.With the number of known PTMs exceeding 650, there is apparently no end to nature's creativity in modifying protein structure with additional elements that can serve regulatory roles.The mere fact that so many PTMs exist already suggests that they might serve nonredundant functions and cannot be exhaustively described in general terms.In the end, PTMs are not generic ʻon' and ʻoff' signals but particular chemical groups, with their physicochemical properties, and, similarly, proteins that regulate them are complex entities with individual structural and mechanistic features.The purpose of the above essay was not to obscure this diversity, but to point to some analogies that can be found, despite the diversity, due to convergent evolution.The appreciation of analogies between diverse systems might help relate the knowledge obtained for one system to open questions in another.
Indeed, the history of the PTM field is a good example of how concepts

CONFLICT OF INTEREST STATEMENT
The author declares no conflicts of interest.

1
Selected milestones in characterising and understanding protein PTMs.The discussion of the indicated discoveries and references to publications can be found in the main text.

2
Key paradigms in PTM research.In all panels (and other figures in this article), modifications are shown in light red, protein substrates in green, writers in blue, erasers in yellow and readers in violet.(A) Regulation of the glycogen-degrading activity of the enzyme glycogen phosphorylase by protein phosphorylation.Phosphorylation and dephosphorylation of this enzyme are ultimately regulated by the hormones glucagon and insulin, through signalling pathways indicated schematically with dashed arrows.(B) Protein ubiquitylation as a signal for degradation by the 26S proteasome.The ubiquitylation reaction is catalysed by an enzymatic cascade composed of E1, E2 and E3 proteins and requires ATP.A degron motif on the substrate promotes ubiquitylation by physically interacting with the E3 ligase.A poly(ubiquityl)ated substrate is recognised by receptor proteins within the 26S proteasome, unfolded and degraded.(C) Regulation of chromatin structure and gene expression by the histone code.Protein modifications on histone tails are installed by writer enzymes, removed by eraser enzymes and recognised by reader proteins.(D) A general scheme of protein regulation by PTMs based on panel C. (E) Sources of variation that produce multiple proteoforms from a single protein-coding gene.A single gene can be alternatively spliced to produce several isoforms, which can be further diversified through differential PTM patterns.Additional sources of proteoform diversity omitted from the figure include, for example, single-nucleotide polymorphisms and alternative translation start sites.Ac, acetylation; Me, methylation; P, phosphorylation; Ub, ubiquitin.when describing the ʻchemical logicʼ of PTMs.My review aims to commemorate Walsh's extraordinary contribution in the wake of his passing.

3
Chemical diversity and similarity among PTMs.(A) A selection of PTMs differing in chemical character and size.Three-dimensional structural representations of different modifications together with the modified amino-acid residue and the flanking backbone are shown at the same scale based on PDB entries: 1T2V (phosphoryl), 2WP1 (acetyl), 5EMW (palmitoyl), 7AKS (ADP-ribosyl), 5LN1 (ubiquityl).(B) A schematic representation of a protein modification reaction.PTM reactions are typically nucleophilic substitution reactions involving a nucleophilic acceptor amino-acid residue on the substrate and a donor molecule, from which a leaving group is released upon reaction.

4
Donors of PTMs and catalytic mechanisms of protein modification.In all panels, the modifications are shown in light red, while the leaving groups are indicated in grey.(A) A simplified catalytic mechanism of a PTM reaction using protein phosphorylation as an example.The enzyme recognises the donor and the modified motif in the substrate and positions and activates both elements.The acceptor amino-acid residue in the substrate can be activated through deprotonation by a catalytic base.(B) A simplified catalytic mechanism of protein ADP-ribosylation by the HPF1:PARP1 complex showing analogy to the phosphorylation mechanism shown in (A).(C) Chemical structures of donors of protein methylation and acetylation.(D) A schematic chemical structure of the donor of ubiquitylation, the ubiquitin∼E2 thioester molecule.Protein structures are not shown to scale compared to chemical bonds.A schematic below provides a simplified picture of the E1-catalysed reaction in which Ub and E2 become covalently joined to form the ubiquitin∼E2 thioester.

F I G U R E 5
Mechanisms of substrate selection by PTM writers.(A)-(D) Schematic illustrations of selected mechanisms through which a writer enzyme recognises its substrates.The writer can use its active site to recognise a specific sequence motif surrounding the modified residue (A).Additionally or alternatively, it can utilise a secondary site located on the same domain as the active site (B), on another domain (C) or on a separate receptor subunit (D).The secondary substrate-docking site recognises a specific docking motif on the substrate distinct from the modified site.The modification site is indicated with a red arrow.(E) A schematic representation of recognition of nucleosome by a PTM writer through a motif that docks to the acidic patch composed of residues from histones H2A and H2B.(F) A schematic representation of co-localisation of a PTM writer and its substrate through the binding of both elements to DNA.(G) A three-dimensional structural representation of a SUMOylation consensus motif derived from a model SUMOylation substrate, RANGAP1, bound to the active site of the SUMOylation writer, the E2 protein UBC9.The figure was made using a fragment of PDB entry 2GRN.(H) A three-dimensional structural representation of a docking motif (ʻkinase-interaction motif') of MKP3 bound to a secondary docking site at the back of the protein kinase domain of ERK2.PDB entry 2FYS.(I) A three-dimensional structural representation of the interaction between the protein kinase LIMK1 and its specific substrate, cofilin-1, which results in the positioning of the main modification site, serine 3, in the kinase active site.The structure shows a post-reaction state in which serine is phosphorylated and ADP is still bound to the active site.Note an extensive interaction surface that ensures specificity for this particular substrate and site.PDB entry 5HVK.(J) A schematic representation of two steps needed for the formation of chains of modifiers such as ADP-ribosyl, ubiquityl or SUMOyl.The initiation/priming and elongation steps may be catalysed and reversed by distinct writers and erasers.The formation of linear or branched chains is shown.(K) A three-dimensional structural representation of the composite acetyl-methyl modification of a lysine residue.PDB entry 8SB6.
covered -can potentially increase the complexity of functional signals encoded by PTMs.Reversibility PTMs differ in how transient or stable they are in the cell.Some stable modifications -notably many canonical cases of lipidation and glycosylation -might be best described as final maturation steps needed for proper localisation and constitutive function of a protein.Other PTMs -including phosphorylation, methylation, acetylation, ADP-ribosylation, ubiquitylation, SUMOylation or a specific nuclear/cytoplasmic type of glycosylation termed O-GlcNAcylation (the attachment of O-linked N-acetylglucosamine) -tend to be reversible.

6
inhibition of one enzyme by another, writer-reader competition can create sophisticated regulatory tools (switches, feedback and feedforward loops, clocks), as best seen during the cell cycle, where networks Regulation and functional consequences of PTMs.(A) A schematic representation of protein demodification through hydrolysis.A hydroxyl moiety from water becomes attached to the modification instead of the acceptor protein amino-acid residue.Proton abstraction from water is not shown.Note that not all PTMs are erased through this simple mechanism.(B) A schematic representation of a two-step hydrolysis mechanism used by some erasers (protein tyrosine phosphatases and some deubiquitinases), in which the modification is first transferred onto a reactive cysteine or serine residue in the eraser before being hydrolysed.(C) A three-dimensional structural representation of a phosphorylation-dependent activity switch in protein kinases.The protein tyrosine kinase domain of insulin receptor in its dephosphorylated and phosphorylated forms is shown based on PDB entries 1IRK and 1IR3.The activation loop is indicated in dark blue.Three tyrosine residues on this loop are seen modified in the phosphorylated form, which primes the kinase for substrate binding (a substrate peptide and ATP are shown) and developed in one context stimulate formulating hypotheses in another, and how different experimental and theoretical approaches combine in formulating a more complete picture.With the current tremendous advancements in available techniques, it can only be wished that this cross-fertilisation should continue and develop in the future.ACKNOWLEDGEMENTS I acknowledge the contribution of, and apologise to, all researchers in the PTM community whose research and ideas have not been covered in this text or have been unsatisfactorily discussed.I thank Philip Cohen, Tim Clausen and Ivan Ahel, who have introduced me to the world of different PTMs.I thank the anonymous reviewers for their valuable feedback, Ivo Alexander Hendriks for advice on PTM proteomics, and my colleagues Bertrand Castaing, Stéphane Goffinont, Franck Coste, Lucija Mance and Aanchal Mishra for the discussions on related topics.The responsibility for any remaining errors lies with me.The initial version of the essay was based on seminars that I have given in various places and I thank the participants for all the questions asked on these occasions.My research is financially supported by the European Union's Horizon Europe Research and Innovation Programme (ERC Starting Grant 'SUMOwriteNread' , no 101078837), La Ligue contre le Cancer and the Centre National de la Recherche Scientifique (CNRS).I am an associate fellow of Le Studium Loire Valley Institute of Advanced Studies and the ATIP-Avenir programme.