Post-translational modification of proteins is ubiquitous and mediates many cellular processes, including intracellular localization, protein–protein interactions, enzyme activity, transcriptional regulation and protein stability. While the role of phosphorylation as a key post-translational modification has been well studied, the more evolutionarily conserved post-translational modification acetylation has only recently attracted attention as a key regulator of cellular events. Protein acetylation has been largely studied in the context of its role in histone modification and gene regulation, where histones are modified by histone acetyltransferases to promote transcription. However, more recent acetylomic and biochemical studies have revealed that acetylation is mediated by a broader family of protein acetyltransferases. The recent structure determination of several protein acetyltransferases has provided a wealth of molecular information regarding structural features of protein acetyltransferases, their enzymatic mechanisms, their mode of substrate-specific recognition and their regulatory elements. In this review, we briefly describe what is known about non-histone protein substrates, but mainly focus on a few recent structures of protein acetyltransferases to compare and contrast them with histone acetyltransferases to better understand the molecular basis for protein recognition and modification by this family of protein modification enzymes.
Post-translational modification of proteins provides an essential mechanism for organisms to react to external and internal stimuli, resulting in a myriad of cellular responses . The importance of phosphorylation as a key post-translational modification cannot be understated, as evidenced by the existence of the over 500 protein kinases that catalyze this reaction, the even larger number of known substrates, and the diverse array of biological pathways that are regulated by these enzymes . The oncogenic properties of several kinases have also made them attractive pharmacological targets for the development of specific and potent enzyme modulators .
Protein acetylation has recently emerged as a significant rival to phosphorylation in terms of its biological scope and significance, although our molecular understanding of protein acetylation and the enzymes that catalyze this reaction has lagged behind that for kinases [4, 5]. Much of our basic understanding has come from the investigation of histone acetylation and the histone acetyltransferases (HATs) that mediate this modification [6-8]. The correlation between histone acetylation and gene activation has been known for nearly half a century, but some of the enzymes responsible for these modifications have only been discovered over the past 20 years [9-12], and there may be many other protein acetyltransferase enzymes yet to be discovered. To date, many HATs have been identified from yeast to humans, resulting in at least five distinct sub-families based on sequence and structural divergence: HAT1, Gcn5/PCAF, MYST (MOZ, Ybf2, Sas2 and Tip60), p300/CBP and Rtt109 [13-22]. By structure elucidation and biochemical characterization of members of each of these HAT sub-families, structural and biochemical similarities and differences have emerged. Each of these HAT sub-families contain a structurally conserved core region comprising a three-stranded β-sheet and helix on one side of the sheet that participates in acetyl CoA co-factor binding and templating the respective substrate protein for acetylation. Remarkably, this core region has little to no sequence homology between the HAT sub-families. The various HAT sub-families contain structurally divergent regions that flank the core region, and also use different chemical strategies to acetylate their substrates . These structurally divergent regions and catalytic strategies appear to participate in HAT-specific activities such as substrate-specific binding and regulation by co-factor proteins.
While the majority of studies on protein acetylation have focused on histones, many HATs, such as members of the CBP/p300 and MYST families, have been shown to acetylate non-histone proteins, and acetyltransferases have been more recently identified (referred to here as protein acetyltransferases, PATs) that exclusively acetylate non-histone proteins [23, 24], such as the eukaryotic α-tubulin acetyltransferase 1 (αTAT1) [25, 26] and the Mycobacterium tuberculosis acetyl CoA synthetase acetyltransferase Rv0998 . In addition, the majority of eukaryotic proteins are modified on their N-termini by a family of N-terminal acetyltransferases [28, 29]. Recently, the structure and mechanism of these non-histone acetyltransferases have been characterized, providing an opportunity to compare the molecular determinants that differentiate these protein acetyltransferases from their HAT relatives. In this review, we briefly summarize the broad range of non-histone substrates of acetylation, but primarily focus on the structure and mechanism of action of non-histone protein acetyltransferases.
Non-histone acetylation targets
Recent acetylomic studies in several cell lines revealed that thousands of proteins are acetylated in various cellular compartments to mediate a wide variety of biological processes [30, 31]. While these studies reveal that acetylation maps to many different types of proteins to potentially mediate many different types of acetylation-dependent protein activities, it is clear that protein acetylation at least affects transcription factor function, protein–protein interaction, protein stability and enzyme activity.
Several proteins involved in gene regulation are modified by acetyltransferases, resulting in specific gene regulatory events . A prototypical acetylated transcription factor is the tumor suppressor protein p53, which is acetylated by the HAT p300/CBP on several lysine residues to promote DNA binding and transcriptional activity [33-35]. p53 is also acetylated by the HAT Tip60 to regulate its apoptotic function [36, 37]. Acetylation of p53 is reversible, and deactylation by histone deacetylase 1 (HDAC1) represses its transcriptional activity [38, 39]. Conversely, acetylation of the Yin Yang 1 (YY1) protein by p300/CBP at two domains results in decreased DNA binding [40, 41]. Transcription factor acetylation may also have stimulatory or inhibitory effects dependent on the site of modification. For example, transcription of the interferon-β gene mediated by the HMG-A1 (High Mobility Group-A1) transcription factor is stimulated by lysine 71 acetylation by the HAT PCAF (P300/CBP Associated Factor) but inhibited by lysine 65 acetylation by the HAT CBP (CREG Binding Protein) [42, 43]. Other transcription factors that are known to be regulated by acetylation include c-MYC (cellular myconcogene), NF-κB (nuclear factor kappa B), MyoD (Myogeneic Differentiation) and E2F (E2 promoter binding factor) [44-50]. Transcription factor acetylation has been reviewed extensively elsewhere [23, 24].
Other DNA transactions have also been shown to be regulated by protein acetylation. For example, protein acetylation has been shown to play a role in DNA replication through direct acetylation of the cohesion protein complex that mediates appropriate separation of sister chromatids during mitosis . This acetylation is mediated by the establishment of cohesion 1 and 2 proteins (ECO1 and 2) [52, 53], while subsequent deacetylation of Smc3 by Hos1 (Hda one similar 1) destabilizes the complex and allows separation of sister chromatids during anaphase .
Protein acetylation also affects protein–protein interaction. For example, the acetylation of importin-α by the HAT p300 promotes its interaction with importin-β, resulting in transport of the RNA-binding protein HuR through the nuclear pore complex, and disruption of this event has been shown to prevent import of this mRNA-stabilizing protein [55, 56]. α-tubulin represents another extra-nuclear protein that has been known for almost 30 years to be highly acetylated, but the exact function of this modification remains unknown [57-59]. Acetylated α-tubulin is generally associated with highly stable, long-lasting microtubules, with acetylation of microtubules apparently being linked to vesicle trafficking along them [58, 60].
The acetylation of certain enzymes has also been shown to modulate their catalytic activities. This is highlighted by recent studies focusing on the protein acetylome in both prokaryotic and eukaryotic organisms, revealing that a large number of metabolic enzymes involved in glycolysis, gluconeogenesis, the TCA cycle, fatty acid synthesis and degradation are acetylated [61-66]. These studies also revealed that, in a number of cases, the acetylation status affected enzyme activity to control the direction of carbon flux in a pathway. For example, in human liver cells, fatty acid production led to elevated acetylation of the β-oxidation multi-functional enzyme enoyl CoA hydratase/3-hydroxyacyl CoA dehydrogenase and increased its catalytic activity . The control of metabolism via acetylation of enzymes includes inactivation of the mitochondrial acetyl CoA synthetase, a key enzyme in production of the acetyl CoA co-factor of acetyltransferases, via acetylation of an active site lysine [67-69]. The regulation of metabolism through protein acetylation is reviewed more extensively elsewhere .
The majority of eukaryotic proteins are also co-translationally acetylated on their N-termini by a family of proteins that have sequence homology to GNAT (Gcn5-related N-acetyltransferase) proteins called N-amino terminal (NAT) acetyltransferases. N-terminal acetylation regulates a wide range of biological processes [70-73]. For example, Hwang et al.  demonstrated that N-terminal protein acetylation regulates the N-end rule for protein degradation, and Scott et al.  demonstrated that N-terminal methionine acetylation of the E2 enzyme Ubc12 modulates its binding to the Dcn1 E3 ligase to facilitate attachment of the ubiquitin-like protein Nedd8 to its Cul1 protein substrate. The anti-apoptotic protein Bcl-xL has also been shown to promote cell survival through down-regulation of acetyl CoA biosynthesis and N-terminal acetylation . In yet another study, N-terminal acetylation was shown to inhibit protein targeting to the endoplasmic reticulum, thus revealing that the modification also plays a role in protein localization . Taken together, the picture that is emerging is that protein acetylation has as diverse and as important a regulatory role as protein phosphorylation in biology.
Structure and mechanism of non-histone protein acetyltransferase enzymes
Over the last few years, we have begun to accumulate biochemical and structural data on the enzymes that mediate non-histone protein acetylation. This puts us in a position to compare and contrast these enzymes with HATs to better understand the activities and substrate-binding specificities of the broader family of protein acetyltransferases. In this review, we focus on four non-histone protein acetyltransferases as model systems: the human α-tubulin acetyltransferase αTAT1 , the human N-amino terminal acetyltransferase Naa50p , the Mycobacterium tuberculosis ACS (Acetyl—CoA synthetase) acetyltranferase Rv0998 , and the Sulfolobus solfataricus ALBA (acetylation lowers binding affinity) DNA-binding protein acetyltransferase SsPAT . We specifically compare their overall structures and acetyl CoA binding pockets, mode of catalysis, substrate-selective binding and mode of regulation.
Overall structure and acetyl CoA binding
The structures of the αTAT1, Naa50p, Rv0998 and SsPAT acetyltransferases are most similar to the Gcn5-related N-acetyltransferase fold (Fig. 1). This observation does not exclude the possibility that other, as yet uncharacterized, non-histone acetyltransferases may contain folds similar to the other classes of acetyltransferases. Not surprisingly, each of these proteins contains a β-sheet-helix core region that is structurally conserved among all HATs (colored blue in each panel of Fig. 1), despite the lack of sequence conservation in this region. As for the various HAT sub-families, the structurally conserved core region is flanked by more variable N- and C-terminal segments (colored green in each panel of Fig. 1). Despite the structural variability of the N- and C-terminal segments, each of these proteins contains a binding cavity or groove formed by the core region and flanked on opposite sides by helices and loops from the N- and C-terminal segments. The shapes of the respective binding grooves or cavities are different and likely contribute to substrate-specific binding, as described in more detail below.
As in the HATs, the central core region of the non-histone protein acetyltransferases participates in a large number of the acetyl CoA binding and stabilizing interactions, with the remainder of the acetyl CoA interactions being formed by two flanking α-helices found in each of the structures. Residues that participate in acetyl CoA interaction are typically not conserved. Notably, each of the acetyltransferases appears to use CoA, not only as an acetyl donor, but also as a molecule to stabilize the overall fold of the acetyltransferase domain. It is difficult to imagine folding of these domains in the absence of co-factor, and it is not surprising that very few acetyltransferase structures have been reported without a bound co-factor or co-factor analog [14, 79].
Mechanism of catalysis
A surprising finding when comparing the various HAT sub-families is that they use different chemical strategies to perform the same chemical reaction of acetylation . This probably derives from the relative chemical simplicity of transferring an acetyl group from a CoA thioester to an amine of a lysine side chain, which allows these enzymes to use chemical strategies that best suit their respective biochemical activities. This catalytic diversity within the HATs appears to extend to the non-histone protein acetyltransferases.
The Gcn5/PCAF and Gcn5-related N-acetyltransferases have been very well characterized structurally and enzymatically, and utilize a conserved glutamate residue to function as a general base through a bi-bi ternary complex mechanism [79-81] (Fig. 2A). In this mechanism, both the acetyl CoA and the substrate protein must be bound to the enzymes before catalysis can occur, and the turnover number for Gcn5 (kcat) is very efficient (approximately 210 min−1) . Interestingly, although SsPAT contains a glutamate residue (E76) at the equivalent position in 3D space, it does not utilize this residue exclusively as a general base to facilitate deprotonation, but instead a number of residues (Y38, E42, E43, D53, H72 and E76) act as a proton wire to shuttle the proton from the active site (Fig. 2B) . The rate of reaction for SsPATs is significantly slower than for Gcn5 (kcat= 2 min−1), but the value reported was measured at 75°C for a thermophilic organism, so the naturally occurring rate at the higher ambient temperature for this organism may be different. Studies on Naa50p also suggest that catalysis does not rely on one specific residue, with a tyrosine residue and a histidine residue (Y73 and H112) fulfilling the role of the general base (Fig. 2C) . This reaction appears to occur slightly more efficiently than for SsPAT (kcat= approximately 7 min−1 for Naa50p), but still not as efficiently as Gcn5.
During structural analysis of Rv0998, it was observed that E235 is positioned in a location similar to the conserved glutamate of Gcn5 (Fig. 2D), and catalysis was ablated when this residue was mutated to an alanine . However, a more thorough enzymatic analysis has yet to be completed, so it is possible that the catalytic mechanism of Rv0998 may be more complicated.
Simultaneous reports of the structure of the αTAT1/acetyl CoA complex revealed structures with nearly identical overall folds [75, 83]. However, the investigators' interpretations of the catalytic properties of αTAT1 were different. Previous work had suggested that a conserved aspartate (D157), not a glutamate, acted as a general base for catalysis (Fig. 2E). Friedmann et al.  showed that mutation of this residue resulted in a catalytically defective enzyme. Furthermore, mutation of this residue to a glutamate (D157E) failed to rescue activity, suggesting that the bond distances in the catalytic pocket are crucial for effective catalysis. A cysteine residue (C120) in close proximity to the catalytic pocket was also identified in this study. Mutation of this residue to either an alanine or serine also resulted in a catalytically defective enzyme, suggesting a role for this residue in catalysis as well. To determine whether αTAT1 employs a bi-bi ternary complex mechanism like Gcn5 or a ping-pong catalytic mechanism that may use this cysteine residue to form an acetyl–enzyme intermediate, as is the case with the MYST HAT proteins . Friedmann et al. performed a bi-substrate kinetic analysis under steady-state conditions , and found that the mechanism used is a bi-bi ternary complex mechanism. Based on this data, the authors proposed that αTAT1 uses D157 and C120 as general bases for catalysis. The kinetic analysis also supported the previous claim of a very inefficient enzyme (kcat= approximately 0.12 min−1) . This observation may suggest why acetylated tubulin is only found in long-lived microtubules, as dynamic microtubules are not stable for long enough for αTAT1 to act upon them. In contrast to the observations of Friedmann et al. , Taschner et al.  showed that D157 and a nearby glutamine residue (Q58) were both mutationally sensitive for acetylation activity (Fig. 1E), suggesting that these two residues may cooperate for general base catalysis. Although further studies of αTAT1 are required to address this discrepancy, it appears that αTAT1 uses yet another chemical strategy for substrate acetylation, highlighting the diversity of chemical strategies used by acetylation enzymes.
Protein substrate recognition
The structure of Gcn5 bound to an 11-residue histone H3 peptide centered around a cognate H3K14 lysine substrate provided the first atomic view of how a HAT recognizes its cognate protein substrate for acetylation . This structure and subsequent structures of Gcn5/PCAF with longer histone or non-histone peptide substrates revealed that the protein substrate binds across an enzyme groove that is formed by the core on the bottom and flanked on opposite sides by the N- and C-terminal segments (Figs 2A and 3A) . Residues in the α2 and α4 helices of the N- and C-terminal segments, respectively, mediate the majority of interactions with the peptide. Interestingly, few substrate side chains participate in interactions with the enzyme, and the majority of interactions are with the substrate backbone. This may explain the ability of Gcn5 to acetylate diverse histone and non-histone lysine residues.
Although the overall structure of Naa50p superimposes well with Gcn5, with a root mean square deviation of Cα atoms of 4.1 Å, the Naa50p substrate-binding site shows significant differences that appear to correlate with its ability to acetylate an N-terminal amino group rather than a lysine side chain amino group . Specifically, a long loop in the C-terminal segment occupies one end of the corresponding peptide-binding groove that is present Gcn5. This makes the substrate-binding region of Naa50p much more ‘closed’ than that of Gcn5, with a width of only approximately 9 Å, instead of approximately 17 Å in Gcn5. This appears to prevent protein segments from lying across Naa50p to insert their lysine side chain residue. Instead, the narrower width of the Naa50p binding cleft creates a ‘tunnel’ that allows the N-terminal end of the cognate substrate to extend into the enzyme active site in an orientation that is roughly perpendicular to that of the protein substrate for Gcn5 (Fig. 3B). In addition, a hydrophobic pocket that is created by the α1–α2 helices and the β6–β7 strands of the N- and C-terminal segments that surround the active site, accommodate the Met–Val sequence that is located at the peptide N-terminus. These residues are conserved among Naa50p enzymes, but not the other classes of N-terminal acetyltransferases, and explain why Naa50p has such defined substrate specificity. Structural characterization of the other N-amino-terminal acetyltransferases will probably reveal similar overall active site architectures to accommodate N-amino protein substrates, but differences in the substrate active site are likely to specify their respective N-terminal cognate sequences.
The lack of structural data on other PATs bound to their respective cognate substrates precludes a detailed analysis of the determinants of substrate specificity. However, the available data on the unliganded enzymes make possible some predictions that are worth noting. αTAT1 displays a strict specificity for α-tubulin, which may be mediated in part by a large positively charged surface near the presumed substrate-binding cleft (Fig. 3C) [75, 83]. This positively charged surface is flanked by small apolar and acidic residues and appears to prohibit histone binding and facilitate α-tubulin K40 recognition. The αTAT1 structure revealed structural elements flanking the conserved core that affect catalysis when mutated, and a wide substrate-binding cleft of approximately 20 Å. Additionally, mutation of a novel β-hairpin structure located in the C-terminal segment was shown to have debilitating or activating effects on catalysis in vitro and on tubulin acetylation in cells (Fig. 3C) , which further validates the importance of these structural elements for substrate recognition by αTAT1.
The structure of SsPAT did not include a protein substrate peptide, but nonetheless suggested important components for substrate recognition. Similar to Naa50p, SsPAT displayed a very narrowed binding cleft of 8-10 Å. However, the SsPAT binding cleft is constricted because of an unusual bent helix (α2) that sits in the analogous space that is occupied by the peptide substrate when bound to Gcn5 (Fig. 3D, green). The α2 helix of SsPAT appears to play a role in substrate recognition, as mutation of residues in this helix or residues that contact it affected catalysis . It was proposed that this helix is dynamic and participates in substrate binding, although the mechanism for this is not clear . Taken together, like the HATs, the non-histone protein acetyltransferases appear to use divergent regions N- and C-terminal to the structurally conserved core region for substrate-specific recognition.
The structure of Rv0998 does not reveal the manner by which the protein substrate binds to the enzyme, but it does reveal how an acetyltransferase may respond to allosteric signals. In this particular case, the acetyltransferase domain of Rv0998 is fused to a cAMP-binding domain. The crystal structures of auto-inhibited and cAMP-activated Rv0998 reveal the mode of cAMP-mediated activation of the acetyltransferase . Specifically, in the absence of cAMP, a portion of the acetyltransferase domain referred to as the ‘lid’ lies over the top of the catalytic glutamate, with a histidine (H173) from the loop positioned to mimic the cognate lysine substrate (Fig. 3E, pink). Upon binding of cAMP to the regulatory domain more than 30 Å from the active site, a series of conformational changes occur that results in significant movement of the ‘lid’ to a position that allows the substrate to gain access to the active site (Fig. 3E, green). Many other acetyltransferases, such as the N-amino terminal acetyltransferases, are also regulated by binding of other protein domains, typically from other polypeptide chains, and it will be interesting to determine whether they are regulated by mechanisms similar or distinct to that of Rv0998.
Several reports have demonstrated that auto-acetylation of HATs also plays important regulatory functions. In particular, the Rtt109, p300/CBP and MYST HAT families have all been shown to require auto-acetylation for maximal acetyltranferase activity [14, 85-88]. Interestingly, each of these HAT families is regulated by auto-acetylation in different ways. A lysine-rich loop in the p300 protein is proposed to occlude the active site, but is displaced when it is auto-acetylated, which allows binding and acetylation of the cognate substrate . Several reports have shown that the MYST acetyltransferases are auto-acetylated [89, 90], and Yuan et al.  showed that a strictly conserved buried acetyl lysine residue in the active site makes essential stabilizing contacts with neighboring residues and facilitates substrate binding and catalysis. Rtt109 also contains a strictly conserved buried acetyl lysine [88, 91, 92], but this residue is approximately 8 Å away from the active site and has been shown to reduce the KM for acetyl CoA binding and increase the catalytic rate , although the mechanism for this is unclear.
A study by Kalebic et al.  identified four lysine residues in αTAT1 that were auto-acetylated, and this auto-acetylation was shown to increase the catalytic activity of the enzyme towards its microtubule substrate. The exact mechanism by which this operates is not clear, as two of the lysines are distant from the catalytic domain, and the other two are at the edge of the catalytic domain and are not resolved in the crystal structure. Nonetheless, it appears that the regulation of acetyltransferase activity by auto-acetylation may extend beyond the HATs to the broader family of protein acetyltransferases.
Conclusions and perspectives
The study of acetyltransferases has come a long way from the first observation of acetylated histones and the subsequent structure determination of the HAT Gcn5. To date, a handful of HATs have been identified, and these may be divided into sub-families that contain a structurally conserved core region, more variable N- and C-terminal segments and divergent catalytic strategies that participate in substrate-specific acetylation and HAT-specific biochemical properties. We also know that HAT activity is often modulated by their interaction with other protein domains and/or proteins and by auto-acetylation. Perhaps not surprisingly, it appears that PATs follow the same rules. Future studies on these proteins clearly need to focus on better understanding their mode of substrate-specific activities and their regulation by other protein domains, protein subunits, auto-acetylation and possibly other post-translational modifications.
A remarkable feature of acetyltransferases is that they lack strong sequence homology. This fact, together with recent acetylomic studies showing that thousands of proteins in various cellular compartments and with diverse biological functions are acetylated, suggests that other PATs exist that are yet to be identified. Indeed, the PATs that have been characterized to date have been identified through their limited sequence homology to GNAT proteins. It is therefore likely that other PATs with homology to the MYST or p300/CBP proteins exist, and perhaps new sub-families of PATs are yet to be discovered. One thing that is becoming clear is that PATs are fascinating biochemical machines and we are only just beginning to uncover their mechanism of action. On the biological side, it may be argued that PATs play as important a role as protein kinases in signal transduction pathways. Together with the correlation of altered histone and protein acetylation in various human pathologies [95, 96], we propose that PATs may also be important drug targets. Indeed, the divergent catalytic strategies that are used by PATs should facilitate development of PAT-specific inhibitors for use in therapy.