Protein acetylation has early emerged as a major posttranslational modification for histones and was recently found to be involved in a variety of biological events such as enzymatic activation and signal transduction. Traditional notion about the physicochemical effects associated with protein acetylation is mainly because of its presence capable of neutralizing positively charged protein system, while diverse non-covalent interactions arising from the acetylation are largely ignored and have never been investigated systematically. In the current work, we perform a comprehensive examination of the geometrical profile and energetic landscape of such acetylation-related non-covalent interactions in protein context by using a combination of high-level ab initio calculations, crystal structure survey, and hybrid mechanical/molecular mechanical analysis, on the basis of small model complexes and real protein systems. It is all coming together to suggest that the formation of complicated non-covalent networks around the acetylated site of protein is fundamentally important for stabilizing local structure, improving systemic rigidity, and even conducting more sophisticated biological functions such as switching enzymatic activity.
The histone acetylation was first discovered in 1964 by Allfrey et al. (1), and later epigenetic studies solidified this seminal discovery and further proposed acetylation as a key component of the ‘histone code’ (2). It is known that several core histones such as H3 and H4 can be acetylated at their specific lysine residues, which leads to the opening of chromatin structure and activating downstream gene expression events (3). In the past decade, protein acetylation has emerged as a key posttranslational modification (PTM) in cellular regulation, which not only occurs at the histone and chromatin, but also is present in a wide spectrum of nuclear and cytoplasmic components, ranging from transcription factors (4) to receptor proteins (5), functioning as a crucial tool in regulating and/or influencing as diverse as gene expression, signal transduction, protein stability, cell senescence, and disease development. The life cycle of a variety of functional proteins including hepatocyte nuclear factor HNF6 (6), tumor suppressor p53 (7), and histone H2AX (8) was revealed to be controlled by the balance between the degree of their acetylation and deacetylation, and this process may be integrated into ubiquitin–proteasome pathway (9). Protein acetylation is also considered as an important target of cancer therapy (10), because it has been identified to play a central role in the formation, proliferation, and metastasis of many kinds of malignant tumors, such as breast (11) and colon cancers (12). Most recently, Zhao and coworkers found that nearly all enzymes in glycolysis, gluconeogenesis, the tricarboxylic acid cycle, the urea cycle, fatty acid metabolism, and glycogen metabolism are acetylated (13) and further demonstrated that central metabolism enzymes in Salmonella are acetylated extensively and differentially in response to different carbon sources, concomitantly with changes in cell growth and metabolic flux (14). These findings clearly pronounced that the ubiquity of protein acetylation involved in various cellular events should be an evolutionarily conserved mechanism in coordination with different metabolic pathways as a response to extracellular stimuli.
Proteins can be acetylated by both enzymatic and non-enzymatic processes. One group of acetyltransferases commonly catalyzes the transfer of an acetyl group from acetyl-CoA to the terminal amine on the side chain of lysine residues. These enzymes are commonly called histone acetyltransferases (HATs), because their best-known substrates have been histones. However, the nomenclature is being revised to lysine acetyltransferases (KATs), reflecting their ability to acetylate lysine ‘K’ on many proteins (15); the cross-regulation between lysine acetylation and other modifications was indicated to be central in modulating chromatin-based transcription and in shaping inheritable epigenetic programs (16), and the cross talk between lysine acetylation and ubiquitination was found as a critical regulatory mechanism controlling protein stability (17). Two categories of acetylation are observed in the protein world (18): (i) The Nα-terminal acetylation is catalyzed via a variety of N-terminal acetyltransferases, which cotransfer acetyl moieties from acetyl-coenzyme A to the α-amino group of protein Nα-terminal residues. (ii) The Nε-lysine acetylation specifically modifies ε-amino group of protein lysines, resulting in a neutral marker on the side chain of target residues. According to the analysis of acetylome, although Nα-terminal acetylation is a very common event in PTM as compared to Nε-lysine acetylation, for example, it is estimated that ∼85% of eukaryotic proteins are Nα-terminally modified (19), and the Nε-lysine acetylation is quite conserved and may play more important roles in the regulation of cellular growth and development (20). In addition, O-acetylserine is also observed in oxidoreductase family, which catalyzes electron transfer between enzymatic moieties and substrate (21).
From molecular viewpoint, the physiological effects of protein acetylation ultimately arise from the changes in physicochemical property and conformational arrangement of the acetylated proteins, which is essentially non-covalent. A prominent concomitancy of acetylation is the cancelation of the positive formal charge attached to protein ammonium group, resulting in, for example, destabilizing the interaction of histone with the negatively charged DNA backbone and opening chromatin (22), or leading to the dimerization of prolactin receptor (23). In fact, the product of protein acetylation is virtually a typical polar group, which can make a complicated non-bonded network with its context, directly or indirectly giving rise to diverse biological effects on the acetylated proteins, such as allosteric regulation and stabilization of the local structure. In addition, the acetyl group has also been observed to directly participate in the protein–protein, protein–peptide, and protein–DNA interactions by forming solid hydrogen bonding between the interacting partners (24). Nevertheless, this non-covalent facet of protein acetylation was largely ignored (and has never been systematically investigated) in previous studies. In the present work, on the basis of high-resolution crystal structures deposited in the Protein Data Bank (PDB) (25), the diverse non-covalent interactions associated with protein acetylation were comprehensively identified, classified, and analyzed by using a variety of statistical and computational methods, including database survey, ab initio calculation, and quantum mechanical/molecular mechanical (QM/MM) dissection. Furthermore, the geometrical feature and energetic landscape of these non-covalent interactions were characterized in detail with respect to their physicochemical effects on protein architecture and interaction.
Materials and Methods
Protein acetylated data set
Up to July 2011, there were more than 60 000 proteins or their complexes with nucleic acids and small ligands deposited in the PDB database (25). Here, we only selected the acetylated proteins or protein complexes with long chain (>50 residues), high resolution (≤2.5 Å), and low homology (<60%). Consequently, 206 acetylated proteins were collected, from which totally 192 Nα-terminal acetylations and 14 Nε-lysine acetylations were identified. To reliably characterize the non-covalent interactions associated with these acetylated moieties, the hydrogen atoms and protonation state of polar and charged protein groups were assigned with propka 2.0 (26), which has recently been evaluated as a credible protonation tool compared to other available protocols (27). The water molecules and some other cofactors such as metal ions and buffering agents were removed manually from these studied proteins, while the cocrystallized ligands of biological interest were kept. Furthermore, all the structures were inspected visually to avoid serious flaws such as the broken main chains and missing side chains.
Ab initio calculations
The small model complex systems used to mimic the potential non-covalent interactions involving acetamide group of acetylated proteins were employed to accurately investigate the geometrical profile and energetic behavior of these interactions at high electron correlation level of theory. The acetamide group was modeled by N-methyl acetamide , and those of hydrogen bond donor and acceptor in protein were substituted using methanol and formaldehyde, respectively. The carbonyl moiety of latter was also adopted to serve as a dipole. Structure minimization and energy analysis procedure for the complex systems were carried out with the Møller-Plesset second-order perturbation theory in conjunction with Dunning’s augmented correlation consistent basis set, MP2/aug-cc-pVTZ (28,29). The minimization procedure had no geometrical and symmetrical constraints. Subsequently, supermolecular approach was applied to determine intermolecular potential between complex members, ΔEint = Ecomplex−Emonomer1−Emonomer2, and the associated basis set superposition error (BSSE) was eliminated by means of the standard counterpoise method of Boys and Bernardi (30). All of these calculations were made with the help of Gaussian 03 suite (31). The generated wavefunctions of electron correlation calculations were further used to perform topological analysis of the electron charge density and its Laplacian involved in the equilibrium complex systems, based on the atoms-in-molecules (AIM) theory of Bader (32).
To further investigate the biological significance of acetylation-related non-covalent interactions in real protein context, several acetylated proteins as well as their deacetylated counterparts were analyzed by using a two-layer ONIOM (our own n-layered integrated molecular orbital and molecular mechanics)-based QM/MM approach. The hybrid ONIOM method is implemented in Gaussian 03 (31), and it was developed by Morokuma et al. (33). It enables different levels of theory to be applied to different parts of a molecular system and combined to produce a consistent energy expression. The objective is to perform a high-level calculation on just a small part of the system and to include the effects of the remainder at lower levels of theory, with the end result being of similar accuracy to a high-level calculation on the full system (34). Here, the acetylated (or deacetylated) protein site and the residues that directly interact with the site were included in inner layer and treated with a high-level density function theory of B3LYP/6-31G(d) (35), while the remainder of the protein system was in outer layer and described by a low-level molecular mechanics of AMBER (36). The acetylated site and water molecules in the studied system were parameterized using generalized amber force field (GAFF) (37) and TIP3P model (38), respectively.
The structures of selected protein systems were fully optimized based on the two-layer QM/MM methodology described above without any constraints. The electrostatic interactions between the two layers were treated in terms of mechanical embedding scheme to save computational cost. After the QM/MM optimization procedure, the acetylated (or deacetylated) site and its interacting residues were manually stripped from the protein system and then performed a higher-level analysis at the MP2/aug-cc-pvtz theory of level to accurately determine their electronic and energetic properties.
Results and Discussion
Classification of acetylation-related non-covalent interactions
Here, let us look at the chemical structure of an acetylated moiety . It is clear that the oxygen atom of carbonyl group and the hydrogen atom of secondary amine can serve as the, respectively, classical acceptor (Figure 1A) and donor (Figure 1B) of canonical hydrogen bonds. In addition, several potential weak interactions seem to also involve in this moiety, that is, (i) the non-canonical hydrogen bond forms using the methyl group as donor (Figure 1C), because σ-π hyperconjugation between the methyl and amide would considerably impair the strength of C–H bond, leading the hydrogen atom to be a moderate Lewis acid that is able to accept the lone pair from, for example, carbonyl oxygen; (ii) the orthogonal multipolar interaction between the amide plane and a perpendicularly approaching dipole (Figure 1D); and (iii) the antiparallel dipolar interaction between the two reversely arranged carbonyl groups (Figure 1E).
To deeply understand the structural, electronic, and thermodynamic properties of the potential acetylation-related non-covalent interactions, high-level ab initio calculations were carried out on the model complex systems mimicking these interactions. The calculation details can be found in Methods section, and the resulting geometrical and energetic parameters are tabulated in Table 1, while the equilibrium structures and corresponding molecular graphs are shown in Figure 2, from which it is no doubt that complexes (a) and (b) are classical hydrogen bonds, with their bond length D and angle θ falling into the normal ranges of, respectively, 2.6–3.2 Å and 160–180° earlier observed in single crystals (39). The good geometrical arrangements of (a) and (b) give rise to a solid bond strength with them, indicated by their interaction energies ΔEint of −5.688 and −4.329 kcal/mol, respectively. The ab initio calculations also confirmed the existence of a weak hydrogen bond between the two members of complex (c), albeit this bond is relatively weak and hence contributes limitedly to systemic stabilization (ΔEint = −1.623 kcal/mol). In addition, the supposed orthogonal multipolar interaction and antiparallel dipolar interaction were predicted to stably exist in the equilibrium state of complexes (d) and (e), respectively. Specifically, the adductive energies of the two dipole-/multipole-dominated complexes are considerably significant, as their ΔEint of −2.842 and −3.920 kcal/mol appear to be quite close to that of strong hydrogen bonds present in complexes (a) and (b), implying that these polar interactions could exert an appreciable effect on the stability and ordering of local structures around the acetylated sites of protein. Furthermore, the electronic topological properties arising from AIM (atoms-in-molecules) analysis further revealed the nature and strength of these non-covalent interactions involving acetamide group. First, ρb, ∇2ρb, and Hb values of complexes (a) and (b) were calculated to be 0.0216, 0.0635, and −0.0004 au, and 0.0198, 0.0610 and −0.0003 au, respectively, which satisfy the features of conventional hydrogen bonds defined by Parthasarathi et al. (40) Second, the peculiarity of weak hydrogen bond in complex (c) was well characterized, as given by its small ρb as well as positive ∇2ρb and Hb (41). Third, the polar interaction profile of complexes (d) and (e) was also reflected fairly well in their electronic topological behavior, that is, the modest ρb is concomitant with positive ∇2ρb but negative Hb, indicating a moderate strength of these unshared interactions (41).
Table 1. Geometrical, electronic, and energetic parameters of equilibrium model complexes involving acetylation-related non-covalent interactions
bGeometrical parameters d (Å), D (Å), and θ (°) are defined in Figure 1.
cΔEint (kca/mol), calculated interaction energy.
dρb (au), electron density at bond critical point (BCP).
e∇2ρb (au), Laplacian of the electron density at BCP.
fεb (au), ellipticity at BCP. gHb (au), electronic energy density at BCP.
Crystal structure survey of acetylation-related non-covalent interactions
It is known that the acetamide moiety of acetylated proteins has two isomers, that is, (Z)- and (E)-isomers, and both of them were observed in protein crystal structures. According to high-level ab initio calculations using model molecule N-methyl acetamides (as shown in Figure 3), the (Z)-isomer is more stable than (E)-one (owing to the moderate steric hindrance effect between two spatially vicinal, bulk methyl groups of (E)-N-methyl acetamide), and corresponding stabilization energy ΔEs is 0.97 kcal/mol. The conversion between the two isomers requires to experience a high-energy transition state (confirmed by vibrational frequency analysis) where the H–N–CH3 plane is perpendicular to O–C–CH3 plane, possessing a large activation energy ΔEa of 28.35 kcal/mol relative to (E)-isomer or, more significantly, 29.32 kcal/mol relative to (Z)-isomer. From the significant energy barrier ΔEa but quite modest energy difference ΔEs between two isomers, it could be speculated that the generation of a specific isomer from unacetylated amino group can be fundamentally influenced by associated non-covalent interactions with its protein context, but when the isomer is yielded, its state is ‘frozen’ and thus it cannot convert spontaneously to other isomer at room temperature, because the interaction energy ΔEint of non-covalent interactions is normally larger than ΔEs but significantly less than ΔEa. The theoretical results were solidified by crystal structure survey of acetylated proteins; for example, the observed ratio of (Z)- to (E)-isomers from crystallographic data is 1.78:1 (i.e., 132 (Z)- and 74 (E)-isomers), which is basically consistent with the expected value of 1.64:1 calculated in terms of Boltzmann distribution at 298.15 K, with regard to the energy difference ΔEs = 0.97 kcal/mol between (Z)- and (E)-isomers.
Canonical hydrogen bond with carbonyl group as acceptor
There were a total of 61 hydrogen bonds with the carbonyl group of acetamide as acceptor found in our acetylated protein data set. Here, we have systematically examined their three independent geometrical parameters, d, D, and θ, to generate distribution profiles for straightforwardly analyzing the geometrical characteristics of these hydrogen bonds. From this, it is evident that most of bond lengths d and D as well as bond angle θ fall into the ranges of 1.8–2.4 Å, 2.7–3.2 Å, and 130–170°, respectively, which are basically consistent with those determined by high-level ab initio calculations (see Table 1), indicating that the hydrogen bond of this category shows a good structural arrangement and may be a common concomitant with the acetylated moieties of protein. Indeed, visual inspection of crystal structures clearly illustrated that the Nα-terminal acetamide, which is considerably abundant in acetylated proteins, can be regarded as a pseudo residue added to the N-terminus of peptide chain. As a result, the carbonyl group C=O of the ith ‘acetamide residue’ can readily form a relatively strong hydrogen bond with the backbone N–H moiety of (i + 4)th residues in α-helix context.
Canonical hydrogen bond with secondary amine as donor
Here, totally 43 hydrogen bonds with the secondary amine of acetamide as donor were curated from our data set. As can be seen, the geometrical profiles of this kind of hydrogen bond are dramatically distinct from those using carbonyl group as acceptor discussed above; their bond lengths d and D are, respectively, elongated and shortened noticeably relative to corresponding standard values shown in Table 1. In particular, the distribution of bond angle θ is significantly deviated from the linear state of 180°, with its peak position present near 100°. In fact, the abnormal geometrical characteristics of this hydrogen bond are mainly attributed to its formation often with sequentially neighboring residue, such as with the hydroxyl group O–H of (i + 1)th serine, leading to a significant distortion on the arrangement of this hydrogen bond.
Non-canonical hydrogen bond with methyl group as donor
The three hydrogen atoms bonded to methyl carbon of acetamide show the feature of weak Lewis acid, because the C–H bonds are impaired considerably by the σ-π hyperconjugation effect between the methyl and amide. Thus, the methyl group could serve as the proton donor to form (weak) hydrogen bonds with lone pairs donated from its surroundings. Both ab initio calculation and crystal structure survey supported this point; the ideal interaction energy ΔEint of this non-canonical hydrogen bond was predicted to be −1.623 kcal/mol. This was further confirmed by the observed geometrical distributions derived from 44 observations; the peak positions of d, D, and θ distributions are located at 2.75 Å, 4.05 Å, and 95°, respectively, which deviate significantly from the ideal geometrical behavior expected by ab initio calculations (Table 1), suggesting that this weak hydrogen bond is very vulnerable to steric hindrance and geometrical constraint in real protein context.
Orthogonal multipolar interaction
Orthogonal multipolar interaction is usually ignored in biomolecular systems but actually plays an important role in conferring both stability and specificity for protein architecture. This is because (i) the structural elements such as carbonyl dipole and amide plane forming this non-covalent interaction are readily available in protein and (ii) a moderately strong interaction energy ΔEint of −2.842 kcal/mol was characterized theoretically by high-level ab initio calculations. Indeed, this kind of intermolecular interaction has been previously observed at the binding interface of a series of protein/ligand complexes, such as the phosphodiesterase and thrombin in complex with their selective inhibitors (42,43), exerting substantial stabilization effect to the binding. In our data set, 49 acetylated moieties were found to be involved in orthogonal multipolar interactions. Their geometrical distribution profiles agree well to those present in ideal condition (Table 1), that is, bond lengths d and D prefer to be in the ranges of 3.0–3.3 Å and <4.5 Å, respectively, leading to the contacting distance between the C or N atom of multipole and the O atom of carbonyl dipole less than their van der Waals radii sum. The distribution of the dip angle θ of carbonyl dipole relative to amide plane exhibits a bimodal profile, with two peak positions located at 55° and 75°, which well reflects the fact that the orthogonal multipolar interaction could arise from both the C and N atoms of amide plane, and the interaction of carbonyl dipole with central C atom would suffer less steric hindrance than that with fringe N atom, giving rise to relatively perpendicular configuration of former as compared to latter.
Antiparallel dipolar interaction
The dipolar–dipolar interactions have been first observed in single crystals as early as thirty years ago (44), and latter found in a variety of organic adducts such as nitriles, ketones, and aldehydes (45). Naturally, there are two kinds of dipolar–dipolar interaction, that is, parallel and antiparallel ones. The carbonyl dipole of acetamide studied here would, however, suffer a severe steric collision if it participates in parallel dipolar interactions. Hence, only the antiparallel dipolar interactions were observed in the crystals of acetylated proteins. As might be anticipated, antiparallel dipolar interactions formed by two reversely arranged carbonyl dipoles, separately provided from acetamide and backbone amide of protein, exhibit an asymmetrical configuration, that is, the d values appear to be generally smaller than D values. This phenomenon was also reproduced in theoretically determined equilibrium structure of small model complex (see Table 1) and could be attributed to volume difference in Cα-substituents of the two interacting partners, that is, a tiny hydrogen atom bonded to acetamide Cα atom, whereas a bulky side chain attached to residue Cα atom. In addition, both the noticeable energy ΔEint of −3.920 kcal/mol revealed by theoretical calculations and the significant number of occurrences identified in real crystals imparted that this kind of solid non-covalent interaction would play a tangible role in stabilizing the local structure and direction of acetylated moieties within their protein context.
QM/MM analysis of real acetylated proteins
To straightforwardly elucidate how the acetylation influences local structure and arrangement of protein, we further employed a two-layer ONIOM-based QM/MM approach to investigate two typical acetylated proteins as well as their deacetylated counterparts. For a protein system, the acetylated moiety and the residues that covalently bond to or non-covalently interact with the acetylated moiety were included in high-level QM layer, while the remainder of the protein was in low-level MM layer. Subsequently, whole system was fully optimized based on a two-layer QM/MM protocol without any constraints. The details of parameter assignment and QM/MM calculations can be found in Methods section, and the superpositions of optimized crystal structures for these two paradigms are shown in Figure 4.
The first one considered here is an artificially designed four-helix bundle protein that shows a significant redox activity on a series of substrates such as azide, acetate, and aromatic molecules (46). This protein is a non-covalently associated homodimer of two helix–loop–helix hairpin motifs, and the N-terminus of each monomer is Nα-terminally acetylated to form a cap. As shown in Figure 4A, the C=O of the 0th Ace moiety is hydrogen-bonded to the N–H of 4th Arg residue, defining a typical i + 4 hydrogen bond to stabilize the α-helix of the hairpin motif. Removing the acetylated cap would result in an observable change in the N-terminal structure of helix. In particular, the distance between Asp1 and Arg4 residues was substantially enlarged because of the absence of the i + 4 hydrogen bond in the deacetylated version, leading to the breaking down of ordering α-helix at protein N-terminus. In fact, this phenomenon is very common in almost all Nα-terminally acetylated proteins starting with α-helix module. Apparently, the i + 4 hydrogen bond associated with the Ace0 moiety is essential for stabilizing the initial helix structure of these acetylated proteins.
The second case is a Nα-terminally acetylated carbonic anhydrase II, which catalyzes reversible hydration of carbon dioxide. Previous study demonstrated that the acetylated enzyme shows higher catalytic activity than its deacetylated counterpart (47), but the molecular mechanism underlying the effect of acetylation/deacetylation on enzymatic activity still remains unknown. From Figure 4B, it is revealed that the Nα-terminal acetylation gives rise to an orthogonal multipolar interaction between the Ace1 and His10, and the bond length and strength of this interaction were predicted to be, respectively, 2.923 Å and −2.678 kcal/mol at the MP2/aug-cc-pvtz level of theory. Thus, the flexible loop region defined by residues 1–10 of the protein is fixed through this effective non-covalent interaction. By contrast, deacetylation would break this interaction and further undermine the rigidity of the loop in acetylated state, which would fundamentally influence substrate entering into or product releasing out of the active site of the enzyme, because this loop region is a large fraction of a ‘lid’ that can functionally open or close the gateway of substrate and product molecules getting in and out of the enzyme.
Intermolecular forces, also termed as non-covalent interactions, have been the subject of intense interest from both experimental and theoretical points of view owing to their fundamental role in determining the three-dimensional structure of a wide number of important biomolecules such as proteins, nucleic acids, and their complexes with small ligands (48). However, quite limited works have to date been addressed on the weak chemical forces associated with protein’s posttranslational modification, the critical events involved in space–time-specific regulation of cellular processes. In the present study, we systematically examined the geometrical profile and energetic landscape of five kinds of potential non-covalent interactions, that is, three hydrogen bonds and two polar interactions, associated with protein acetylation, which was early regarded as a key step of starting gene expression and recently found to be involved in a variety of biomolecular processes such as enzymatic activation and signal transduction. High-level ab initio calculations on small model complexes preliminarily unveiled the physicochemical nature of such acetylation-related interactions. Crystal structure surveys further identified a considerable number of attractive interactions of acetylated moieties with their surroundings in protein context. QM/MM analyses of real protein systems ultimately gave a quantitative pronouncement for the biological functionality of these interactions. All of these are coming together to suggest that, beyond the traditional notion of eliminating positive formal charges from protein, acetylation can also define a complicated non-covalent network around it to stabilize local structure, to improve systemic rigidity, and even to conduct more sophisticated functions such as switching enzymatic activity.
We would like to acknowledge the National Science Foundation of China (Grant Nos: 20890020, 21175029, and 20890022) for financial support.