Variation of free‐energy landscape of the p53 C‐terminal domain induced by acetylation: Enhanced conformational sampling

The C‐terminal domain (CTD) of tumor suppressor protein p53 is an intrinsically disordered region that binds to various partner proteins, where lysine of CTD is acetylated/nonacetylated and histidine neutralized/non‐neutralized. Because of the flexibility of the unbound CTD, a free‐energy landscape (FEL) is a useful quantity for determining its statistical properties. We conducted enhanced conformational sampling of CTD in the unbound state via virtual system coupled multicanonical molecular dynamics, in which the lysine was acetylated or nonacetylated and histidine was charged or neutralized. The fragments were expressed by an all‐atom model and were immersed in an explicit solvent. The acetylation and charge‐neutralization varied FEL greatly, which might be convenient to exert a hub property. The acetylation slightly enhanced alpha‐helix structures that are more compact than sheet/loop conformations. The charge‐neutralization produced hairpins. Additionally, circular dichroism experiments confirmed the computational results. We propose possible binding mechanisms of CTD to partners by investigating FEL. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.


Introduction
An intrinsically disordered region (IDR) of a protein is highly flexible in physiological conditions unless interacting with its partner molecule. [1] Structurally ordered proteins are well known to exert their biological functions through well-defined quaternary, tertiary, and secondary structures. In contrast, IDRs exert their functions actively using conformational flexibility. Signal transduction is a typical function of IDR, [2] where a single IDR interacts with different partner molecules to regulate the signal transduction. This multipartner interaction property is called a hub property. [3] IDRs are related to some diseases such as cancer, diabetes, and neurodegenerative disease. [4] Therefore, IDRs are an important subject related to biology, biophysics, and medical science.
Actually, IDRs can be characterized as (1) dynamic conformations (conformational ensemble), (2) post-translation modification (PTM), and (3) roles of IDR-partner interaction. (1) An IDR does not adopt a specific tertiary structure in the unbound state. Consequently, IDR is not characterized by a single tertiary structure but by a conformational ensemble. [5] The relation between the conformational ensemble and its biological functions has not been understood sufficiently. [6] Presumably, the conformational ensemble architecture affects the binding mechanism. (2) PTM is found frequently in IDR. Importantly, PTM such as methylation, phosphorylation, and acetylation alters the physicochemical properties of IDR and modulates the functionality. [2] Consequently, it is interesting to investigate the variation of the conformational ensemble by PTM. (3) Some IDRs adopt specific tertiary structures when interacting with their partners. This phenomenon is known as coupled folding and binding. [7] Two interaction mechanisms have been proposed for coupled folding and binding processes: induced folding [8,9] and conformational selection. [9,10] In induced folding, IDR binds to the partner with conformations differing from that adopted in the complex (i.e., genuine conformation found in the native complex form). Subsequently, the IDR conformation varies until reaching the most stable complex structure. In conformational selection, the genuine bound conformation is involved in advance in the conformational ensemble of the unbound state. This bound form is used to bind to the partner. Recently, a computational report has described that coupled folding and binding takes place by a combination of the induced folding and the conformational selection. [11] Another IDR-partner interaction picture is fuzzy-complex formation, by which IDR binds to its partner adopting multiple conformations. [12] Therefore, in this mechanism, conformational disorder/flexibility emerges in both bound and unbound states. Enthalpic stability for the individual bound form is less important for discussing the complex formation. A fly casting mechanism provides an alternative viewpoint to the IDR-partner interaction scheme. [13] An IDR has a greater interaction radius than that of a folded protein. It captures partners at a large distance like a fishing line.
Tumor suppressor protein p53 consists of four functional domains: the transactivation domain (TAD) [residues , the DNA binding domain (DBD) [100-300], the tetramerization domain (TET) , and the C-terminal negative regulatory domain (CTD) . Both TAD and CTD are IDRs that interact with many molecules. These domains possess a hub property. [14] To date, three TAD-partner complex structures have been solved, where TAD adopts helical structures on surfaces of the partners in all three cases. [15][16][17] Four CTD-partner complex structures have been determined, where CTD adopts various structures: a helical structure to bind to S100B (PDB ID: 1DT7), [18] a sheet to Sir2 (PDB ID: 1MA3), [19] and fuzzy/coiled structures to CBP (PDB ID: 1JSP), [20] and Cyclin A (PDB ID: 1H26). [21] It is particularly interesting that CTD uses a common sequence [380-386] to bind to these four partners. In this sense, this short common binding region has the hub property. Furthermore, the states of histidine 380 (H380) and lysine 382 (K382) in CTD vary depending on the binding partner as follows: when binding to S100B, K382 is acetylated, although H380 can take either a positively charged or neutralized state depending on the pH condition. [18,22] To Cyclin A, K382 is nonacetylated and H380 is neutralized. [21] To Sir2, K382 is acetylated and H380 is positively charged. [19] To CBP, K382 is acetylated and H380 is neutralized. [20] The correspondence between the state of CTD and the partner is presented in Table 1.
Actually, CTD and its partner have been studied using molecular simulations: Allen et al. demonstrated trends of fluctuations around CTD binding sites of partners. [23] Chen et al. assessed which population sellection or the induced folding is plausible to bind to S100B using molecular dynamics (MD) with an implicit solvent model and some simplifications of the protein model. [24] McDowell et al. showed heterogeneity of CTD and the binding mechanism of CTD to S100B. [25] Staneva et al. investigated conformational preferences of the unbound CTD using an implicit-water Monte Carlo simulation. [26] Although these studies provided beneficial knowledge for the conformational ensemble of CTD, they did not clarify the effects of acetylation on the CTD's conformational ensemble. We consider that computation of a free-energy landscape is crucially important to elucidate the effects of acetylation on the conformational ensemble. To that end, a powerful conformational sampling method is necessary.
A multicanonical simulation has been introduced to enhance the sampling of complicated systems. [27][28][29] Nakajima et al. introduced a multicanonical MD simulation (McMD) using Cartesian coordinates for dynamic variables. Adoption of Cartesian coordinates produced a multimolecular system that is tractable without special devices in a computer program. McMD generates various conformations under equilibrium conditions, which provides not only the most thermodynamically stable state but also intermediate states of the system. Importantly, a freeenergy landscape at arbitrary temperature is computable from the resultant conformational ensemble. To increase the sampling efficiency of McMD, trivial trajectory parallelization McMD (TTP-McMD) was developed, [30,31] and applied to systems consisting of an intrinsically disordered segment and its partner protein. [11,32] Recently, a virtual system coupled McMD (V-McMD) was also developed to increase the McMD sampling efficiency. [33] Terakawa et al. performed all-atom TTP-V-McMD of a p53 linker region (40 residues long) to design force-field parameters for a coarse-grained simulation model. The study reproduced an X-ray scattering profile using the force field. [34] For this study, we examined four CTD fragments in an unbound state (i.e., single-chain state) using TTP-V-McMD, where K382 was nonacetylated or acetylated and H380 was positively charged or neutralized. The system was treated with an all-atom model in an explicit solvent. The free-energy landscapes were computed from the sampled conformations. Results show that the free-energy landscape of the single-chain state varies by the K382 acetylation and the H380 neutralization. In fact, the IDRpartner binding is controlled not only by the finally formed complex structure but also by the conformational distribution of the unbound state. We suggest possible binding mechanisms of CTD to their partner molecules, S100B, Sir2, CBP, and Cyclin A with investigation of the single-chain free-energy landscapes.

Theory
In a conventional (canonical) MD simulation of a biomolecular system, the force acting on an atom i is computed as f i 52grad i E R ðrÞ, where E R ðrÞ is a potential energy of the  [c] N P N N CBP [d] N N N P [a] Complex structures were referred from another report. [18] Binding affinity of CTD to S100B was determined from an in vitro experiment. [22] [b] Complex structures were referred from another report. [21] [c] Complex structures were referred from another report. [19] [d] Complex structures were referred from another report. [20] [e] Mark "P" means that CTD can bind to the partner.
[f ] Mark "N" means that CTD cannot bind to the partner.

FULL PAPER
WWW.C-CHEM.ORG system as a function of the coordinates of the constituent atoms r5½r 1 ; r 2 ; . . . ; r Natom , and where N atom is the number of atoms (polypeptide atoms plus solvent atoms) and r i 5½x i ; y i ; z i is the position vector of atom i, that is, the x, y, and z coordinates. An energy distribution obtained from the canonical MD converges to the following canonical distribution as (1) where R gas represents the gas constant, n R ðE R Þ denotes the density of states (DOS) of the system at the potential energy E R , T stands for temperature, and Z c 5 Ð n R ðE R Þexp 2E R =R gas T Â Ã dE R represents a partition function (i.e., a normalization factor) at T.
McMD uses a modified potential energy E mc instead of E R , which is defined formally as The force acting on the atom i is represented as An energy distribution P mc ðE R Þ from the McMD simulation at T is given simply by replacing E R by E mc in eq. (1) as where Z mc 5 Ð nðE R Þexp 2E mc =R gas T Â Ã dE R is a partition function at T. This equation guarantees that the energy distribution from a longtime McMD converges to a flat function. Then, the flatness of P mc (E R ) is a measure to judge the convergence of sampling.
Virtual system coupled McMD (V-McMD) has been developed to increase the sampling efficiency of McMD, where a "virtual system" is introduced with setting its physical quantities arbitrarily. [33] From here, we explain the framework of V-McMD: Assume that a virtual system exists in addition to the real system (i.e., the biomolecular system). We define an entire system as the sum of the real system and the virtual system. Here we assume that the virtual system does not interact explicitly with the real system. In other words, no cross-term exists in the potential energy of the entire system. Then, the total potential energy is given as E tot 5E R 1E v , where E v is the virtual system's potential energy, a function of a virtual system's coordinate v. Accordingly, DOS of the entire system is represented as n tot ðE R ; E v Þ5n R ðE R Þn v ðE v Þ, where n v is DOS of the virtual system. Using the arbitrary property of the virtual system, we simply set this DOS as n v ðE v Þ51. Then, the multicanonical energy E vmc of the entire system is given as From eq. (5), the energy distribution is calculated as where Z vmc 5 Ð n tot ðE R ; E v Þexp½2E vmc =R gas TdE R dE v is a partition function of the entire system at T. As with eqs. (4) and (6) also ensures that the two-dimensional distribution P vmc ðE R ; E v Þ converges to an even function after a long V-McMD simulation.
Generally, it is more difficult to achieve convergence of the two-dimensional distribution than that of one-dimensional one. The virtual system can be set arbitrarily. Then, we discretize coordinate v to reduce the sampling space as v5v i , where i51; . . . ; n vs and n vs is the number of the virtual states allowed. We introduce the term "virtual state" to specify the state of the discretized virtual system. The ith virtual state corresponds to v i . Accordingly, E v is discretized as E vi . Therefore, the time-evolution of the virtual system is done using the Metropolis Monte Carlo scheme. Now, the system is specified by two quantities: E R and v i . We set the action of the virtual system on the biomolecular system so that E R is confined in a zone Z when its virtual state is in v i . The method for the confinement is explained in an earlier report [33] . We set the zones so that Z i overlaps with Z i21 and Z i11 , although Z i21 and Z i11 have no mutual overlap as presented in Figure 1. The actually used zones are given later. Assuming that the system is at the filled-circle position of Figure 1, for which energy E R is involved in Z i in the virtual state v i , then, this potential energy is also involved in Z i11 . Therefore, the state v i can transition to v i11 using the Monte Carlo method. The molecular configuration r is not changed (i.e., E R is not altered) in this transition. The transition from v i to v i21 Figure 1. Transitions among adjacent virtual states. Presuming that the system is at the filled-circle position, where potential energy is E R and the virtual state belongs to v i , then, the virtual state might jump to v i11 without changing atomic positions (i.e., without changing E R ). Its transition probability is given in eq. (7). However, when the system is at the open-circle position, the virtual state might transition to v i21 . might occur because E R is involved in Z i and Z i21 if the system is at the open-circle position of Figure 1. We set the transition probability between virtual states v i and v i61 as shown below.
Equation (7) establishes that the transition occurs unconditionally. However, the probability can be set arbitrarily in general. We operate the virtual-state transition once every N int steps. Then, r moves according to MD scheme from the ith to the ði1N int Þ th steps without changing the virtual state v i . Consequently, v i might transition to v i21 or v i11 at the ði1N int Þ th step without changing r. The actual value for the transition interval N int is given later. Because of eq. (7), the energy distribution for each virtual state converges to a constant.

Materials and Methods
We constructed four fragment systems of p53 CTD, all of which consist of 17 amino-acid residues (residues 372-388 in UniProt [35] ) and amino acid sequences in single-letter codes as (i) Ace-KKGQSTSR-H 1 -KKLMFKTE-NH2, (ii) Ace-KKGQSTSR-H 1 -K-aK-LMFKTE-NH2, (iii) Ace-KKGQSTSRHKKLMFKTE-NH2, and (iv) Ace-KKGQSTSRHK-aK-LMFKTE-NH2, where aK, H 1 , Ace, and NH2, respectively, represent acetylated lysine, positively charged histidine, acetyl cap, and amine cap. As described in this article, we refer, respectively, to these fragments as (i) NonAc(H 1 ), (ii) Ac(H 1 ), (iii) NonAc, and (iv) Ac. These fragments include in common the binding regions (residues 380-386) to the four partners S100B, Sir2, CBP, and Cyclin A. We designate the regions as "common binding regions" in this study, which are shown as underlined in the sequences above. Remember that Table 1 presents correspondence between the fragment and its binding partner molecule. We put each of the fragments in a periodic box (50 3 50 3 50 Å 3 ) filled with water molecules and ions, where the number of ions is set so that the net charge of the whole system is zero and the ionic concentration is set to a physiological salt one. The conformations of the CTD fragments were the helical structure taken from the S100B-CTD complex. As explained later, these helical structures were randomized quickly in a preparative canonical MD simulation at a high temperature. Table 2 presents characteristics of the simulation systems.
Before V-McMD simulations, we performed a constantpressure (NPT) simulation at 300 K and 1 atm to ascertain the periodic box size for each system. The resultant box sizes are presented in Table 2. Then, we conducted a long hightemperature (600 K) constant-volume (NVT) simulation (time step: 2.0 fs) to randomize the helical conformation of each system.
Generally, DOS (n R ðE R Þ) is required to perform multicanonical sampling [see eq. (2)]. However, DOS is unknown a priori. Consequently, to begin with, we approximated DOS by performing conventional MD runs (i.e., canonical MD runs) at different temperatures covering a wide temperature range [280 K-600 K]. Then, as reported in the literature, [33] the resultant canonical energy distributions at the various temperatures are integrated to approximate DOS, which is used to define E vmc for the first V-McMD simulations [see eq. (5)]. DOS was estimated in the range of 280 K to 600 K. Therefore, the subsequently performed V-McMD simulations aim to obtain a flat energy distribution [eq. (6)] in this range. The upper temperature limit (600 K) was set so that the fragment overcame various energy barriers. The Results and Discussion section shows that various conformations were sampled. The lower limit (280 K) was lower than a room temperature (300 K). Therefore, the obtained conformational ensemble involved conformations probable at 300 K.
The first V-McMD simulation was performed with using E vmc ð5E mc Þ obtained above, where 32 runs were executed in parallel starting from the randomized conformations sampled from the high-temperature simulation. Therefore, we used the TTP procedure to perform V-McMD, [31,32] although we do not explicitly use the term "TTP-V-McMD" in this article. After the first V-McMD simulation, we updated E mc according to the method presented in an earlier report, [33] and performed the second V-McMD simulation with using the updated E mc , and so on. The initial conformation for one of the 32 runs in the ith iteration is the last snapshot for the run in the ði21Þ th iteration. We repeated this iteration procedure until the energy distribution converges to a function flat sufficient: P vmc % const. Then, the final iteration is the production run to collect snapshots for analyses. The numbers of iterations were 8 for Non-Ac(H 1 ), 9 for Ac(H 1 ), 8 for NonAc, and 8 for Ac, where the length of the production run was 320 ns for all the systems. Table S1 of Supporting Information presents the simulation lengths, inter-virtual state transition interval N int , and virtual- [a] Periodic box size is given in Å 3 unit. Shown values are those obtained after NPT simulation at 300 K and 1 atm.

FULL PAPER
WWW.C-CHEM.ORG state zones Z i for each of iterations. One might consider that the simulation length of 320 ns for the production run is too short to obtain a statistically significant ensemble. However, the enhanced sampling method has higher efficiency than conventional sampling does. Later, we discuss statistical properties of the resultant ensembles. We used a computer program psygene-G [36] for V-McMD with the SHAKE [37] method to fix the covalent-bond length related to hydrogen atoms, the zero dipole summation method [38][39][40] to calculate long-range electrostatic interactions, the velocity scaling method [41] to control temperature, TIP3P model [42] for water molecule, and an Amber-based hybrid force field for p53 CTD. [43] The Amber-based hybrid force field is defined as E hybrid ðxÞ5ð12xÞE 94 1xE 96 , where E 94 and E 96, respectively, denote param 94 and param 96 AMBER force fields [44,45] , and where x ð0 x 1Þ is a mixture weight for E 94 and E 96 . Kamiya et al. confirmed that a proper range for x is 0:45 x 0:95. [43] Ikebe et al. showed that the larger the value of x, the smaller the helical propensity in the resultant conformational ensemble at 300 K. [46] In our previous simulation of an IDP system, [11] we set x50:75 and obtained results comparable to experimentally obtained results. However, in our preparative simulation of the present systems with setting x50:75, the resultant ensemble exhibited a considerably high helical content. Then, we set x50:80 for the current study. The MD time step was set to 2.0 fs for all the simulations.
We constructed the force-field parameters of acetylated lysine (aK) as follows: The dihedral angle parameters are the same as those of lysine in the Amber force field. The atomic partial charges of aK were from a force field (Ref: http://pc164. materials.uoi.gr/dpapageo/amberparams.php).

Conformational ensemble and free-energy landscape
The V-McMD simulation of each system produces a conformational ensemble that consists of snapshots of various energies. The statistical weight assigned to a snapshot of energy E R at temperature T is equivalent to the canonical energy distribution P c ðE R ; TÞ [eq. (1)]. We denote this ensemble as Q SYS , where the notation "SYS" specifies the computed system as SYS 5 NonAc(H 1 ), Ac(H 1 ), NonAc, or Ac. Accordingly, the statistical weight of each system is denoted as P SYS c ðE R ; TÞ. The summation of the four ensembles is denoted as Q sum : In this study, we set T5300 K to prepare Q SYS , although we do not mention explicitly that the statistical weight is set at 300 K.
To analyze the conformational ensemble, we generate a two-dimensional (2D) free-energy landscape using principal component analysis (PCA) as follows: First, we compute inter-C a atomic distances for each snapshot in Q sum , and define a vector as q5½q 1 ; q 2 ; . . . ; q Npair , where q i is a distance for a Ca atomic pair and N pair is the number of the pairs. Remember that the number of residues is 17 for all four systems. Consequently, N pair is 136 (5173ð1721Þ=2) for all systems. Then, we calculated a variance-covariance matrix A with elements ði; jÞ expressed as A ij 5hq i q j i2hq i ihq j i, where brackets are the ensemble average over conformations in Q sum . Diagonalizing this matrix, we obtained N pair eigenvectors (v 1 ; v 2 ; . . . ; v Npair ) and eigenvalues (k 1 ; k 2 ; . . . ; k Npair ), where v i and k i are paired satisfying an equation Av i 5k i v i . The eigenvectors satisfy an orthogonal and normalized relation: v i Á v j 5d ij . We presume that the eigenvalues are arranged in descending order as k 1 > k 2 > . . ..
We use v 1 and v 2 to construct the 2D space (2D PCA space) by setting the coordinate axes to v 1 and v 2 , and to generate a conformational distribution by projecting conformations in Q SYS to the 2D PCA space. The coordinate axis v i is designated as a principal component (PC) axis i. We denote the kth conformation in Q SYS as q ðkÞ (5½q Finally, the potential of mean force (PMF) is computed as F SYS ðx PC1 ; x PC2 ; TÞ 2R gas Tln½P SYS ðx PC1 ; x PC2 ; TÞ with spatial patterns that are called the freeenergy landscape (FEL) in this study. From comparison of the spatial patterns of FEL among the four systems, we can discuss the difference of the architecture of FEL among the systems.
The ratio of contribution from the PC component i to the whole standard deviation is expressed as In PCA, the larger the eigenvalue assigned to the PC components i becomes, the greater the contribution: SD 1 > SD 2 > . . .. Therefore, the 2D PCA space constructed by v 1 and v 2 is suitable to overview the conformational distribution. The contribution ratio by the PC components 1 and 2 is given simply as rc 1 1rc 2 .

Synthesis of the nonacetyl and acetyl CTD fragments
CTD fragment with nonacetyl lysine (K382) was synthesized by the 9-fluorenylmethoxycarbonyl (Fmoc) method at a 0.05 mmol scale using a peptide synthesizer (Liberty Blue; CEM Corp., NC). After the completion of the peptide chain assembly, the obtained resin was treated with TFA containing 2.5% triisopropylsilane and 2.5% distilled water for 2 h. The crude peptide was concentrated by a nitrogen stream, precipitated by ether and purified by the reversed-phase HPLC to obtain nonacetyl peptide. ESI mass, found: 1038.7, calcd. for [M 1 2H] 21 : 1038.8. The synthesis of CTD fragment with acetyl-lysine (aK382) was also performed by the synthesizer, except that Lys [11] was introduced using Fmoc-Lys(Aloc)-OH (Aloc: allyloxycarbonyl), activated by O-benzotriazol-1-yl-N,N,N 0 ,N 0 -tetramethyluronium hexafluorophosphate (HBTU)/N,N-diisopropylethylamine (DIEA). After the completion of the chain elongation, the solution of tetrakis(triphenylphosphine)palladium(0), phenylsilane, acetic anhydride in dichloromethane was added to the resin for 45 min to acetylate the side chain amino group of Lys selectively [11] . The deprotection and purification were performed in the same manner to obtain Ac peptide. ESI mass, found: 1059.7, calcd. for [M 1 2H] 21 : 1059.8.

Far-UV circular dichroism measurements
The nonacetyl and acetyl CTD fragments were dissolved in Milli-Q water at 200 lM. The peptide solutions were diluted to 40 lM with the desired concentration of trifluoroethanol (TFE) for circular dichroism (CD) measurements. To detect the helix propensity, we varied TFE concentrations of 0-50%. It is noteworthy that the TFE concentrations in this article are volume per volume percentages. The peptide solutions also contained 25 mM sodium phosphate (pH 7.0 without TFE) and 0.1 M sodium chloride. Because pH is 7.0, the histidine state is neutral. Consequently, the fragments treated in this CD measurement are also designated as NonAc and Ac.
We performed far-UV CD measurements of the peptide solutions. Far-UV CD spectra were recorded on a spectropolarimeter (JASCO J-820; Jasco Corp., Japan) at 258C using a quartz cuvette with 1-mm path length. The spectra were expressed as the mean residue ellipticity and [h] (deg cm 2 dmol 21 ). All spectra were estimated iteratively 16 times. Furthermore, this procedure was repeated three times to compute the average and standard deviation of the spectrum.

Results and Discussion
The V-McMD production run generated conformational ensemble Q SYS for the four CTD systems (SYS 5 NonAc(H 1 ), Ac(H 1 ), NonAc, and Ac) in the unbound state. Figure 2 demonstrates the flat distributions from the production runs. The flatness ensures that eq. (6) was satisfied (DOS was estimated accurately). Therefore, the sampling has been done with statistical significance. Below, we analyze effects of the acetylation of K382 and the charge neutralization of H380 on the free-energy landscape and secondary structure contents. Furthermore, we discuss possible binding mechanisms of the four CTD fragments to their partner molecules. Last, we again check the statistical significance of the resultant ensemble by demonstrating the convergence of FEL.
Free energy landscape of the full-length fragments Figure 3 shows FELs at 300 K constructed in the 2D PCA space. The contribution ratios rc 1 and rc 2 from Q sum were, respectively, 41.4% and 18.8%. Then, the contributions from the PC 1 and 2 axes (rc 1 1rc 2 ) were 60.2%. In Figure 3, we refer to the clusters as G SYS k , where superscript SYS specifies the computed system and the subscript k is a label assigned to the clusters. Figure 4 demonstrates representative tertiary structures in each cluster. In all panels, G SYS 1 is assigned to the cluster of the global minimum PMF, which corresponds to a nearly complete helix (see structures in G SYS 1 in Fig. 4) located at the same position in the 2D PCA space. Figure 3 manifests that the clusters can transition mutually at 300 K: the free-energy barriers among the clusters are surmountable at 300 K, except for the cluster G . In other words, the CTD fragments are disordered.
All FELs involved not only the complete-helix cluster (G SYS 1 ) but also partially helical ones. The tertiary structures are shown in G  . . . ; n vs ) for the virtual states are integrated into the shown distribution P mc ðE R Þ using the method presented in an earlier report. [33] FULL PAPER WWW.C-CHEM.ORG G Ac 7 in Figure 4d. The b-hairpins are also found as G  Figure 4c, and as G Ac 3 in Figure 4d. Although all the CTD fragments exhibited conformational diversity, the FEL shape is considerably different as shown in We find that extended conformations are distributed in the solid-line circled region. It is likely that the acetylation facilitates hydrophobic-core formation by making the CTD fragment compact, which results in disappearance of the extended conformations. To verify this expectation, we calculated a radius of gyration of the fragments at 300 K only using hydrophobic atoms in the CTD fragment: Cb, Cc, and Cd atoms of K372, K373, K381, K382, and K386; Cb atom of H380; Cb atom of L383; Cb atom of M384; and Cb atom of P385. The side-chain tip of the acetylated K382 in the Ac(H 1 ) and Ac systems was excluded from the computation of radius of gyration for the strict comparison because the side-chain tip does not exist in NonAc(H 1 ) and NonAc. Table 3 presents the radius of gyration values (Rg), which demonstrates that the acetylation induces a compact hydrophobic core. We discuss this point further by viewing the tertiary structures of the CTD fragments below.
We computed the secondary-structure propensity of each residue in Q SYS using the DSSP program. [47] Figure 5 depicts the secondary-structure contents along the sequence at 300 K. Comparison of Figures 5a and 5b reveals that K382 acetylation induces the helix propensity of the CTD fragment. This tendency is also apparent from comparison of Figures 5c and 5d. Furthermore, Table 4 presents the helix increment by the acetylation quantitatively. Figure 6a presents a tertiary structure taken from the cluster G AcðH1Þ 1 , where the aK382 and L383 side-chains form a hydrophobic contact in the helix. Similarly, a conformation taken from G Ac 1 shows that a hydrophobic contact is formed between aK382 and S378 (Fig. 6b). Figure 6c shows a hydrophobic contact between the side-chain stems of aK382 and K381 in the partially helical conformation taken from G AcðH1Þ 2 . Unless K382 is acetylated, a repulsive force acts between these two lysine residues. Furthermore, in this conformation, the oxygen atom in the aK382 side-chain and the nitrogen atom in the K381 side-chain interact electro-statistically. Figure 6d presents a conformation taken from G Ac 5 , where a hydrophobic core is formed by side-chain tips of aK382 and M384 and sidechain stem of R379. Consequently, the acetylation induces the hydrophobic core formation in helices. Then, the radius of gyration becomes small. If Lys382 is nonacetyl form, then repulsion interactions take place between Lys382 and the other positively charged residues in the helix because each sequence of the fragments includes six or seven positively charged residues. Consequently, acetyl-lysine stabilizes helical structures more than nonacetyl lysine does.
As described above, comparison between Figures 3a and 3c as well as between Figures 3b and 3d clarified that the charge neutralization of H380 increases the probability in the brokenline circled region, where hairpin structures are distributed. Figure 5 and Table 4 also show that the charge neutralization enhances the hairpin formation. Figure 7a  , where the side-chain of H380 and main-chain of K382 form a hydrogen bond. If H380 is positively charged, then this distorted hairpin becomes unstable because a repulsive interaction between the positively charged H380 and K382 might break the hydrogen bond. Furthermore, a repulsive interaction between the positively charged H380 and R379 might also destabilize the b-hairpin structure. Consequently, the charge neutralization of H380 is necessary for stabilizing the hairpins in Figures 7a and 7b. Figure 7c displays a hairpin from G Ac 3 , where H380 and T377 form a hydrogen bond. These tertiary structures exemplify that the charge neutralized H380 serves the hydrogen bonds to stabilize turns in the hairpins. Data shown in Figure 3 suggest visually that the Ac(H 1 ) system might have the narrowest structural varieties among the four systems. To elucidate this feature quantitatively, we computed the standard deviation r SYS of the conformational distribution for Ca atoms for each system as where the brackets are the ensemble average over conformations in each ensemble Q SYS weighted at 300 K. The resultant values are: r NonAcðH1Þ 537:6 Å , r AcðH1Þ 530:6 Å , r NonAc 536:5 Å , and r Ac 536:9 Å . These values of structural fluctuations are consistent with the radii of gyration in Table 3. Consequently, the Ac(H 1 ) system has the smallest standard deviation. This reduction of the broadening results from the acetylation of K382. One might expect the Ac system to have a narrow

FULL PAPER
WWW.C-CHEM.ORG distribution because K382 is also acetylated in this system. However, as shown above, the neutralization of H380 induces hairpins, which prevents reduction of the distribution. A temperature replica exchange MD simulation [24,26] of a 14residue p53 CTD fragment and a conventional MD simulation [24,26] of a 15-residue p53 CTD fragment were performed to obtain a conformational ensemble in the unbound state. These two fragments are fully included in our 17-residue fragment, and H380 and K382 were positively charged, respectively, and K382 was nonacetylated. Consequently, those fragments are parts of the NonAc(H 1 ) fragment, and binds to the S100B molecule with adoption of a helical conformation. It is particularly interesting that, in these studies, the ensembles involved a helical fraction, which is consistent with our result for Q NonAcðH1Þ . In contrast, recent CD measurement of two 32-residue p53 CTD fragments, which involve our NonAc and Ac segments, showed that the conformations of the CTD fragments are randomized. [48] Our computational results demonstrated that both Q NonAc and Q Ac contain a helical fraction and that the helical content of Q Ac is larger than that of Q NonAc . This apparent inconsistency between the computational results and the CDexperimental observation should be analyzed. We note, however, that the fragment length for the CD experiment is considerably longer than ours.
To link the computation and experiment, we conducted CD experiments of the NonAc and Ac fragments of 17 residues long. TFE enhances formation of secondary structures, especially of helix. [49,50] To clarify the inherent helix propensity of the two fragments, the measurement was done at various TFE concentrations. CD spectra at zero TFE concentration have suggested that the overall structural feature of both fragments is characterized by a disorder state (data not shown), which is consistent to the preceding CD measurement. [48] Therefore, the helical contents obtained from the simulations were larger than those from the CD experiments. In fact, although CD experiments are useful to discuss the secondary-structure properties of polypeptide qualitatively, the CD data might involve quantitative ambiguity in assessing the secondarystructure contents. Conversely, the simulation data might involve some errors. Therefore, we compare the simulation data with the CD data qualitatively. Figure 8 shows that the Ac fragment has a higher helix contents than the NonAc fragment at all examined TFE concentrations including the zero TFE concentration. This result was also supported by analysis of the CD spectra using "Bestsel" software [51] (data not shown), which estimates the secondary-structure contents from CD spectra. Therefore, we conclude from both computation and  experimentation that the acetylation of K382 enhances the helix formation slightly. As described earlier, NonAc(H 1 ) binds to S100B with adopting helical conformation and Ac(H 1 ) does not adopt a helical conformation to bind to a partner (Table 1). This experimentally obtained result might be inconsistent to the computational result that Q NonAcðH1Þ contains a smaller helix content than Q AcðH1Þ . We discuss possible binding mechanisms of the CTD fragments to their partner molecules in the next section.
Free energy landscape of the common binding region As described in the Introduction section, the p53 CTD has a hub property. Four CTD-partner complex structures were determined.
As described in the Materials and methods section, the residues 380-386 of CTD are the common binding region to all the four partners S100B, Sir2, CBP, and Cyclin A. We computed the 2D FEL for this common binding region (Fig. 9), where the variancecovariance matrix was computed only for the common binding region. This figure shows that various clusters are distributed in the 2D PCA space for all the ensembles. The contribution ratios are rc 1 573:5 % and rc 2 514:4 %; then rc 1 1rc 2 587:9 %.
We refer to the clusters as S SYS k , where superscript SYS specifies the computed system and subscript k is a label assigned to clusters in Figure 9. Cluster S SYS 1 is assigned to the global minimum of PMF in all panels along with FEL for the fulllength fragments. Figure 10 presents representative tertiary Apparently, the ensemble Q NonAcðH1Þ has the broadest distribution of the four ensembles. Many clusters are found in FEL (Fig.  9a). The tertiary structures taken from the clusters are diverse (Fig. 10a) . Intercluster transitions can take place readily because free-energy barriers among the clusters are low.
Structural diversity decreases when K382 is acetylated and/or H380 is neutralized. In Q AcðH1Þ , no b hairpin cluster exists, although three partial helix clusters (S ) exist (Fig.  9c). In Q Ac , two helix clusters (S Ac 1 and S Ac 2 ) exist, but no hairpin cluster exists (Fig. 9d). In fact, the free-energy barriers among clusters in Figures 9b-9d are higher than those in Figure 9a. Therefore, inter-cluster transitions in Figures 9b-9d occur by passing narrower regions than those in Figure 9a. Figure 11 displays the experimentally determined complex structures and sampled conformations that are located near the bound form in the free-energy landscape (Fig. 9). Apparently, the sampled conformations closely resemble the experimental bound form.
The bound forms of the common binding region in the experimentally determined complex structures are also assigned to Figure 9. Bound forms s S100B and s CBP are mutually close in Figure 9 although s S100B and s CBP , respectively, denote a helix and a twisted conformation (See Fig. S1 of Supporting Information). The closeness of the two conformations results from the similarity of the Ca atomic pair distances. Remember that the PCA space is generated based on the inter-Ca atomic distances. Therefore, s CBP can convert to s S100B by minor rearrangements of the Ca atomic pair distances.  Figure 9. FEL, F SYS , at 300 K for the common binding region (residues 380-386) constructed in 2D PCA space: a) F NonAcðH1Þ , b) F AcðH1Þ , c) F NonAc , and d) F Ac . The lowest PMF site, r min , is set as F SYS ðr min Þ50:0 kcal/mol. Clusters S SYS k are shown in FEL. Black filled circle, blue triangle, black filled square, and magenta colored diamond, respectively, denote positions of the bound forms s S100B , s Sir2 , s Cyclin , and s CBP . Table 1 presents the correspondence between the CTDfragment type and the binding partner. PCA1 and PCA2 axes are computed from the entire ensemble (i.e., Q sum ). Consequently, the PCA axes for all panels are common. [Color figure can be viewed at wileyonlinelibrary.com] Figure 9 proposes possible mechanisms of CTD binding to their partner molecules. We infer that the main binding mechanism of NonAc(H 1 ) to S100B is the population selection because s S100B is located at a fringe of the most stable cluster S NonAcðH1Þ 1 (Fig. 9a). In other words, Q NonAcðH1Þ prepares the bound form in advance. Furthermore, because the free-energy barriers from the other clusters to S NonAcðH1Þ 1 are low as described above, the bound form is recruited quickly when the bound form is exhausted to bind to S100B. The helical content of Q NonAcðH1Þ is smaller than that of Q AcðH1Þ . Figure 9a suggests that Q NonAcðH1Þ contains helical conformations sufficient for use for binding to S100B even if the helical content of Q NonAcðH1Þ is less than that of Q AcðH1Þ .
The population-selection mechanism might take place when the Ac fragment binds to CBP, where the conformations in the most stable cluster S Ac 1 can transition readily to the conformation s CBP (Fig. 9d). However, the cluster S Ac 1 is not connected to the other clusters by low free-energy pathways. Consequently, the recruitment of conformations to S Ac 1 from the other clusters might be slow. In other words, the rate constant for the Ac fragment binding to CBP might be smaller than that for the NonAc(H 1 ) binding to S100B if the population-selection mechanism occurs.
For binding of the Ac(H 1 ) fragment to Sir2, the conformation s Sir2 is located at a high free-energy site in Figure 9b.  Figure 11. Experimentally determined complex structures and sampled conformations. The blue-colored region is the common-binding region of the experimental structure. Red is the sampled one. The sampled conformations are selected from the vicinity of the bound form in the freeenergy landscape (Fig. 9), and are superimposed on the bound form. White-colored regions are outside the common-binding regions. Cyancolored regions are partners. [Color figure can be viewed at wileyonlinelibrary.com] Therefore, we presume that the fragment binds to Sir2 with a different conformation than s Sir2 , by which an encounter complex is formed. Then, the intermolecular interactions bring the fragment to the genuine complex structure. Therefore, we presume that the binding mechanism of this fragment belongs to the induced folding mechanism.
The NonAc fragment can bind to either S100B or Cyclin A ( Table 1). The binding mechanism to S100B might be the population selection for the same reason for the NonAc(H 1 ) fragment binding to S100B. The structure s Cyclin is located at a site with free energy of about 3 kcal/mol in Figure 9c. Therefore, a small fraction of the ensemble Q NonAc is a conformation close to s Cyclin . Consequently, the binding mechanism of this fragment to Cyclin A might belong to the population selection. However, the main fraction of Q NonAc is far from s Cyclin . Therefore, different conformations might be used to bind to Cyclin A. Consequently, the induced folding mechanism is also possible.
It is likely that the diversity of FEL modulated by the state variation of lysine and/or histidine induces the hub property of CTD. One can reasonably infer that different FELs have different interaction mechanisms to other molecules. A single protein segment can have multiple binding partners. This property is called a hub. It is noteworthy that the flexibility of the CTD segment is fundamentally important for the diversity of FEL. If CTD is a structurally well-defined portion of the protein, then FEL has no great diversity. Therefore, CTD might bind only to a single partner.
The binding mechanism proposed here is based only on the FEL of the unbound state, which means that no IDR-partner interactions are considered. To ascertain whether the proposed mechanism is correct or not, we should perform simulations of systems where CTDs and their partner molecules coexist, as in earlier studies. [11,32] However, the current study is useful to investigate the variation of FEL in the presence or absence of the partner.
Finally, we confirmed the convergence of the sampled data. As reported in the convergence of sampling-section in Supporting Information, the convergence is good for all the systems.

Conclusions
To investigate a highly flexible biomolecular system, computational approaches are fundamentally important because experimental detection of large fluctuations at an atomistic resolution is still difficult. Because the high flexibility is an inherent property of IDR, investigation of the conformational ensemble is necessary to elucidate the nature of IDR. Therefore, a powerful conformational sampling method is required. We performed the enhanced conformational sampling method, V-McMD, to obtain the conformational ensembles of four p53 CTD fragments in the unbound state at the atomic resolution in an explicit solvent. Then, we constructed free-energy landscapes from the obtained conformational ensembles.
The shape of the free-energy landscape varied depending on the K382 acetylation and/or the H380 neutralization in CTD. It is particularly interesting that acetylation enhanced the helix propensity. This computational result was confirmed using CD experiments. We also demonstrated that acetylation induces the hydrophobic-core formation. The H380 neutralization has enhanced the hairpin formation of CTD. The helix content obtained from V-McMD tends to be larger than that from the CD experiment. This fact suggests that the force field is imperfect and that there have not been accurate force fields yet. [52] Results from the CD experiment were explained by the McMD simulation with atomistic details. Therefore, we believe that our results are useful to discuss the variation of CTD's conformational ensemble. Furthermore, the current results might assist in the generation of a general model for understanding the switching mechanism conducted by PTMs.
Each of the four CTD fragments has particular binding partner(s). We proposed possible binding mechanisms from the free-energy landscape of the unbound state. To judge whether the proposed mechanisms are correct or not, sampling of systems consisting of CTD and their partners is necessary for the next stage of research. However, as discussed in the Introduction, the binding mechanism of IDR is determined not only by the finally formed complex structure but also by the conformational distribution in the single-chain state. As discussed in Results, the spreading of the free-energy landscape and the free-energy barriers might affect the IDR-partner binding mechanism. Therefore, results of the current study of the unbound state are expected to be useful to investigate the variation of the free-energy landscape in the presence of the partner molecules. The results will provide useful knowledge to ascertain the hub property and coupled folding and binding of CTD.