Water-Mediated Recognition of Simple Alkyl Chains by Heart-Type Fatty-Acid-Binding Protein

Long-chain fatty acids (FAs) with low water solubility require fatty-acid-binding proteins (FABPs) to transport them from cytoplasm to the mitochondria for energy production. However, the precise mechanism by which these proteins recognize the various lengths of simple alkyl chains of FAs with similar high affinity remains unknown. To address this question, we employed a newly developed calorimetric method for comprehensively evaluating the affinity of FAs, sub-Angstrom X-ray crystallography to accurately determine their 3D structure, and energy calculations of the coexisting water molecules using the computer program WaterMap. Our results clearly showed that the heart-type FABP (FABP3) preferentially incorporates a U-shaped FA of C10–C18 using a lipid-compatible water cluster, and excludes longer FAs using a chain-length-limiting water cluster. These mechanisms could help us gain a general understanding of how proteins recognize diverse lipids with different chain lengths.

Liposome preparation. Each FA analogue was incorporated into large unilamellar vesicles (LUVs) with a FA/phospholipid molar ratio at 1/10 and less, where DMPC was used as the matrix lipid because of its physical similarity to mammalian cell membrane. FA-containing LUVs were prepared by thin film method [3] followed by extrusion through polycarbonate membranes with a pore size of 100 nm [4] Details of example procedure were shown below.
Fatty acids (FAs) were purchased from Sigma-Aldrich and Wako Pure Chemicals.
1,2-dimyristoyl-sn-glycero-3-phosphocholine (DMPC) was purchased from Avanti Polar Lipids as a dry powder. These chemicals were used without further purification. 3.7 mg of DMPC was dissolved in 2 mL of chloroform-methanol (1:1 vol:vol) with a FA of K + salt form at 1/11 (or 1/10) molar ratio in a glass vial. The organic solvent was removed under vacuum for more than 6 hours subsequent to evaporation with V-10 solvent evaporation system (Biotage, Uppsala, Sweden), the resultant lipid film was hydrated and suspended by using bath sonicator (ASU Cleaner, AS ONE, Osaka, Japan) into 500 L of ITC buffer: 20 mM Tris-HCl, pH 8.0, and 100 mM NaCl (or 20 mM potassium phosphate, pH 7.0, and 100 mM NaCl). In case of unsaturated FAs, the vial was flushed with nitrogen gas after removal of organic solvent. The lipid suspension was then subjected to 3 cycles of a freeze-thaw treatment at -30 °C/40 °C to form multilamellar vesicles (MLVs). The resulting MLVs were extruded through a polycarbonate membrane with a 100 nm diameter pore size in LiposoFast extruder (Avestin Inc., Ottawa, Canada) to form LUVs. The FA concentration in the extruded LUV solution was determined by the 2-nitrophenylhydrazine (2-NPH) method [5] using a short-and long-chain fatty acid analysis kit (YMC, Kyoto, Japan).  Table S1b) and some of SFAs, we used the phosphate only buffer (20 mM K-phosphate pH7.0 and 100 mM NaCl) at 310 K, where the number of active binding site per protein was between 0.6 and 0.9. Since the crystals of FABP3 were prepared with the Tris buffer system, we mainly adopted the thermodynamic parameters obtained under the Tris buffer conditions. Each experiment was accompanied by a ligand dilution in which the FA-DMPC LUV solution was titrated against the ITC buffer. The ligand dilution data were subtracted from the corresponding binding data before analyses. Obtained ITC data were fitted using the NanoAnalyze software (TA instruments) based on the independent model (single binding site) to deduce thermodynamic parameters. All experiments were repeated a minimum of three times. Mean and standard error values were calculated from the three sets of data.
X-ray co-crystallography. On the purpose of obtaining clearer electron density maps for FA alkyl chains, the delipidated FABP3 at a concentration of 2 mg/ml were incubated for 1 h at 310 K with an excessive amount (>1.5-fold molar) of FA. The FABP3 in complex with FAs were concentrated to approximately 20 mg/ml for crystallization. The crystals were obtained using the sitting drop vapour diffusion method. The co-crystals of the FABP3 were each grown in 1.0 µL of protein solution and 1 µL of reservoir solution (100 mM Tris-HCl, 55 % PEG400, pH 8.0−8.5) equilibrated against 1.0 mL of the reservoir solution at 293 K. These crystals were directly flash-cooled in a stream of cold nitrogen gas at 100 K with reservoir solution as a cryoprotectant.
X-ray diffraction data were collected by beamlines BL38B1 and BL44XU from SPring-8 synchrotron radiation sources (Harima, Japan). Diffraction data were processed using the HKL2000 program. [6] The deposited FABP3 structure coordinate (PDB ID 1HMT) was used as starting models for the structural analysis. Structural refinements were carried out using a restrained least-squares refinement method in Refmac software [7] as implemented within the CCP4 package. Anisotropic refinements were carried out using the program SHELXL. [8] The geometry of the refined model was validated by the program PROCHECK. [9] The data collection and refinement statistics are S4 summarized in Table S1. The final atomic coordinates and structure-factor amplitudes (PDB entries 4TJZ, 4TKB, 4TKH, 4TKJ, and 3WVM) have been deposited in the Worldwide Protein Data Bank (wwPDB; http://www.wwpdb.org) and the Protein Data Bank Japan at the Institute for Protein Research, Osaka University, Suita, Osaka, Japan (PDBj; http://www.pdbj.org/).

WaterMap calculation.
The protocols involved in the WaterMap calculations are described in previous works. [10] Input protein structures were prepared using the Protein Preparation Wizard in the Maestro (version 9.8) [11] molecular modelling suite. Amino acid residues outside of a 20 Å shell around SFAs were removed and the system was solvated in a TIP4P [12] water box extending at least 10.0 Å beyond the truncated protein in all directions. An 8.0 ns MD simulation was performed following a standard WaterMap relaxation protocol with 5.0 kcal/mol positional constraints, except for Lys58 and Asp77, to facilitate equilibration of the water molecules in the binding site. Water molecules from the frames saved at 1.5 ps intervals in the simulation were clustered into distinct hydration sites, and the excess entropy and enthalpy were calculated relative to bulk solvent according to the inhomogeneous solvation theory. [13] Molecular dynamics simulation. All simulations of the FABP3-FA complexes were performed at 310 K with the program 'MARBLE'. [14] The CHARMM36/CMAP for the protein, [15] CHARMM36 [16] for FA and TIP3P for water [17] were used as the force-field parameters. Periodic boundary conditions and the particle mesh Ewald method were applied. The Lennard-Jones potential was smoothly switched to zero over the range 8 -10 Å. The symplectic integrator for rigid bodies was used and CH x , NH x (x = 1, 2, 3), SH and OH groups were treated as rigid bodies. The time step was set to 2.0 fs.
For FA = C10:0 to C16:0, the initial coordinates of our simulations were taken from the final stages of the crystal structure refinements described in the previous section. For FA = C18:0, the crystal structure with a resolution of 1. 37 Å at a later stage of the refinement by our group was used as the initial structure. Polyethylene glycol and glucosamine phosphate molecules in the crystal structures were removed, and the oxygen atoms of these molecules were replaced by water oxygen atoms except for the phosphate oxygen atoms. The first methionine residue of FABP3 was removed.
Using PROPKA3.1, [18] we estimated that all protein residues adopt the default protonation states and the FAs are in a deprotonated state. The protein, bound FA and crystal water molecules were placed in a cubic box to accommodate a minimum water shell thickness of 12 or 13.5 Å. Two Na + ions were added in bulk solution to neutralize the system. The resulting systems contain over 31,500 atoms. Before starting the MD simulations, the systems were minimized to remove unfavourable contacts.
For the equilibration, the systems were gradually heated to 310 K for 100 ps under the NPT condition at 1 atm with the constraints on the positions of all non-hydrogen atoms of the protein and S5 bound FA. The same constraints were kept for the next 100 ps, and then gradually removed over a period of 150 ps. Subsequent 650 ps runs with no positional constraints were discarded as the further equilibration. After the equilibration runs, 20 ns production run were performed under the NPT condition at 1 atm and 310 K for all systems. The snapshots were saved every 1 ps.
During the 20 ns trajectories, the protein backbone and FAs fluctuated around the crystal structures. The protein backbone and the FA atoms in every snapshot were superimposed as one unit on the corresponding energy-minimized structure using the method proposed by Kabsch. [19,20] The root-mean-squared fluctuations of FAs in Figure S4c were calculated using the structures after the superimposition.

Structure Comparison between protein molecules in same protein families.
We obtained the lists of PDB and chain IDs belonging to the same protein family from Pfam database (version 27.0). [21,22] Because the database was not updated for nearly one year as of February 2014, we added some crystal structures newly-deposited in the PDB to the lists referring to the sequence neighbor lists in PDBj. [23] Residue numbers are sometimes different between the different PDB files of the same protein, therefore equivalent residues in two molecules were identified through the sequence alignment with 'Clustal omega' [24] Then one of the two protein molecules was superimposed on the other molecule with the method proposed by Kabsch. [19,20] For the superimposition, C atoms of all residues were used, but the ones of the residues missing in at least one molecule were not used in the calculation.
Finally, root-mean-squared difference (RMSD) for the C atoms between the two molecules was calculated. The pair which shows the largest RMSD value among the all protein molecule pairs in the protein family was tabulated in Table S5.

Figure S6. Examples of promiscuous recognition of lipids and hydrophobic ligands by rigid protein (A) and flexible proteins (B and C).
Among other proteins, the above three lipid-binding proteins are focused; CERT is the relatively rigid proteins that bind various ceramides, [ 25] and cytochrome P450 [26,27] and peroxisome proliferator-activated receptor (PPAR) [28][29][30] are known to recognize their ligands in an induced-fit manner. See Table S5 for detailed data.
(A) CERT: Main chains of the apo-and C 6 -, C 16 -, and C 18 -ceramide bound structures of the CERT START domain, particularly near the binding cavity, are largely superimposed. Red, apo-form; green, blue, and cyan, C 6 -, C 16 -, and C 18 -ceramide bound forms, respectively. [25 ] The image was taken from reference [25] © The National Academy of Sciences of the United States of America.
(B) Cytochrome P450 3A4: A superposition of the ketoconazole (hydrophobic ligand, not shown) complex (2J0C, dark colours) and the ligand-free structures (1TQN, light colours). Note that the structures of -helices F/F', G/G' and a C-terminus loop undergo large conformation changes by ligand binding as depicted by orange arrows. The images were taken from reference. [26] © The National Academy of Sciences of the United States of America.
(C) PPAR: The structures of ligand-binding cavities of PPAR with 5-hydroxyeicosapentaenoic acid (5-HEPA) and 4-hydroxydocosahexaenoic acid (4-HDHA). Note the differences in the conformation and orientation of the ligands and surrounding amino acid residues. The image was taken from reference [30] with permission. © Nature Publishing Group. Figure S7. Overlay plots of FA selectivity profiles. (a) FA compositions in human fat cell tryacylglycerol (red circles) [31] and artery plasma (blue triangles) [32] in percentage (right axis) and

S11
FABP3 FA affinities in pK d (gray squares, left axis). (b) FA binding affinities to three subtypes of human peroxisome proliferator-activated receptors (PPARs  in red circle,  in orange triangles, and  in violet inverted triangles) [33] and FABP3 (gray squares). FA affinities to PPARs are shown in pIC 50 evaluated by inhibition assay against 3 H-labeled ligands. [33] (c) Overlay plot of SFA selectivity profiles for FABP3 (gray squares) and acyl-CoA dehydrogenases short-chain (SCAD, blue circles), medium-chain (MCAD, magenta triangles) and long-chain (VLCAD, dark red inverted triangles) a b c S12 involved in the first reaction in mitochondrial -oxidation. [34] *The affinity of C6:0 FA was below the detection limit. Water-soluble short SFAs of C6-C8 and the FAs of C10-C18 are also favourable substrates of the mitochondrial -oxidation cascade. [34] The overlapping chain-length preference between FABP3 and acyl-CoA dehydrogenases indicates that FABP3 selectively binds water-insoluble "fuel" FAs as the substrates for mitochondrial energy production.  S14 Table S2. X-ray data processing and structure determination statistics for FABP3 in complex with the fatty acids.  The site numbers are shown in the figure below. b Occupancy is the fraction of the MD frames in which a water molecule is found at the hydration site, in the total number of frames recorded. c H, S and G of the hydration siteswere calculated based on the free energy relative to bulk water. d #HB(WB) is the number of the hydrogen bonds between the water molecules. e #HB(PW) is the number of the hydrogen bonds between the water molecules and the protein.

S13
Hydration site numbers are indicated in C10-FABP3 complex. Colour gradation is based on the free energy relative to bulk water in the same way as Figure S4.   [25] 2E3R_A (Crystal structure of CERT START domain in complex with C18-ceramide (P1)) [25] 1.323 CP2B4_ RABIT rabbit Cytochrome