This work is supported by: University of Massachusetts Boston.
Molecular docking is a frequently used method in structure-based rational drug design. It is used for evaluating the complex formation of small ligands with large biomolecules, predicting the strength of the bonding forces and finding the best geometrical arrangements. The major goal of this advanced undergraduate biochemistry laboratory exercise is to illustrate the importance and application of this tool. Students carry out the computational modeling of the interaction of acetylcholinesterase and its inhibitor, tacrine, and learn about the concepts of protein structure, enzyme-inhibitor interactions, intermolecular forces, and role of molecular design in drug-development.
Computer-based methods are becoming increasingly important and complementary to wet laboratory experiments in studying the structure and function of biomolecules. Molecular docking is a frequently used tool in structure-based rational drug design. Although early efforts were hindered by limited possibilities in computational resources, due to recent advances in high performance computing virtual screening methods became more and more efficient. These methods contributed to the development of several drugs and drug candidates that advanced to clinical trials. Examples include lead compounds to prevent myocardial infarction, to treat HIV infection, Alzheimer's disease, rheumatoid arthritis, and many other diseases [1, 2]. Docking programs simulate how a target macromolecule (receptor, enzyme, or nucleic acid) interacts with small molecule ligands, such as substrates, inhibitors, or other drug candidates. To model the binding between the ligand and the target molecule, their known three-dimensional structures are superimposed and the fit between the key sites of the target molecule and the ligand is then analyzed. By using molecular mechanics, the programs usually determine the binding energy between the host's binding site and the ligand, a feature used to predict and describe the efficacy of the binding .
The undergraduate biochemistry laboratory experiment described here uses one of the programs, the AutoDock Tools (ADT), to model the binding of an inhibitor ligand (tacrine) to its target enzyme, acetylcholinesterase (AChE).
ADT is a package of automated docking tools and excellent resource to engage biochemistry students in computational studies. It is frequently used, fast and available free of charge from the Scripps Research Institute . Aside from generating binding energies in these docking studies, the position of the ligand in the host's binding site can be visualized. The inhibition of AChE enzyme (EC # 126.96.36.199) is chosen as a model system for the introduction to molecular docking. We have selected this system for three reasons: (i) the crystal structure of AChE is readily available, (ii) the computational experiment can be easily connected to a wet laboratory experiment if desired using well-established protocols for the measurement of enzyme activity of AChE , and (iii) the broad biomedical relevance of AChE. Many neurodegenerative diseases (Alzheimer's disease, Parkinson disease, Huntington's disease, etc.) are associated with the degeneration of the cholinergic system resulting in the decrease in the amount of neurotransmitters, such as acetylcholine (ACh). Inhibition of the hydrolysis of ACh by blocking its metabolic enzyme AChE [6, 7] increases the ACh concentration and provides a symptomatic treatment option.
There are several AChE inhibitors currently approved for the treatment of Alzheimer's disease. These are donepezil, rivastigmine, galanthamine, and tacrine .
This computational experiment has recently been introduced to an advanced level undergraduate Biochemistry laboratory curriculum. It could also be used in medicinal or computational chemistry laboratory courses, as well.
REQUIRED RESOURCES AND PRELABORATORY PREPARATIONS
Although most theoretical modeling studies require significant computer resources, the present experiment can be carried out using simple desktop or laptop computers that are available in most departmental computer laboratories. In our case several laptop computers that were used, serve as mobile computational resource at the department. The specifications of the computers are: Dell Latitude D630, Intel®Core™ 2 Duo CPU T7250, 2.00 GHz, 1.99 GHz, 0.99 GB of RAM, MWXP professional, SP2.
Cygwin is a useful free tool for the simulation of a Linux-like environment for Windows. It is critical because ADT runs only under a Linux environment. Cygwin consists of two parts: a DLL (cygwin1.dll), which acts as a Linux API emulation layer providing substantial Linux API functionality and a collection of tools, which provide typical Linux interface. Cygwin was downloaded from cygwin.com for Windows XP (Professional) for 64-bit computers.
ADT is distributed as a part of MGLTools from http://autodock.scripps.edu/downloads in form of MGLTools-1.5.2-Setup.exe executable file. It includes not only AutoDock and AutoGrid but also Python 2.6, a program language, which is used to run ADT.
Molecular Structure Files
After all software installations were complete, we created a separate folder (BioChem Lab) for the necessary files. To run ADT from this folder, we copied executable files autodock4.exe, autogrid4.exe, and required files cygwin1.dll, adt4, runAdt.py here. We also placed there two files with which students would work during the experiment; the ligand file in typical mol2 format (l.mol2) and the enzyme structure of the human AChE taken from Protein Data Bank (PDB code is 1B41.pdb) (r.pdb) . We also decided to make a backup folder in case of unexpected damage to the original files. That folder contained the copy of all files that should be obtained after the successful completion of the exercise (result files).
The obtained protein structure (1B41.pdb) was slightly modified : the cocrystallized peptide fasciculine II and small ligands α-L-fucose, N-acetyl-D-glucosamine, and 2-(acetylamino)-2-deoxy-α-D-glucopyranose were removed (Fig. 1).
The structure of tacrine (Fig. 2) was drawn using ChemDraw Ultra from the ChemOffice software package. Then, it was copied into Chem3D Ultra (same program package), where it was subjected to simplified energy minimization search to a minimum root mean standard deviation gradient of 0.100. The obtained structure with local minimum was saved in convenient mol2 format.
The flexible residues and docking site of the enzyme were adapted from the literature . Flexible residues were found to be Ser 203, Glu 334, and His 447. The binding site of the protein was determined next to Tyr 72, Tyr 124, Trp 286, Tyr 341, and Asp 74 residues with the gridcenter of 121.96, 106.597, −114.897 .
OVERVIEW OF THE PROCEDURE
The major aim of the present laboratory exercise is that students perform a docking study of tacrine, a known inhibitor of the AChE enzyme using ADT, a freely available docking software package. They calculate the position of the docked ligand and flexible residues moved in the process of interaction. To make the docking computationally manageable, putative binding conformations were limited to the known binding site, and the protein was treated as a rigid body except for three amino acid sidechains, the “flexible residues.” Tacrine was used in the form of its 10 conformers, which were individually treated as rigid structures during calculations. After calculations, the major task is to compare the energies of the interaction in different conformations and determine the best fit between the enzyme and ligand. The students work with ADT in three major steps:
Preparation of Target Protein and Ligand Files
Preparation of the Protein
The protein structure is obtained from the PDB web site (http://www.pdb.org) in .pdb format (see Prelaboratory Preparations). This is a special format for protein structures that are obtained by X-ray crystallography or NMR studies. All structures are described as text files that contain the necessary information on the molecule, such as number of atoms, name of atoms, bond distances, angles, dihedral angles, residue numbers, and so forth. Sometimes this structure file contains too much information or not enough for a particular purpose; that is why it must be edited. In some applications, it is assumed that there is no necessity to have hydrogen atoms in the file (e.g. because the large number of hydrogens in proteins would significantly increase the size of the file and the addition of hydrogens later is a simple task). In this step, we need to place back all hydrogens for ADT calculations. The other required action is to remove water molecules from the surface of the protein. This is necessary because the extra water molecules will mask the protein surface from the ligand. (Please note that most proteins are reported in PDB cocrystallized with different small molecules that are removed from the protein structure during the prelaboratory preparations described earlier.)
Preparation of the Ligand
The ligand is originally drawn with a widely used chemical structure drawing software, such as Chem3D Ultra to obtain standard 3D structures (.mol2 format) (See Prelaboratory Preparations). It is recommended to confirm whether all hydrogen atoms are in the file before working with ADT. After opening the ligand, it can be visualized and ADT now automatically computes Gasteiger charges (empirical atomic partial charges) and distinguishes between hybridization states and atom types. As a part of preparation, the program determines rotatable bonds of the ligand to be able to generate different conformers for the docking.
Preparation of the Flexible Residue File
A unique property of the program is its ability to take into account the flexibility of not only the ligand but also the enzyme during docking process. It means that ADT is able to model not only how the ligand docks to the protein but also the position of flexible residues. To use this advantage, the flexible residues must be chosen and the rotatable bonds must be found. Flexible residues are amino acids in the binding site region of the protein that are able to alter their position via conformational change upon ligand binding. They are found by comparison of different crystallized structures or by molecular dynamic simulations. According to the literature, the flexible residues are Ser 203, Glu 334, and His 447 in our system . Rotatable bonds are used by the program to generate rotational isomers of amino acids and to present enzyme structures with those conformers. The structure of ligand-free AChE highlighting the position of the binding site is shown in Fig. 1.
Calculation of Affinity Maps by Using a 3D-grid Around the Protein and Ligand
A part of ADT, the AutoGrid calculates the energy of the noncovalent interactions between the protein and probe atoms that are located in the different grid points of a lattice that defines the area of interest (i.e. the area of the macromolecule where the possibility of ligand binding is studied). AutoGrid builds as many files as the number of probe atoms used. There are 30 different types of grid maps. Each one shows the interaction energies for a particular atom type, such as aliphatic carbons, aromatic carbons, hydrogen bonding oxygens, and so forth. The grid itself is a box with determined dimensions that is located at the site on the surface of the protein, where we expect the interaction with the ligand. In other words, the grid is our field of study (Figs. 3 and 4 ). The center of such a grid was precalculated earlier (see Prelaboratory Preparations) and a grid center of 121.96, 106.597, −114.897 is used in the following operations.
In this part of modeling, we need to determine the area where the ligand interacts with the enzyme on its surface, size of that area and particular types of atoms participating in the interaction of both the ligand and the enzyme. The first two parameters are determined by size and position of the grid box; the third parameter is given by map type. Once those parameters were set in one file, AutoGrid calculates grid parameter files for each type of atoms within the given area.
Defining the Docking Parameters and Running the Docking Simulation
When the preparation of the input files (ligand and protein) and the calculation of the affinity maps are properly performed, AutoDock will carry out the docking automatically using the newest docking algorithm (Lamarckian Genetic Algorithm).
Preparation of the Docking Parameter File (.dpf)
Once all files are ready, we need to specify for the program what particular ligand, protein, flexible protein, and maps we want to work with and also what algorithm we want to use, how many iterations are required, and so on. That information is kept usually in docking parameter file.
Running Autodock 4
Viewing the Docking Results
As a result of AutoDock calculations, we obtain the output file with in our case, 10 conformers of the protein–ligand complex with flexible residues and the ligand located within the binding pocket. Each structure had been scored and ranked by the program by calculated interaction energy. Using very convenient tools of the ADT package those generated structures could be easily visualized together with interesting information including intermolecular energy, rank of the ligand, and many others. Later, students can open this file with other softwares (e.g. Paint from the Windows auxiliary programs) and transfer the structures into their laboratory report.
RESULTS AND DISCUSSION
Students participating in this laboratory were mostly senior undergraduate Biochemistry/Chemistry/Biology majors (class enrollment 25). This experiment was presented as the last one in the second semester of our two-semester biochemistry laboratory sequence. Participating students had already taken exams in protein structure and function thus we considered them prepared to understand the major goals and steps of this exercise. Furthermore, shortly before this exercise, at the end of the second semester Biochemistry course, lectures were presented on drug development, and as a tool the computational methods were also explained.
Our major goal with this exercise was to familiarize students with these computational virtual screening methods, and illustrate it with an example that involves a medicinally relevant enzyme (AChE) and a commercially available approved drug molecule (tacrine). Considering that in a real life application about 500–25,000 conformations of a drug molecule are tested in the binding pocket, this exercise with 10 conformations, only illustrates this process.
After the necessary preparations have been completed, the class has been given a short introduction about how computational methods, such as docking contribute to the overall success of rational drug design. This analysis was built on the basics of medicinal chemistry that was introduced during the lecture class. The students were made aware of both the usefulness and limitations of such virtual screening methods. We particularly highlighted that these methods are tools that assist researchers in either developing new drug candidates or explain the mode of action of existing compounds, and not ones that guarantee ultimate success. Relevant positive examples were given based on two recent reviews on this topic [1, 2].
Regarding the current experiment, the following major points are worth mentioning during the introductory part of the class. There are over 200 AChE crystal structures in the PDB. 1B41 was chosen, despite its modest resolution of 2.8 Å, because it is the only nonmutant human protein. Similarly, tacrine was chosen because it is one of the four currently approved drugs for AChE inhibition, and has a relatively simple structure that makes the calculations faster.
Among the available AChE PDB entries one can find the empirical crystallographic structure of tacrine in complex with Torpedo californica AChE (1acj) . The human and Torpedo sequences are 59% identical, and the global structural alignment of alpha carbons (1acj vs. 1b41) has an RMS of 1.21 Å (526 alpha carbons, Swiss PDBViewer Magic Fit). Similarly, the three flexible residues have their side chain atoms within a single atomic diameter of each other in the global alignment of alpha carbons (1b41: Ser203, Glu334, His447; 1acj: Ser200, Glu327, His440). This structure may be obtained for comparison with the docking results. Students, however, should be made aware of the limitations, namely that the current docking experiment included only 10 conformations, whereas thousands are investigated in research level docking. Thus, we could only identify the best of the 10 conformations. Research level docking studies would probably result in a near full reproduction of tacrine binding similar to the X-ray structure in 1acj.
During the experiment, the calculations were carried out with 10 different conformations (Table I). The docked protein–ligand structures were visualized and the binding energies were determined.
Table I. Docking results for the tacrine-AChE system
Intermolecular energy (kcal/mol)
In their laboratory reports, the students were asked to include the visualization of the docking results for the least and the most favorable conformations of the tacrine-AChE system, in a format shown in Figs. 3 and 4. They also had to give the final intermolecular energies (in kcal/mol) for these conformations, and answer to the following study questions:
(a)What is the main purpose of docking studies?
(b)Why is docking important for biochemists?
(c)In your opinion, what are the possible limitations of the method?
The visual representations and particularly the intermolecular binding energy data illustrate that different conformations have a definite impact on the interaction.
In similar studies, several conformations of the same molecule are usually docked to find the most advantageous structural form. Following the above approach, other molecules with completely different structure can also be docked to the same or different enzyme by ADT and the binding energies would reveal important information regarding the protein–ligand interactions.
LABORATORY TIME MANAGEMENT
After proper software installation and prelaboratory preparation of the basic protein and inhibitor structure files, the experiment fits well into the preset 4 h laboratory period. Following about 30 min introduction to the basic concepts the students use step-by-steps instructions to carry out the experiments. After the preparation of the input files (ligand and protein) and the calculation of the affinity maps, the computer time necessary to complete the calculation is about 40 min. This time can be used for further discussion of relevant basic topics, such as primary, secondary, tertiary, and quaternary protein structures, structure optimization, enzyme kinetics, types of interaction in enzyme-inhibitor complex formation, use of enzyme inhibitors as drugs, and the role of structure-based design in drug development.
The generation and visualization of the enzyme-inhibitor binding data in the described experiment provide opportunity to explain structural features of enzyme inhibition and point out the major benefits and weaknesses of molecular docking. The advantages of using the PDB and computational approaches in biochemical studies have also been illustrated. Student response was positive to the exercise. They particularly enjoyed the visual close-out representation of the binding site of the enzyme, and the enzyme-inhibitor complex. They understood the major steps and the laboratory reports were of high quality compared to other experiments.
In order to help the implementation of this laboratory exercise, our syllabus with step-by-step instructions is provided as supplemental material. The detailed user guide and further notes on AutoDock can be found at websites listed as . In addition, the input files necessary to run the experiment can be downloaded from the following website: http://alpha.chem.umb.edu/chemistry/biochm386/
The authors like to acknowledge the helpful feedback from Professor Mridula Satyamurti, the joint course instructor, and our students enrolled in the BioChem 386 class during Spring 2009.