Molecular Determinants of Carbocation Cyclisation in Bacterial Monoterpene Synthases

Abstract Monoterpene synthases are often promiscuous enzymes, yielding product mixtures rather than pure compounds due to the nature of the branched reaction mechanism involving reactive carbocations. Two previously identified bacterial monoterpene synthases, a linalool synthase (bLinS) and a cineole synthase (bCinS), produce nearly pure linalool and cineole from geranyl diphosphate, respectively. We used a combined experimental and computational approach to identify critical residues involved in bacterial monoterpenoid synthesis. Phe77 is essential for bCinS activity, guiding the linear carbocation intermediate towards the formation of the cyclic α‐terpinyl intermediate; removal of the aromatic ring results in variants that produce acyclic products only. Computational chemistry confirmed the importance of Phe77 in carbocation stabilisation. Phe74, Phe78 and Phe179 are involved in maintaining the active site shape in bCinS without a specific role for the aromatic ring. Phe295 in bLinS, and the equivalent Ala301 in bCinS, are essential for linalool and cineole formation, respectively. Where Phe295 places steric constraints on the carbocation intermediates, Ala301 is essential for bCinS initial cyclisation and activity. Our multidisciplinary approach gives unique insights into how carefully placed amino acid residues in the active site can direct carbocations down specific paths, by placing steric constraints or offering stabilisation via cation‐π interactions.


Calculation of the interaction energies between terpinyl cation and enzyme residues
The crystal structure of bCinS in complex with Mg 2+ ions and the GPP analogue (2Z)-2-fluoro-3,7dimethylocta-2,6-dien-1-yl trihydrogen diphosphate (PDB ID 5NX7, [2] chain A) was used as starting structure to generate a structure of the bCinS S-α-terpinyl cation complex. The protonation states of titratable residues were estimated using PropKa3.1. [4] The aliphatic chain of the GPP analogue was removed while the phosphate motif was retained as PPi, and then the terpinyl cation was manually docked into the enzyme active site. To properly orient the terpinyl cation, we first considered the orientation/conformation of the GPP analogue in the crystal structure (this should be reasonable, considering the minimal differences in the structure of the analogue and GPP itself). [2] We further made sure that after optimization, the structure of the bCinS terpinyl cation complex corresponded to a productive configuration (i.e. it connects with the expected reactant and product complexes).
The bCinS terpinyl cation complex was optimized using quantum mechanics/molecular mechanics (QM/MM) energy minimisation, [5] where the QM region (terpinyl cation, catalytic Mg 2+ ions, and PPi) was treated at the M06-2X/6-31G(d) level [6] and the MM region (protein and crystal waters) with the CHARMM36 force field [7] and the modified TIP3P model. The QM/MM optimizations were performed with the ChemShell package.
All optimizations were carried out using the DL-FIND optimizer module of ChemShell and hybrid delocalized internal coordinates (HDLC). [11] The charge of the QM region was +3. All atoms within 7 Å of the QM region were unconstrained during optimizations, whereas more distant atoms were kept fixed. Non-bonded interactions were calculated without cutoff. An electrostatic embedding scheme with charge shift correction was used to compute the electrostatic interaction between the QM region and the surrounding partial charges of the MM region. [12] From the M06-2X/6-31G(d)//CHARMM36 optimized structure ( Figure 4 of the main manuscript), the coordinates of the side chains (including the alpha carbon atom) of the relevant bCinS Phe residues (74, 77, 78, and 179) and the terpinyl cation were extracted. Then the valence of alpha carbon atoms was completed by adding hydrogen atoms and the interaction energy of each Phe residue with the cation was calculated at the DFT (M06-2X/TZVP) level with counterpoise correction. [6,13] For comparison, the Phe residues were mutated by A and L and the interaction energies were recalculated. Additionally, F77 was mutated by Y.
These calculations were performed with the Gaussian09 program. [9] Mutations were performed using the mutagenesis wizard of the PyMOL package, and the mutant residues were positioned in a similar orientation to the wild type ones.

Molecular dynamics (MD) simulations of the bCinS GPP complex
We set up MD simulations of the wt-bCinS GPP complex, aiming at getting molecular insights into the effect of the A301V/L/F mutations on bCinS catalysis. The intention was to also perform simulations of the mutant bCinS GPP complexes, to assess if there were differences in the dynamics of GPP in the complex that may explain the differences in product specificity observed experimentally (similar to previous works). [14] But, as mentioned in the main manuscript and below ( Figure S2), mutation of A301 into the wt-bCinS GPP complex by residues with bulkier side chains caused serious steric clashes with surrounding residues which indicate that the enzyme active-site contour promoting cyclization of GPP cannot be maintained. Due to the large uncertainty of initial structures of these mutant bCinS GPP complexes, it was therefore not possible to perform reliable simulations for these.
For the MD simulations of GPP in wt-bCinS we modelled the bCinS GPP complex again using the chain A of the 5NX7 crystal structure as starting structure. [2] The GPP fluorinated analogue was just slightly modified to meet the structure of GPP (F was removed and the aliphatic chain was protonated according to GPP). The obtained bCinS GPP structure was then solvated using a rectangular box of TIP3P water molecules with a minimum buffer of 13 Å around the protein, using the solvate plugin of the VMD package. [15] The water box contained 17646 water molecules. Additionally, with the autoionize plugin of VMD, five Na + ions were added to neutralize the system. [15] Thereafter, the complex was transferred to the Amber16 program (which faithfully represents CHARMM force fields) to perform MD simulations on GPUs using the PMEMD code. [16] The CHARMM36 protein force field was employed.
Our MD protocol is analogous to previous studies [2,14,17,19] and consisted of the following: i) minimization of the positions of hydrogen atoms (all heavy atoms fixed); ii) minimization of solvent (all other parts of the system fixed); iii) minimization of the entire system with positional restraints of 5 kcal mol -1 Å -2 applied to the Cα atoms and the Mg 2+ B ion, and with one-sided harmonic distance restraints between diphosphate oxygens and the Mg 2+ A,C ions (applied on distances of > 2.2 Å, with force constant 50 kcal mol -1 Å -2 , to maintain the coordination observed in X-ray crystal structures); iv) 60 ps of canonical (NVT) ensemble thermalisation to 300 K (with the positional and distance restraints still in place); v) 150 ps of isothermalisobaric (NPT) ensemble at 300 K and 1 bar, keeping the distance restraints and gradually decreasing the restraints (from 5 to 0.5 kcal mol -1 Å -2 ) on the Cα atoms and Mg 2+ B; vi) 30 ns NPT simulation at 300 K and 1 bar with the diphosphate-Mg 2+ A,C distance restraints. All MD simulations were performed using periodic boundary conditions and a time step of 2 fs. A direct space cut-off of 8 Å for nonbonded interactions with PME for long-range electrostatics was used. All bonds involving hydrogen atoms were constrained by SHAKE. [20] Langevin dynamics was used for temperature control (collision frequency of 5 ps -1 for steps iv-v and 2 ps -1 for step vi), and pressure was controlled by coupling to an external bath (Amber16 default settings) for NPT conditions. A total of four independent MD simulations of 30 ns were performed by using different initial velocity distributions. The analysis of the simulations was performed using the final 20 ns of each simulation with VMD [15] and the CPPTRAJ utility of AmberTools16. [16a] Trajectories indicated that GPP was maintained in a conformation in line with cyclisation, with a reasonably stable protein structure ( Figure S1).