Engineering of Thermostable β‐Hydroxyacid Dehydrogenase for the Asymmetric Reduction of Imines

Abstract The β‐hydroxyacid dehydrogenase from Thermocrinus albus (Ta‐βHAD), which catalyzes the NADP+‐dependent oxidation of β‐hydroxyacids, was engineered to accept imines as substrates. The catalytic activity of the proton‐donor variant K189D was further increased by the introduction of two nonpolar flanking residues (N192 L, N193 L). Engineering the putative alternative proton donor (D258S) and the gate‐keeping residue (F250 A) led to a switched substrate specificity as compared to the single and triple variants. The two most active Ta‐βHAD variants were applied to biocatalytic asymmetric reductions of imines at elevated temperatures and enabled enhanced product formation at a reaction temperature of 50 °C.


Sequence selection and plasmids
A set of homologous β-hydroxyacid dehydrogenases (βHAD) sequences from Fademrecht et al. [1] was searched for sequences originating from hyperthermophilic archaea or bacteria by comparison with the BacDive database (release 09/2017). [2] The hypothesis was that sequences from hyperthermophiles would likely encode for thermostable proteins. Archaeal sequences were only considered if corresponding crystal structures were available in the Protein Data Bank (PDB) to increase the chances for protein expression in E. coli.
Based on the previously established standard numbering scheme for imine reductases, [1] equivalent position numbers can be assigned with the imine reductase from Streptomyces kanamyceticus (PDB accession 3ZHB) as a reference sequence. Only sequences containing a lysine residue at standard position 187 and a glutamine or asparagine residue at standard position 191 were considered for further analysis (Table S1). Both positions were described as relevant for substrate interaction in βHADs in previous works. [3][4][5] Table S1: Plasmids used for heterologous expression of the βHAD variants.
Four candidates for thermostability were selected for experimental validation: Two candidates from Thermus thermophilus (Tt-βHAD-1 and Tt-βHAD-2), one candidate from Thermocrinis albus (Ta-βHAD), and an archaeal candidate with known crystal structure from Pyrobaculum calidifontis (Pc-βHAD, with PDB accession 3WS7). The four selected βHAD candidates were found to share modest sequence identities between 26.5 and 35.7 % (Table S2). The wildtype sequences of the two candidates Pc-βHAD and Tt-βHAD-1, were found to also contain a phenylalanine residue at standard position 250, which was described as 'gate-keeper' for the access of bulkier substrates in previous works. [ (Table S1). Pairwise Needleman-Wunsch alignments were performed using the implementation of the software suite EMBOSS (version 6.6.0) [6] with gap opening and extension penalties of 10 and 0.5, respectively, and the BLOSUM62 substitution matrix. Ta-βHAD  Tt-βHAD-1 Tt-βHAD-2 Pc-βHAD 35.7 26.5 32.6 Ta-βHAD 26.9 27.9 Tt-βHAD-1 33.1 Selected enzymes were synthesized from BioCat GmbH (Germany) containing the replacement of the catalytic lysine by an aspartic acid (highlighted in green in the following sequence data). Codon-optimized synthetic genes were cloned into a pET-28 vector system.

Generation of Ta-βHAD variants
pET28a_Ta-βHAD_K189D served as the starting template for the creation of several active site variants. Site-directed mutagenesis was performed using the QuikChange Site-Directed Mutagenesis method. [7] The used reaction mix and the thermocycler program (Mastercycler epgradient, Eppendorf) are listed in Tables S3 and S4, respectively.  After the PCR 2 µl (2 U) of DpnI was directly added to the reaction mix, and the methylated parental template DNA was digested for 1h at 37°C. The transformation was performed by the heat-shock of chemocompetent E. coli JW5510. The regenerated cells were fully plated on LBCmp agar plates and incubated at 37 °C overnight. Overnight cultures of single clones were used for plasmid isolation with the ZyppyTM Plasmid Miniprep Kit (Zymo Research) according to the manufacturer's protocol. The validation of the introduced mutations into the plasmids was performed by Eurofins (Ebersberg, Germany) via Sanger sequencing with standard primer(s) pTRCHis-RP and pTRCHis-RP.

III. Protein expression and purification
Pre-cultures in 5 ml LB (34 µg/mL Cmp) were inoculated with 100 µL over-night cultures of sequenced single clones and incubated at 37 °C and 180 rpm for 20 h. The main culture, 800 ml TB (34 µg/ml Cmp) in 2 l flasks with baffles (VWR), was inoculated with 1 mL of pre-culture. The main culture was grown at 37 °C and 150 rpm until an OD of 0.7-0.8. The overexpression was induced by the addition of L-(+)-arabinose in a final concentration of 0.03 % (w/v) and incubated for 18 h at 25 °C and 150 rpm. After 18 h, the cells were placed on ice, and the cell suspension was centrifuged for 30 min at 10000 g (Avanti J-26S XP centrifuge, Beckmann Coulter). The supernatant was discarded, and the cells were washed with 25 ml KPi-buffer (50 mM, pH 7.5). The washed cells were then transferred to 50 mL plastic tubes (Sarstedt) and centrifuged for 30 min at 3220 g (Centrifuge 5810 R, Eppendorf). The supernatant was discarded. The cell pellets were resuspended in 2 ml of buffer TRIS-HCl (50 mM, pH 8) for Ta-βHAD variants per gram cell pellet. The resuspended cells were disrupted by high-pressure homogenizer (EmulsiFlex C5, Avestin) at 4 °C for 3 cycles with 750-1000 bar counter-pressure. The disrupted cells were centrifuged for 30 min at 4 °C and 8000 g (Centrifuge 5810 R, Eppendorf). The obtained lysates were further processed or stored at -20 °C. Since Ta-βHAD variants are thermostable, they were purified using an optimized heattreatment purification protocol. The finally obtained protocol is described below. The lysates of the Ta-βHAD variants were incubated at 57 °C for 60 min for precipitation of E. coli proteins. The sample with the precipitate was centrifuged for 30 min at 4 °C and 8000 g (Centrifuge 5810 R, Eppendorf) to get rid of all precipitated proteins. The supernatant was then transferred into Vivaspin 6 centrifugal concentrators (10.000 MWCO PES, Sartorius) and centrifuged at 6000 g (Centrifuge 5810 R, Eppendorf) until approximately 1 ml remained. Then 5 mL of TRIS-HCL buffer (50 mM, pH 8.0, and 10 % (w/v) L-sorbitol) was added two times and centrifuged until approximately 1 mL concentrated, and the re-buffered enzyme was obtained. For determination of the protein concentration of the βHAD variants, the Pierce™ BCA Protein Assay Kit (Thermo Scientific) was used according to the manufacturer's protocol. BSA was used as a calibrating agent with known protein concentrations. The soluble expression of the Ta-βHAD variants (34 kDa) was validated via SDS-PAGE gel analysis.

Thermofluor assay
The thermal shift (ThermoFluorTM) assay offers a quick and simple technique for assessing the thermal stability of proteins. SyproTM Orange was used as fluorescent dye to track protein unfolding with respect to temperature. Therefore 20 µL of purified enzyme variants and 5 µL of SyproTM Orange dye were mixed in specific 96-well plates for real-time PCR (twin.tec PCR Plate 96, Eppendorf AG, Hamburg, Germany) to a final assay concentration of about 2 µM protein and 5x dye, respectively. The plates were immediately covered using optically clear adhesive sheets (Optical Adhesive Covers, Applied Biosystems, Foster City, US) and directly centrifuged at room temperature and 900 g for 2 min to collect solutions in the bottom of the well and to remove bubbles. The assay plate was analyzed using the Master cycler EP Gradient real-time PCR instrument (Eppendorf AG, Hamburg), setting following parameters: plate layout: well; filter 520 nm: SYBR; sample volume: 25 µL; PCR program: hold 20°C for 1 min, heat up with a heating rate of 2.4°C per minute to 99°C; total program time: 33 min. The instrument's software realplex determines the increase in fluorescence over time depending on the increase in temperature (melting curve), whereby the inflection point of the determined slope correlates with the melting temperature (TM) of the protein.

Size-exclusion chromatography
Size exclusion chromatography for the analysis of the oligomerization state was performed on an Agilent 1260 Infinity II LC System using a Yarra 3µ SEC-200 column (300 x 4.6 mm, Phenomenex). The gel filtration standard #1511901 from Bio-Rad was used for calibration. The method is described in Table S6.

VI. Thermofluor assay
Thermofluor assay for the analysis of the thermostability was performed using the realplex2 Mastercycler (Eppendorf) and SYPRO Orange (Thermo Fisher Scientific). The purified enzyme was mixed with 2.5x SYPRO Orange and measured at 520 nm. The temperature program is described in Table S7.

Biotransformations in co-solvents
Biotransformations in co-solvent were performed in 50 mM TRIS-HCl pH 8.0 and the respective co-solvent (5%, 10%, 25 %, 50% v/v) with purified enzyme (2 mg/mL) at 25 °C for 24 h. After 5 h were taken and analyzed for the biotransformations in methanol. The reaction consisted of 10 mM substrate, 2.5 mM NADPH, 25 mM glucose-6-phosphate and 5.0 U/mL glucose-6-phosphate-dehydrogenase. Negative controls were performed containing no βHAD, no substrate, or no NADPH. After 24 h, the reactions were stopped and analyzed as described elsewhere. The product formation was normalized to the product formation without any cosolvent.

Biotransformations
Biotransformations were performed in 1.5 mL reaction tubes (Sarstedt) with heat-purified enzyme and glucose-6-phosphate dehydrogenase cofactor regeneration system at 25°C and 600 rpm (Thermomixer comfort, Eppendorf) for all generated Ta-βHAD variants in triplicates the reactions were stopped at time points 0 h, 4 h, and 24h and analyzed as described elsewhere. The reaction mixture can be found in Table . As negative controls served buffer or empty vector.  Sample preparation and extraction Biotransformations with substrate 3 were analyzed with GC-FID. The biotransformations were stopped (at desired timepoint) by adding 50 µl 5M NaOH to each 150 µl biotransformation reaction. Then 400 µl methyl-tert-butyl ether (MTBE) was added as the extracting agent and 1 mM 3-methyl piperidine as an internal standard. The samples were immediately vortexed for 2 min and centrifuged for 5 min at 16000 g (Centrifuge 5415 R, Eppendorf) for phase separation. 200 µl Upper organic phase was transferred to GC vials with an inlet (WICOM).
For the analysis of the biotransformations via GC, the GC-2010 Plus gas chromatograph (Shimadzu) with a flame ionization detector (FID) was used. The applied conditions for substrates and corresponding products are shown in Table S7.
Biotransformations with substrate 2a were analyzed with normal phase (NP) HPLC. The biotransformations were stopped (at the desired time point) by adding 50 µl 5M NaOH to each 150 µl biotransformation reaction. Then 400 µl cyclohexane-isopropanol mixture (ratio 70:30) was added as an extracting agent and 0.2 mM or 1 mM 1-acetonaphthone as an internal standard. The samples were immediately vortexed for 2 min and centrifuged for 5 min at 16000 g (Centrifuge 5415 R, Eppendorf) for phase separation. Then 200 µl upper organic phase was transferred to HPLC vials with an inlet (WICOM). For the analysis of the biotransformations via NP-HPLC an Agilent 1200 series HPLC system composed of degasser (G1379B, Agilent 1260 Infinity), quaternary pump (G1311A, Agilent 1200 series), autosampler (G1329A, Agilent 1200 series), thermostated column compartment (G1316A, Agilent 1200 series) and diode array detector (G1315D, Agilent 1200 series) was used. The applied conditions for substrates and the corresponding products are shown in Table SX.
The analysis of product formation was conducted by dividing the peak areas of the product by the IS area. The obtained quotient was then converted to a product concentration using a product standard curve.

Molecular modeling
To visualize the residues selected for mutagenesis, a homology model of Ta-βHAD was generated. For this purpose, the Modeller [8,9] PlugIn PyMod2.0 [10] was used with PyMOL 1.8. [11] For this purpose, the protein sequence (WP_012992122) of Ta-βHAD without His-tag and HRV 3C protease cutting site as input to perform a BLAST search [12] of the PDB. [13] The crystal structures of the in addition to that identified sequence homologs 3DOJ (30.0% sequence identity), 3PEF (33.1% sequence identity), and 3WS7 (35.7% sequence identity) were fetched, the chain's sequences were imported and aligned with Ta-βHAD sequence via MUSCLE. [14] Finally, a monomeric homology model was generated utilizing this cluster as a template ( Figure  S13), and its dimeric form was achieved via superposition of two monomeric copies to 3PEF_B and 3PEF_D, respectively ( Figure S13). The NADP+ cofactor was obtained from 3WS7 and was adapted via PyMOL builder to provide a functional NADPH cofactor. The atom names were adapted manually. As all template sequences displayed a low sequence identity, energy minimization was performed to optimize the dimeric model. Parameters NADPH was calculated with antechamber. [15] Using the parameters listed in MOL2 and FRCMOD files, parameter XML files were generated. The pKa values of the side chains were calculated using PROPKA [16,17] provided by the PDB2PQR server (version 2.0.0). [18] A pH of 8 and the Parse forcefield were utilized. According to these results, all residues differing from the standard protonation state (D89, D114, K139, E179, K215, K258, E272) were protonated/deprotonated manually. The simulations were performed using OpenMM 7.4.1 [19] utilizing NVIDIA CUDA GPU platform. [20] General Amber force field (GAFF) and Amber14 force field were used. [21,22] The cubic box with a padding of 1.5 nm was solvated with water (tip4p-Ew water model) [23], the protein charge was neutralized, and ionic strength of 0.1 M NaCl was applied, and a neutralized. Energy minimization was performed until 10 kJ/mole tolerance energy. A reference temperature of 300 K, a pH of 8, the Langevin integrator with a friction coefficient of 1/ps, and a step size of 2 fs were used. [24] In the resulting model the mutations of single variant TA1 (K189D; numbering includes His-tag and HRV 3C cutting site), triple variant TA6 (K189D/N192L/N193L; numbering includes His-tag and HRV 3C cutting site), quadruple variant TA16 (K189D/F250A/N192L/D258A; numbering includes His-tag and HRV 3C cutting site) and quintuple variant TA20 (K189D/F250A/N193L/N192L/D258S; numbering includes His-tag and HRV 3C cutting site) were introduced using the PyMOL mutagenesis tool.