Enzymology of Pyran Ring A Formation in Salinomycin Biosynthesis

Tetrahydropyran rings are a common feature of complex polyketide natural products, but much remains to be learned about the enzymology of their formation. The enzyme SalBIII from the salinomycin biosynthetic pathway resembles other polyether epoxide hydrolases/cyclases of the MonB family, but SalBIII plays no role in the conventional cascade of ring opening/closing. Mutation in the salBIII gene gave a metabolite in which ring A is not formed. Using this metabolite in vitro as a substrate analogue, SalBIII has been shown to form pyran ring A. We have determined the X-ray crystal structure of SalBIII, and structure-guided mutagenesis of putative active-site residues has identified Asp38 and Asp104 as an essential catalytic dyad. The demonstrated pyran synthase activity of SalBIII further extends the impressive catalytic versatility of α+β barrel fold proteins.


General analytical procedures
NMR data were collected in CD3CN using Bruker Avance spectrometers using either a 500 DCH cryoprobe operating at 500.05 MHz for 1 H and 125.7 MHz for 13 C ( 13 C, TOCSY) a 500 TCI cryoprobe operating at 500.13 MHz for 1 H and 125.8 MHz for 13  Chemical shifts were recorded using an internal deuterium lock for 13 C and residual 1 H in CD3CN (δH 1.94, δC 118.26) and are given in ppm on a scale relative to δTMS = 0. NMR spectra were processed using Bruker Topspin (v. 3.2). DQF-COSY spectra were acquired with 4k data points in F2 and 512 increments with 2 scans per increment. TOCSY spectra were acquired using DIPSI2 modulation and a mixing time of 120 ms, 8k data points were acquired in F2 and 360 increments with 4 scans per increment. The edited HSQC spectra were optimized for 145 Hz with 2k data points in F2, 512 increments and 4 scans per increment. HMBC spectra optimized for 10 Hz with a three-fold low pass J filter to suppress one bond couplings, 4k data points were acquired in F2 with 768 increments and 8 scans per increment. Edited-HSQC-TOCSY spectra were optimized for 145 Hz using DIPSI2 modulation and a mixing time of 120 ms, 8k data points were acquired in F2, 360 increments and 16 scans per increment. NOESY spectra were recorded using a mixing time of 0.6 s, 8k data points in F2 with 768 increments and 4 scans per increment. The HSQC-HECADE spectrum was recorded with 8k data points in F2, 512 increments and 36 scans per increment. The J (scale) factor was 1, the TOCSY mixing time 120 ms and the spectrum was optimised for 1 JC-H = 145 Hz.
The G-BIRD-HSQMBC experiment was recorded with 8K data points in F2, 480 increments with 40 scans per increment, a delay time τ/2 of 1/8J and optimised for 1 JC-H = 145 Hz. All data were zero filled to 1K or 2K in F1 for processing.
HPLC-MS analysis was performed using an HPLC (Hewlett Packard, Agilent Technologies 1100 series) coupled to a Finnigan MAT LTQ mass spectrometer fitted with an electrospray ionization (ESI) source. HPLC-MS data were processed and deconvoluted using Xcalibur (v. 1.1) (Thermo Finnigan).
For analysis of small molecules the HPLC was fitted with a Prodigy 5µ C18 column (250 mm × 4.6 mm, Phenomenex) column. A solvent system of methanol and water both containing 0.1% formic acid (v/v) was used. Samples were eluted with a linear gradient of 85 to 100% of methanol over 20 min, then 100% methanol over 10 min at a flow rate 0.7 mL min -1 (method A). Alternatively a linear gradient of 85 to 100% of methanol over 15 min, then 100% methanol over 8 min at a flow rate 1 mL min -1 was used (method B). The mass spectrometer was run in positive ionization mode, scanning from m/z 150 to 1800, and the collision energy was set to 15%. ESI high resolution MS (ESI-HRMS) was carried out on a Thermo Fisher Orbitrap with 60,000 resolution and normalized collision energy of 15%.
For protein mass determination the HPLC was fitted with a Jupiter C4 (250 mm x 2 mm, 5 µm, Phenomenex) column. A solvent system of acetonitrile and water both containing 0.1% (v/v) trifluoroacetic acid was used. Samples were eluted with a linear gradient of 5 to 35% of acetonitrile over 10 min, then from 35 to 95% of acetonitrile over 15 min at a flow rate 0.3 mL min -1 .
Preparative HPLC purification was performed using a Luna C18(2) column (250 mm x 21.2 mm, 10 µm, Phenomenex) connected to the Agilent 1200 series HPLC. Samples were eluted at a flow rate of 15 mL min -1 using the following method: water and acetonitrile as solvents with a gradient of 90 to 95% acetonitrile over 10 min, then 95 to 100% acetonitrile over 20 min, followed by 100% acetonitrile wash for 15 min. Fractions were collected at 0.9 min intervals.

Materials, DNA isolation and manipulation
Plasmids, strains and oligonucleotides (Invitrogen) used in this work are summarized in Tables   S2, S3 and S4, respectively.
Restriction endonucleases, Calf intestinal alkaline phosphatase, Phusion ® high-fidelity PCR master mix with GC buffer (for cloning), and Gibson assembly ® master mix were purchased from New England Biolabs (NEB). Biomix TM Red PCR master mix (for screening purposes) was purchased from Bioline. T4 DNA ligase and fast digest restriction endonucleases were supplied by Fermentas. Lysozyme powder was purchased from Amresco ® . Proteinase K powder was manufactured by Melford Laboratories Ltd. All chemicals were from Sigma-Aldrich. All organic solvents used were HPLC grade.
Plasmid DNA was isolated from an overnight culture using the E.Z.N.A. ® Plasmid Mini Kit I (Omega Bio-Tek) according to the manufacturer's protocol. High molecular weight genomic DNA from Streptomyces strains was isolated using the salting out procedure. [1] Purification of DNA fragments from agarose gels was performed using the KeyPrep Gel DNA Clean Up Kit (Anachem) according to the manufacturer's instructions. Site-directed mutagenesis was performed according to modified QuikChange ® site-directed mutagenesis kit (Stratagene) guidelines. DNA sequencing was carried out by the DNA Sequencing Facility in the Department of Biochemistry, University of Cambridge.

Bacterial strains and culture conditions
Streptomyces albus strains were grown in TSBY liquid medium (3% tryptone soy broth, 10.3% sucrose, 0.5% yeast extract) for isolation of genomic DNA, and on SFM solid medium (2% mannitol, 2% soya flour, 2% agar) for conjugation and strain maintenance. For liquid cultures, the strains were grown at 30°C with shaking at 220 rpm in a rotary incubator for 36-44 h. For solid culture, the strains were grown at 30°C for 10-12 days.

General strategy for vector construction for in-frame gene deletion
Recombinant plasmids for in-frame gene deletions were constructed by ligation of two PCRamplified DNA fragments (about 2 kb) from the upstream and downstream flanks of the target gene into pYH7 [2] vector digested with NdeI ( Figure S13). The Gibson assembly method was used to perform a three-piece ligation. [3] Recombinant plasmids were introduced by conjugation into S. albus DSM 41398. The donor strain was E. coli ET12657/pUZ8002, and conjugation was carried out as described by Luhavaya et al. [4] Potential mutants were checked by PCR analysis ( Figure S14).

Expression and purification of recombinant proteins
The expression plasmid pET-SalBIII_C (Table S2)  The protein yield of SalBIII is 45 mg L -1 . The purified proteins were examined by pre-cast NuPAGE ® Novex ® 4 -12% Bis-Tris gels with NuPAGE ® MES SDS running buffer (Thermo Scientific TM ) and HPLC-MS ( Figure S30).

Enzyme activity assay
Activity assays for SalBIII were carried out in a total reaction volume of 100 μL containing 2 µL of compound 2 (dissolved in DMSO, concentration unknown) and SalBIII at a final concentration of 50 µM. Reaction mixtures were incubated at 37°C for 20 h in a Storage buffer with pH 6.8. The assay mixture was extracted with ethyl acetate (3 x 400 µL). After evaporation of the solvent, the samples were resuspended in 50 μL of methanol and analyzed by HPLC-MS. Reaction mixture without enzyme added served as a control.
Substrate for the assay (compound 2) was purified using analytical C18 column because of the low production yield of the metabolite (Section 1.1, HPLC method A). After 24 injections (dried ethyl acetate extract from 0.5 L of culture redissolved in 1.3 mL of methanol) and manual collections, the amount of compound 2 was still below quantification level, but enough for the MS detection.

Culture extraction for HPLC-MS analysis of metabolites
500 µL samples of culture broth of S. albus strains were extracted with 500 µL of ethyl acetate.
The solvent was evaporated, the residue was redissolved in 500 μL of methanol and the mixture centrifuged before being subjected to HPLC-MS analysis. 5 μL of the solution was injected.

Purification of S. albus ∆salC metabolite 3
A 7-day-old 1.7 L culture broth of S. albus DSM 41398 ∆salC mutant [6] strain was extracted twice with 2 L of ethyl acetate. The combined organic layers were dried over MgSO4 and the solvent was evaporated, yielding 6.7 g of an oily residue. The latter was redissolved in 0.5 L of hexane and extracted three times with 0.5 L methanol/water (4:1, v/v) mixture. Methanol/water fractions were combined, and methanol was removed by evaporation at reduced pressure. The remaining water layer was extracted three times with 500 mL of ethyl acetate. The combined extracts were dried with anhydrous MgSO4 and the solvent was removed in vacuo. 5.5 g of oily residue was obtained. To remove the remaining oil, the sample was further purified by flash chromatography on a column of silica gel (60 μm particle size, 20 cm × 3 cm). The column was washed with 300 mL of 1:1 (v/v) mixture of hexane/ethyl acetate, and compounds were eluted with 2 L of ethyl acetate/methanol (19:1, v/v). Combined fractions contained 1.6 g of a mixture containing compound 3. Further purification was achieved by repeated rounds of preparative HPLC. Final fractions were combined and desalted using Chromabond C18 EC column (Macherey-Nagel) yielding around 5.2 mg of 3 ( Figure S29). All stages of purification were monitored by direct injection into the Finnigan MAT LTQ mass spectrometer or by HPLC-MS analysis.

H/D-exchange in small organic molecules
Residual solvent from the sample was removed under the flow of nitrogen. The sample was redissolved in deuterated methanol and incubated at room temperature for 24 h. Then the sample was used for direct injection into the Finnigan MAT LTQ mass spectrometer (Figures S20, S26). The best rod shaped crystals could be reproduced using a hanging drop method in Limbro plates and they appeared after 2-4 days in a crystallization condition composed by 25% PEG400, 20% PEG3350, 0.1 mM MgCl2 and 0.1 mM Tris pH 8.0. Crystals for SalBIII were prior cryo-cooled using as a cryo-mixture composed of 20% of PEG400 and 80% of well crystallization condition. X-ray diffraction data were collected at the P13 station on PETRAIII, Hamburg, Germany using a wavelength of 0.977Å and a PILATUS 6M pixel detector. The images were integrated using XDS [7] in the space group I212121 and at a resolution of 1.8Å. The statistics of data collection and refinement are detailed in Table S10.

Structure determination and refinement
The structure of SalBIII was solved by molecular replacement using the methods implemented in the program Phaser [8] from the suite CCP4 [9] . The structure of the C-terminal domain of Lsd19 (PDB code: 3RGA) was used as a search model. [10] The refinement was carried out using the program REFMAC5 [11] or Phenix.refine [12] from Phenix suite [13] version 1.8.4-1496. The structure was manually rebuilt and visualized using the program COOT 0.7.2. [14] The structure was validated using the program MolProbity. [15] Visual analysis was also performed using the program COOT 0.7.2 [14] and the figures were prepared using the program PyMol (Schrödinger, LLC). Data collection, refinement and validation statistics are presented in Table S10. The SalBIII atomic coordinates and structure factor have been deposited in the Protein Data Bank, PDB code: 5CXO.

NMR Analysis of Compound 3
Inspection of the 13 C, DEPT and 1 H spectra indicated the presence of two ketones, one carboxylic acid derivative, two trisubstituted double bonds and an acetal. TOCSY and HSQC-TOCSY spectra allowed the identification of seven spin systems (shown below in bold), the connectivity of these was then elucidated using COSY data and the location of the two rings by HMBC correlations.
While no TOCSY correlation could be established between H8 and H9 a weak COSY coupling and several HMBC correlations allowed spins systems 1 and 2 to be connected. HMBC correlations then connected the remaining spin systems along with the six unprotonated carbons to give the flat structure ( Figure S1). This contains the C1-C16 portion exactly as found in salinomycin and in the salinomycin analogues [4] (Figure S15), a hemiacetal at C17, alcohol at C19, ketone at C21 and two trisubstituted double bonds at C24/C25 and C28/C29. Both double bonds were determined to be trans using NOESY data as shown in Figure S2   and H31 signals the NOE correlations from this signal to H29 and H30 cannot be unambiguously assigned as being due to H26 and H31, respectively. However, an NOE correlation between H27/H29 along with the observed 1 H-13 C couplings (H29/C27 3 J = 5 Hz and H29/C31 3 J = 8.2 Hz) all agree with a trans configuration. It therefore seems reasonable to assume that the ambiguous correlations are indeed those shown in the figure).
The structure of the A ring was determined using NOE and 1 H-1 H coupling constant data. The key NOE correlations were those between H2, H5a and H7 that placed all these on one face and C4a and Me40 that placed these on the other. The configuration of the C2 stereocentre was harder to confirm; the large (10.8 Hz) coupling between H2 and H3 placed these in an antiperiplanar arrangement precluding the use of J-based configurational analysis which cannot distinguish between the two possible diastereomers in this case. No useful NOEs were observed between the A ring and H41 either. Comparison of the 1 H-1 H coupling constant and NOE correlation data in this region with that for natural salinomycin (1) showed a very good match however suggesting the C2 stereocentre be assigned as R configured ( Figure S3).  The stereochemistry in the C7-C10 region was assigned using Muratas's J-based configurational analysis method. [16] The required 1 H-13 C coupling constants were obtained from HSQC-HECADE [17] and G-BIRD-HSQMBC [18] experiments, the latter being necessary to overcome the small scalar coupling between H8 and H9.
Expected value for undetermined (nd) coupling constant is shown in italics. these protons however J-based configurational analysis was unable to determine fully the configuration at C12 as the H13-C11 and H13-C36 coupling constants could not be determined. An NOE between H10 and H13 however served to put H13 and C11 gauche and confirm the R configuration at C12 in agreement with natural salinomycin (Figures S5, S6).

Figure S5. Conformation of the B ring (NOE correlations in blue).
Expected value for undetermined (nd) coupling constant is shown in italics. Figure S6. Conformation about the C12-C13 bond (NOE correlations in blue). Data fits an A2 or B2 conformer, NOE correlations favour B2.
The remote stereocentre at C19 was linked to that at C17 via the diastereotopic protons at C18.
First the conformation about the C18/C19 bond was established by J-based configurational analysis.
A large (10.6 Hz) coupling placed H18b and H19 in an anti orientation and series of small 3 J coupling constants between H18b/C20, H18a/C20 and H18a/H19 indicated that all these pairs were gauche ( Figure S7). The small 2 J coupling between H18a/C19 and large 2 J coupling between H18b/C19 confirmed that H18a lay anti to the C19 hydroxyl group and thus established the relative configuration of the C18 protons with respect to the C19 stereocentre. The expected NOE correlations for this arrangement were observed ( Figure S8). The conformation of the C17/C18 bond was then established. The G-BIRD-HSQMBC spectrum allowed the C16/H18a and C16/H18b coupling constants to be measured and the small values for these indicated gauche relationships in each case ( Figure S8) C18/C19 bond, as shown in Figure S8, suggested that H19 should lie close to OH17 and an NOE correlation between these two protons was duly observed. The proposed conformation for this region is shown in Figure S8 and shows that an S configuration at C17 can be connected to an S configuration at C19 in agreement with the assignment of this centre in the E-15 and E-16 compounds from S. albus ∆salE mutant ( Figure S15) and with the prediction from bioinformatics analysis. [4] Figure S7. Conformation about the C18-C19 bond, data fits a D1 conformation. Figure S8. Conformation of the C16-C19 region. a) NOE correlations (note: Flexibility about the C19-C20 bond means that H18a, H18b and H19 each show correlations to both C20 protons). b) 1 H-13 C coupling constants between H18 and C16.
The final proposed 3D structure of 3 is shown in Figure S9.   Tables   Table S1. NMR data for 3. All spectra were recorded in CD3CN.      Rfactor = Fobs  Fcal/Fobs, where Fcalc and Fobs are observed and calculated structure factor amplitudes c Rfree = Rfactor , but using a random subset of the data (5%) which is excluded from the refinement. Figure S10. Multiple sequence alignment of the polyether epoxide hydrolases, limonene epoxide hydrolase (LEH) and SalBIII. The catalytic amino acid dyad is marked by red star and highlighted in yellow. The following proteins were used for alignment: Lasalocid LasB [20] (contains two domains LasB-N and LasB-C), nanchangmycin NanI [21] (contains two domains NanI-N and NanI-C), monensin MonBI and MonBII, [22] nigericin NigBI and NigBII, [23] tetronomycin TmnB, [24] tetronasin TsnB (Dr. F. Huang, personal communication), LEH, [25] salinomycin SalBI and SalBII. [6] Alignment was compiled by the Multalin program. [26] Figure S11. Phylogenetic analysis of polyether epoxide hydrolases. Lasalocid LasB, [20] nanchangmycin NanI, [21] monensin MonBI and MonBII, [22] nigericin NigBI and NigBII, [23] tetronomycin TmnB, [24] tetronasin TsnB (Dr. F. Huang, personal communication), and salinomycin SalBI, SalBII, and SalBIII [6] epoxide hydrolases are shown. PhyML 3.1/3.0 program was used for the phylogenetic tree construction. [27] Figure S12. Sequence alignment of SalBIII and Cyc11 domain from indanomycin [28] biosynthesis. Amino acid residues proposed to constitute the active site cavity are shown in boxes. Asp38/28 and Asp104/94 are proposed to be a catalytic dyad (marked by red star). Alignment was compiled by the Multalin program. [26] Figure S13. Schematic illustration of the in-frame deletion of the salBIII gene in the salinomycin biosynthetic gene cluster. SalBIII and its truncated sequence are shown in blue. control. Lanes 9, 10, 11, 12, 13, and 14 correspond to the PCRs with gDNA of of the same potential double mutant colonies, but PCR salC_f/r [6] primer pair was used to confirm that salC gene was deleted, lane 7negative control, lane 8 -positive control (plasmid pMY∆salC [6] was used as a DNA template in the PCR). If the salC gene has been deleted, a band of 532 bp is expected, whereas for the WT strain, a band of 1,825 bp is expected. 1kb Plus DNA Ladder (Fermentas) was used as a molecular DNA size marker. Figure S15. Chemical structures of the metabolites from S. albus ∆salE mutant. [4,29] Figure S16. Proposed general structure for the metabolites observed in S. albus ∆salBIII mutant. The structure is drawn in fully uncyclized form as no conclusion about cyclization pattern and differences between two isomers (B-8, B-11) can be made from MS n analysis. Ring A is not present.