Programmed Iteration Controls the Assembly of the Nonanoic Acid Side Chain of the Antibiotic Mupirocin

Abstract Mupirocin is a clinically important antibiotic produced by Pseudomonas fluorescens NCIMB 10586 that is assembled by a complex trans‐AT polyketide synthase. The polyketide fragment, monic acid, is esterified by a 9‐hydroxynonanoic acid (9HN) side chain which is essential for biological activity. The ester side chain assembly is initialised from a 3‐hydroxypropionate (3HP) starter unit attached to the acyl carrier protein (ACP) MacpD, but the fate of this species is unknown. Herein we report the application of NMR spectroscopy, mass spectrometry, chemical probes and in vitro assays to establish the remaining steps of 9HN biosynthesis. These investigations reveal a complex interplay between a novel iterative or “stuttering” KS‐AT didomain (MmpF), the multidomain module MmpB and multiple ACPs. This work has important implications for understanding the late‐stage biosynthetic steps of mupirocin and will be important for future engineering of related trans‐AT biosynthetic pathways (e.g. thiomarinol).

Fermentation was inoculated with 1% seed culture in modified LB medium (1% Bacto tryptone, 0.5% yeast extract, 0.5% sodium chloride) supplemented with 4% w/v glucose and 0.5 g/ml IPTG (3 x 100 ml) in 500 ml baffled flasks. The culture was incubated at 200 rpm at 22°C for 50 hrs then centrifuged at 8000 rpm for 15 min. The supernatant was extracted with EtOAc three times and the combined extracts were evaporated in vacuo to give a crude extract, which was resuspended in MeOH for LC-MS analysis.

General Experimental Procedures for chemical synthesis
All reagents and solvents were obtained from commercial sources and were used as purchased. Petroleum ether is of the 40 -60 °C boiling point range. Where anhydrous conditions were necessary, reactions were carried out in flame-dried glassware under a positive pressure of nitrogen using standard Schlenk syringesepta techniques. Anhydrous solvents were obtained from an Anhydrous Engineering Ltd. modified Grubbs system of double alumina and alumina copper catalysed drying columns. [2] Routine monitoring of reactions was conducted by analytical TLC using aluminium sheets precoated with silica (MerckKieselgel 60 F254) with a suitable solvent system and visualised using 254 nm UV light and/or developed with potassium permanganate and heat. Flash column chromatography was performed according to the procedure used by Still et al. [3] Infrared spectra were recorded on a Perkin Elmer Spectrum 100 FTIR spectrometer with an ATR diamond cell and frequencies are reported in wavenumbers (cm −1 ). Only strong and selected absorbances are reported. Mass spectrometry was performed by the University of Bristol mass spectrometry service by electrospray ionisation (ESI) or atmospheric pressure chemical ionisation (APCI) using a Bruker MicrOTOF II or Thermo Scientific Orbitrap Elite spectrometer.
The residue (8.23 g, 50.8 mmol) was suspended in DMF (100 mL) and cooled to 0 °C. TBSCl (15.3 g, 102 mmol) and imidazole (11.4 g, 168 mmol) were added sequentially, and the reaction was stirred overnight at rt. The reaction was quenched with H2O (100 mL) and hexane (150 mL) was added. The layers were separated, and the aqueous layer was extracted with hexane (2 × 150 mL). The combined organics were dried over MgSO 4 and concentrated in vacuo to give diprotected product (10.19 g, 58%) as a colourless oil.

Acetonide of Crotonoyl pantetheine (S20)
According to GP03, reaction of crotonic acid S19 ( All data are in accordance with the literature. [11]

Materials and methods for molecular biology techniques
Reagents were purchased from Sigma-Aldrich, Thermo Fisher or Merck Millipore. E. coli competent cells were purchased from New England Biolabs (T7 Express and 5-), Merck Millipore (Novagen BL21 (DE3)) or Agilent (ArcticExpress (DE3) RIL). All enzymes used were purchased from Thermo Fisher Scientific.

Plasmid generation, protein expression and purification.
Specific genes were amplified from P. fluorescens NCIMB 10856 genomic DNA with primers outlined in Table S1 and [12] .
Harvested cells were sonicated and the soluble fraction was purified by immobilized metal affinity chromatography (IMAC) via a HiTrap 5 ml HP Ni column (GE Healthcare). Protein was eluted using a linear gradient from 6-100% of Buffer B (50 mM Tris-HCl, 500 mM NaCl, 10% (v/v) glycerol, 800 mM imidazole, pH 8.0). Eluted protein was further purified by size exclusion chromatography (SEC) using either a HiPrep 26/60 Sephacryl S100 or S200 column (GE Healthcare) in Buffer C (25 mM Tris-HCl-, 150 mM NaCl, pH 7.5, 1 mM DTT) before protein concentration. MmpF, MmpF_C183 and MmpB_KS were purified by IMAC in Buffer A and B supplemented with 1mM TCEP, then by SEC in buffer A supplemented with 1 mM DTT.
MacpA was cleaved overnight using in house TEV protease prior to SEC. Purified protein (50 M) was analysed by analytical size exclusion chromatography using either a Superdex 75 10/300 or Superdex 200 increase 10/300 GL column (GE Healthcare) calibrated with molecular weight standards (GE Healthcare) [13] .
For NMR studies, 15 N labelled protein was produced from cells grown to OD600 = 2.0 in LB media supplemented with carbenicillin (100 g/ml) at 37 °C. Cells were then pelleted by centrifugation (6000 rpm, 10 mins), washed twice with sterile M9 media and then exchanged into M9 minimal media at a 4:1 volumetric ratio. Cells were supplemented with 1 gL -1 15 NH4Cl, 0.5% (v/v) glycerol and 0.05% (w/v) glucose and induced with 0.5 mM IPTG, then harvested after 16 hrs at 16 °C. Cells were resuspended into buffer A and purified as described above for the unlabelled protein.

ESI-MS
Samples were desalted for ESI-MS analysis using a C4 ZipTip (Merck) per the manufacturer's instructions.
The source as set to positive mode and spectra were acquired over 500-3000 m/z and analysed using MassLynx 4.1 software. For Ppant ejection assays, an appropriate charge state was isolated using the MSMS functionality. The transfer collision energy was increased until fragmentation was observed (typically 5 V to 20 V) and spectra were collected from 200-1000 m/z.

Phylogenetic tree analysis
Amino acid sequences of 663 KSs were extracted from 58 trans-AT PKS clusters with EryAII_KS3 and EryAIII_KS5 from the erythromycin cis-AT PKS chosen as outgroup KS domains. The sequences were aligned using the MUSCLE algorithm [15] with default settings and a Maximum-likelihood phylogenetic tree was computed using Mega-X with 100 bootstrap iterations using the LG +F substitution model, with Gamma distribution and gaps set to use all sites. [16] The tree was visualised using FigTree with clades manually assigned.  Table S1: A list of primers used in this study.