The Biosynthesis of the Benzoxazole in Nataxazole Proceeds via an Unstable Ester and has Synthetic Utility

Abstract Heterocycles, a class of molecules that includes oxazoles, constitute one of the most common building blocks in current pharmaceuticals and are common in medicinally important natural products. The antitumor natural product nataxazole is a model for a large class of benzoxazole‐containing molecules that are made by a pathway that is not characterized. We report structural, biochemical, and chemical evidence that benzoxazole biosynthesis proceeds through an ester generated by an ATP‐dependent adenylating enzyme. The ester rearranges via a tetrahedral hemiorthoamide to yield an amide, which is a shunt product and not, as previously thought, an intermediate in the pathway. A second zinc‐dependent enzyme catalyzes the formation of hemiorthoamide from the ester but, by shuttling protons, the enzyme eliminates water, a reverse hydrolysis reaction, to yield the benzoxazole and avoids the amide. These insights have allowed us to harness the pathway to synthesize a series of novel halogenated benzoxazoles.

For detection of compound 1, NatL2 at a concentration of 2 μM was incubated with 1 mM 3-HAA, 1 mM ATP, 10 mM MgCl2, and 50 mM Tris-HCl (pH 8.0) in a volume of 50 µL reaction system for 15 min at 30 °C. Reactions were filtered through a 10 kDa membrane to remove the protein by centrifugation at 4 o C. The filtrate was then subjected to LC-MS analysis as above. To study the conversion of 1 to 2, the filtrate was further incubated at room temperature for 1 h or 2 h before loaded on LC-MS. To study the conversion of 1 to 3, the filtrate was incubated with 2 μM NatAM at room temperature for 1h before loading onto LC-MS.

Assays of NatL2
NatL2 and NatAM were incubated with 5 mM EDTA at 4 °C overnight. The enzyme activity was then tested in a volume of 100 µL consisting of 2 μM EDTA treated proteins, 1 mM ATP, and 1 mM 3-HAA in 50 mM Tris-HCl (pH 8.0) but no activity was detected. 10 mM MgCl2, and 10 mM ZnCl2 were added to the reaction and product observed.

Structural characterization of compound 2, 3 and 4
For structural characterization of compound 2, 3 and 4 generated in NatL2 and NatAM reactions, a 100 mL aqueous solution containing 50 mM Tris-HCl pH 8.0, 2 μM purified enzyme(s), 10 mM MgCl2, 1 mM ATP and 1 mM substrate was prepared. All assays were mixed with a pipette and divided into 1 mL per tube, incubated at 30 °C for 3 h and finally quenched by adding equal volumes of methanol. Proteins were removed by centrifugation. The targeted compounds were enriched and isolated with macro-porous adsorptive resins and further purified by HPLC semi-preparation using parameters described above. The purified compounds were concentrated and resolved in appropriate solvent and characterized using NMR. The NMR data of compound 2 and 3 were recorded on the Bruker Avance 500 MHz NMR spectrometry and the NMR data of compound 4 were recorded on a Bruker Avance 400 MHz NMR spectrometry.

Site-directed mutagenesis of NatL2 and NatAM
The whole plasmid pWDY1211 or pWDY1232 was amplified by high-fidelity PCR using each primer pair. After checking the PCR products by agarose gel electrophoresis, 2 μL of the PCR product was treated with the restriction enzyme DpnⅠ, and the total 20 μL reaction solution was incubated at 37 °C for 1.5 h. 2.5 μL of the resulting solution was used for transformation of E. coli DH10B competent cells. The positive clones were picked and verified by DNA sequencing.

Substrate scope of NatL2 and NatAM
Various halogenated derivatives of 3-HBA were incubated with NatL2 and NatAM in the presence of 3-HAA. Typically, the reactions were carried out on a 100 µL scale with 2 μM purified NatL2/NatAM, 1 mM ATP, 1 mM 3-HAA, 1 mM halogenated derivative and 10 mM MgCl2 in 50 mM Tris-HCl (pH 8.0). The mixture was incubated at 30 °C for 180 min. Benzoxazoles and amide shunt products were analysed by LC-MS and confirmed by high resolution MS.
Crystallization, data collection and structure determination Freshly prepared NatL2 was initially screened as apo form and crystals were obtained at 4 o C. These crystals diffracted poorly at the Diamond synchrotron and the structure was not solved. We incubated with 5 mM ATP and 5 mM MgCl2 for 1 h before setting up crystallization. Plate shaped crystals of NatL2 were successfully obtained within 2 weeks using hanging drop diffusion methods by mixing the 1 μl of protein (40 mg/ml of protein, 5 mM ATP and 5 mM magnesium chloride) with 1 μl of reservoir solution (0.2 M KSCN, 0.1 M sodium citrate pH 6.0, 30% PEG MME 2K) at room temperature. Crystals were fished, transferred to cryo-protectant (0.2 M KSCN, 0.1 M sodium citrate pH 6.0, 40% PEG MME 2000) and then flash-frozen by plunge into liquid nitrogen.
Crystals grown in the presence of ATP were soaked overnight in mother liquor supplemented with either with 20 mM AMPPNP, 20 mM ATP, saturating concentrations of 4, saturating concentration of 3-HAA or combined saturating 3-HAA and 20 mM AMPPNP. Cocrystallization of NatL2 and 5 mM 3-HAA, 5 mM AMP and 5 mM MgCl2, was also performed but gave no new information. To obtain NatL2:SA complex structure, co-crystallized NatL2:3-HAA crystals were soaked in 300 mM salicylic acid overnight. Crystals were then transferred into similar cryo-protectant supplemented with varied ligands before being flash-frozen in liquid nitrogen. X-ray fluorescence scan spectrum (MCA) suggests existence of zinc ion in the crystal and phase information was obtained from the zinc anomalous signal using MAD methods at peak (1.2828 Å), inflection (1.2832 Å) and high remote wavelength (1.2697 Å) ( Table S3 and S4).
NatAM at a concentration of 40-50 mg/ml was screened for crystallisation. Crystals were obtained in two conditions after two weeks at 4 o C. Crystals belonging to space group C121 (C2) are grown in the condition containing 0.05 M MgCl2, 0.1 M HEPES pH 7.5 and 30% v/v polyethylene glycol monomethyl ether 550, while crystals belongs to P 212121 or P21212 were grown in the condition containing 0.2 M MgCl2, 0.1 M Bis-Tris pH 6.5 and 25% w/v polyethylene glycol 3,350. Full size crystals were later transferred in the mother liquor supplied with 40-45% polyethylene glycol 3,350. NatAM crystals were soaked in its mother liquor containing 20 mM zinc chloride for several minutes, and then flash-frozen in liquid nitrogen with 40% PEG 3350 as cryo-protectant.
To obtain complex structures of NatAM, apo NatAM crystals were soaked overnight in the mother liquor supplied with 20 mM AMP, ATP, AMPPNP, 3-HAA, 3-HBA, 2 and 3 at saturated concentration before being flash-frozen in similar cryoprotectant. NatAM crystals were also soaked with 4 at saturated concentration from 2 hours to one week. Higher occupancy of 4 was observed by using longer soaking times.
All the X-ray diffraction data were recorded at Diamond Light Source beamlines (I03, I04, I04-1, and I24) which we gratefully acknowledge. The diffraction images were reduced, integrated, and scaled using xia2 [2] , DIALS [3] or autoPROC [4] . To obtain phase information, the NatL2:AMP crystals were also grown in the presence of 5 mM manganese (II) chloride, however X-ray fluorescence scan spectrum suggests a much higher abundance for zinc instead of manganese ion. A fluorescence scan was then used to locate the Zn K edge as 1.2828 Å at I04. Multiple-wavelength data were then collected at wavelengths of 1.2828 Å (peak), 1.2832 Å (inflection) and 1.2697 Å (high energy remote). For NatAM crystals, X-ray fluorescence scan spectrum indicates strong signal of zinc and MAD data were collected at wavelengths of 1.2821 Å (peak), 1.2826 Å (inflection) and 1.2626 Å (high energy remote). AutoSHARP [5] was used to locate the two zinc atom sites in the asymmetric unit and initial model was obtained by using autobuilding programme ARP/wARP [6] or Phenix AutoBuild wizard [7] . The initial model was further refined using Refmac [8] , Phenix [9] , manually built with Coot [10] , and improved using PDB-redo [11] . All the other structures were solved by molecular replacement using PHASER [12] with NatL2-AMP complex structure or apo NatAM structure as the initial models and refined as above.

Plasmids/strains
Relevant genotype/comments  Table S1. Plasmids and strains used in this study.      Alignments were prepared with MUSCLE and was rendered with ESPript 3.0 with secondary structure of NatL2 or NatAM depicted above the sequence alignment.
In both (a) and (b) a black box with white text denotes strict identity, a black bold character shows sequence similarity and black box highlights conserved regions.