Genome Mining Enabled by Biosynthetic Characterization Uncovers a Class of Benzoxazolinate‐Containing Natural Products in Diverse Bacteria

Abstract Benzoxazolinate is a rare bis‐heterocyclic moiety that interacts with proteins and DNA and confers extraordinary bioactivities on natural products, such as C‐1027. However, the biosynthetic gene responsible for the key cyclization step of benzoxazolinate remains unclear. Herein, we show a putative acyl AMP‐ligase responsible for the last cyclization step. We used the enzyme as a probe for genome mining and discovered that the orphan benzobactin gene cluster in entomopathogenic bacteria prevails across Proteobacteria and Firmicutes. It turns out that Pseudomonas chlororaphis produces various benzobactins, whose biosynthesis is highlighted by a synergistic effect of two unclustered genes encoding enzymes on boosting benzobactin production; the formation of non‐proteinogenic 2‐hydroxymethylserine by a serine hydroxymethyltransferase; and the types I and II NRPS architecture for structural diversity. Our findings reveal the biosynthetic potential of a widespread benzobactin gene cluster.


General experimental procedures
All chemicals were purchased from Sigma-Aldrich, Acros Organics, or Iris BIOTECH. Isotopelabeled chemicals were purchased from Cambridge Isotope Laboratories, Inc. Genomic DNA of selected Xenorhabdus and Pseudomonas strains were isolated using the Qiagen Gentra Puregene Yeast/Bact Kit. DNA polymerases (Taq, Phusion, and Q5) and restriction enzymes were purchased from New England Biolabs or Thermo Fisher Scientific. DNA primers were purchased from Eurofins MWG Operon. DNA fragments were purchased from Twist Bioscience. PCR amplifications were carried out on thermocyclers (SensoQuest). Polymerases were used according to the manufacturers' instructions. DNA purification was performed from 1% TAE agarose gel using Invisorb® Spin DNA Extraction Kit (STRATEC Biomedical AG). Plasmids in E. coli were isolated by alkaline lysis. HPLC-UV-MS analysis was conducted on an UltiMate 3000 system (Thermo Fisher) coupled to an AmaZonX mass spectrometer (Bruker) with an ACQUITY UPLC BEH C18 column (130 Å, 2.1 mm × 100 mm, 1.7 μm particle size, Waters) at a flow of 0.6 mL/min (5-95% acetonitrile/water with 0.1% formic acid, v/v, 16 min, UV detection wavelength 190-800 nm).

Sequencing
Long and short DNA reads were generated by Nanopore and Illumina sequencing, respectively.
For library preparation, a TruSeq DNA PCR-free high-throughput library prep kit (Illumina) and the SQK-LSK109 ligation sequencing kit (Oxford Nanopore Technologies, ONT) were used without prior shearing of the DNA. To generate the short reads, a 2 × 300-nucleotide run (MiSeq reagent kit v3, 600 cycles) was executed. The long reads were generated on a GridION platform using an R9.4.1 flow cell. Base-calling and demultiplexing were performed using Guppy v4.0.11 (ref. [1] ).
Both data sets were assembled using Unicycler v0.4.6. The region of interest was identified using antiSMASH (ref. [2] ) to be located on a plasmid with a size of 182,126 bp in Xenorhabdus vietnamensis DSM 22392.

Strain and culture conditions
Wild-type strains and the mutants thereof and E. coli (Table S1) were cultivated on lysogeny broth (LB) agar plates at 30 o C overnight and were subsequently inoculated into liquid LB culture at 30 o C with shaking at 200 rpm. For compound production, the overnight LB culture of a mutant was transferred into 5 mL XPP medium [3] (1:100, v/v) with 2% (v/v) of Amberlite TM XAD-16 resins, 0.1 % of L-arabinose as an inducer, and selective antibiotics such as ampicillin (Am, 100 µg/mL), kanamycin (Km, 50 µg/mL), or chloramphenicol (Cm, 34 µg/mL) at 30 o C with shaking at 200 rpm.

Culture extraction and HPLC-UV-MS analysis
The XAD-16 resins were collected after 72 h and extracted with 5 mL methanol. The solvent was dried under rotary evaporators, and the dried extract was resuspended in 500 μL methanol, of

Construction of insertion mutants
A 500-800-bp upstream of the target gene (xsbA and pbzA) was amplified with a corresponding primer pair listed in Table S3. The resulting fragments were cloned using Hot Fusion [4] into pCEP_kan or pCEP_cm backbone that was amplified by pCEP_Fw and pCEP_Rv. After the transformation of the constructed plasmid into E. coli S17-1 λ pir, clones were verified by PCR with primers pCEP-Ve-Fw and pDS132-Ve-Rv. The wild-type strain (recipient) was mated with E. coli S17-1 λ pir (donor) carrying constructed plasmids. Both strains were grown in LB medium to an OD600 of 0.6 to 0.7, and the cells were washed once with fresh LB medium. Subsequently, the donor and recipient strains were mixed on an LB agar plate in ratios of 1:3 and 3:1, and incubated at 37°C for 3 h followed by incubation at 30°C for 21 h. After that, the bacterial cell layer was harvested with an inoculating loop and resuspended in 2 mL fresh LB medium. 200 μL of the resuspended culture was spread out on an LB agar plate with ampicillin/kanamycin (or ampicillin/chloramphenicol) and incubated at 30°C for 2 days. Individual insertion clones were cultivated and analyzed by HPLC-UV-HRMS, and the genotype of all mutants was verified by plasmid-and genome-specific primers.

Construction of deletion mutants
A ~1000-bp upstream and a ~1000-bp downstream fragments (mutations were introduced by primers) of a target gene (xsbB, xsbC, xsbD, pbzA, pbzB, pbzD, pbzF, pbzG, pbzI, and phzE) were amplified using primer pairs listed in Table S3. The amplified fragments were fused using the   complementary overhangs introduced by primers and cloned into the pCKcipB or pEB17_KM vector (linearized with PstI and BglII) by Hot Fusion. [4] Transformation of E. coli S17-1 λ pir with the resulting plasmid and conjugation with a wild-type strain or mutant, as well as the generation of double crossover mutants via counterselection on LB plates containing 6% sucrose. Deletion mutants were verified via PCR using primer pairs listed in Table S3, which yielded a ~2000-bp fragment for mutants genetically equal to the WT strain and a ~1000-bp fragment for the desired deletion mutant.

Heterologous expression of xsb BGC
All plasmids carrying target genes for heterologous expression were constructed via Hot Fusion. [4] The biosynthetic gene cluster, xsbABCDE, was cloned into pCOLADuet-1. The ADIC synthase encoded gene in the xpz BGC (xpzC) was cloned into pACYCDuet-1. xsbA and xpzC were constructed separately into two multiple cloning sites of pACYCDuet-1. xsbC was cloned into pCOLADuet-1. E. coli BL21(DE3) was transformed with plasmids for (co-)expression.

Isotope labeling experiments
The cultivation of strains for labeling experiments was carried out as described above. [13] The cell pellets of the 100 μL overnight culture were washed once with 100-μL ISOGRO ® 15 N medium before being transferred into the 5-mL ISOGRO ® 15 N medium culture. For the purpose of inverse feedings, additional unlabeled L-serine or glycine was added into the 15 N medium culture at a final concentration of 1 mM.

Isolation and purification
2% of XAD-16 resins from a 3 L LB culture of X. szentirmaii PBAD xsbA mutant induced by Larabinose were harvested after 72 h of incubation at 30 o C with shaking at 120 rpm, and were washed with water and extracted with methanol (3 × 500 mL) to yield a crude extract 3.5 g. The extract was subjected to a Sephadex LH-20 column eluted with MeOH and afforded seven fractions.

NMR spectroscopy
Chemical shifts (δ) were reported in parts per million (ppm) and referenced to the solvent signals.

In vitro enzymatic assays of PbzB
pbzB with an N-terminal His-SUMO-tag was cloned into a pET11a vector. E. coli BL21(DE3) was transformed with the resulting plasmid. 10 mL overnight culture carrying the plasmid was transferred to 500 mL LB medium with ampicillin (Am, 100 µg/mL). The strain was grown to an OD600 of 0.8 at 37 °C, and then 0.5 mM IPTG for induction was added into the culture, followed by incubation at 22 °C for 24 h. The cells were collected by centrifugation (10,000 r.p.m., 15 min, 4 °C).
Cell lysis was performed by resuspending the pellet in 100 mL BugBuster ® (primary amine-free) Extraction Reagent with 1 µL of Benzonase® Nuclease, 14 mg of cOmplete™ EDTA-free protease inhibitor, and lysozyme (200 μg/mL), followed by incubation at 4 °C for 45 min. Cell debris was removed by centrifugation at 20,000 xg for 30 min and the protein was purified by using Ni 2+ affinity chromatography.
The reaction mixture (100 µL) contained 3.2 µM of PbzB, 1 mM substrate (glycine, L-serine, and Dserine), 500 µM mTHF, 5 µM PLP, and 50 mM potassium phosphate buff pH 7.5. After incubation at 30 °C for 1 h, the reaction was quenched by adding 100 µL of acetonitrile. Products were analyzed using HPLC-UV-HRMS conducted on an UltiMate 3000 system (Thermo Fisher) coupled to an Impact II qTof mass spectrometer (Bruker) with an ACQUITY UPLC BEH Amide column (130 Å, 2.1 mm × 50 mm, 1.7 μm particle size, Waters) at a flow of 0.4 mL/min (5-50% water/acetonitrile with 0.1% formic acid, v/v, 5 min, 90% water/acetonitrile with 0.1% formic acid, v/v, 2.1 min, UV detection wavelength 190-800 nm. 2-Hydroxymethylserine as a standard compound was prepared at different concentrations and these samples were measured by HPLC-UV-HRMS with the above-mentioned method to obtain a standard curve ( Figures S27 and S28). 6000. Prior to data collection, crystals were flash-frozen in liquid nitrogen using a cryo-solution that consisted of mother-liquor supplemented with 20% (v/v) glycerol. Data were collected under cryogenic conditions at the European Synchrotron Radiation Facility (Grenoble, France) [14] . Data were processed with XDS and scaled with XSCALE [15] . All structures were determined by molecular replacement with PHASER [16] manually built in COOT [17] , and refined with PHENIX [18] . The search model for the PbzB structures was the glycine hydroxymethyltransferase from Acinetobacter baumannii (PDB 5VMB). Figures were prepared with Pymol (www.pymol.org) [19,20] .
22 Figure S4. Multiple-sequence alignment of XsbC with biochemically characterized acyl-AMP ligases (NatL2, BomJ, and PtmA1), acyl-CoA ligase (PtmA2), and A domains (DltA and PheA) by Clustal Omega Alignment. Adenylation domain core motifs A1-A10 (turquoise), [25] the catalytic lysine residue (yellow), the Michaelis complex-forming amino acids of the first half (adenylation) reaction (grey), the zinc-binding motif (green), and the characteristic C-terminal extensions (red).   [29] and 4-chlorobenzoate CoA ligase (4CBL) [30] .   (b) Proposed formation of benzobactins-628 and 1021. 2-Hydroxymethylserine is symmetrical due to two identical hydroxymethyl groups, and therefore integrations of the building block, as well as dimerization and tetramerization, would not bring diastereoisomers to benzobactin compounds, exemplified by R-3. This is consistent with the observation of no (diastereo)isomers for compounds 2, 3, 4, and benzobactins-628. Therefore, benzobactins-1012 a and b are highly unlikely to be a pair of stereoisomers. Instead, they are assumed to be structural isomers with a difference in forming linkages between 2-hydroxymethylserines via ester bond(s) or amide bond(s).  (b) Multiple sequence alignment of PbzB with its homologs AsmD, FmoH, XvbB, as well as other structurally characterized glycines/serine hydroxymethyltransferases. The used serine hydroxymethyltransferases were hits retrieved from a Dali search using the PbzB apo-structure in this study as the search query. The consensus threshold was set to >85%. The green arrows mark the residues involved in the coordination of the substrate (glycine or serine) bound to the co-factor PLP thereby either forming the external aldimine PLG or PLS. The lysine residue within conserved loop 6 (green arrow) forms a Schiff base with PLP. The black arrows highlight the residues involved in the binding of mTHF. The right panel shows a zoom into the alignment pointing to the difference between the specialized PbzB-type serine hydroxymethyltransferases and other GlyA-type glycine/serine hydroxymethyltransferases involved in the central metabolism of amino acid biosynthesis. Figure S12. Phylogenetic analysis of PbzD-A2 with other biochemically characterized A domains from Xenorhabdus and Photorhabdus strains by FastTree2.1.11 (ref [27,28] ). PbzD-A2 (asterisk) falls into the clade of A domain with cysteine specificity (light blue) and is separate from those with glycine (orange) or serine (yellow) specificity. The tree is based on protein sequences from core motifs A4 (234) to A5 (331) which are used to determine substrate specificity. [31]  Figure S14. Phylogenetic analysis and multiple-sequence alignment of PbzD-C1 and PbzI with other condensation domains/enzymes from Xenorhabdus and Photorhabdus strains by FastTree2.1.11 (ref [27,28] ). (a) PbzD-C1 (asterisk) falls into the clade of heterocyclization domains (yellow), while PbzI (asterisk) and its homologs from benzobactin-related BGCs are separate from all other condensation domains/enzymes (red). Heterocyclization domains (yellow) catalyze both peptide bond formation of two amino acids and subsequent intramolecular heterocyclization of cysteine, serine, or threonine. Starter condensation domains (green) acylate the first amino acid with a fatty acid or polyketide moiety. L CL condensation domains (dark blue) catalyze a peptide bond formation between two L-amino acids. Terminal condensation domains (purple) catalyze the release of the T-domains tethered peptidyl chain. Dual condensation domains (orange) catalyze both epimerization and condensation. The tree is based on protein sequences of full-length condensation domains/enzymes. (b) A multiple-sequence alignment shows that PbzI and its homologs lack the conserved histidine or aspartic acid in the first and second positions of core motif C3. Conserved amino acids in the core motif C3 are indicated with shapes of gray.