Reconstitution of Iterative Thioamidation in Closthioamide Biosynthesis Reveals Tailoring Strategy for Nonribosomal Peptide Backbones

Abstract Thioamide‐containing nonribosomal peptides (NRPs) are exceedingly rare. Recently the biosynthetic gene cluster for the thioamidated NRP antibiotic closthioamide (CTA) was reported, however, the enzyme responsible for and the timing of thioamide formation remained enigmatic. Here, genome editing, biochemical assays, and mutational studies are used to demonstrate that an Fe‐S cluster containing member of the adenine nucleotide α‐hydrolase protein superfamily (CtaC) is responsible for sulfur incorporation during CTA biosynthesis. However, unlike all previously characterized members, CtaC functions in a thiotemplated manner. In addition to prompting a revision of the CTA biosynthetic pathway, the reconstitution of CtaC provides the first example of a NRP thioamide synthetase. Finally, CtaC is used as a bioinformatic handle to demonstrate that thioamidated NRP biosynthetic gene clusters are more widespread than previously appreciated.


Experimental Procedures General methods
Sequencing and oligonucleotide primer synthesis was performed by Eurofins Genomics. Media components were purchased from Sigma, Roth and Difco. All chemicals were purchased from commercial suppliers (Sigma, Roth, etc.) without further purification. Restriction endonucleases were purchased from New England Biolabs. A list of all strains, plasmids and oligonucleotide primers used can be found in Table S1, Table S2, and Table S3, respectively.

Bacterial strains and culturing conditions
Escherichia coli strains were grown in lysogeny broth (LB) shaken at 160 rpm or on LB agar plates at 37 °C with appropriate antibiotic selection (chloramphenicol, 25 µg mL -1 ; kanamycin, 50 µg mL -1 ; ampicillin, 100 µg mL -1 ). All plasmid construction and storage was performed with E. coli TOP10, while E. coli Rosetta (DE3) and E. coli HM0079 were used for protein overexpression and production of CTA intermediates, respectively.
Ruminiclostridium cellulolyticum DSM 5812 was cultivated under an anaerobic atmosphere (N 2 :H 2 :CO 2 , 85:5:10 vol:vol:vol) in a Whitley A35 anaerobic work station (Don Whitley Scientific) operating at 37 °C. Routine cultivation was performed in modified CM3 medium with cellobiose (6 g L -1 ) as previously described. [1] For the production of CTA, strains were grown in DSMZ medium 165 as previously described. [2] Bioinformatic analyses For phylogenetic studies, the sequences were aligned with MAFFT 7 [3] and the phylogenetic trees were reconstructed using the neighbor-joining method with 1000 bootstraps using MEGA6 [4] (see Table S4 for a list of sequences used). The PCP and AANH protein multiple sequence alignments were performed with Clustal Omega using the default parameters. [5] Protein sequences for the sequence similarity network (SSN) were retrieved using BLASTp against the reference genome database and CtaC (Ccel_3258) as a query. The top 100 homologs of CtaC identified from this search (see Table S5) were used to generate the SSN using the Enzyme Function Initiative Enzyme Similarity Tool (EFI-EST; https://efi.igb.illinois.edu/efi-est/). [6] The network was constructed at an expectation-value (e-value) of 10 −80 and visualized using Cytoscape (v. 3.2.1) with the organic layout. [7] Sequences with 100% identity were visualized as a single node in the network. The local genomic region (~10 kbp upstream and downstream) surrounding the gene encoding the CtaC homolog was checked for the presence of secondary metabolite biosynthetic genes. Nodes corresponding to proteins not found in a bioinformatically identifiable natural product biosynthetic gene cluster were removed from the final network; however, the accession numbers for these sequences can be found in Table S5.

LC-HR-MS
HPLC-HR-MS and HPLC-HR-MS/MS measurements were performed with a Thermo Accela HPLC-system coupled to either a QExactive Hybrid-Quadrupole-Orbitrap (Thermo Fischer Scientific) or Exactive Hybrid-Quadrupole-Orbitrap (Thermo Fischer Scientific) mass spectrometer equipped with an electrospray ion source. For QExactive measurements, separation was performed with an Accucore C18 column (

Synthesis of compound 1 [(3-(3-(4-hydroxybenzamido)propanamido)propanoic acid)]
Compound 8 (compound 22 in Ref. 9) was deprotected using aqueous trifluoroacetic acid (95%) and concentrated under reduced pressure to obtain the title compound as a white powder (quantitative). See Figure S21-S22 for NMR spectra. The HPLC was operated at a flow rate of 20 ml min -1 . The pure product was deprotected using TFA (95%) to obtain the title compound in a yield of 23% over two steps. See Figure S27-S28 for NMR spectra.

Plasmid construction for CRISPR/Cas knockout vector
The knockout plasmid was generated as previously described with minor changes. [2] The identification of a suitable target site containing the necessary PAM (NGG) sequence for generation of the ctaC gene knockout was performed using the webtool CRISPy-web. [10] The sgRNA cassette (P4 promoter, sgRNA, spy terminator) containing the selected N20 sequence and the homology arms (250 bp upstream and downstream from the editing site with a mutated N20 sequence) for DNA repair following nicking by the Cas9-nickase were synthesized by GenScript. The resultant synthetic vector (pUC57-ctaC-target) was used as a template to amplify a DNA fragment with primers TargetCT-F/TargetCT-R (see Table S3) and Phusion High-Fidelity DNA polymerase (New England Biolabs) to create an amplicon including the N20-gRNA, the P4 synthetic promotor and the homologous regions with the mutated N20 sequence. Following purification using Monarch PCR and DNA Cleanup Kit (New England Biolabs), the amplicon was inserted into BsaI-digested pCasC [2] (New England Biolabs) using NEBuilder HiFi DNA Assembly Master Mix (New England Biolabs) to afford pCasC-ctaC. E. coli Top10 competent cells were transformed with the assembly reaction and transformants were selected on LB plates supplemented with 10 μg mL -1 gentamicin. Correct clones were identified by colony PCR using OneTaq Quick-load 2x Master Mix (New England Biolabs) using primers Cas-3258-check F /Cas-3258-check R, and verified by sequencing with Cas-3258-seq (see Table S3).

Generation of Ruminiclostridium cellulolyticum ΔctaC
R. cellulolyticum ΔctaC, which contains an in-frame nonsense mutation in ctaC, was generated using pCasC-ctaC as previously described. [2] The design of pCasC-ctaC was such that successful editing of ctaC would introduce the desired mutation (TAA), as well as an EcoRV restriction endonuclease site (GATATC) to facilitate screening of transformants. After electroporation, individual potential R. cellulolyticum ΔctaC colonies were randomly picked and subjected to colony PCR using OneTaq DNA Polymerase (New England Biolabs) using primers Cas-3258-check F and Cas-3258-check R (see Table S3). The presence or absence of the restriction site (corresponding to edited or unedited ΔctaC) was ascertained by digestion of the generated PCR products (1,276 bp) with restriction endonuclease EcoRV (New England Biolabs), followed by agarose gel electrophoresis (1.5% agarose gel) of the fragments. Pure mutant colonies were identified by the presence of the expected fragments of (851 bp and 425 bp) and lack of full-length PCR product (1,276 bp), while the PCR product remained undigested in the case of the wildtype control. Subsequently, genomic DNA was isolated from a putative mutant using a MasterPure Gram Positive DNA Purification Kit (Epicentre). In order to ensure the chosen mutant was free of wild-type contamination, the target gene was amplified and subjected to restriction analysis as described above. Purification of the undigested ΔctaC PCR product was performed using a Monarch PCR and DNA Cleanup Kit (New England Biolabs) and DNA sequencing (Eurofins Genomics Germany GmbH) using primer Cas-3258-seq (Table S3) was used to confirm the presence of the desired mutation ( Figure S2).

Screen for production of CTA intermediates by Ruminiclostridium cellulolyticum ΔctaC
Strains were cultivated and extracted with ethyl acetate as previously described. [2] Organic extracts were analyzed by LC-HR-MS (Exactive) for the presence of amide congeners of previously described intermediates.

Generation of pCTA1-ΔctaC
For the generation of pCTA1-ΔctaC, pCTA1 (see reference [2] for a description of the vector) was used as a template in a PCR amplification using the mutagenesis primers listed in  Table S3 and by restriction digestion with EcoRI.

Screen for production of CTA intermediates by Escherichia coli pCTA-ΔctaC
E. coli HM0079 pCTA2 competent cells were transformed with pCTA1-ΔctaC and transformants were selected on LB plates supplemented with 100 μg mL -1 ampicillin and 25 μg mL -1 chloramphenicol. A single colony was used to inoculate 5 mL of LB supplemented with ampicillin/chloramphenicol and the culture was grown for 16-20 h at 37 °C and 160 rpm. The production of intermediates was performed in autoinduction medium according to the previously reported method [2] and in LB. For the LB experiments, 500 μL of this culture was used to inoculate 50 mL of antibiotic-supplemented LB in a sealed conical tube and cultures were grown at 30 °C and 130 rpm. When the cultures reached an optical density at 600 nm (OD 600 ) of approximately 0.6-0.8, they were supplemented with 50 μM ZnCl 2 , 1 mM Na 2 S, and protein expression was induced by the addition of 0.4 mM isopropyl-β-Dthiogalactopyranoside (IPTG). The induced cultures were grown at 30 °C and 130 rpm in sealed conical tubes for 3 hours. For both the autoinduction and LB cultures, the cells were separated from the spent medium by centrifugation at 4000 x g for 10 min.
Extraction of the cell-free spent medium was conducted as previous described [2] and extracts were checked for the presence of thioamide-free CTA precursors by LC-HR-MS (Exactive). Cell pellets were resuspended in 2.5 mL of 10 mM Tris pH 7.5 and lysed by sonication at 30% power for 2 min (SONOPLUS ultrasonic homogenizer with a MS73 microtip, Bandelin). In order to release any intermediates that might be bound to a PCP, KOH was added to the lysate to a final concentration of 10 mM and samples were incubated for 3 h at 37 °C. The base-treated lyasate was extracted with 2 volumes of ethyl acetate, the extract was dried over Na 2 SO 4 and the ethyl acetate was removed using a rotary-evaporator. The resultant solid was dissolved in 100 μL of methanol and analyzed by LC-HR-MS (Exactive). See Figure S4 for representative data.

Plasmid construction for E. coli expression vectors
The target genes (ctaC, ctaE, and ctaH) were amplified by PCR from the genomic DNA of R. cellulolyticum DSM 5812 using the oligonucleotide primers listed in Table S3. PCRs were performed with Phusion High-Fidelity DNA Polymerase (New England Biolabs) and amplicons were purified using an innuPREP Gel Extraction Kit (Analytik Jena). Following purification, the DNA fragments were either digested with NheI and BamHI (for pET28-ctaE and pET28-ctaH) or KpnI and BamHI (for pMAL-ctaC). The digested inserts were purified using an innuPREP PCRpure Kit (Analytik Jena) and ligated with an appropriately digested vector using T4 DNA ligase (New England Biolabs).
E. coli Top10 cells were transformed with the ligation reactions and transformants were selected on LB agar supplemented with appropriate antibiotic (pET28a, 50 µg mL -1 kanamycin; pMAL, 100 µg mL -1 ampicillin). Plasmids were isolated from the transformants and the sequence of the construct was confirmed by sequencing using the oligonucleotide primers listed in Table S3.
Oligonucleotide primers for the construction of pET28-ctaC were designed using the NEBuilder assembly tool (http://nebuilder.neb.com/) and are listed in Table S3. The gene was amplified from the genomic DNA of R. cellulolyticum DSM 5812 by PCR using Phusion high-fidelity DNA polymerase (New England Biolabs) and amplicons were purified using an innuPREP Gel Extraction Kit (Analytik Jena). The linear vector for the assembly reaction was generated by digestion of pET28a with NdeI and HindIII. The assembly reaction was performed using NEBuilder HiFi DNA Assembly Master Mix (New England Biolabs) according to the manufacturer's instructions. E. coli Top10 cells were transformed with the assembly reaction and transformants were selected on LB agar supplemented 50 µg mL -1 kanamycin. Plasmids were isolated from the transformants and the sequence of the construct was confirmed by sequencing using the oligonucleotide primers listed in Table S3.

Protein overexpression
E. coli Rosetta (DE3) cells were transformed with pET28a-and pMAL-derived expression vectors and transformants were selected on LB agar plates supplemented with 50 μg mL -1 kanamycin/25 μg mL -1 chloramphenicol and 100 μg mL -1 ampicillin/25 μg mL -1 chloramphenicol, respectively. A single colony was used to inoculate 4 mL of LB supplemented with the appropriate antibiotic and cultures were grown for 18-20 h at 37 °C and 160 rpm. The entire overnight culture was used to inoculate 400 mL of fresh LB medium supplemented with the appropriate antibiotic.
Cultures were grown at 37 °C and 160 rpm until an optical density at 600 nm (OD 600 ) of approximately 0.6 was reached. Cultures were then iced for 10 min before protein expression was induced with the addition of IPTG to a final concentration of 0.4 mM. As CtaC is predicted to coordinate a structural zinc ion, CtaC overexpression cultures were supplemented with ZnCl 2 to a final concentration of 50 μM. Following induction, cultures were grown for an additional 16-18 h at 18 °C before the cells were harvested by centrifugation at 4000 x g for 15 min. Cell pellets were washed with Tris buffered saline (10 mM Tris pH 7.5, 150 mM NaCl) and stored at -20 °C for up to one month before use.

Purification of N-terminal His 6 fusion proteins
Cell pellets were resuspended in 30 mL of lysis buffer [50 mM Tris pH 8.0, 300 mM NaCl, 25 mM imidazole, 5% glycerol (v/v)] supplemented with 1 mg mL -1 lysozyme (Roth). After incubation on ice for 30 min, cells were disrupted by sonication at 4 °C using a SONOPLUS ultrasonic homogenizer with a MS73 microtip (Bandelin) and the following parameters: 30% power, three 45s cycles with 5 min breaks between cycles. The insoluble debris was removed from the lysate by centrifugation at 17000 x g for 30 min and the cleared lysate was loaded onto 2 mL of TALON Superflow resin (GE healthcare) equilibrated with lysis buffer. The resin was washed with 100 mL of lysis buffer supplemented with 1 mM dithiotheritol (DTT) and His 6 -tagged proteins were eluted using 15 mL of elution buffer [50 mM Tris pH 8.0, 300 mM NaCl, 250 mM imidazole, 1 mM DTT, 5% glycerol (v/v)]. The eluent was concentrated in an appropriate molecular weight cutoff Amicon Ultra 15 mL centrifugal filter (Merck Millipore) and two 10-fold buffer exchanges with storage buffer [50 mM HEPES pH 7.5, 300 mM NaCl, 1 mM DTT, 20% glycerol (v/v)] were performed in the filtration device prior to a final concentration step and storage at -80 °C. Protein concentration was determined by absorbance at 280 nm and purity was assessed by SDS-PAGE ( Figure S5).

Purification of N-terminal MBP fusion proteins
Cell pellets were resuspended in 30 mL of lysis buffer [50 mM Tris pH 7.5, 500 mM NaCl, 5% glycerol (v/v)] supplemented with 1 mg mL -1 lysozyme (Roth) and lysed as described above. Following the removal of insoluble debris by centrifugation, the cleared lysate was loaded onto 2 mL of Amylose resin (New England Biolabs) equilibrated with lysis buffer. The resin was washed with 100 mL of lysis buffer supplemented with 1 mM dithiotheritol (DTT) and MBP-tagged proteins were eluted using 15 mL of elution buffer [50 mM Tris pH 7.5, 300 mM NaCl, 1 mM maltose, 1 mM DTT, 5% glycerol (v/v)]. The eluent was concentrated in an appropriate molecular weight cutoff Amicon Ultra 15 mL centrifugal filter (Merck Millipore) and two 10-fold buffer exchanges with storage buffer [50 mM HEPES pH 7.5, 300 mM NaCl, 1 mM DTT, 20% glycerol (v/v)] were performed in the filtration device prior to a final concentration step and storage at -80 °C. Protein concentration was determined by absorbance at 280 nm and purity was assessed by SDS-PAGE ( Figure S5).

In vitro phosphopantetheinylation of the PCPs (CtaE/CtaH)
Prior to phosphopantethein loading, the His 6 -tags were removed from the apo-PCPs by the addition of thrombin In order to generate holo-PCPs loaded with potential CtaC substrates, thrombin cleavage and Sfp PPTase loading reactions were performed under strictly anaerobic conditions [atmosphere composed of (N 2 :H 2 :CO 2 , 85:5:10 vol:vol:vol)] in a Coy Lab anaerobic chamber at 25 °C. Following PCP loading, the holo-PCPs were purified using a PD-10 desalting column (GE Healthcare) equilibrated with anaerobic reaction buffer. PCP-containing fractions were pooled and concentrated using a 3 kDa Amicon Ultra 0.5 mL centrifugal filter (Merck Millipore) in the anaerobic chamber. Following concentration, glycerol was added to a final concentration of 20%, the protein solution was aliquoted into air-tight microfuge tubes and stored at -80 °C until use. The concentration of the loaded PCP was determined by absorbance at 280 nm and the BCA method (Pierce BCA kit, Thermo Fisher Scientific).

CtaC Fe-S cluster reconstitution
The reconstitution of the iron-sulfur cluster in MBP-tagged CtaC was performed under anaerobic conditions in a Coy anaerobic chamber. First, 25 μM MBP-CtaC in reconstitution buffer [50 mM HEPES pH 7.5, 150 mM NaCl, 2 mM MgCl 2 , 1 mM TCEP] was mixed with 2.5 mM sodium dithionite and incubated at 8 °C for 1 h. To this mixture, ammonium iron citrate was added slowly to a final concentration of 250 μM with careful mixing. After incubation at 8 °C for 5 min, Li 2 S was added to a final concentration of 250 μM and the mixture was incubated at 8 °C for 18 h. Unbound iron and sulfide was removed using a PD-10 desalting column (GE Healthcare) equilibrated with anaerobic reconstitution buffer. CtaC-containing fractions were pooled and concentrated using a 10 kDa Amicon Ultra 0.5 mL centrifugal filter (Merck Millipore) in the anaerobic chamber. Following concentration, glycerol was added to a final concentration of 20%, the protein solution was aliquoted into air-tight microfuge tubes and stored at -80 °C until use.

Chemical quantification of CtaC iron and sulfur content
The iron content of CtaC was quantified using a modified version of the previously established Fe 2+ -ferene complex assay. [11] First, 10 μL of MBP-CtaC (25 μM) was diluted to 20 μL with distilled water. Iron was released from the protein by the addition of 20 μL of 1% HCl and heating to 80 °C for 10 min. After cooling the sample to room temperature, 100 μL of ammonium acetate [7.5% (w/v)], 20 μL of ascorbic acid [4% (w/v)], 20 μL of sodium dodecylsulfate [2.5% (w/v)] and 20 μL of ferene [1.5% (w/v)] were added. Samples were mixed by vortexing, 100 μL was transferred to a 384-well microtiter plate and the absorbance at 593 nm was measured using a Varioskan Lux microplate reader (Thermo Fisher scientific). The iron content was determined by comparison to a standard curve generated with 25400 μM FeSO 4 . All measurements were performed in triplicate.
The sulfide content of CtaC was quantified using a modified version of the previously established methylene blue assay. [11][12]  . The pellet was resuspended by vortexing briefly, the color was allowed to develop for 20 min at 22 °C, 100 μL was transferred to a 384-well microtiter plate and the absorbance at 670 nm was measured using a Varioskan Lux microplate reader (Thermo Fisher scientific). The sulfide content was determined by comparison to a standard curve generated with freshly made 12.5400 μM Li 2 S. All measurements were performed in triplicate.

UV-visible Spectroscopy
Spectra were measured using a Varioskan Lux microplate reader (Thermo Fisher scientific) plate reader and a 384well microtiter plate with a 50 μL sample volume. All protein solutions were analyzed at a concentration of 25 μM. Glycerol-containing reconstitution buffer was used as a blank.

CtaC activity assays
Assays  1, 4, and 5). Additionally, reactions with 5 were performed with 5 μM MBP-CtaC. All reactions were performed under strictly anaerobic conditions and were allowed to proceed for 22 h at 25 °C before product formation was monitored using LC-HR-MS (QExactive).

LC-HR-MS detection of PCP-bound thioamidated products from CtaC reactions
Potassium hydroxide was added to reaction mixtures to a final concentration of 10 mM and samples were incubated at 30 °C for 1 h. Next, one volume of methanol was added to the sample, precipitate was removed by passing the sample through a 0.45 μm syringe filter and substrate processing was determined by LC-HR-MS (QExactive).
Products were compared to authentic thioamidated CTA congeners in the organic extract of R. cellulolyticum cultures.

Detection of ATP-hydrolysis products
Reactions were performed with 0.5 μM CtaC and 30 μM 1-holo-CtaE, 1 mM ATP [50 mM HEPES (pH 7.5), 150 mM NaCl, 20 mM MgCl 2 , 1 mM TCEP] for 18 h at 25 °C in an anaerobic chamber. The reactions were stopped by the addition of 1 volume of methanol and analyzed by HPLC according to an established method. [13] The HPLC profiles were compared to those of reference compounds (Jena Bioscience). Figure S1. Detailed phylogenetic tree of AANH enzymes. Neighbor-joining phylogenetic tree of diverse sulfurinserting AANH superfamily members with 1000 bootstrap replicates. The structures of the products synthesized by each AANH subclass are displayed with the installed moiety colored red. See Table S4 for a list of the sequences used to generate this tree. GMP synthetase is used as an outgroup to root the cladogram.     Figure S6. Multiple sequence alignment of CtaC and TtuA/TtcA-type AANH enzymes. Domain structures of TtcA, TtuA and CtaC along with a multiple sequence alignment of members of these AANH subfamilies. The residues responsible for Zn, ATP and Fe-S cluster binding are colored according to the legend and residues selected for mutation in CtaD are colored red.

Figure S7. Phosphopantetheinylation of apo-CtaE and apo-CtaH in vitro by Sfp. MALDI-TOF-MS spectral overlays of (A) CtaE and (B)
CtaH phosphopantetheinylation assays with multiple coenzyme A (CoA) derivatives. The expected mass shift relative to apo-PCP for each CoA derivative is listed in parentheses. (C) A multiple sequence alignment of CtaE/H and diverse acyl and peptidyl carrier proteins is displayed. The conserved phosphopantetheinyl transferase recognition site is indicated by the box. The serine residue expected to be the site of phosphopantethein (PP) attachment on CtaE/H is colored red. In order to localize the site of PP attachment on the PCPs, the conserved serine residue was mutated to alanine for each PCP and the corresponding mutants (CtaE S36A and CtaH S40A ) were isolated. (D) MALDI-TOF-MS spectral overlays of phosphopantetheinylation assays performed with mutant versions of CtaE (CtaE S36A ) and CtaH (CtaH S40A ) that lack the conserved serine residue. Consistent with the prediction, these mutant proteins were not processed by Sfp.                    [15] E. coli HM0079 pCTA1 pCTA2 Strain contains pSU18-CTA1 and pTrc99a-CTA2 for expression of whole CTA gene cluster [2] E. coli HM0079 pCTA1-ΔctaC pCTA2 Strain contains mutant version of pSU18-CTA1 lacking ctaC This study Table S2. Plasmids used in this study.