Cotranslational folding of alkaline phosphatase in the periplasm of Escherichia coli

Abstract Cotranslational protein folding studies using Force Profile Analysis, a method where the SecM translational arrest peptide is used to detect folding‐induced forces acting on the nascent polypeptide, have so far been limited mainly to small domains of cytosolic proteins that fold in close proximity to the translating ribosome. In this study, we investigate the cotranslational folding of the periplasmic, disulfide bond‐containing Escherichia coli protein alkaline phosphatase (PhoA) in a wild‐type strain background and a strain background devoid of the periplasmic thiol: disulfide interchange protein DsbA. We find that folding‐induced forces can be transmitted via the nascent chain from the periplasm to the polypeptide transferase center in the ribosome, a distance of ~160 Å, and that PhoA appears to fold cotranslationally via at least two disulfide‐stabilized folding intermediates. Thus, Force Profile Analysis can be used to study cotranslational folding of proteins in an extra‐cytosolic compartment, like the periplasm.


| INTRODUCTION
Protein secretion across the inner membrane in the Gram-negative bacterium Escherichia coli has been classically thought to proceed post-translationally, orchestrated by an interplay between the signal peptide of the protein, cytoplasmic chaperones such as Trigger Factor and SecB, the ATPase SecA, and the SecYEG translocon. [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17] Post-translational export commences when SecA recognizes a secretory protein via its signal sequence, and targets it to the inner membrane. [18][19][20][21][22] Folding en route to the membrane is prevented by sequence-specific motifs within the protein itself, 23 and by interactions with cytoplasmic chaperones such as SecB. 12,24,25 The unfolded state enables the protein to be exported via the SecYEG translocon, 26 driven by the SecA ATPase activity and by the proton motive force. [27][28][29][30][31] The signal sequence of the protein is then cleaved off by signal peptidase I (LepB), after most of the protein is completely translocated across the inner membrane into the periplasm. 32,33 Once in the oxidizing environment of the periplasm, the protein commences folding, often involving the formation of disulfide bridges catalyzed by the periplasmic Disulfide bond (Dsb) system. [34][35][36][37] The existing view that secretion of proteins across the inner membrane occurs post-translationally has been recently challenged by experiments demonstrating that SecA can associate with ribosomes and ribosome-bound nascent chains during translation. [38][39][40][41][42] These studies suggest that proteins may also be targeted and secreted cotranslationally, and are consistent with earlier findings that signal peptides of certain periplasmic proteins are cleaved before the proteins are fully exported. 32 Moreover, it has been found that translation intermediates of alkaline phosphatase (PhoA) can form complexes with the periplasmic thiol: disulfide interchange protein DsbA, indicating that a population of PhoA can fold cotranslationally, as assayed by disulfide-bond formation. 43 Force-profile analysis (FPA) is a recently developed method that has been used to study cotranslational protein folding of cytoplasmic proteins, both in vitro and in vivo. [44][45][46][47][48][49][50][51][52][53] FPA takes advantage of the sensitivity of the SecM-family of translational arrest peptides (APs) to pulling forces acting on the nascent chain: the higher the pulling force, the less efficient is the translational stall induced by the AP. 54,55 Many cotranslational processes, including protein folding, can generate force on the nascent chain, and are hence amenable to FPA. Here, we ask whether FPA can be used to study cotranslational folding in the periplasm of E. coli, that is, when proteins fold ≳160 Å away from the peptidyl transferase center (PTC) in the 50S subunit of the ribosome, where the force exerted on the nascent chain is sensed by the AP. Specifically, we have analyzed the cotranslational folding of PhoA. Our results suggest the existence of at least two disulfide-stabilized, cotranslational folding intermediates that form when 330 and 360 residues of mature PhoA have emerged into the periplasm.

| RESULTS AND DISCUSSION
2.1 | FPA as a tool to study protein folding in the periplasm FPA is based on the principle that cotranslational folding of a protein domain fused to an AP can generate a pulling force on the AP that reduces its stalling efficiency. [44][45][46][47][48][49][50][51][52][53] To study the cotranslational folding of PhoA by FPA, different lengths of PhoA were fused to the 17 amino-acids long E. coli SecM AP via a four amino acid long glycine-serine linker and a hemagglutinin tag (HA; Figure 1a,b); the length N of each construct is calculated from the N-terminal end of PhoA (including the signal peptide) to the last amino acid in the AP. The 30 amino-acids long linker sequence (serine-glycine linker + HA tag + AP) was kept constant for all constructs, to reduce background noise in the force profile (FP) stemming from interactions of the elongating chain with the ribosome exit tunnel. A 77-residue long C-terminal tail derived from the E. coli protein LacZ was engineered at the C terminus of the AP to ensure good separation of the arrested (A) and fulllength (FL) forms of the protein by SDS-PAGE (Figure 1c). Since PhoA has a cleavable signal peptide, the possible appearance of both cleaved and uncleaved forms could complicate interpretation of the FPA data due to multiple protein species on SDS-PAGE. To avoid this, we used a variant of PhoA with a point mutation (A 21 P) in the signal peptide that renders it uncleavable, yet does not interfere with export of PhoA into the periplasm. 32,56 All constructs were expressed in E. coli MC1061 or in MC1061ΔdsbA. Proteins were labeled for 2 min with [ 35 S] methionine, followed by immune-precipitation against the HA-tag and separation by SDS-PAGE. As a proxy for the pulling force acting on the nascent chain, the fraction full-length protein (f FL ) was calculated for each N as the ratio between the intensity of the FL protein band and the sum of the intensities of the A and FL protein bands (Figure 1c). f FL values were plotted against their corresponding N values to obtain a FP for PhoA. Both the A and FL PhoA protein products for lengths N = 403 and N = 431 showed a difference in their migration on SDS-PAGE in the presence or absence of the reductant 1,4-dithiothreitol (Figure 1d), indicating disulfide bond formation and therefore translocation into the periplasm, as depicted in Figure 1b.

| The PhoA FP reveals two disulfide bond-stabilized folding intermediates
The PhoA FP obtained in wild-type E. coli (Figure 2a), reveals multiple peaks with maxima at N = 311, 403-407, and 431 residues. In addition, the f FL value for the N = 555 construct, in which the entire PhoA domain has been translocated into the periplasm and is connected via an 84-residue long linker to the PTC, is also relatively high.
The tightly spaced Peaks II and III at N ≈ 400-435 residues are particularly interesting, since they correspond well to the respective lengths of PhoA nascent chains (39 and 44 kDa, not including the signal peptide) when the cotranslational formation of the C190-C200 and C308-C358 disulfide bonds (catalyzed by DsbA) can first be detected. 43 Given that an extended polypeptide of 50 residues length should be able to span the 160 Å distance between the PTC and the periplasmic exit from the SecYEG translocation channel, 57 C308 should emerge into the periplasm at N ≈ 360 residues, and C358 at N ≈ 410 residues. This suggested to us that Peaks II and III may reflect the formation of disulfidestabilized folding intermediates in the periplasm. This is also supported by Figure 1d, and by in vivo studies that F I G U R E 1 (a) Diagrammatic representation of the PhoA constructs. Different lengths of PhoA (black) with a NC signal peptide (magenta) were fused to the E. coli SecM arrest peptide (EcSecM; purple) via a GS linker (gray) and an HA tag (pink) (a comprehensive list of sequences for all constructs is included as Supporting Information). The 30 amino-acid long linker sequence (serine-glycine linker + HA tag + AP) was kept constant for all constructs. A 77 amino acid long C-terminal tail was added at the C-terminal end of the AP to allow for resolution of arrested A and full-length FL protein products by SDS-PAGE. The different lengths of PhoA were generated by truncating from the C-terminal end of the PhoA coding region. (b) A schematic representation of cotranslational translocation of a stalled ribosome-nascent chain of PhoA with the NC signal sequence (magenta) docked in the inner membrane (IM). and the mature domain of PhoA translocated into the periplasm where it can form disulfide bonds upon interacting with DsbA. Translation was arrested at the ribosome by SecM (purple), and the 30 amino acid long sequence spanning the ribosome tunnel (SecM + HA tag + GS linker) was kept constant for all constructs. (c) Autoradiographs of three different constructs of SecM-stalled PhoA obtained after radioactive pulse-labeling in vivo and SDS-PAGE. The relative amounts of A and FL were estimated by quantification of the protein bands in the autoradiographs, and the fraction full-length was calculated as f FL = I FL /(I A + I FL ). f FL serves as a proxy for the force generated by cotranslational folding of the mature domain of PhoA in the periplasm. (d) Autoradiographs of constructs N = 403 and N = 431 demonstrate a difference in protein migration on SDS-PAGE ±DTT. DTT was either excluded or included in the Laemmli buffer that was used to elute the immune-precipitated proteins after pulse-labeling. The differences in migration reflect the formation of disulfide bonds in the absence of DTT, which are reduced in its presence. The schematics in panels A and B were created using BioRender.com have shown that both the C190-C200 and the C308-C358 disulfide bonds are needed for mature PhoA to reach full stability. 58 To test this hypothesis, we first recorded f FL values in and around Peaks I, II, and III in the MC1061ΔdsbA strain (Figure 2b). Significant reductions in f FL (p < .01, two-sided t-test) were seen for constructs N = 403 and N = 431. We further mutated the four Cys residues involved in disulfide-bond formation (C190, C200, C308, and C358) to Ala, either individually or in combination, in constructs N = 311 (Peak I), 403 and 407 (Peak II), and 431 (Peak III; Figure 3a sensitive to mutations in the first disulfide (C190, C200) but not in the second (C308, C358). Likewise, the data for Peak III (N = 431) suggest the existence of a second intermediate encompassing Residues 1 to 360 (Figure 4b), in which all four Cys residues have emerged into the periplasm. In this case, the single Cys ! Ala mutations destabilize the folded state both directly and also indirectly through the formation of mis-paired disulfides between the three remaining Cys residues. 43,58 The double mutations C190 + C200 and C308 + C358 are less detrimental as they leave one native disulfide intact and do not lead to mis-pairing, while the triple mutations prevent both native disulfides from forming. Whether Peak I also reflects the formation of a folding intermediate . C190A and C200A form the first pair of disulfide bonds in PhoA and are exposed to the periplasm at this length. (b) For length N = 403, no significant effects are seen for the individual mutants C190A and C200A (orange bars) or C308A and C358A (that form the second disulfide bond; magenta bars) compared to the wild-type PhoA truncate. C190A, C200A, and C308A should be in the periplasm at this length. (c) At N = 407, a significant reduction (2-sided t-test, p < .01, indicated as **) was observed for C190A and C200A (orange bars) compared to the wild-type PhoA truncate (purple bar). No significant reduction was seen for C308A and C358A (magenta bars). (d) At N = 431, all four Cys residues are exposed to the periplasm, and the consequences of mutating them to Ala result in significant reductions (p < .01, indicated as **) for C200A (orange solid bar), C308A and C358A (magenta solid bars). No significant reduction was seen for the double mutations C190A-C200A (orange striped bar) or C308A-C358A (magenta striped bar). A significant reduction (p < .01, indicated as **) was observed for the triple mutant C190A-C308A-C358A, (blue striped bar). Error bars indicate SEM values calculated from independent triplicates of all constructs remains unclear, but in any case disulfide bond formation appears not to be involved (Figures 2b and 3a).
We conclude that FPA can detect long-range forces on the nascent chain generated by cotranslational protein folding in the periplasm, and that PhoA folds cotranslationally via at least two disulfide-stabilized intermediate states. It will be interesting to further investigate the folding of periplasmic proteins with nonconsecutive disulfide bonds, and to dissect the role of periplasmic chaperones in cotranslational folding.

| Reagents, chemicals, and primers
All chemicals and reagents were purchased from Merck Sigma Aldrich; primers for PCR and DNA sequencing were purchased from Eurofins Genomics; gene fragments, Bis-Tris gels, Phusion DNA polymerase, GeneJet plasmid isolation kits, GeneJet PCR purification kits, and GeneJet Gel extraction kits were purchased from Thermo Fisher Scientific; [ 35 S] Methionine was purchased from Perkin Elmer; Protein-G agarose beads were manufactured by Roche and purchased from Sigma Aldrich; the mouse monoclonal antibody against the HA antigen was purchased from Covance.

| Design and engineering of fulllength PhoA fused to the SecM AP
The gene sequence encoding full-length PhoA-GSGS-HA-SecM was designed in silico and ordered from GeneArt, Thermo Fisher Scientific.
The construct was designed as follows: the sequence of the phoA gene from E. coli K12 MG1655 was obtained Cys residues are indicated in spacefill and numbered according to UniProt ID P00634, that is, counting from the N-terminus of the signal peptide (which is 22 residues away from the N-terminus of the mature protein). Prepared using the PyMol molecular graphics software (https://pymol.org) from the EcoCyc database, 59 and was engineered upstream of sequences encoding a four amino-acid long GSGS linker, an HA tag, the 17 amino acid long E. coli SecM AP, and a C-terminal tail consisting of 77 residues derived from the sequence of LacZ.
The obtained gene fragment was engineered by Gibson Assembly 60 into a previously described plasmid that harbors an arabinose-inducible promoter and an ampicillin resistance gene for selection, 49 and sequenced for verification.

| Engineering a noncleavable PhoA signal and generation of truncates
The noncleavable (NC) PhoA signal peptide (MKQST IALALLPLLFTPVTKPR) was designed based on predictions using SignalP 5.0 61 and PhoA mutants with NC signal peptides. 56 The sequence used consists of a single amino acid substitution (A-P) at Residue 21 in PhoA (underlined above). This single substitution was introduced into the gene sequence encoding PhoA-GSGS-HA-SecM using partially overlapping PCR and the following pair of primers: forward 5 0 -GTGACAAAACCACGGACACCAGAAA TGCC-3 0 and, reverse 5 0 -TTCTGGTGTCCGTGGTTTTG TCACAGGGGTAAA-3 0 .
The resulting parental construct-NCPhoA-GSGS-HA-SecM-was used to engineer constructs that were truncated every 10 amino acid residues from the C-terminus of PhoA (primer list and protein sequences provided as Supporting Information).
The resulting PCR products were subjected to Dpn1 digestion and transformed into DH5α cells, thereafter the plasmids were isolated and sequences verified (Eurofins Genomics).
Plasmids bearing the correct sequences were transformed into E. coli MC1061 competent cells for in vivo expression and pulse labeling.

MC1061ΔdsbA strain
The MC1061ΔdsbA strain was engineered using the λ-Red recombineering-based approach developed by Datsenko and Wanner. 62 In short, a kanamycin cassette with flanking regions homologous to the up-and downstream regions of the dsbA gene was generated by PCR using the pKD13 plasmid as a template and the dsbA-FRT-fw and dsbA-FRT-rv primers. The PCR-reaction was treated with DpnI (NEB CutSmart) to get rid of the template DNA. The PCR reaction was loaded on an agarose gel to verify the molecular weight of the generated PCR product and it was subsequently purified using the Gen-eJET PCR Purification kit (Thermo Scientific). The purified PCR product was electroporated into E. coli MC1061 cells harboring pKD46 that were cultured in standard Lysogeny Broth (LB) in the presence of 0.2% arabinose at 30 C. After verifying the correct insertion of the kanamycin cassette into the dsbA gene using colony PCR and the primers dsbA-up and dsbA-down, dsbA::Km R was transferred from MC1061ΔdsbA::Km R /pKD46 to wild-type MC1061 by means of P1-transduction 63  After 14 hr, the cultures were back-diluted into 2 ml LB (supplemented with 100 μg/ml Ampicillin) to an A 600 of 0.1, and the fresh suspensions were cultured at 37 C with shaking at 200 rpm till they reached an A600 of 0.4 (measured using a Shimadzu UV-1300 spectrophotometer). The cultures were chilled on ice for 5 min, transferred into 2 ml reaction tubes (Sarstedt), and cells were isolated by centrifugation at 5000g for 8 min in a cooled table-top centrifuge. The supernatant was discarded rapidly, and the cell pellets were washed in M9 minimal medium (supplemented with 1 μg/ml of 19 amino acids excluding methionine, 100 μg/ml Thiamine, 0.1 mM CaCl 2 , 2 mM MgSO 4 , 0.4% w/w fructose, and 100 μg/ml Ampicillin), by gentle resuspension.
The suspensions were centrifuged once again to obtain cell pellets, which were subsequently resuspended in M9 medium at 37 C. These were cultured for 1 hr at 37 C and shaking at 200 rpm.
The cell cultures were then transferred into 2 ml reaction tubes and placed in slots in a table-top thermomixer (Eppendorf) set to 37 C and shaking at 700 rpm for 5 min prior to pulse-labeling. Expression of the gene of interest was induced by 0.2% Arabinose for 5 min, and cells were pulse-labeled with 4 cu [ 35 S] Methionine for 2 min. The reaction was aborted by treating the cells with ice-cold Trichloroacetic acid (TCA) at a final conc. of 10%, and incubating at −20 C for 20 min. The precipitated samples were centrifuged for 10 min at 20,000g at 4 C in a table-top centrifuge.
The resulting pellets were washed with ice-cold acetone (to neutralize the TCA and solvate lipids) and centrifuged at 20,000g for 10 min in a table-top microcentrifuge (Eppendorf). The obtained pellets were airdried to rid them of residual acetone and solubilized with 120 μl Tris-SDS (10 mM Tris-HCl pH 7.5 and 2% SDS) at 56 C for 10 min and shaking at 1,000 rpm in a table-top thermomixer (Eppendorf). The resulting suspension was centrifuged 20,000g for 10 min at room temperature in a table-top micro-centrifuge (Eppendorf), and 100 μl of the supernatant was subjected to immune precipitation using the anti-HA antiserum.
The obtained samples were separated by SDS-PAGE and visualized by autoradiography on a FujiFilm Scanner FLA-9000.

| Quantification of radioactively labeled proteins
The FL and A protein bands on the gel visualized after autoradiography were quantified using MultiGauge (Fujifilm) from which one-dimensional intensity profile of each gel lane was extracted. The band intensities were fit to Gaussian distributions using EasyQuant. 54 The sum of the arrested and full-length band intensities was calculated, and this was used to estimate the fraction of fulllength protein for each construct. Independent triplicates were run for all constructs, and averages and SEM values were calculated.