A novel synthetic sRNA promoting protein overexpression in cell‐free systems

Bacterial small RNAs (sRNAs) that regulate gene expression have been engineered for uses in synthetic biology and metabolic engineering. Here, we designed a novel non‐Hfq‐dependent sRNA scaffold that uses a modifiable 20 nucleotide antisense binding region to target mRNAs selectively and influence protein expression. The system was developed for regulation of a fluorescent reporter in vivo using Escherichia coli, but the system was found to be more responsive and produced statistically significant results when applied to protein synthesis using in vitro cell‐free systems (CFS). Antisense binding sequences were designed to target not only translation initiation regions but various secondary structures in the reporter mRNA. Targeting a high‐energy stem loop structure and the 3′ end of mRNA yielded protein expression knock‐downs that approached 70%. Notably, targeting a low‐energy stem structure near a potential RNase E binding site led to a statistically significant 65% increase in protein expression (p < 0.05). These results were not obtainable in vivo, and the underlying mechanism was translated from the reporter system to achieve better than 75% increase in recombinant diaphorase expression in a CFS. It is possible the designs developed here can be applied to improve/regulate expression of other proteins in a CFS.


| INTRODUCTION
Non-coding bacterial RNAs such as antisense RNAs (asRNAs) and small RNAs (sRNAs) have become important tools in synthetic biology [1][2][3][4][5][6][7][8] and metabolic engineering. 9,10 They have emerged as important regulators in both prokaryotes and eukaryotes 11,12 and have been found to play a central role in regulating gene expression 3 at post-transcriptional 3,13,14 and translational levels. 15,16 Most act in trans 17 by annealing to target mRNAs, typically at or near the ribosome-binding site (RBS) sequence. 18,19 sRNAs can have antisense sequences (a short asRNA portion) to bind mRNA targets and/or act with catalytic function. The asRNA-mRNA duplex formation has been documented to decrease translation [19][20][21][22][23][24][25][26][27] and/or increase the degradation of target mRNA. 2,19,[28][29][30] This resulting decrease in target gene expression has been found to be tunable through calculation and design of asRNA-mRNA binding free energy. 31 Synthetic regulatory asRNAs have been shown to be suitable for conditional gene silencing 32,33 and can serve as a convenient tool to engineer metabolism due to their ability to regulate gene expression without intensive genetic manipulations such as gene knockouts or disruptions 18,26 ; they can also be expressed from inducible promoters, providing additional metabolic controls. Synthetic sRNAs and asRNAs have already been used in a number of applications ranging from detecting metabolic state, 34 balancing metabolic pathway expression, 30 and tightly regulating toxin genes. 1 Promising applications also include multiplexing and combinatorial applications to screen for enhanced target chemical production 31 and optimization of target pathway gene expression levels and down-regulation of competing pathways. 5,7,8,31 Finally, asRNA technology has been used in industrial processes to improve yields and reduce byproducts, 4,6,8,21 to optimize protein production 25 and it has been combined with CRISPR/Cas9 systems for derepression. 9,10,14 The majority of the asRNAs applications mentioned above and strategies in synthetic biology, in general, make use of living cells.
However, the complexities of living cells, including selective induction, natural variability, and difficulties in standardization can make these systems difficult to work with and sometimes irreproducible. 35,36 To address these challenges, cell-free systems (CFS) were developed. [35][36][37][38] Cell-free biology systems harness cellular machinery in vitro 36,39,40 and can be categorized into three main classes. The first is a simplified CFS composed of crude extract prepared simply by lysing bacterial, plant, or animal cells. 35,36,41 The second contains purified Escherichia coli translational components and required buffers, amino acids, and energy/cofactor components, 38,[42][43][44][45] and the third class of CFS contains these with purified enzymes and complexes from different sources. 39,[46][47][48][49][50] The ability of CFSs to bypass the complexities of living cells in an in vitro environment offers engineering flexibility and other advantages such as controllable transcription/translation modifications, high synthesis rates and product yields, high tolerance to toxic substrates/products/intermediates, and easy product purification. 35,37,40,41,50,51 Here, we report the development of a novel synthetic sRNA design for modulating gene expression that expands the existing toolset by enabling gene overexpression in addition to expression knockdown. Our synthetic sRNA construct is composed of three parts: (i) a stem-loop stabilizer sequence, (ii) an antisense target binding sequence, and (iii) a terminator. It has been reported that stem-loop structures facilitates sRNA-mRNA interactions 17,52 and that a stem structure close to the 5 0 end helps sRNA stability. 23,53 Including the stabilizer sequence, which forms hairpin structure, protects sRNA against exonucleolytic degradation. 54 The sRNA operates independently of Hfq. In fact, Hfq binds preferably to AU-rich region upstream of the terminator and poly (U) tail of the Rho-independent terminator. 19,[55][56][57][58][59] These elements were omitted from our design. Its small size (48 nt total) allows multiple sRNAs to be contained on a single plasmid or genomic insert, and the antisense sequence is easily swappable by PCR. To demonstrate its function, the antisense portion of the sRNA was designed to target several predicted secondary structures in a fluorescent reporter, in addition to the RBS and 3 0 end.
The system was initially built and evaluated in E. coli, but here, we also demonstrate that due to cell induction inconsistency, a simplified CFS provided more statistically significant results when testing the system. Most importantly, we were able to increase the expression level of the fluorescent protein reporter used in this study significantly by occluding the target region of RNase E. Mechanisms were learned and applied to diaphorase (DI) expression in vitro to show how the system can be modified to target gene expression in a CFS or cell-free protein synthesis (CFPS) reaction.

| Bacterial strains and culture conditions
E. coli 10-β (New England Biolabs; Ipswitch, MA) were used as the host expression system in all experiments and were cultured in LB growth medium (10 g/L tryptone, 5 g/L yeast extract, 10 g/L sodium chloride) supplemented with ampicillin to the final concentration of 100 μg/mL. Appropriate concentrations of arabinose "Ara" and anhydrotetracycline "ATc" were used to induce protein and asRNA expression. E. coli BL21 Star (DE3) (New England Biolabs) were used for diaphorase expression in CFS experiments. A description of all strains and plasmids used in this study are given in Table S1, and all PCR primer sequences are given in Table S2 of Supplementary Appendix 1.
All data used in this study were obtained from at least biological replicates.

| Plasmid construction and sRNA design
Three different plasmids were constructed for this study and are shown in Figure S1 (and described in Table S1) in Supplementary Appendix 1. 60 The first plasmid, pART15-C1, was used as a reporter and contains the Anemonia majano cyan fluorescent protein gene (AmCyan) under the control of Ara-inducible P BAD promoter. pART15-C2 was also used as a reporter to determine the expression levels of AmCyan under the control of ATc-inducible P LtetO-1 . This was used to determine relative expression levels of sRNAs. pART15-C1-LacZ was  The synthetic sRNA scaffold is shown in Figure 2 and was designed to have three parts, described from 5 0 to 3 0 : (i) a 14 nt synthetic stem-loop stabilizer, (ii) a 20 nt antisense sequence complementary to the target mRNA, and (iii) a 14 nt stem-loop transcriptional terminator. 60 sRNA antisense sequences (Table S3) were designed to target the following regions of the AmCyan mRNA: (i) the ribosome binding site (RBS) of (antisense sequence referred to as "asRBS"), (ii) the first 20 bp of AmCyan from the 5 0 end (asB1), (iii) a potential high-energy loop (asB30), (iv, v) potential low-energy loops (asB57, asB180), (vi-viii) potential high-energy stem structures (asB407, asB430, asB539), (ix, x) potential low-energy stem structures (asB606, asB615), (xi, xii) near the 3 0 end (asB693, asB701), and (xiii) a control without a binding region, containing only the stabilizing region and terminator (asCTRL). Predictions of RNA secondary structures and energy regions were computed using NUPACK software (http:// www.nupack.org). 63 All of the target binding sequences were blasted against the E. coli genome and pART15 plasmid to ensure no other interactions were anticipated.  F I G U R E 2 Synthetic sRNA scaffold structure containing a 14 nt synthetic stem-loop as stabilizing sequence, a 20 nt binding sequence complementary to the target region, and a 14 nt stem-loop transcriptional terminator. The structure shown was exported from NUPACK. 63 were measured at ex. 458 nm, em. 489 nm. Normalized fluorescence (F N ) was calculated by dividing the fluorescence measurement by the OD 600 reading. Fluorescence change (F Δ ) was calculated by comparing the fluorescence of cells grown supplemented with both ATc and Ara with the positive control fluorescence (pART15-C1), using Equation 1:

| Diaphorase over-expression
Four plasmids expressing sRNAs targeting potential binding sites of RNase E in diaphorase (DI, GenBank accession number JQ040550) were constructed. pBAD-LIC was used as the backbone plasmid to clone the four antisense fragments (Table S3 in  was attributed to diaphorase production. Controls (no diaphorase production) were used to verify this measurement.

| Statistical analysis
All data were obtained from biological replicates as indicated in figures and tables. Statistical analysis was performed using ANOVA tests and pairwise comparisons using Tukey's honest significant difference (HSD). showed that induction with 0.05 mM Ara was sufficient to induce the entire population. This is shown in Figure S3. Similarly, cells harboring pART15-C2 were induced with different levels of ATc, and a concentration of 100 ng/mL was observed to yield full and uniform induction of all cells (shown in Figure S4). In addition, crosstalk between ATc and arabinose inducible systems was assessed. Flow cytometry results show that the presence of ATc did not influence the expression of Ara inducible promoter. Interaction between ATc and Ara seems to occur only at very low concentration of the inducers ( Figure S5), which was outside the range used in our experiments.  The average response is the percent increase or decrease in fluorescence observed relative to pART15-C1. b The asCTRL contains the stabilizer stem loop and terminator but no antisense sequence.
designed to bind nt 430-450 of the reporter mRNA. This is far from the translation initiation region. Targeting the high-energy stem further (asB407 and asB539) produced decreases in fluorescence, with some results being statistically significant (e.g., À43% for asB407 at 18 h). Notably, we observed a statistically significant increase in fluorescence of more than 35% from asB606 at 18 h, but a statistically significant increase was not observed at 29 h.
Pairwise comparisons testing was applied to fluorescence results of all sRNAs at every time point and were compared with each other to identify statistically significant differences. Of all comparisons made, only 7.7% were statistically different. Thus, despite the significant findings described above, most of the sRNA constructs varied (some widely) in fluorescence readings-it has been suggested in the literature that the extreme complexity of living cells may play a role. 36 In addition, the objective of microbial cells is growth, not the production of target products, which is usually opposed to synthetic biology and metabolic engineering goals. 67 These results were investigated further by observing nonuniform induction of the cultures through flow cytometry when using both the Ara and ATc inducers. This is shown in Figure S8 and demonstrates induction heterogeneity that differs given different sRNAs and even between replicates. The mechanisms leading to this behavior remains unknown; however, a CFS was used to repeat these experiments, since cell-to-cell induction variations would be negligible.

| Implementation of a CFS (in vitro system)
To facilitate the analysis of the synthetic asRNAs and to overcome the induction heterogeneity of implementation in vivo, we con- With this simplified CFS, the system required 24 h for induction (the point where fluorescence appeared), and the maximum was achieved by 96 h. All fluorescence readings were averaged and normalized by the OD 600 . Unlike the in vivo system, fluorescence data with pART15-AS constructs were more consistent, and these data are shown in Figure S16 and Supplementary Appendix 2. Three time points were chosen to analyze the data (48 h, 72 h, and 96 h). The CFS produced many more statistically significant results than did the in vivo system. In fact, 44.6% of all pairwise comparisons were found statistically significant (p < 0.05). This was an observable increase from the 7.7% of the in vivo system, and fluorescence results showed far more consistent. ANOVA test results are given in Table 2. Targeting the RBS did not produce a statistically significant decrease in fluorescence in the CFS, but this was observed when targeting the first 20 nt and repression levels were higher than the in vivo system. However, targeting other secondary structures produced far superior repression. For example, targeting the high-energy stem produced greater than 65% (statistically significant) fluorescence reduction, and targeting the 3 0 end of the mRNA (asB701) lead to more than 70% fluorescence reduction, compared to about 19% reduction in the in vivo system (both statistically significant).
However, we found that selectively targeting the low-energy stem (asB606) led to a statistically significant increase in fluorescence of more than 66% at 48 and 96 h (it was more than 50% at 48 h). This OmpC. 71 MicC requires the Hfq RNA chaperone for its function, 71,72 and studies have shown its superior repression capabilities when compared to other scaffolds. 31 In fact, there is a high frequency of RNase E mediated cleavages in Hfq-dependent asRNAs. [73][74][75][76] Using the MicC scaffold, we added the same antisense binding regions used in asRBS (targets the mRNA RBS), asB1 (targets the first 20 nt of the mRNA) and asB606 (targets the low-energy stem loop of the reporter mRNA starting at nt 606). Results are also given in Table 2.
Targeting the RBS with our MicC-asRBS sRNA led to an increase in fluorescence expression, which is contrary to results published for other in vivo systems. A possible mechanism for this is discussed later. However, targeting the first 20 nt with the MicC-asB1 led to a statistically significant reduction in fluorescence of better than 40%.
This was greater repression than the 20% (and not statistically signif- whereas, our synthetic scaffold produced an increase in fluorescence of over 65%. Thus, a mechanism was sought to explain the overexpression observation by our asB606 sRNA.

| Probing the mechanism of asB606 with NUPACK and catRAPID
To determine the mechanism behind asB606 sRNA-mediated overexpression, analysis with the NUPACK thermodynamic software 63   The average response is the percent increase or decrease in fluorescence observed relative to pART15-C1. b The asCTRL contains the stabilizer stem loop and terminator but no antisense sequence.

| Implementation for diaphorase production in vitro
Diaphorase (DI) is a soluble NAD(P)H dehydrogenase (EC 1.6.99.1 or EC 1.6.99.3). It is involved in balancing NADH generation and NADPH consumption in anaerobic conditions. 64 It is applied in synthetic enzymatic pathways for the production of high-value chemicals. 64,85,86 Therefore, overexpression of this enzyme as part of a synthetic pathway is of interest. Previous results with the sRNA asB606 showed that occluding the RNase E binding site increased the expression of a fluorescent reporter by 50-80% in a CFS. Here, this design was adapted and implemented for diaphorase expression in a CFS.
Potential binding sites for the RNase E were predicted by catRAPID [82][83][84] (shown in Figure S20), and four targets were identified (s) for diaphorase mRNA and RNase E ( Figure S20). Plasmids (pBAD-AS) harboring asRNAs targeting those regions were constructed (pBAD-as410, pBAD-as490, pBAD-as510, and pBAD-as528). Descriptions, primers, and antisense sequences are given in Tables S1-S3. Next, an in vitro diaphorase-CFS was developed by combining cells expressing diaphorase and cells transformed with one of the pBAD-AS plasmids in the same sonication reaction. Diaphorase expression was observed in the presence of multiple sRNAs, as shown in Figure 4. Asterisks are shown for expression levels that are statistically significant (compared to non-induced cells for the same sRNA).
The induction of as410, as490 and as528 improved the expression level of diaphorase by about 50%, 66% and 75%, respectively. However, a slight increase was observed for as510 (3.5%) and no effect for other replicates of as490 (data not shown An understanding of the asB606 mechanism showed that this particular asRNA likely promoted overexpression by occluding RNase E, which serves to inhibit mRNA degradation. This effectively extends its half-life, enabling increased translation. Importantly, this mechanism is likely predictable using NUPACK and catRAPID software. We tested this with the expression of diaphorase in vitro and found increased expression relative to the wild type. We anticipate this mechanism can be used in CFPS reactions to further increase yields. There is strong interest in utilizing these systems to produce enzymes for catalysis and in vitro metabolic engineering. 50,64,85,86 The translation of our sRNAs from improving fluorescent reporter expression to diaphorase shows promise for this system to be applied universally in CFSs. Further improvements are required for reliable function of our synthetic sRNAs in vivo, yet it remains unknown how the system is leading to heterogeneous, and somewhat randomized, induction. It is recognized that this phenomenon can be inherent in analyzing some synthetic circuit in vivo, making the simplified CFS used here a reliable alternative. An improved understanding of the differences and heterogeneities observed between in vivo and CFS expression of our sRNA is needed so it can be engineered to enhance protein expression in vivo and be used as a reliable tool for metabolic engineering. AUTHOR CONTRIBUTIONS Imen Tanniche: Conceptualization (equal); data curation (lead); formal analysis (lead); investigation (lead); methodology (lead); writingoriginal draft (lead). Hadi Nazem-Bokaee: Conceptualization (equal); formal analysis (equal); investigation (equal); methodology (equal); writing