Flagellar region 3b supports strong expression of integrated DNA and the highest chromosomal integration efficiency of the Escherichia coli flagellar regions

The Gram-negative bacterium Escherichia coli is routinely used as the chassis for a variety of biotechnology and synthetic biology applications. Identification and analysis of reliable chromosomal integration and expression target loci is crucial for E. coli engineering. Chromosomal loci differ significantly in their ability to support integration and expression of the integrated genetic circuits. In this study, we investigate E. coli K12 MG1655 flagellar regions 2 and 3b. Integration of the genetic circuit into seven and nine highly conserved genes of the flagellar regions 2 (motA, motB, flhD, flhE, cheW, cheY and cheZ) and 3b (fliE, F, G, J, K, L, M, P, R), respectively, showed significant variation in their ability to support chromosomal integration and expression of the integrated genetic circuit. While not reducing the growth of the engineered strains, the integrations into all 16 target sites led to the loss of motility. In addition to high expression, the flagellar region 3b supports the highest efficiency of integration of all E. coli K12 MG1655 flagellar regions and is therefore potentially the most suitable for the integration of synthetic genetic circuits.


Introduction
The Gram-negative model bacterium Escherichia coli is capable of thriving in a wide variety of environments (Juhas et al., 2014a). Easily amenable to genetic manipulations, E. coli strain K-12 is among the most frequently used hosts for cloning and the intermediate and the final destination chassis for engineering large DNA fragments.
Introduction of the synthetic DNA fragments into the E. coli genome by chromosomal integration has many advantages over the plasmid-borne transformation (Cunningham et al., 2009;Marcellin et al., 2010). Furthermore, integration into the chromosome could be exploited for heterologous protein expression, particularly for expression of toxic proteins in E. coli. Work on plasmids has shown that regulation of expression is tighter when the copy number is low (Anthony et al., 2004;Guan et al., 2013). The frequently used methods of the E. coli chromosomal integration include the integrase-mediated recombination between the phage attachment sites (att) (St-Pierre et al., 2013) and the λ bacteriophage Red recombinase-mediated recombination employing knockin/knock-out (KIKO) vectors (Sabri et al., 2013), plasmid pSB1K3(FRTK) (Juhas et al., 2014b) and the yeast mitochondrial homing endonuclease I-SceI (Ublinskaya et al., 2012). Chromosomal integration target sites differ significantly in their ability to support integration and expression of the integrated genetic circuits (Juhas et al., 2014b). As the traditionally used att sites are missing in a number of industrially important E. coli strains, identification and validation of the reliable chromosomal integration target sites is crucial for E. coli engineering. Ideally, integration target sites should be well-characterized, non-essential, conserved and highly expressed (Fraser et al., 1999;Baba et al., 2006;Vora et al., 2009;Kahramanoglou et al., 2011;Juhas et al., 2014b). Genes encoding flagellar functions meet all these prerequisites (Juhas et al., 2014b). Previous analyses of the E. coli K12 MG1655 flagellar regions 3a and 1 led to the identification of only three potential integration target sites (Juhas et al., 2014b;Juhas and Ajioka, 2015). The identification and validation of alternative integration sites is crucial for the development of a robust synthetic biology toolkit (Juhas and Ajioka, 2015). This is critical particularly for applications that require integrations of multiple genetic circuits into the chromosome. Here, we investigate the E. coli K12 MG1655 flagellar regions 2 and 3b. Analysis of the seven and nine highly conserved genes of the flagellar regions 2 and 3b, respectively, revealed significant variability in their suitability for integration and expression of genetic circuits. Furthermore, we show that in addition to high expression, the E. coli K12 MG1655 flagellar region 3b supports highest efficiency of chromosomal integration of all E. coli flagellar regions.

Results and discussion
Integration target loci in the E. coli flagellar regions 2 and 3b Identification of the reliable chromosomal integration target loci is crucial for engineering E. coli cells (Sabri et al., 2013;Juhas et al., 2014b). Chromosomal integration target sites should be well-characterized, conserved, non-essential and highly expressed (Fraser et al., 1999;Baba et al., 2006;Vora et al., 2009;Kahramanoglou et al., 2011;Juhas et al., 2014b;Juhas, 2015). Genes encoding flagellar functions are considered to be among the best targets for integration of genetic circuits into the E. coli chromosome (Juhas et al., 2014b). Previous studies investigating E. coli K12 MG1655 flagellar regions 3a (Juhas et al., 2014b) and 1 (Juhas and Ajioka, 2015) led to the identification of three putative chromosomal integration target sites. Identification and validation of the alternative loci is important particularly for those biotechnology and synthetic biology applications that require integrations of multiple genetic circuits into E. coli chromosome.

High efficiency integration into E. coli flagellar region 3b
As E. coli chromosomal loci differ in their ability to support integration of genetic circuits (Juhas et al., 2014b), we investigated the integration efficiency for each of the 16 target loci. Genetic circuit was integrated into the investigated target sites [motA (motAi), motB (motBi), flhD  Table 3. The investigated target loci differed significantly in their suitability to support integration of the genetic circuit. From the analysed genes of the E. coli K12 MG1655 flagellar region 2, the integration efficiency into motA (motAi) was highest (Fig. 5). From the E. coli K12 MG1655 flagellar region 3b, fliK (fliKi) supports the highest integration efficiency (Fig. 5). Notably, integrations into one and four loci of the flagellar regions 2 (motAi) and 3b (fliEi, fliJi, fliKi, fliRi), respectively, occurred with the higher efficiency than integrations into the previously examined flagellar regions 3a (Juhas et al., 2014b) and 1 (Juhas and Ajioka, 2015). Furthermore, integration efficiency into fliK (fliKi) was significantly higher than that of motA (motAi) (Fig. 5). Hence, the E. coli K12 MG1655 flagellar region 3b supports the highest efficiency of integration of all E. coli flagellar regions.

Integrations into flagellar regions 2 and 3b abolish motility
Flagellum is crucial for the motility of E. coli cells. Therefore, the disruptions of the flagellar functions-encoding genes usually have a negatively impact on motility (Juhas et al., 2014b). Integrations into two genes of the previously analysed flagellar region 3a only reduced motility of the engineered strains when compared with the wild type (Juhas et al., 2014b). We investigated the effect  Minamino and Macnab, 1999 of the chromosomal integrations into the flagellar regions 2 and 3b by spotting 2 μl of the normalized overnight cultures of the engineered E. coli strains and E. coli K12 MG1655 wild type in the middle of the motility agar plates (Fig. 6). The motility of all strains harbouring integrations in the investigated genes of the flagellar regions 2 ( Fig. 6A) and 3b ( Fig. 6B) was completely abolished.

Integrations into flagellar regions do not have negative impact on the growth
As integrations of the synthetic genetic circuits into the E. coli chromosome should not negatively impact cell growth, target loci cannot be located within essential genes (Juhas et al., 2011;2012a,b;2014a). To assess the effect of chromosomal integrations into the seven investigated genes of the flagellar regions 2 (motA, motB, flhD, flhE, cheW, cheY and cheZ) and 3b (fliE, fliF, fliG, fliJ, fliK, fliL, fliM, fliP and fliR) on the growth rate, the absorbance of the engineered strains and K12 MG1655 wild type was measured with the microplate reader (Fluostar Omega).
Integrations into all investigated genes of the flagellar regions 2 (Fig. S2) and 3b ( Fig. S3) did not diminish growth rate when compared with the wild type at both 30°C and 37°C. This is consistent with previous results from flagellar regions 3a (Juhas et al., 2014b) and 1 (Juhas and Ajioka, 2015).

Transcription of the flagellar regions 2 and 3b
The relative transcription of the investigated genes of the flagellar regions 2 and 3b was measured by real-time polymerase chain reaction (RT-PCR) using arcA and rpoD as the reference housekeeping genes (Jandu et al., 2009;Minty et al., 2011). Real-time polymerase chain reaction (RT-PCR) showed that the relative expression of four genes from both analysed flagellar regions 2 (motA, motB, flhD, cheY) and 3b (fliJ, fliK, fliL, fliM) was higher (twofold to fivefold) than the average expression of the housekeeping genes (Fig. 7A). The relative transcription of fliG was not significantly different, whereas the transcription of the remaining genes was lower than the mean expression of the housekeeping genes (  (Fig. 7B). From the flagellar region 2, highest expressed (8-to 11-fold higher than the housekeeping genes) was the genetic circuit integrated into motA (motAi), motB (motBi) and flhD (flhDi) (Fig. 7B). The expression at flhE (fhlEi), cheW (cheWi) and cheY (cheYi) was fourfold to sixfold higher than the mean expression of the housekeeping genes (Fig. 7B). From the flagellar region 3b, highest expressed (8-to 13-fold higher than the housekeeping genes) was the genetic circuit integrated into fliJ (fliJAi), fliL (fliLi) and fliR (fliRi) (Fig. 7B). The expression at the remaining loci of the flagellar region 3b was sixfold to eightfold higher than the mean expression of the housekeeping genes (Fig. 7B). Such strong expression of the genetic circuit integrated into this flagellar region is interesting, particularly when considering that the flagellar region 3b shows lowest probability of being occupied by RNA polymerase (Fig. 1B). This suggests that other factors might be also important for the expression of the integrated synthetic DNA and shows that empirical characterization is necessary for engineering into integration sites. Expression of the integrated genetic circuit was determined by the quantitative measurement of the green fluorescent protein (GFP) and the red fluorescent protein (mCherry) fluorescence over time with the microplate reader (FLUOstar Omega). For this, we have used plasmids pSB1A1(GFP) and pSB1A1(mCh) harbouring GFP and mCherry, respectively, regulated by the pR promoter. Both GFP and mCherry were not expressed at permissive conditions for the repressor (30°C), while the temperature shift to 42°C set off GFP and mCherry expression (Figs S4-S7). of all E. coli flagellar regions. Notably, the genetic circuit integrated into flagellar region 3b was also highly expressed although the probability of the RNA polymerase binding into this region is significantly lower than into other flagellar regions. This suggests that other factors might also play a role in the expression of the integrated synthetic DNA. There appears to be a weak inverse correlation between the probability of RNA polymerase  binding to the target loci and their ability to support integration of the genetic circuit; however, this will require further investigation. Furthermore, as flagellar genes are closer to the terminal (TER) region of the E. coli chromosome than oriC, their copy number is approximately sixfold lower than those genes close to oriC during expo-nential growth. Therefore, genes nearer to oriC are also potentially interesting target loci for integration and expression of genetic circuits. Besides the modified lambda Red recombinase method used in our analysis, clustered regularly interspaced short palindromic repeats (CRISPR) and integrases could be exploited for E. coli Overall, the E. coli K12 MG1655 flagellar region 3b is the most suitable of all E. coli flagellar regions for integration and expression of genetic circuits. However, there is a significant variation between individual target loci. For instance, motA of the E. coli K12 MG1655 flagellar region 2 supports the second highest integration and expression efficiency of all investigated target sites in this study (Figs 5 and 7). Therefore, when considered individually, fliJ and motA appear to be the most suitable integration target loci of the analysed flagellar regions 2 and 3b.

Bacterial strains, plasmids and growth conditions
All strains and plasmids used in this study are recorded in Table 4. Escherichia coli was routinely grown in Luria-Bertani (LB) medium supplemented with ampicillin (100 μg ml −1 ) or kanamycin (50 μg ml −1 ) when required. Liquid E. coli cultures were cultivated on a rotatory shaker at 200 r.p.m. at 30°C, 37°C or 42°C. Plate cultures were supplemented with 1% agar (w/v) and grown for about 24 h at 30°C, 37°C or 42°C.

DNA amplification and modification
DNA was amplified by PCR in 50 μl of reaction volumes employing Phusion DNA polymerase (Thermo Scientific) or Dream Taq master mix kit (Thermo Scientific) according to the supplier's instructions. Oligonucleotide primers for PCR amplifications were synthesized by Integrated DNA Technologies (IDT) and Sigma-Aldrich. DNA fragments were purified by gel electrophoresis, followed by gel extraction employing Qiaquick Gel Extraction kit (Qiagen), according to the manufacturer's instructions. Plasmid DNA was performed with the Qiaprep Spin Miniprep kit (Qiagen), according to the supplier's recommendations. Sequencing was performed by Source Bioscience (Cambridge, UK). A Gibson Isothermal Assembly method (Gibson et al., 2009;Merryman and Gibson, 2012) was employed to assemble DNA fragments. The original Gibson Isothermal Assembly method protocol was modified as described previously (Juhas et al., 2014b).   Relative transcription was quantified with REST9 Software (Qiagen) employing Pfaffl method (Pfaffl et al., 2002). Integration of the genetic circuit into the chromosome Altered Hannah (Hanahan et al., 1991) and Miller and Nickoloff (1995) protocols were used to prepare the chemically competent and electro-competent E. coli cells respectively. Integrations of the genetic circuit into target open reading frames of the analysed E. coli flagellar region were carried out using method described previously (Juhas et al., 2014b). Briefly, plasmid pKM208 was transformed into the wild-type E. coli K12 MG1655 and selected on plates with ampicillin at 30°C. Escherichia coli K12 MG1655 harbouring pKM208 was inoculated into LB with ampicillin and grown at 30°C. After reaching OD600 of 0.2, 1 mM IPTG was added and the bacterial culture was cultivated to the final OD600 of 0.4-0.6. Bacteria were subsequently washed and resuspended in 10% glycerol and transformed with the genetic circuit harbouring the flanking sequences of the target genes. Bacteria with chromosomal integrations were selected on plates with kanamycin at 37°C and subsequently grown at 42°C to cure out the temperature-sensitive plasmid pKM208. Chromosomal integrations were proved by PCR with flanking primers and sequencing.

RNA isolation and purification
Total RNA was isolated from 10 9 E. coli cells at midexponential phase with Isolate II RNA Mini Kit (Bioline) according to manufacturer's instructions. To elute RNA from the Isolate II RNA columns, 60 μl of RNAse-free H2O was used. To avoid contamination with genomic DNA, the isolated RNA was purified with TURBO DNA-free Kit (Applied Biosystems) according to supplier's instructions.

RT-PCR
Isolated and purified RNA (1 μg) was used to synthesize cDNA using SuperScript III Reverse Transcriptase (Invitrogen) according to supplier's instructions. Primers for RT-PCR designed with PRIMER3 Software were prepared to generate 100-150 bp long DNA sequences. Expression levels were measured using QuantiTect SYBR Green PCR Kit (Qiagen). MicroAmp Fast Optical 96-Well Reaction Plates (Applied Biosystems) with RT-PCR reactions were incubated in the 7500 Fast Real-Time PCR System (Applied Biosystems) according to manufacturer's instructions. The relative expression was computed employing REST9 Software (Qiagen) with Pfaffl method (Pfaffl et al., 2002). The RT-PCR was performed in triplicate, and the means and standard errors were calculated.

Evaluation of motility
Motility agar plates for motility assay were made by transferring 100 ml of motility agar [composed of 0.25% Bacto-Agar (Difco), 5 g NaCl and 10 g tryptone] in the 13 cm plates and let to set overnight. Plates were then pre-warmed to 37°C and inoculated with the 2 μl of the overnight bacterial cultures normalized to OD600 of 1.0. Pictures were taken after incubation for 4-6 h at 37°C.

Sequence analyses
The annotated E. coli K-12 MG1655 genome from the E. coli K-12 project website (http://www.xbase.ac.uk/genome/ escherichia-coli-str-k-12-substr-mg1655) was used to retrieve DNA sequences of the target loci. DNA sequencing was carried out by Source Bioscience (Cambridge, UK). The BLASTN (Altschul et al., 1990) and TBLASTX algorithms from the National Center for Biotechnology Information (NCBI) website (http://ncbi.nlm.nih.gov) and the position-specific iterated BLAST (PSI-BLAST) (Altschul et al., 1997) were used to compare DNA sequences.

Supporting information
Additional Supporting Information may be found in the online version of this article at the publisher's web-site: