Correspondence: Víctor de Lorenzo, Centro Nacional de Biotecnología-CSIC, Systems Biology Program, Campus de Cantoblanco, Madrid 28049, Spain. Tel.: +34 91 585 45 36; fax: +34 91 585 45 06; e-mail: email@example.com
A large number of prokaryotic regulatory elements have been interfaced artificially with biological circuits that execute specific expression programs. Engineering such circuits involves the association of input/output components that perform discrete signal-transfer steps in an autonomous fashion while connected to the rest of the network with a defined topology. Each of these nodes includes a signal-recognition component for the detection of the relevant physicochemical or biological stimulus, a molecular device able to translate the signal-sensing event into a defined output and a genetic module capable of understanding such an output as an input for the next component of the circuit. The final outcome of the process can be recorded by means of a reporter product. This review addresses three such aspects of forward engineering of signal-responding genetic parts. We first recap natural and non-natural regulatory assets for designing gene expression in response to predetermined signals – chemical or otherwise. These include transcriptional regulators developed by in vitro evolution (or designed from scratch), and synthetic riboswitches derived from in vitro selection of aptamers. Then we examine recent progress on reporter genes, whose expression allows the quantification and parametrization of signal-responding circuits in their entirety. Finally, we critically examine recent work on other reporters that confer bacteria with gross organoleptic properties (e.g. distinct odour) and the interfacing of signal-sensing devices with determinants of community behaviour.
The essence of any regulatory network is the conversion of one or more defined endogenous or environmental inputs into equally defined outputs (Silva-Rocha & de Lorenzo, 2008, 2010). It is often the case that a given regulatory device can process more than two inputs and/or give rise to more than two outputs. For instance, a transcription factor (TF) that responds to a ligand and can also be phosphorylated in two alternative residues could then activate various promoters, often in combination with other TFs. However, even in such cases, complex regulatory devices can be deconstructed into clusters of simple binary actions (Silva-Rocha & de Lorenzo, 2008). In this respect, regulatory networks share many of the qualities embodied in the electronic components of processors, as they execute deterministic computations of signals into responses. There is a key difference, though. While in electronic circuits the signal-carrier ingredient always has the same nature (electrons), in biological circuits, the input is generally different from the output. This means that in every biological signal-propagation cascade, any given upstream input to one of the nodes of the network has to result in an output that can be understood by the next node of the progression chain. This obviously restricts, but does not impede, the possibility of engineering biological circuits with the same ease as electronic counterparts. As discussed below, the challenge in this case is precisely to design the right interface for connecting the regulatory modules at stake and making sure that a specified incoming signal is eventually translated into a precise outbound result.
Biological systems possess two types of regulatory networks, which often share similar or identical topologies, but possess very different mechanisms and dynamic properties (Kiel et al., 2010). The most kinetically efficient are doubtlessly the signal transduction pathways that are so prevalent in eukaryotic cells. These pathways generally involve a chain of post-translational changes (e.g. phosphorylation, effector-dependent allosterism) in which the signal-carrier agent(s) move(s) quickly from one component of the network to the other. Most signal transduction pathways of this sort involve a succession of biochemical events carried out by the interplay between enzymes and signal-carrier agents. These can either be nondiffusible (e.g. high-energy phosphate moving through a cascade of kinases) or diffusible (calcium, cAMP, etc.). The processes can be as rapid as milliseconds up to a few minutes. Typically, the number of components of standard signal transduction processes increases from the earlier signal-sensing step, resulting in a growing directional amplification that can elicit a considerable response. Although phosphorelays do exist in prokaryotes and some biological functions are elicited by signal transduction mechanisms, bacteria preferentially process endogenous and exogenous signals through transcriptional regulatory networks (Kiel et al., 2010). In this case, the signal acts on more or less specific TFs to induce the activation or the repression of a distinct subset of genes. Depending on the TFs, the responses can involve many genes, a few or just one, and therefore, amplification cannot be taken for granted. On the other hand, the response time is necessarily longer than in signal transduction, because the events that go from the input to the output (transcription, translation) generally take a few minutes. These two types of regulatory circuits (signal transduction and transcriptional regulation) often appear together. For instance, mitogen-activated protein kinases that trigger the responses of eukaryotic cells to a large collection of extracellular stimuli end up, in some cases, activating nuclear TFs for the expression of specific genes (de Nadal & Posas, 2010). By the same token, the ordered expression of the components for the flagellar machinery in Escherichia coli is subject to an intricate transcriptional regulation control (Kalir & Alon, 2004). However, once the protein products are in place, the functioning of the motility apparatus is regulated by a signal transduction pathway in which methylation and phosphorylation become the key signal-carrier events.
Adjoining transcriptional control and signal transduction, a distinct mechanism for implementing simple and fast regulatory actions is allosteric control of protein activity. The classical notion, pioneered by Monod and Koshland in the 1960s, is that small molecules, often substrates or products of the enzymes at stake, can control (positively or negatively) the activity of the cognate proteins by binding to sites different from the active centre (reviewed by Perutz, 1989). Regardless of the specific models, the idea is that binding of a small molecule to a distinct site in a protein causes a conformational change that is propagated through the enzyme structure to ultimately affect the performance of the active centre. Allosteric regulation thus provides a natural mechanism for implementing control loops, such as feedback inhibition from downstream products or feed-forward activation from upstream substrates. In addition, crevices in proteins can easily mutate to accommodate metabolites and other small molecules not necessarily connected to the primary activity yielded by the proteins. This allows a degree of connectivity and regulatory cross-talk between metabolic pathways independent of any transcriptional control or signal transduction pathway.
Since the early times of Molecular Biology, geneticists have been intrigued by the possibility of rewiring the molecular components of regulatory networks by fusing the upstream signal-sensor device with a different downstream output. In fact, much of modern microbial biotechnology is based on the fact that we can express genes of interest under the control of promoters that are not native to their natural context. Similarly, a large variety of whole-cell biosensors have been created, both for basic research and for biotechnological applications, in which an external physicochemical signal (nutrient, stressor, oxygen tension, temperature, solvents, etc.) or an endogenous molecule drives the production or the activity of a measurable reporter product (van der Meer et al., 2004). It should be noted that refactoring gene expression (transcriptional) networks in bacteria is far simpler than reprogramming signal transduction pathways in eukaryotic cells (Kiel et al., 2010). It therefore does not come as a surprise that the majority of efforts made thus far along this line come from the prokaryotic realm (and only occasionally from higher cells). This review focuses exclusively on the design of input/output nodes in prokaryotes for interfacing given inputs (in most cases, chemical species) into desired outputs, the latter including (but by no means limited to) reporter genes. Interested readers are directed to Kiel et al. (2010) for an update on the same issues in the eukaryotic domain.
Input/output itinerary in extant gene expression circuits
One aspect of engineering biological signal-sensing circuits deals with the nature of the inputs to the genetic networks mentioned above. Typically, such inputs include small molecules, nutrient-regulatory proteins, enzymes, DNA, RNA, temperature, redox status, desiccation, etc., while outputs can also include metabolic intermediates and multienzymatic steps. Obviously, a collection of regulatory knots that each use device-specific molecules as signal carriers cannot be used in combination to assemble many engineered systems. One much-debated approach to unify the nature of the signal carrier in biological circuits is to replot the connections between parts that make up each of the devices so that the inputs and outputs are borne only by RNA polymerase (RNAP; Endy, 2005; Canton et al., 2008; Kelly et al., 2009). In this context, the counts of polymerase per second (PoPS, the flow of RNAP molecules along DNA; Endy et al., 2005) could be taken as a biological counterpart of electric current in gene expression circuits. The PoPS intensity is set by the amount of RNAP molecules that pass a specific nonreturn position of promoter DNA each second. Although RNAP molecules can diffuse intracellularly, PoPS refers exclusively to the amount of transcription initiation and nonreturn elongation events that occur at a given, distinct promoter. In this way, both the input and the output signals that go through a regulatory node or module can be accurately described. The downside of the concept is that transcription initiation and elongation are highly stochastic processes at the level of single cells (Golding et al., 2005) and therefore one has to deal with the average population behaviour for generating PoPS values. By the same token, one can define RiPS (ribosome per second; http://syntheticbiology.org/Abstraction_hierarchy.html) as a way to describe the expression of the final protein product after the corresponding transcript resulting from PoPS going through a gene sequence has been formed. Both PoPS and RiPS are interesting and promising concepts that could become accurate tools for reliably describing input/output gene expression functions (Fig. 1a). Yet standard, reliable methods for the measurement of such parameters in vivo still need to be developed (Canton et al., 2008; Kelly et al., 2009; Fernandez-Lopez et al., 2010).
The four basic scenarios of signal transfer through any node of a regulatory network are illustrated in Fig. 2. The simplest situation is the one in which a given input is translated into a single output. Although virtually all promoters depend in vivo on more than one signal, it is often possible to establish conditions that set the output of a regulatory knot in a manner depending exclusively on a single input (Fig. 2a). Examples of this include phage transcriptional repressors (e.g. the Cro and CI proteins of lambda phage; Oppenheim et al., 2005) and activators (the CII protein). Note that in these cases, there is an absolute specificity and fidelity between one input (the repressor or the activator protein) and one output (transcription). However, this is by no means the most common regulatory case. The majority of control events involve two or more inputs that become converted into a single output (Fig. 2b). This is the computation par excellence that regulatory knots perform, as very diverse environmental signals often have to be distilled into just one transcriptional decision. The (simplified) lac system is a straightforward example of this: lacZ is expressed or not depending on two primary inputs (lactose and the LacI protein; Wilson et al., 2007). In reality, the lac system is connected in vivo not only to a third input (the CRP/cAMP system) but also subjected to a global control brought about by sigma factor competition (Nyström, 2004). In an extreme case of the coexistence of multiple inputs, activation of the Pu promoter of Pseudomonas putida mt-2 transcribes (or not) the xyl genes for biodegradation of m-xylene depending not only on the presence of this substrate in the medium but also on the occurrence of other C-sources and/or various stressors (Velazquez et al., 2006). In this case, the computation of numerous inputs (m-xylene, alternative nutrients, UV light, desiccation, oxidative environment, growth phase and others) results in a single output (expression of the xyl genes). The opposite case (one input, multiple outputs, Fig. 2c) is also very frequent and it is typically caused by transcriptional factors that bind several promoters (what has been traditionally called a regulon; Mendoza-Vargas et al., 2009). When a single stimulus triggers the activity of various TFs, the whole set of genes that become affected is called a stimulon (Hatfield & Benham, 2002; Cases et al., 2003). Because the number of genes that belong to given regulons/stimulons (for instance, C starvation, heat shock, oxygen tension, etc.) is typically high and the bacterial genomes encode a limited number of ORFs, it is somewhat unavoidable that most genes are directly or indirectly subject to more than one regulatory control.
Interdependence and orthogonality
The considerable intertwining of different genetic circuits in the same cell means that the working status of one domain of the transcriptome necessarily echoes into the others and therefore that regulatory circuits hardly work in isolation. This is an important feature of genetic networks that is often overlooked: control of the expression of individual genes is generally subject to various layers of regulation. One can experimentally decrease the number of variables to just one input and one output, but in the natural world, the architecture of given networks can be understood only on the basis of their connections to other networks. This is doubtless a challenge for the engineering of artificial gene expression devices (e.g. for biotechnology or synthetic biology purposes; de Lorenzo & Danchin, 2008), which ideally should be orthogonal (two components of a system are orthogonal to each other if they do not influence each other). Orthogonality ensures that the effects produced by module A in a dynamic structure neither create nor propagate side effects to other domains B, C, etc. of the same system with respect to the host. In the extant biological world, the only functions with a significant degree of orthogonality are borne by bacteriophage genomes, which are naturally selected to develop their own genetic program in a manner independent of the regulatory networks of the host. One archetype of such functions is the single-unit RNAP of the T7 phage (and other viral polymerases), which does not depend on any host factor – other than small molecules and precursors – to carry out its biological role (Chan et al., 2005). However, even bacteriophages depend on the translation machinery of the cells, making complete orthogonality virtually impossible. The creation of so-called orthogonal ribosomes, able to recognize alternative genetic codes, is an interesting step in the direction of implanting alien gene expression systems into an otherwise natural host (Rackham & Chin, 2005), but the corresponding genetic toolbox is still difficult to implement (Neumann et al., 2010a, b).
Finally, conversion of input into output in genetic circuits still has one more checkpoint. Although transcriptional regulation is the most noticeable level of gene expression control, it is increasingly becoming clear that post-transcriptional controls (including the mechanisms that cope with termination; Uptain et al., 1997) are also key determinants of cell homeostasis. While the earlier emphasis in this field was on mRNA stability and the control of the ribosomal machinery by translation factors, the last few years have witnessed an explosion of studies on how RNAs check translation by diverse means (riboswitches, small noncoding RNAs, ribozymes, translational attenuation, etc.; Nahvi et al., 2002; Mendoza-Vargas et al., 2009). While not applicable to all genes, post-transcriptional regulation controls key processes in bacterial physiology. The detection of small RNAs in bacteria is ongoing at the time of writing this review, although the function of such a massive expression of antisense transcripts (apart from a few well-known examples) is still uncertain (Guell et al., 2009). The dramatic on/off transitions between alternative structures in riboswitches in the absence of any cellular factor make them attractive building blocks for the construction of biological computing circuits as well as orthogonal gene regulation devices for synthetic biology (Beisel & Smolke, 2009; Win et al., 2009). Whether based on RNA regulatory checks or not, proteins are eventually produced and their activity can be controlled later by post-translational modifications (typically, phosphorylation and methylation) and/or by their degradation rate.
The corollary of all of the above is that the process that goes from sensing a nutritional, chemical or physical input signal all the way to produce an output response relies on multiple transfer functions between nodes of more or less complex regulatory networks with a fixed logic. Such functions comprise different kinetic properties that depend on a large diversity of mechanisms compatible with the type of biological task to be performed. Perhaps transcriptional initiation control produces a coarse regulatory effect, while riboswitches and sRNAs modulate the expression of metabolic functions that require a fine-tuning of their activity. Moreover, small regulatory RNAs allow a very fast response to a stimulus and a quick reversion to the initial state once the stimulus is gone, as compared with the much slower protein-based transcriptional regulation.
Mining regulatory devices from transcriptional networks
The search for genetic elements responsive to given signals has traditionally relied on the use of a reporter fusion of reference (e.g. lacZ or other) attached to the gene under examination, followed by the generation of random transposon insertions in the target genome and identification of those that alter the expression of such a reporter. This approach has been later extended to the mining of regulatory elements in metagenomes (Uchiyama et al., 2005), thereby allowing access to regulatory functions of nonculturable bacteria. However, in more recent times, the complete transcriptomes of individual bacteria have become the primary source of signal-responding genes. Unlike the reporter/insertion approach, which basically monitors the expression of one gene at a time, transcriptomes offer the complete gene expression landscape under given conditions in vivo. This information often yields complex transcriptional networks in which the signal specificities of individual regulators might be difficult to pinpoint.
Although it is possible to identify regulatory nodes in an expression network as sites where single inputs are translated in multiple outputs, or multiple inputs converted into single outputs, the most common situation is the one in which multiple, simultaneous inputs produce equally manifold and simultaneous outputs (Fig. 2d). In these cases, it is useful to disclose the regulatory nodes in discrete logic gates (Silva-Rocha & de Lorenzo, 2008). Such gates represent Boolean operations in which on one or two inputs produce a single logic output each time. Because the output is also a logic-level value, an output of one logic gate can connect to the input of one or more other logic gates. Although binary logic circuits are based on functions with just two possible states (0 or 1), existing biological systems typically display distinct values for the input/output functions. The representation of regulatory circuits as wholes of Boolean gates exposes the structure of any network even in the absence of kinetic or biochemical details. An approach that is capable of handling many of the above problems is based on a class of piecewise-linear (PL) differential equation models originally proposed by Glass & Kauffman (1973). The state variables in the PL models correspond to the functional levels of proteins encoded by the genes in the network, while the differential equations represent the interactions arising from the regulatory influence of some proteins on the synthesis and degradation of others. The regulatory interactions are modelled by means of step functions, giving rise to the PL structure of the differential equations. PL models have been used successfully for the study of several prokaryotic and eukaryotic regulatory networks and hold considerable promise to examine cases where not all the elements of given organisms of a system are known (de Jong et al., 2003). Boolean formalisms allow a description of situations involving multiple inputs/multiple outputs (Fig. 2d) such as a defined collection of binary actions. While this may not represent reality accurately (often promoters are inherently subject to more than two inputs), it is instrumental to describe complex regulatory circuitry and to carry out simplified depictions of otherwise difficult control knots.
Either taken one by one or as a whole, natural prokaryotic transcriptional networks provide a wealth of regulatory modules that can be exploited for designing signal-sensing circuits. Yet, the number of usable regulatory set-ups available so far is limited (Mijakovic et al., 2005; Voigt, 2006) and perhaps most of them remain to be discovered. One way of tackling this important caveat is to survey the pool of existing, natural regulatory modules for desired input/output performances – rather than forward – designing the same behaviours. One major source of such assets is the whole of the genetic circuits that control the expression of catabolic pathways for the biodegradation of environmental pollutants in soil bacteria (Diaz & Prieto, 2000; Tropel & Van Der Meer, 2004). These are often organized in subnetworks with many regulators that respond to distinct chemical species and cognate promoters provided with the most diverse connectivities. These systems are excellent reservoirs for mining and redesigning novel regulatory modules. One procedure to extract them involves the use of wet activity mining methods for highly diverse environmental metagenomes (Galvao et al., 2005). Some of them allow the assembly of genetic traps in which a pool of genomes is experimentally interrogated – and eventually the right clones selected – for promoters and regulators with given regulatory specificities (Uchiyama et al., 2005). These can surely be further developed in the quest for more biological equivalents to the Boolean gates discussed above.
As mentioned above, the basic component of any regulatory circuit is a node/device able to convert an input into an output. Various devices can then be assembled to form a module and various modules to generate a complete system (Fig. 3; Endy, 2005; Andrianantoandro et al., 2006; Purnick & Weiss, 2009). In the sections below, we discuss several of the natural or the engineered prokaryotic regulatory elements that have been used thus far for designing signal-sensing/output-producing set-ups. In this article, we limit the notion of biological sensor to any minimal component of the prokaryotic gene expression machinery that, following interaction with a given chemical species or exposure to a distinct physicochemical condition (temperature, UV light, oxygen tension, etc.), generates a discrete output signal (Chambers et al., 2008). In most cases, the output is a transcription initiation stroke that can be coupled to the expression of a downstream gene (PoPS, Fig. 1a), but it may also involve a riboswitch that allows the translation of an otherwise silent mRNA (RiBS, Fig. 1b). Ideally, such an upstream sensor acts as a master switch, which triggers the production of the output only in the presence of one specific stimulus.
The first engineered sensors were designed to simply place the expression of a reporter gene under the control of transcriptional regulators known to be responsive to a desired signal (van der Meer et al., 2004). In more recent times, this approach has been enriched not only with the rational or the combinatorial amalgamation of various sensor modules for signal amplification, pulse production and oscillatory outputs (Elowitz & Leibler, 2000) but also with the introduction of RNA sequences that make control of expression possible without using transcriptional factors (Nahvi et al., 2002). Furthermore, both regulatory proteins and RNA sensors can be evolved, engineered or altogether synthesized to respond to non-natural signals and molecules. This expands the range or the input signals that can be engaged much beyond the natural ones and thus produce artificial sensor systems that respond to uncommon stimuli. These two types of master sensors (derived from prokaryotic TFs or based on RNA) and their mutated or artificial derivatives are examined below separately.
Sensing signals with transcriptional regulators of environmental bacteria
As soon as studies on gene regulation extended to non-E. coli bacteria, it became evident that microorganisms that inhabited sites with a history of pollution by chemical waste possessed a wealth of transcriptional factors responsive to the most diverse chemical structures and environmental conditions (Diaz & Prieto, 2000; Tropel & Van Der Meer, 2004). Not surprisingly, the first signal-sensing devices engineered in live cells involved the salicylate-responsive TF NahR of soil pseudomonads, which were designed to detect the presence of naphthalene or its metabolic intermediates in the medium (Burlage et al., 1990; King et al., 1990; Werlen et al., 2004; Mitchell & Gu, 2005). This earlier work on NahR set the scenario for the construction over the years of a large number of whole-cell biosensors with a basically identical layout: (a) one TF from an environmental bacterium, which responded to a small chemical molecule, to a heavy metal/metalloid or a physical condition for example temperature, (b) one target promoter and (c) one reporter product expressed through such a promoter. This approach has been recurrently reviewed (Daunert et al., 2000; Köhler et al., 2000; D'Souza, 2001; Yagi, 2006; Girotti et al., 2008) and will not be further examined here. Instead, we inspect below the cases where an extant TF has been artificially evolved or somehow reshaped to respond to new signals. Many of these developments stem from the growing application of targeted mutagenesis techniques (Yuan et al., 2005) that allow a fast in vitro evolution of sensor proteins for the sake of expanding the range of inputs that can be detected. The most successful cases have been those in which these techniques have resulted in effector-specificity mutants of several master regulators of biodegradative operons for xenobiotic and recalcitrant compounds, some of which are addressed now in more detail.
XylR is the main regulator of the so-called TOL pathway of P. putida mt-2 (Ramos & Marques, 1997). This transcriptional regulator is an enhancer-binding protein (EBP) of the NtrC family (Morett & Segovia, 1993), which, in the presence of toluene, m-xylene or p-xylene, activates its cognate σ54-dependent promoter Pu, thereby triggering the expression of the upper operon of the pathway (Harayama et al., 1989). The first efforts to modify the effector specificity of XylR were carried out in the 1990s (Delgado & Ramos, 1994; Delgado et al., 1995). By that time, XylR mutants were generated with chemical nitrosoguanidine mutagenesis that were responsive to m-nitrotoluene, a compound that is not an effector for the wild-type regulator. Subsequent attempts to create XylR mutants with new specificities were based on its modular structure. XylR has four domains, the N terminal domain (A domain) being the one responsible for effector binding (Ramos & Marques, 1997; Devos et al., 2002). This fact makes the generation of new effector-specificity variants of XylR possible by introducing changes only in this domain (a characteristic shared by other members of the NtrC family; Wise & Kuske, 2000; Beggah et al., 2008). Several authors have followed this strategy by creating a battery of XylR specimens with novel specificities by means of different methods that allow the generation of diversity. In one case (Garmendia et al., 2001), a combinatorial library of XylR mutants was produced by shuffling the N-terminal A domain of this regulator with the same moiety of homologous regulators DmpR (Shingler & Moore, 1994) and TbuT (Byrne & Olsen, 1996). This library was then subject to both positive and negative selection in the presence of 2-nitrotoluene, 3-nitrotoluene, 4-nitrotoluene and biphenyl in a test strain that coupled growth to the response (or the lack of it) of the mutant proteins to the non-native effectors. The shuffling yielded some XylR variants that were induced by these compounds (Garmendia et al., 2001). In a separate work (Galvao et al., 2007), XylR variants able to respond to the synthetic chemical 2,4-dinitrotoluene (2,4-DNT) were generated. To this end, a mutant library of xylR was created bearing sequences of a pool of A domains spawned with an error-prone DNA polymerase. The breeding of new variants was carried out in this case using a selection/counterselection system analogous to the yeast URA3 marker (Galvao & de Lorenzo, 2005) in the presence of 2,4-DNT.
Regardless of the procedure used for their generation, some of the resulting effector-specificity variants of XylR have been integrated into genetic circuits for various applications. For instance, Mohn et al. (2006) established a system to expose the activity of dehydrochlorinases acting on γ-hexachlorocyclohexane (γ-HCH, lindane). For this, they constructed an E. coli strain carrying a Pu-lacZ reporter gene (Pu is the target promoter of XylR) along with a regulator variant (xylR5), which is able to respond to 1,2,4-trichlorobenzene (TCB). This engineered host activated the expression of the reporter gene in the presence of the TCB generated by dehydrochlorination and the subsequent condensation of the aliphatic ring of γ-HCH into an aromatic compound. This approach (and other similar approaches; van Sint Fiet et al., 2006) has been instrumental to set up genetic traps for exploring inconspicuous reactions encoded in metagenomic libraries of environmental DNA. Other XylR mutants have also found interesting applications in the construction of whole-cell biosensors to trace residues of explosives in soil (Garmendia et al., 2008). This is based on the ability of some XylR mutants to respond to 2,4-DNT and 2,4,6-trinitrotoluene (TNT). In this case, P. putida strains able to produce fluorescence or light emission in the presence of trace amounts of these nitroaromatic compounds were engineered by fusing the Pu promoter to reporters with an optical output [green fluorescent protein (GFP), lux]. These were then combined with the mutant regulator responsive to nitrotoluenes. Because TNT is used as an explosive and 2,4-DNT is always present as an impurity in a large number of antipersonnel landmines (Sylvia et al., 2000), this work paves the way for developing bacterial biosensors to sense explosives in situ. Further progress in this direction has been achieved by introducing one of the 2,4-DNT-responsive XylR mutants into a formatted P. putida strain suitable for environmental release (de Las Heras et al., 2008).
XylS is a second transcriptional factor of the TOL pathway of P. putida mt-2. This protein activates the expression of the meta-pathway genes of the TOL system upon binding benzoate or m-toluate, which are metabolic intermediates of toluene and m-xylene degradation (Ramos & Marques, 1997). A large number of effector-specificity XylS variants have been developed over the years by introducing mutations into the N-terminal region. Several of these confer increased responsiveness to benzoate and/or sensitivity to non-natural derivatives (Ramos et al., 1986; Michan et al., 1992). More recently, XylS versions that multiply by 10-fold the induction of the cognate Pm promoter with respect to the wild type when cells encounter benzoates were obtained by directed evolution (Vee Aune et al., 2009). This TF possesses a large number of properties of interest, in particular its ability to activate Pm both by overexpression of a large amount of the protein in the absence of benzoates and by a smaller amount of the TF bound to its aromatic effectors. This property has been thoroughly exploited for the construction of a signal amplification cascade (Cebolla et al., 2001).
DmpR, the main transcriptional regulator of the (methyl)phenol degradation pathway of Pseudomonas strain CF600 (Shingler et al., 1992), is another transcriptional regulator of the NtrC family that has been developed in the laboratory in order to change its effector specificity (Pavel et al., 1994). The set-up for screening such mutants was based on placing the kanamycin resistance gene under the control of the promoter activated by DmpR, so that only the mutants responsive to the desired effector could survive in a medium added with Km and the target inducer. In one instance, a search was carried out for mutants that responded to 4-ethylphenol, a noneffector of wild-type DmpR. A plasmid carrying DmpR was introduced into the selection Po-km strain (Po is the promoter activated by DmpR). Cells were then exposed to the mutagen ethylmethanesulphonate and plated on a medium with kanamycin and 4-ethylphenol. This produced one mutant bearing a point mutation in the A domain of the protein, which displayed not only a novel response to 4-ethylphenol but also showed an increased reaction to 4-methyphenol and 3,4-dimethyphenol. This DmpR mutant has been later used to construct a Pseudomonas fluorescens biosensor strain for detecting bioavailable phenols in groundwater and leachate samples (Leedjarv et al., 2006). Besides chemical mutagenesis, error-prone DNA polymerase has also been used to introduce random mutations into the A domain of this regulator (Wise & Kuske, 2000). In this case, the resulting mutant library of DmpR variants was screened successfully for responsiveness to 2-nitrophenol, 4-chloro-3-methylphenol, 2,4-dimethylphenol, 2-chlorophenol and 4-nitrophenol using a Po-lacZ reporter strain. Some mutants resulting from the same experiment also displayed an enhanced response to phenol. The exploitation of DmpR and its derivatives as the main component of a whole-cell biosensor device has been extended to the detection of chlorinated phenols in kraft pulp bleaching effluents (Campos et al., 2004). However, DmpR variants with effector-specificity changes do not originate only in the laboratory. The introduction of Pseudomonas sp. CF600 (the original host of the dmp catabolic operon) into soil amended with 4-methylphenol and 3,4-dimethylphenol created enough pressure for in situ selection of outgrowers able to metabolize these compounds. Further analysis demonstrated that these strains had point mutations within the A domain of DmpR, which produced an enhanced response to 4-methylphenol (Sarand et al., 2001).
HbpR is yet another member of the NtrC family of prokaryotic EBPs from Pseudomonas azelaica (Jaspers et al., 2000). In this case, the regulator recognizes biaromatic structures such as 2-hydroxybiphenyl (2-HBP), thereby activating the genes responsible for 2-HBP biodegradation (Jaspers et al., 2001b). HbpR has been modified recently in a manner similar to XylR and DmpR in order to obtain variants that respond to 2-chlorobiphenyl (2-CBP; Beggah et al., 2008). To this end, HbpR mutants carrying mutations throughout the A domain were enriched in a reporter strain with a GFP fusion to a cognate promoter by means of flow cytometry-assisted cell sorting (FACS). This procedure differed from the previous ones in that the selection system did not rely on the growth of the bacterial hosts, thereby widening the range of mutations that led to the required phenotype. The sorting process yielded a collection of phenotypically diverse mutants, which included at least one that had acquired the ability of responding to 2-CBP.
TF mutants with novel effector specificities are not limited to members of the σ54-dependent family of EBPs. Members of the LysR family have also been subjected to this type of directed evolution. This is the case for NahR, a salicylate-responsive LysR-type transcriptional regulator (LTTR) that controls the expression of the nah and sal naphthalene degradation operons in several pseudomonads (Werlen et al., 2004; Mitchell & Gu, 2005; Park et al., 2005). Early attempts in this direction involved the use of error-prone PCR for the production of mutant NahR pools, followed by their capture in a Psal-lacZ reporter strain (Psal is the target of NahR). By adding benzoate to the test plate, two mutants responsive to this otherwise noninducer were identified by means of blue/white colony screening. The corresponding NahR variants showed changes in the 169 and 269 residues (Cebolla et al., 1997). One of the mutants also showed a much higher affinity for 3-chlorobenzoate and 3-methyl salicylate than wild-type NahR. In a further development, site-directed mutagenesis of the residues 169 and 269 of this protein created mutants with different induction profiles in the presence of salicylate and benzoate (Park et al., 2005). Because both XylS and NahR can be made to respond to the same compounds (benzoate or salicylate; Ramos et al., 1986; Cebolla et al., 1997), the two regulatory proteins have been combined in an artificial feed-forward cascade able to amplify by various orders of magnitude the response of the output promoter once the system has been induced with salicylate (Fig. 4; Cebolla et al., 2001).
DntR, the main regulator of the 2,4-DNT degradation pathway of Burkholderia sp. strain DNT, is another LTTR that has been modified for altering its effector-response properties (Suen et al., 1996). Although the proximity of the dntR gene to the divergent dnt operon for DNT catabolism could suggest that DntR is the activator of the corresponding genes in response to the pathway substrate, it turns out that the only recognizable effector of this TF is salicylate (Smirnova et al., 2004; de Las Heras et al., 2008). This was an unwelcome result, because a 2,4-DNT-responsive regulator could have interesting applications for the engineering of biological sensors for explosives. To overcome this problem, the crystal structure of DntR was resolved and used to attempt a rational design of the effector-binding site to facilitate accommodation of 2,4-DNT instead of, or in addition to, salicylate (Smirnova et al., 2004). On the basis of such a three-dimensional structure, a number of DntR variants were generated with amino acid changes in residues 110, 111 and 169 (Lonneborg et al., 2007). All of these mutants responded to benzoate, one of them also displaying a small, but still significant ability to respond to 2,4-DNT.
The choice of signal-sensing modules includes some prokaryotic repressors responsive to small molecules, and TFs of this type have been used extensively (in their natural forms or as mutant variants) in many engineered circuits. Besides the conspicuous LacI repressor (Wilson et al., 2007), the tetracycline-responsive TetR protein has received considerable attention because of the ease of generating variants with the most diverse phenotypes (Scholz et al., 2003, 2004; Henssler et al., 2004, 2005; Ramos et al., 2005). These include the recognition of new effectors and the reversal of inducers into inhibitors and vice versa. The TetR protein has been the subject of recent reviews and will not be further discussed in detail here. In contrast, other repressors are worth mentioning: those that regulate the expression of metal resistance genes in environmental bacteria. Specifically, the ArsR protein, which represses the ars genes for As(V) reduction and As(III) extrusion in many bacteria, has been the subject of many As-sensing circuits aimed at the construction of cheap and reliable biosensors for such a metalloid (Stocker et al., 2003; Diesel et al., 2009). Similarly, the various metal ion-responsive regulators found in strain Cupriavidus metallidurans CH34 (Diels et al., 2009) were recurrently exploited through the 1990s for the construction of sensors for a variety of heavy metals (Collard et al., 1994; van der Lelie et al., 1994; Diels et al., 2009), although this endeavour has not received much attention in more recent times. In contrast, the utilization of the mercury-responsive repressor/activator protein MerR, which was proposed some time back as the basis for a mercury biosensor (Bontidean et al., 1998), continues to this day (Harkins et al., 2004; Wegner et al., 2007).
Finally, signal-sensing modules based on prokaryotic transcriptional factors have been used to detect environmental stimuli other than small molecules or metals. For instance, the combination of elements of the SOS system of response to DNA damage of E. coli with a lux reporter is the basis of a test (the so-called VITOTOX) for the recognition of genotoxicity (van der Lelie et al., 1997). Much of this sensing circuit is based on the strong induction endured by the recN gene when the host DNA undergoes environmental insults. Interestingly, this test is more sensitive than the classical Ames assays when applied to compounds such as the polyaromatic hydrocarbons. The VITOTOX has found a fertile field of application for the rapid assessment of DNA-related toxicity in the early stages of drug development (Westerink et al., 2009). Yet, the ultimate bacterial biosensor would be the one able to respond to any kind of environmental stress. The question in this case is whether there is a single TF able to sense stress of any origin and translate it into a defined transcriptional response. The discovery in the early 1990s of what appeared to be a protein in E. coli that was induced under virtually all types of environmental insults (UspA; Nyström & Neidhardt, 1992) led to considerable interest in the possibility of developing generic stress biosensors based on it (Van Dyk et al., 1995). Later work, however, has shown without a doubt that stress sensing is the job of an intricate regulatory network rather than the action of a singular TF (Hengge, 2009).
The above list of TFs used as signal-sensing modules (or with the potential of being used as such) is not at all exhaustive and interested readers are directed to available databases (for instance, the BIONEMO resource; Carbajosa et al., 2008) to explore the landscape of available regulatory elements for circuit engineering. However, it may well be that the chemical desired to be an inducer of the circuit is structurally distant from any of the known native effectors of naturally occurring TFs. The challenge here is not so much to evolve existing effector-binding pockets in TFs, but to engineer from scratch such crevices for new small molecules. In a remarkable contribution, Looger et al. (2003) advocated a smart strategy to address this issue. It involved the use of a periplasmic sugar-binding protein of E. coli as a scaffold to precalculate computationally the optimal constellation of amino acids in a protein pocket for accommodating the chemicals of choice. The best binding combinations predicted in silico could then be entered into the corresponding gene and integrated into a synthetic bacterial signal transduction pathway fused to the output promoter. This method was used for constructing regulatory circuits able to respond inter alia to TNT and for generating biosensor strains for other molecules of interest. Alas, these results have not been easy to replicate in other laboratories and it is likely that in vivo evolution, combined with efficient screening procedures, will remain for the time being the most useful approach for the same purpose. However, computer-directed rational design of novel binding pockets, at least in enzymes, is a thriving area of research (Schueler-Furman et al., 2005), and some successes with re-engineering TFs to the same end might be feasible in the not so distant future.
Note that once the signal-sensing transcriptional factor(s) are in place, the intensity of the response (PoPS) depends to a large extent on the sequence of the target promoter sequence. This intrinsic quality of promoters has been exploited to produce regulatory nodes in which the output range is set at the user's will using different variants of the corresponding DNA sequence (van der Lelie et al., 1997; Stocker et al., 2003; Yagur-Kroll et al., 2010). In other cases, the sequence of a given promoter can be engineered to have activation/repression sites for two or more TFs, each responding to their own signals (Molina-Lopez & Santero, 1999; Tropel et al., 2004). In this way, a single promoter can compute different signals into a unique output (Silva-Rocha & de Lorenzo, 2008). Multiple-input/single-output nodes are of essence for the design of complex regulatory circuits (Lou et al., 2010).
Interesting and useful as they are, signal-sensing modules based on TFs are not devoid of problems for engineering robust genetic circuits. The principal one is that conversion input into output requires various steps, which depend altogether on the host gene expression machinery. The TF has to be expressed through its own promoter (which can be regulated by other signals) and then it has to activate or repress the target promoter with the concourse of the host RNAP, additional factors, etc. One alternative to overcome these caveats is to exploit the intrinsic regulatory abilities of RNA, which are increasingly providing new possibilities of network engineering based on different principles and dissimilar strategies. How can this be done? The mere glance of natural riboswitches (Nahvi et al., 2002) and other RNA-based sensors (Winkler et al., 2004) suggests immediately the possibility of using RNA-based elements instead of proteins as the signal-sensing parts of designed circuits. That many existing RNAs modulate gene expression upon interacting with a specific metabolite without involving the action of proteins paves the way towards engineering or evolving new ones with other specificities. The predominant mechanism by which riboswitches modulate the expression of given genes is the control of translation initiation. Less frequently, riboswitches control transcription termination as well.
In general, RNA regulatory elements are located in cis within the untranslated regions (between the promoter and the start codon of the gene) of the mRNAs. Riboswitches have a modular structure comprised of two elements: (a) the RNA sequence responsible for specific binding to small molecules (the aptamer part) and (b) the expression context (often called the expression platform) responsible for inhibiting or allowing the translation of the corresponding gene (Fig. 1b). Depending on the presence or absence of the metabolite that binds the aptamer part, riboswitches do undergo a structural reorganization, leading to a change in the context RNA sequence that checks the translation of the gene. Depending on the secondary and tertiary structure adopted by the regulatory RNA at stake, the binding of small molecules may exert different effects on the translation of adjacent genes. These include (1) relieving or promoting the formation of a rho-independent transcription terminator, (2) sequestering the ribosome-binding site (RBS) or (3) controlling mRNA degradation (Suess & Weigand, 2008). Naturally occurring RNA regulatory elements have been reviewed recently (Winkler, 2005; Winkler & Breaker, 2005; Gilbert & Batey, 2006; Coppins et al., 2007; Serganov & Patel, 2007) and will not be discussed further here. Instead, we focus henceforth on synthetic RNA riboswitches that have been integrated into prokaryotic genetic networks in bacteria in order to control the expression of specific genes.
The idea of integrating synthetic RNA regions that bind chemicals and small molecules (i.e. aptamers; Hermann & Patel, 2000) or respond to temperature (Waldminghaus et al., 2008) in regulatory circuits was raised shortly before the discovery of the first natural riboswitch (Werstuck & Green, 1998). In this case, an aptamer that bound the so-called Hoechst 33342 dye (a fluorescent bisbenzimide) was engineered into the 5′ untranslated region of the lacZ gene. After introducing the construct into eukaryotic cells, a decrease in β-galactosidase activity was observed upon addition of the dye. Since then, many more examples have followed. Synthetic riboswitches emulate the modular structure of naturally existing ones, as they are generally organized in a sequence containing a ligand-binding region and an expression platform (Edwards et al., 2007). Several groups have succeeded in placing the expression of one given gene under the control of aptamers obtained by in vitro selection, the entire set-up implanted into a prokaryotic genetic background (Wilson & Szostak, 1999; Stoltenburg et al., 2007). One of the most remarkable breakthroughs along the line was the breeding of a synthetic riboswitch in the laboratory that was responsive to theophylline. To this end, a high-affinity theophylline-binding aptamer that had been obtained by in vitro selection (Jenison et al., 1994) was fused to the communication module of a functional ribozyme (Soukup & Breaker, 1999). This hybrid sequence was then introduced behind a constitutive Bacillus subtilis promoter adjacent to the xylose-responsive repressor XylR (not to be confounded with the eponymous protein of P. putida mentioned above). The reporter system consisted of an XylR-repressible promoter driving the expression of a downstream fusion of xylA to lacZ. Further analysis of this construct revealed that the expression (i.e. translation) of XylR was indeed controlled by the engineered aptamer. This interfered with the translation of the repressor in the absence of theophylline, but allowed its correct expression upon binding this molecule (Suess et al., 2004).
Engineering riboswitches based on aptamers isolated in vitro or in vivo is a fruitful field of research to this day. The pioneering work with the theophylline riboswitch just mentioned paved the way to other strategies to produce expression systems induced by small molecules (or other environmental stimuli) á la carte. One useful selection stratagem is to subject the expression of an antibiotic resistance gene to the structural switch induced by a small molecule on a folded RNA structure. One can, for instance, place in E. coli the same theophylline riboswitch described above, but upstream of the RBS of a lacZ/chloramphenicol resistance gene construct. In this case (Desai & Gallivan, 2004), cells became lacZ+ and resistant to Cm in a manner dependent on the presence of theophylline in the medium. This riboswitch was later evolved to create a library of variants in which the distance between the aptamer and the RBS of the lacZ gene was diversified between four and eight bases using completely randomized nucleotides (Lynch et al., 2007). Following a high-throughput selection assay based on the expression of β-galactosidase, a new version of the theophylline riboswitch was obtained with a low background level of gene expression in the absence of the ligand. This work demonstrated that it is possible to engineer riboswitches with a tight control of their performance by combining synthetic aptamers with neat in vivo selection methods.
In a further advancement on the same system, Gallivan's group has exploited the theophylline riboswitch for reprogramming E. coli cells to detect and navigate towards this compound in the medium. This could be brought about by placing the expression of the CheZ phosphatase (which is involved in the forward motion of the flagellum; Bren & Eisenbach, 2000) under the control of such a riboswitch (Topp & Gallivan, 2007a, b). These results provide a proof of concept that synthetic aptamers and riboswitches can be used for designing complex regulatory circuits – including gross macroscopic behaviours. Therefore, it does not come as a surprise that new methods for functional screening of synthetic riboswitches in vivo are actively being pursued. One successful development in this respect involves the use of FACS, rather than straight genetic selection, for enrichment of the desired properties. In one noteworthy example, the DsRedExpress gene was placed downstream of a sequence led by the theophylline aptamer and a randomized nucleotide sequence. By passing the population of host cells through a FACS system, an optimal theophylline-responsive RNA was isolated with a signal/response ratio of nearly 100-fold. Further analyses showed that in the absence of theophylline, the secondary structure of the RNA sequesters the RBS of the DsRed gene and thus displays a very low level of expression. In contrast, when the ligand is present, the RBS is exposed and a strong induction occurs (Lynch & Gallivan, 2008). A complementary approach to the same end involved the connection of the theophylline aptamer to a library of linker sequences, followed by an intrinsic transcriptional terminator, and placing this construct upstream the GFP. FACS screening of the bacterial population bearing the library was then carried out for connector regions capable of disrupting the structure of the terminator in response to theophylline binding (Fowler et al., 2008).
It should be noted that the effect of riboswitches on gene expression can be either allowing translation of the product at stake upon binding of a small molecule (or signal otherwise) or just the contrary – hindering translation when an effector is present and inhibiting when absent. Virtually all naturally existing riboswitches turn off the expression of their downstream genes upon binding the ligand (Winkler & Breaker, 2003; Tucker & Breaker, 2005), while synthetic counterparts, such as those described above, are often selected for the opposite behaviour. This has led to interesting opportunities to exchange the performance of specified effector-responsive riboswitches into their complementary equals. Once again, the theophylline riboswitch led the way (Topp & Gallivan, 2008). While the earlier versions of the system had the small molecule allowing the expression of the downstream gene, subsequent efforts were directed towards the generation of variants that instead repressed the corresponding reporter product upon addition of the effector. To this end, a randomized eight-nucleotide library was placed upstream of a modified theophylline riboswitch. This allowed the selection of a sequence that strongly sequestered the RBS of the downstream gene. Further analysis of the resulting construct revealed that the theophylline-responsive repressor behaved as such because it involved a region present in the very sequence of the reporter gene. Similar attempts have been made to exchange the sign of the natural thiamine pyrophosphate (TPP) riboswitch (Nomura & Yokobayashi, 2007). In this case, the stratagem for the selection of riboswitch variants that stimulated the expression of the downstream gene (instead of repressing it) involved the construction of a random library of the expression platform segment of the TPP riboswitch and its combination with a downstream tetracycline resistance gene. Subsequent genetic selection of the library in the presence/absence of Tc gave rise to sequence variants that activated the expression of the downstream gene in response to TPP. The data just discussed on the theophylline-responsive and TPP-responsive riboswitches indicate the remarkable flexibility of the RNA in implementing different modes of gene regulation.
Because they are easy to synthesize, express and select for specific qualities, riboswitches are becoming increasingly popular for engineering complex gene expression circuits (Beisel & Smolke, 2009; Win et al., 2009). Furthermore, their properties are only minimally dependent on the host, thereby increasing their orthogonality. In this context, a number of genetic tools for creating artificial riboswitches have been reported that allow the combination of in vitro-selected aptamers with adjacent expression platforms. Once a desired behaviour is achieved, various riboswitches can be combined to generate complex networks. In fact, natural riboswitches often exhibit tandem aptamer regions that bind the same (Welz & Breaker, 2007) or different molecules (Sudarsan et al., 2006), thereby displaying gene control functions similar to Boolean, digital logic gates. The first artificial circuit of this sort has involved – with little surprise – the theophylline aptamer and the two versions of the TPP riboswitch i.e. the wild type that activates gene expression upon binding TPP and the engineered, repressor counterpart (Nomura & Yokobayashi, 2007). By randomizing the sequence between the two and passing the corresponding libraries through a dual genetic selection, synthetic riboswitches were produced that displayed responses typical of digital AND and NAND logic gates in response to theophylline and TPP as inputs (Sharma et al., 2008).
Many other applications of riboswitches, which, for the most part, are beyond the scope of this article, are thriving at the time of writing this review (Beisel & Smolke, 2009; Win et al., 2009). One intriguing development is the combination of riboswitches with ribozymes for the production of what has been called aptazymes (Soukup & Breaker, 1999; Soukup et al., 2000). In this case, the concept is to bring about a catalytic activity (for instance, self-cleavage of RNA) in a manner dependent on the presence or absence of an effector (Soukup & Breaker, 1999; Soukup et al., 2000). In one remarkable example (Ogawa & Maeda, 2008), an RNA sequence derived from the theophylline riboswitch was generated that was able to self-cleave upon exposure to this molecule (Ogawa & Maeda, 2008). The system was engineered such that self-cleavage released the portion of the RNA molecule that sequesters the RBS of the gene placed downstream, thereby allowing its translation. This regulatory element was then used to assemble a regulatory feed-forward loop (FFL; Mangan & Alon, 2003) to further increase the ON/OFF efficiency. The cascade involved placing the expression of the SP6 RNAP gene under the control of this regulatory RNA and then having the construct coexist with a reporter gene transcribed by the SP6 RNAP (which is translated from the first construct) and expressed through the same regulatory RNA. On this basis, the expression of each protein is controlled by the same aptazyme and transcription of the output reporter gene is ultimately regulated by SP6 RNAP and theophylline in a typical configuration of the FFL network motif (Ogawa & Maeda, 2008).
Although the multifaceted work on the theophylline aptamer and the natural/synthetic TPP counterparts has by now yielded the impressive dividends discussed above, research on synthetic regulatory RNA elements still has a long way to go. The ultimate objective of these efforts is the integration of any aptamer obtained in vitro within an in vivo functional riboswitch or ribozyme. Some interesting developments in this respect are in sight. One important breakthrough would be the design of robust platforms for the direct in vivo selection of new effector-binding aptamers á la carte instead of the current and somewhat fastidious two-step procedure of in vitro binding and in vivo regulation. Last, but not the least, synthetic riboswitches have been shown to work in a variety of organisms other than those used for selection (Wieland & Hartig, 2008), thereby releasing the corresponding regulatory circuit from the constraints that are inherent to transcriptional regulators (Marqués et al., 2006).
Engineering the output
We have examined above some of the major choices that are at hand for designing signal-sensing elements in prokaryotic transcriptional networks, namely transcriptional regulators and riboswitches. For the rest of the article, the second part of the problem, the outputs available for engineering regulatory systems, will be discussed. The key notion is that of the reporter gene, the component that provides the readout of the activity and state of any regulatory device by encoding an easily recordable and/or selectable phenotype (Sørensen et al., 2006). Such reporters are typically expressed under the regulation of the biological devices described previously (i.e. transcriptional regulators/cognate promoters and aptamers/riboswitches). A large number of genes serve the function of reporting the functioning of the corresponding signal-sensing modules. Reporters range from antibiotic resistance genes and chromogenic activities (Vollmer & Van Dyk, 2004) to enzymes that allow positive or negative selection (Galvao & de Lorenzo, 2005; Rackham & Chin, 2005). The first and most widely used reporter gene in prokaryotic molecular biology has been thus far lacZ encoding the enzyme β-galactosidase of E. coli. The many uses of this somewhat classical reporter gene have been examined (Trun & Trempy, 2004) and will not be addressed in detail here. Furthermore, lacZ is quickly losing ground in favour of GFP (Fig. 5). Instead, we review below those genes that provide optical signals (luminescence, fluorescence) as well as other reporter set-ups that translate inputs into complex community behaviour, electrochemical signals or organoleptic indicators.
Reporters with optical readouts
Luminescence is one of the easiest biological outputs to observe and to record. This is not only because it can be detected by the naked eye, but also because light emission can be easily measured by luminometry, recorded with a photographic film and the source can be mapped and quantified on given objects with CCD imaging. The first application of the five-gene (luxCDABE) bacterial bioluminescence from Vibrio fischeri (Engebrecht et al., 1983) was described in 1985 as a reliable reporter of gene expression (Engebrecht et al., 1985). The luxCDABE gene cluster encodes the heterodimeric luciferase borne by genes luxAB along with the enzymes required for the production of its substrate, tetradecanal (luxCDE). When growing under aerobic conditions, bacteria bearing the complete lux genes have all the biochemical requirements for light emission, without any need for cell disruption or an enzymatic assay (Meighen, 1993). Other advantages of the lux reporter include the rapid appearance of the bioluminescent signal following the induction of gene expression (Van Dyk et al., 1994) and the wide range of options for measuring light production (Vollmer, 1998). Furthermore, the lux-encoded products have a short functional life and the light signal is swiftly turned over after its production. This allows the kinetics of the luminescent output to reflect promoter functioning in real time more reliably than those reporters that tend to accumulate. On the other hand, the downside of lux-based reporters is the requirement for an active cellular metabolism and oxygen for the luciferase reaction to generate the bioluminescent signal (Vollmer & Van Dyk, 2004). One way to overcome these problems, at least in part, is the use of only the luciferase part of the system (luxAB) as a reporter and adding n-decanal as an exogenous coinducer (Meighen, 1993).
Despite the value of the lacZ and lux reporters, the first account of the GFP from Aequorea victoria (Cody et al., 1993) as a readout of gene expression (Chalfie et al., 1994) in virtually all types of (aerobic) organisms soon became a landmark in the history of Molecular Biology and Biotechnology. Since its first applications, the ease of measuring fluorescence without either disruption of the cells or addition of extra substrates has turned GFP and its many variants (Crameri et al., 1996) into a powerful tool to examine gene expression and targeting at the level of single cells (Southward & Surette, 2002). The uses range from the determination of protein localization in microbial cells (Phillips, 2001), to labelling and separation of individual cells in mixed populations using FACS (Tombolini & Jansson, 1998), studies of multiple-species bacterial communities in biofilms (Møller et al., 1998) and many others (Errampalli et al., 1999). For this review, however, we just tackle the applications of GFP as a reporter of gene transcription when combined with a signal-sensing module. In this respect, we find in the literature a considerable number of plasmid constructs that use GFP as a descriptor of bioavailable toluene and related compounds (Stiner & Halverson, 2002), arsenite and arsenate in drinking water (Stocker et al., 2003), octane diffusion in oil droplets (Jaspers et al., 2001a) or the presence of phenanthrene (Tecon et al., 2006). Yet, GFP proteins are not altogether optimal as reporters, because (a) their maturation leading to fluorescent emission is a slow, oxygen-dependent and pH-sensitive process (although some super-folders have been produced; Fisher & DeLisa, 2008), (b) once formed, the lifespan of the fluorescent proteins is quite long (although shorter-lived variants are indeed available; Miller et al., 2000). This not only limits their application to aerobic conditions but also leads to serious kinetic bottlenecks in the calculation of the temporal parameters that rule the input/output transfer function. Finally, (c) quantification of GFP output requires specialized laboratory instrumentation (fluorescence microscopy or a fluorimeter). Furthermore, while lux reporters emit light as a result of an enzymatic (and therefore reusable) process, fluorescence is associated with the production of singular GFP molecules, thereby resulting in comparatively lesser sensitivity (Kohlmeier et al., 2007). This leads to virtually all signal-sensing devices based on GFP variants to be typically assembled in multicopy plasmids rather than in the genome of the host bacterium. Only a few chromosomally inserted gfp reporter strains have been constructed for phenolics (Goulian & van der Woude, 2006), salicylate (Huang et al., 2005) or herbicides such as 2,4-dichlorophenoxyacetic acid (Fuchslin et al., 2003).
Fortunately, one does not unavoidably have to make a choice between GFP and enzymatic (lacZ or lux) readouts of gene expression, because there are some cassettes that have been engineered for having the best of the two possibilities. On the one hand, converting lacZ fusions into GFP counterparts by homologous recombination with adequate vectors is straightforward (Goulian & van der Woude, 2006). However, it is then possible to use dual reporters to simultaneously examine gene expression in single cells while measuring the same at a population level. For instance, the α-lac fragment has been fused to the N-terminus of a full-length GFP. When placed in a suitable host strain, the resulting protein exhibits both enzymatic β-galactosidase activity and fluorescence emission (Martin et al., 2009). This bifunctional reporter is claimed to have a wide dynamic range and is instrumental for the characterization of gene expression in single cells as well as in cultures. With a similar purpose, a dual reporter gfp-lux has been constructed in which the order and the translation signals of the native luxCDABE operon of Photorhabdus luminescens were modified for an optimal expression in a wider variety of hosts. The resulting luxABCDE operon was then placed downstream of a promoterless gfp gene, with optimal results (Qazi et al., 2001). This gfp-luxABCD cassette has been very useful to compare various parameters related to gene expression in singular cells and in complete populations (Qazi et al., 2004).
These developments, however, do not overcome completely the problems with GFP mentioned above, specifically, the need for oxygen for proper folding and maturation of the chromophore. Fortunately, GFP is not the only fluorescent protein that nature offers. In more recent times, new products have been proposed as GFP alternatives, which are capable of emitting florescence without oxygen (Drepper et al., 2007). Such autofluorescent proteins are derived from the photoactive N-terminal light–oxygen–voltage domains of blue light bacterial photoreceptors of B. subtilis (the YtvA protein; Losi et al., 2002) and P. putida (the SB2 protein; Krauss et al., 2005). These domains, which were optimized through site-directed mutagenesis in order to enhance their natural, but weak fluorescence, use FMN as their chromophore. When expressed in E. coli (and probably in other bacteria), such engineered proteins produced optical signals in the same range as the blue (BFP) and the cyan (CFP) variants of the GFP. Yet, the main advantage of these proteins is that they emit under both aerobic and anaerobic conditions, as shown when they were introduced into the facultative anaerobic Rhodobacter capsulatus. These interesting proteins have been improved further for utilization in eukaryotic hosts such as Saccharomyces cerevisiae and Candida albicans strains, which have been made to fluoresce under anoxic conditions (Tielker et al., 2009). Alas, the system is still dependent on the availability of FMN for these proteins to yield the optical output, which may not be the case under all physiological conditions or cellular compartments. In any case, the utilization of oxygen-insensitive fluorescent proteins will certainly expand the applications of such types of reporters in gene-regulatory networks beyond the current state of affairs (Kohlmeier et al., 2007).
One interesting development regarding the engineering of output devices in gene expression circuits is the conversion of the enzymatic reaction brought about by a reporter product into an electric signal. Not surprisingly, the first report of an electroanalytical set-up for online and in situ quantification of gene expression (Biran et al., 1999) dealt with the monitoring of the E. coliβ-galactosidase using p-aminophenyl-β-D-galactopyranoside (PAPG) as the substrate (Kulys et al., 1980). Transformation of PAPG by the enzyme yielded p-aminophenol (PAP), which can be oxidized by an electrode. If a fixed voltage is applied, the current generated by PAP oxidation can then be measured (Rosen & Rishpon, 1989) and the β-galactosidase activity can be easily followed (Biran et al., 1999). This nondisruptive stratagem allows a near-real-time monitoring of gene expression and can thus be applied to a variety of engineered genetic constructs. For instance, a whole-cell biosensor for the detection of cadmium was constructed, which consisted of the lacZ gene expressed under the Cd(II)-responsive promoter zntA. Using an electrochemical device along the lines just discussed, Biran et al. (2000) were able to detect nanomolar concentrations of environmental cadmium (e.g. in freshwater, seawater and soil) within minutes. Using a similar approach, it has been possible to monitor electrochemical damage to DNA caused by genotoxic agents such as 4-nitroquinoline-1-oxide. In this case (Paitan et al., 2003), the standard Chromotest strain (Quillardet & Hofnung, 1985), which bears a fusion between the SOS promoter PsfiA and lacZ, was adapted for this type of output recognition. Fortunately, electrochemical detection is not restricted to lacZ and many other enzymes can be potentially used as reporters amenable to this strategy, provided that suitable substrates that produce redox compounds are available. For instance, alkaline phosphatase (PhoA) catalyses the transformation of p-aminophenyl phosphate into PAP and therefore its activity can be electrochemically determined as well (Paitan et al., 2004). This feature was exploited for the construction of a sensor strain able to indicate the presence of aromatic hydrocarbons such as toluene and xylenes in the environment. In this case, the concept was to fuse the xylene-responsive promoter Ps of the TOL plasmid of P. putida mt-2 (along with the XylR transcriptional factor; Ramos & Marques, 1997) to a phoA reporter gene. As explained before, this set-up allowed reliable monitoring of the evolution of environmental xylene by just following the electric current caused by PAP on the electrodes of the device (Paitan et al., 2004). Hopefully, this approach will expand the use of enzymes as gene expression reporters as suitable redox substrates and products become available. Furthermore, devices for monitoring electrochemical reporters can be miniaturized (Popovtzer et al., 2005), allowing a sensitive real-time detection of target chemicals in situ by means of portable, lightweight equipment.
Ice nucleation as a reporter of gene expression
The protein product of the inaZ gene from Pseudomonas syringae S203 is responsible for ice nucleation and confers the ability to catalyse ice formation in supercooled water. This interesting property, which has also been described in other bacteria such as Erwinia herbicola, P. fluorescens, Pseudomonas viridiflava and Xanthomonas campestris (Govindarajan & Lindow, 1988), represents one of the very few cases of phase transition caused by a biological component. From a mechanistic point of view, its turns out that the product of the inaZ gene provides an authentic physical template for ice-crystal formation, rather than behaving as an enzyme (Wolber et al., 1986). In a quite innovative development, Lindow's laboratory (Govindarajan & Lindow, 1988) proposed that such ice nucleation activity can be quantified conveniently through an extremely simple droplet freezing assay. The advocates of this procedure claim that inaZ gene fusions are c. 105–106-fold more sensitive in measuring transcriptional activity than equivalent lacZ fusions (Lindgren et al., 1989). If correct, this means that the user can detect the formation of as low as one InaZ protein in a population of ≤106 cells. The extraordinary sensitivity of inaZ makes it a reporter of choice for a large number of applications, in particular those that require in situ sensing of very minute concentrations of a given chemical or physical input. This includes the detection of available iron or nitrate in the rhizosphere (Loper & Henkels, 1997; DeAngelis et al., 2007), determination of sugar availability on plant leaves (Miller et al., 2001) and recognition of intercellular signalling molecules for example N-acyl homoserine lactones (AHLs; DeAngelis et al., 2007). Given all the claimed advantages of ice nucleation with respect to other reporters and the ease of its measurement, it comes as a surprise that so few researchers have used it thus far (Fig. 5) and that its application is mostly limited to studies on plant–microorganism interactions. Whether or not the use of inaZ increases in the future, this approach is doubtlessly a smart and elegant strategy to quantify gene expression with altogether unsophisticated equipment (Wolber et al., 1986). This is in sharp contrast to the growingly intricate methodologies for examining optical-output reporters discussed above.
Production of odours
Although far less quantitative, odour is yet another signal that can be used as an output in bacterial reporter systems where a gross diagnosis of the state of the gene expression circuit, rather than a detailed measure of its level, is required. If necessary, biological compounds that produce odour can be typically analysed using GC coupled to MS. However, their main value as reporters is that they can be detected simply by the human or the animal nose (exacerbated in the case of dogs). The production of a distinct odour as a proxy of gene expression overcomes not only the need for any equipment but also the limitations of the optical reporters discussed above. Odour can be sensed at any time of the day and at all levels of light. One interesting enzyme for this application is a 46-kDa protein named cystalysin, from Treponema denticola. Cystalysin removes sulphhydryl groups from S-containing compounds (e.g. cysteine cystathionine and S-aminoethyl-l-cysteine), producing inter alia the strong odorant, if unpleasant, H2S (Chu et al., 1999). Another attractive protein to the same end is the l-methionine-α-deamino-γ-mercaptomethane-lyase (METase). This enzyme catalyses the conversion of l-methionine into the intense odour compound methanethiol (also known as methyl mercaptan), a colourless gas with a smell like rotten cabbage (Yoshimura et al., 2000) in microorganisms, such as Pseudomonas, Trichomonas and Clostridium (Kreis & Hession, 1973; Inoue et al., 1995; Hori et al., 1996; McKie et al., 1998). Although not yet available in the academic literature, the METase of P. putida has been patented as a reporter system on the basis of its capacity to generate large amounts of methanethiol, which can be unambiguously detected as an indicator of gene expression (Nicklin et al., 2007). Fortunately, not all odour reporters have to be disagreeable, and a number of recent efforts have been directed towards the assembly and validation of genes and pathways leading to pleasant scents. For instance, an E. coli strain has been reported to produce the smell of bananas during the stationary phase (http://openwetware.org/wiki/IGEM:MIT/2006/Blurb). To this end, the enzymes that catalyse the biosynthesis of isoamyl alcohol and its further conversion into isoamyl acetate (the molecule that confers the distinct banana aroma) were assembled as a reporter system and placed behind a stationary-phase promoter. The same strain also had with a second reporter odour consisting of two enzymes for production, respectively, of salicylic acid from chorismate, and then the generation of methyl salicylate, the typical scent of wintergreen. In this case, the two-enzyme reporter was expressed from one exponential-phase promoter. While these ingenious approaches may appear to be unusual, they also provide exciting perspectives of merging sophisticated signal-sensing gene circuitry with very simple detection schemes. In situ detection of explosive residues in inconspicuous antipersonnel mines is certainly one of the instances where odour reporters (whether disagreeable or pleasing) may find a fertile field of application. Measurement of odour signals and their qualification as pleasant or unpleasant is, however, a considerable problem that is being tackled for instance with the so-called artificial nose (eNose) technology (Haddad et al., 2010), an issue that is beyond the scope of this article.
Reporting signal sensing by means of population behaviour
Although the downstream response of sensor systems typically involves one or more reporter genes/products, the choice of outputs is not limited to them. The growing ease of gene cloning and even the complete synthesis of multicomponent circuits allow the straightforward interfacing of distinct signal-sensing modules with complex regulatory networks. One interesting possibility is to subject a genetically encoded community trait to a predetermined signal that is entered into the system at the user's will. One of these cases has been mentioned in passing above: the submission of the cheZ gene of E. coli to a theophylline-responsive riboswitch (Lynch et al., 2007). CheZ, one of the proteins involved in the chemotaxis signal transduction pathway, translates the binding of chemoreceptors by ligands into a change in the rotational direction of the flagellar motor. This is because CheZ dephosphorylates CheY, the protein that determines the sense of the flagellum motion. In the absence of a suitable chemoattractant, CheY is phosphorylated, thereby causing the flagellum to rotate counterclockwise and making cells tumble and not migrate in semi-solid agar. In fact, E. coli mutants lacking CheZ cannot dephosphorylate CheY-P and are nonmotile. In contrast, the presence of a chemoattractant makes CheZ dephosphorylate CheY, which induces a clockwise motion of the flagellum. On this basis, the group of Gallivan estimated that by placing the expression of CheZ under the control of a theophylline riboswitch, the chemotactic behaviour of the entire bacterial population could be reprogrammed to navigate towards this drug (Lynch et al., 2007). And this is exactly what happened: by conditioning the translation of cheZ RNA to the addition of theophylline, an E. coli strain was generated that could rotate its flagella clockwise only in the presence of the compound and thus migrated towards this stimulus. The specificity of the system was indicated by the fact that the same engineered cells did not migrate towards caffeine, a structural analogue of theophylline. Moreover, the engineered chemotaxis platform made cells move following the gradient of the attractant. Such a property allowed the set-up of a high-throughput procedure for selecting new expression platform sequences in the riboswitch that improved induction rates (Topp & Gallivan, 2007a, b). Besides being of academic interest, these approaches have an immense application potential in environmental biotechnology: one could, for instance, design a motile bacterium with biodegradative capacities to seek and destroy recalcitrant pollutants such as atrazine (Sinha et al., 2010).
The importance of these experiments is considerable, as they were a proof of concept that extant regulatory circuits controlling a gross community trait can be altogether reprogrammed by just manipulating one of the components of the signal transmission program. One such trait is biofilm formation, a complex phenomenon, which virtually all bacteria are able to present by coordinating the behaviour of an entire microbial population (Monds & O'Toole, 2009; Nadell et al., 2009). Although many distinct genes are involved in the process, occasionally, one of them may act as a bottleneck for the entire biofilm-building progression. One such gene in E. coli is traA, which encodes the pilin subunit of the conjugative pilus, so that traA mutants are unable to settle on surfaces (Ghigo, 2001). On this basis, the conditional expression of the traA gene can be engineered to control biofilm formation. In this case, surface colonization becomes the gross reporter of the signal under which the expression of traA is placed. This circumstance has been exploited to induce, for instance, biofilm buildup to be dependent on a forward-designed bistable switch circuit responsive to either quorum-sensing (QS) signals or a transient activation of the SOS pathway (Kobayashi et al., 2004). In this last case, the resulting strain acquired a steady ability to form biofilms in response to DNA damage. As before, with the example of the theophylline riboswitch, this instance highlights the possibility of interfacing á la carte predetermined inputs with complex outputs by integrating new modules into pre-existing regulatory networks.
Finally, coordinated community behaviour has also been engineered through the genetic rewiring of components of the QS system, so that bacteria eventually perform as a macroscopic biosensor. One prominent case involved the set-up of a synthetic multicellular system in which engineered E. coli receiver cells were programmed to form patterns of differentiation based on chemical gradients of the AHL signal that was synthesized by sender cells (Basu et al., 2005). In another case, the engineered sensor circuit was designed to produce an oscillatory output visible to the naked eye. This required not only the design of the corresponding genetic circuit for alternate output production for example GFP in single cells but also the confinement of the reporter bacteria in a microfluidic chamber (Danino et al., 2010). This allowed the synchronization of promoter activity to the oscillations of the diffusible autoinducer signals, resulting in the emergence of spatiotemporal expression waves at millimetre scales. That QS signals allow temporal coordination of population behaviour adds considerable value to the search for novel autoinducers and novel autoinducer-responsive transcriptional factors. To this end, various procedures to determine such regulatory elements from culturable bacteria (Steindler & Venturi, 2007; Kumari et al., 2008) and from metagenomes (Hao et al., 2010) have been proposed. These fascinating developments, which are beyond the scope of this review, merge molecular microbiology and hard-core engineering for the sake of robust programming of bacterial behaviour. The large number of potential applications of these approaches is easy to foresee.
Conclusions and outlook
Complex regulatory networks are built through the connection of discrete input/output modules that are reminiscent of the components of electronic circuits. In natural, extant networks, the output of a given regulatory knot is understood as an input to the next one. This is precisely the challenge at stake when engineering artificial regulatory schemes, although in most cases it can be overcome through an optimum combination of transcriptional factors, promoters, ribozymes and small molecules. The transfer functions between input and outputs in any node of a network can be measured and parametrized in vivo using quantifiable reporter products, making it possible to build regulatory models and control circuits with a high degree of confidence (Ronen et al., 2002; Kalir & Alon, 2004). One pending issue, however, is the development of standardized methods for describing absolute gene expression figures in terms of PoPS and RiPS, as well as conversion tables for reliably translating activity units of one reporter into the others and with PoPS. In a different direction, new downstream reporters can be envisaged to bring complex circuit designs into practical use in various environmental and biosensing endeavours. Furthermore, the output of the engineered reporters in individual cells can be synchronized or coordinated with components of the motility apparatus and the QS system, so that the macroscopic behaviour of an entire population also can be rationally programmed. Finally, the growing discovery of regulatory elements based on RNA (Nahvi et al., 2002) and the implementation of strong selection systems (Beisel & Smolke, 2009; Sinha et al., 2010) will be paramount to evolve both proteins and aptamers as components or artificial gene circuits (Win et al., 2009).
The work in the authors' laboratory was supported by generous research grants from the Spanish Ministry of Science and Innovation (CONSOLIDER), by contracts of the Framework Program of the EU (MICROME, BACSINE) and by Funds from the Autonomous Community of Madrid.