Targeting of proteins to the twin‐arginine translocation pathway

Abstract The twin‐arginine protein transport (Tat pathway) is found in prokaryotes and plant organelles and transports folded proteins across membranes. Targeting of substrates to the Tat system is mediated by the presence of an N‐terminal signal sequence containing a highly conserved twin‐arginine motif. The Tat machinery comprises membrane proteins from the TatA and TatC families. Assembly of the Tat translocon is dynamic and is triggered by the interaction of a Tat substrate with the Tat receptor complex. This review will summarise recent advances in our understanding of Tat transport, focusing in particular on the roles played by Tat signal peptides in protein targeting and translocation.


| INTRODUC TI ON
The transport of proteins across lipid membranes is an essential biological process. In prokaryotes, the general secretory (Sec) and twin-arginine translocation (Tat) pathways operate in parallel to transport proteins across the cytoplasmic membrane. The Sec system transports unfolded proteins through a narrow channel, in a process that can either be co-translational or post-translational (Figure 1a; reviewed in Collinson, Corey, & Allen, 2015;Tsirigotaki, Geyter, Sostaric, Economou, & Karamanou, 2017). Following translocation, proteins fold in the extracellular compartment, usually aided by periplasmic chaperones (Stull, Betton, & Bardwell, 2018). The Tat pathway, by contrast, exports proteins that are folded in the cytosol and is therefore strictly post-translational (Figure 1a; Berks, 2015;Cline, 2015;Hamsanathan & Musser, 2018). Both of these pathways are also able to integrate hydrophobic segments of transmembrane proteins into the bilayer.
The Sec pathway is ubiquitous and essential as the majority of both extracytoplasmic proteins and polytopic membrane proteins require this system for their localisation. The Tat pathway transports a smaller number of substrates and, as a consequence, is not essential for survival under most growth conditions (reviewed in Palmer & Berks, 2012). Indeed, the Tat pathway is absent from some classes of bacteria and archaea. Nevertheless, the Tat system plays an important role in prokaryote physiology as it transports a subset of proteins that must be exported in a folded state. These proteins are primarily those that noncovalently bind prosthetic groups in the cytoplasm, for example, iron sulphur clusters or the molybdopterin cofactor (Sargent et al., 1998;Weiner et al., 1998). One of the most important Tat substrates is the Rieske iron-sulphur protein (Figure 2a) of the cytochrome bc 1 and b 6 f respiratory complexes which is essential for bacterial photosynthesis and for many types of respiratory metabolism (Aldridge, Spence, Kirkilionis, Frigerio, & Robinson, 2008;Bachmann, Bauer, Zwicker, Ludwig, & Anderka, 2006;De Buck et al., 2007;Hinsley, Stanley, Palmer, & Berks, 2001;Keller, Keyzer, Driessen, & Palmer, 2012;Meloni et al., 2003). The Tat machinery is conserved in the chloroplasts and mitochondria of plants, but with the exception of homoscleromorph sponges it has been lost from the mitochondria of animals (Carrie, Weissenberger, & Soll, 2016;Petru et al., 2018;Pett & Lavrov, 2013;Settles et al., 1997).

| Tat S I G NAL PEP TIDE S
Proteins are targeted to the Sec and Tat pathways by N-terminal signal peptides that are superficially very similar. They each share a tripartite structure with a positively charged n-region, a hydrophobic h-region, at least 12 amino acids in length with a propensity for helix formation and a polar c-region containing a cleavage site for signal peptidase (Dalbey & Wickner, 1985;Lüke, Handford, Palmer, & Sargent, 2009;Yahr & Wickner, 2001;Figure 1b). One of the key differences between the two targeting signals is that the n-regions of Tat signal peptides almost always have a pair of consecutive arginines, for which, the Tat system is named, whereas the Sec signal peptide n-regions are simply positively charged with no specific sequence constraints (Berks, 1996). Numerous studies have highlighted the twin-arginines in Tat signal peptides as essential for efficient Tat transport, with even conservative substitution with lysine being poorly tolerated (e.g., DeLisa, Samuelson, Palmer, & Georgiou, 2002;Stanley, Palmer, & Berks, 2000). By contrast, there appears to be no mechanistic difference between lysine and arginine in Sec signal peptides (Sasaki, Matsuyama, & Mizushima, 1990).
The paired arginines of the Tat signal peptide are part of a larger motif (S-R-R-x-F-L-K; Figure 1b). The amino acids at the other motif F I G U R E 1 Targeting to the Sec and Tat pathways. (a) The Sec pathway transports unfolded proteins. During co-translational targeting to Sec, the signal sequence is recognised at the translating ribosome by ribosome-bound signal recognition particle (SRP) and the nascent chain is guided via the SRP receptor to the Sec translocon, where the energy of protein synthesis is harnessed to drive protein transport. In the post-translational pathway, the substrate is maintained in an unfolded conformation and guided to the Sec translocon by the ATPase, SecA. ATP hydrolysis by SecA provides the driving force for Sec-dependent post-translational protein export (Collinson et al., 2015;Lycklama a Nijeholt & Driessen, 2012;Rapoport, Li, & Park, 2017;Tsirigotaki et al., 2017). The Tat pathway transports folded proteins without the requirement for targeting factors. (b) Signal peptides that target to Sec and Tat pathways share a similar tripartite organisation with a positively charged n-region, hydrophobic h-region and polar c-region containing a signal peptidase cleavage site (AxA). Tat signal peptides have an almost invariant pair of arginines that are embedded within a SRRxFLK motif (Berks, 1996). A helix destabilising residue (#), often a glycine, serine or proline towards the C-terminal end of the h-region, provides flexibility at this region of the signal peptide (Hamsanathan et al., 2017). A basic residue (+) is frequently found in the Tat signal peptide c-region and serves as a Sec avoidance motif (Bogsch et al., 1997). The arrow indicates the position of signal peptide cleavage. Amino acid sequences of two E. coli Sec signal peptides, OmpA (posttranslational Sec targeting; Fekkes et al., 1998) and DsbA (co-translational targeting; Schierle et al., 2003)-basic residues in the n-region and the signal peptidase cleavage site in the c-region are underlined and shown in bold. Two well-studied E. coli Tat signal peptides, SufI and TorA, are also shown. Residues that match the twin-arginine consensus are in red, the Sec avoidance signal in bold typeface and the signal peptidase cleavage site in underline (a)

(b)
positions are only semi-conserved and none of them are essential for Tat transport (Stanley et al., 2000). After the twin-arginines, the consensus phenylalanine has the highest frequency (e.g., it is found in two thirds of Escherichia coli Tat signals; Palmer, Sargent, & Berks, 2010). Mutational analysis has indicated that amino acid hydrophobicity at this position is an important factor and substitutions that reduce hydrophobicity decrease transport efficiency (Stanley et al., 2000).
A second critical difference between Sec and Tat signal peptides is the degree of h-region hydrophobicity. Highly hydrophobic Sec signal sequences interact with the signal recognition particle (SRP) when they emerge from the ribosome exit tunnel to engage in co-translational translocation. By contrast, moderately hydrophobic Sec signals escape SRP and mediate post-translational translocation through interaction with SecA and other chaperones (Tsirigotaki et al., 2017). Tat signal peptide h-regions are generally less hydrophobic than Sec signals and contain significantly more glycines and fewer leucines (Cristóbal, Gier, Nielsen, & Heijne, 1999). However, there is an overlap in hydrophobicity scores for naturally occurring Sec and Tat signal peptides and more than half of E. coli Tat signals are able to mediate some degree of productive engagement with the Sec pathway if they are fused to a Sec-compatible reporter protein (Tullman-Ercek et al., 2007).
Evidence discussed below suggests that these residues are required to allow the signal peptide to undergo conformational changes during interaction with the Tat machinery.
Finally, Tat signal peptides frequently contain at least one basic amino acid in their c-regions. This is not required for targeting to the Tat system and can be readily substituted for a neutral or even negatively charged amino acid without affecting the rate of Tat transport (Stanley et al., 2000). Instead, it has been shown that the positive charge acts as a Sec-avoidance motif, reducing functional engagement of the signal peptide with the Sec pathway (Blaudeck, Sprenger, Freudl, & Wiegert, 2001;Bogsch, Brink, & Robinson, 1997;Cristóbal et al., 1999) consistent with earlier studies that found that c-region basic residues interfere with the function of Sec signal F I G U R E 2 The Sec and Tat pathways cooperate for the biogenesis of some bacterial Rieske proteins. (a) The topological arrangements of selected bacterial Rieske proteins are shown. The vast majority of bacterial Rieske proteins have a single N-terminal transmembrane helix that is inserted in the membrane by the Tat pathway (Bachmann et al., 2006;De Buck et al., 2007;Goosens, Monteferrante, & Dijl, 2014). Most Actinobacterial Rieske proteins have three transmembrane helices, but some have five (Keller et al., 2012;Tooke et al., 2017). In each case, the final transmembrane helix is integrated by Tat. (b) Biogenesis pathway for polytopic Rieske proteins through Sec, to enable initial transmembrane helix insertion (adapted from Keller et al., 2012;Tooke et al., 2017) and subsequently, (c) through Tat to allow insertion of the final transmembrane helix and transport of the soluble domain (a) ) c ( ) b ( peptides (Geller, Zhu, Cheng, Kuhn, & Dalbey, 1993;Li, Beckwith, & Inouye, 1988).

| FUN C TI ONAL OVERL AP B E T WEEN Tat AND Sec TARG E TING S EQUEN CE S
In prokaryotes and plastids, the Tat pathway always coexists with Sec. It is imperative that Sec and Tat substrate proteins are sorted to the correct transport pathway. The Sec system cannot tolerate folded proteins, which can lead to lethal jamming of the machinery (Cosma, Danese, Carlson, Silhavy, & Snyder, 1995; van Stelten, Silva, Belin, & Silhavy, 2009), whereas the Tat system is unable to transport most unfolded proteins (e.g., DeLisa, Tullman, & Georgiou, 2003;Halbig, Wiegert, Blaudeck, Freudl, & Sprenger, 1999;Santini et al., 1998). Correct targeting is par- However, the final transmembrane helix resembles a typical Tat signal sequence, in accord with a requirement for extracellular iron sulphur proteins to acquire their cofactors in the cytoplasm (Berks, 1996;Berks, Sargent, & Palmer, 2000;Keller et al., 2012).
Detailed mechanistic analysis has indicated that it is a combination of low relative hydrophobicity coupled with the presence of numerous positive charges at the C-terminal side of the helix that render the Sec system unable to fully translocate the C-terminus of this domain across the membrane. As a result, the Sec apparatus releases the transmembrane domain which most probably forms a re-entrant loop in the membrane (Tooke et al., 2017;Figure 2b). Following cofactor insertion into the Rieske domain, the membrane-tethered Tat signal sequence is recognised by the Tat system to complete localisation and assembly of the protein ( Figure 2b).
The finding that the Tat signal peptides of such dual-targeted proteins initially engage with the Sec apparatus has significance for the targeting mechanism of soluble Tat substrates. Analysis of the known and predicted Tat substrates in E. coli and Salmonella shows that the overwhelming majority have the basic c-region or mature domain N-terminus required to avoid Sec transport. The inference is that such signal peptides often initially engage with the Sec pathway and abort at a late stage when the C-terminal positive charges are recognised and are inserted in the membrane as reentrant loops. This would mean that the Tat system frequently recognises membrane-associated signal peptides (Bageshwar, Whitaker, Liang, & Musser, 2009;Ma & Cline, 2000;Musser & Theg, 2000;Shanmugham, Wong Fong Sang, Bollen, & Lill, 2006).
Despite the overwhelming conservation of the twin-arginines in Tat signal peptides, recent studies demonstrate that they are not mechanistically essential for operation of the Tat pathway.
Specifically, inactivating substitutions in either the paired arginines or their binding site in the Tat translocon can be overcome by increasing the hydrophobicity of the signal peptide h-region Ulfig et al., 2017). Some of these hydrophobic suppressors are able to direct significant levels of export, approaching 30% of wild type transport activity and it was noted that even signal peptides with a marked increase in hydrophobicity (approaching those that target the SRP pathway) could productively engage with Tat Ulfig et al., 2017).
Collectively these results indicate that the functional requirements for Tat signal peptides are remarkably similar to Sec, that is, one or more positive charge in the signal peptide n-region coupled with a relatively hydrophobic h-region. Indeed, it has been shown that two canonical Sec signal peptides, OmpA and DsbA ( Figure 1b)   . However, in vivo, it is unlikely that such substrates would ever reach the Tat machinery because their signal peptides would interact with either SRP or SecA and be channelled into the Sec pathway. This places extraordinary constraints on Tat signal sequences which must evolve to escape recognition by these targeting factors. Indeed, it is likely that the twin-arginine motif and its cognate recognition site on the translocon, arose to increase the affinity of the Tat system for the weakly hydrophobic signal peptides.
By the same token, although signal peptides with paired arginines are compatible with the Sec pathway, only 0.02% of E. coli Sec signals contain this feature, whereas paired lysines are much more common . This implies that there may also be evolutionary constraints acting on Sec targeting sequences.
Interestingly, while some Tat substrate proteins are clearly incompatible with the Sec pathway because they must be folded in the cytoplasm, some protein families are compatible with either export route. A good example of this is the cell wall amidase family which in E. coli has the three members; AmiA, AmiB and AmiC. While AmiA and AmiC are Tat substrates, AmiB (which has 40% sequence identity to AmiC), is a Sec substrate (Bernhardt & de Boer, 2003;Ize, Stanley, Buchanan, & Palmer, 2003). Similarly, many solute binding proteins that would normally be expected to utilise the Sec pathway are Tat substrates in Streptomyces bacteria. (Joshi et al., 2010;Widdick et al., 2006). Intriguingly, the Tat machinery is localised to the tips of growing hyphae in Streptomyces coelicolor, so it is plausible that the Tat-dependent export of these proteins may reflect a requirement for them to be secreted at the region of active growth (Willemse et al., 2012).

| Tat S I G NAL PEP TIDE S TRI G G ER A SS EMB LY OF THE AC TIVE Tat TR ANS LOCON
The Tat  In the resting state the Tat receptor complex comprises multiple (probably three or four) copies of TatA, TatB and TatC in a 1:1:1 ratio (Alcock et al., 2016Bolhuis, Mathers, Thomas, Barrett, & Robinson, 2001Habersetzer et al., 2017). Crosslinking studies alongside sequence co-evolution analysis and molecular sim-  (Ramasamy et al., 2013;Rodriguez et al., 2013;Rollauer et al., 2012;Zhang et al., 2014). TatC has six transmembrane helices with the N-and C-terminus located at the cytoplasmic side of the membrane. Transmembrane helices are numbered. Contacts between the opposite face of the TatB transmembrane helix and the first transmembrane helix of an adjacent TatC facilitate oligomerisation of the complex (Alcock et al., 2016;Blümmel et al., 2015). While a complex of TatB and TatC is stable to purification and retains the ability to interact with Tat signal peptides (Bolhuis et al., 2001;de Leeuw et al., 2002;Tarry et al., 2009), in vivo TatA is also associated with the complex in the resting state (Alcock et al., 2016;Aldridge, Ma, Gérard, & Cline, 2014;Habersetzer et al., 2017;Zoufaly et al., 2012). The TatA  Tat transport is initiated by the interaction of a Tat signal peptide with the receptor, with the twin-arginine motif recognised by a conserved surface patch on the cytoplasmic face of TatC (Alami et al., 2003;Rollauer et al., 2012). Following initial binding, the signal peptide subsequently transitions to bind more deeply within the receptor complex (Alami et al., 2003;Blümmel et al., 2015;Gérard & Cline, 2007;Hamsanathan et al., 2017; Figure 4). This promotes a reorganisation of the receptor complex in which TatB is displaced from its resting state binding site on TatC, allowing this site to be occupied by TatA (Alcock et al., 2016;Habersetzer et al., 2017). In this activated state of the receptor complex, we expect that TatB now occupies the helix 6 binding site. Crosslinking experiments also suggest that TatC molecules adopt a tail-to-tail orientation following The precise order of these events is unclear. However, based on current evidence, it is likely that receptor re-organisation is triggered by interaction of the signal peptide h-region with TatB, consistent with the extensive contacts TatB makes with this region of the signal peptide (Alami et al., 2003;Gérard & Cline, 2006;Panahandeh, Maurer, Moser, Delisa, & Müller, 2008). This mechanistic model is supported by genetic suppressor analysis, where a group of suppressor substitutions were identified in the transmembrane helix of TatB that restored Tat transport activity to signal peptides with inactivating substitutions of the twin-arginine motif and to TatC variants that had inactivating substitutions in the twin-arginine recognition site. Biochemical analysis of these suppressors revealed signal peptide-independent structural reorganisation of the receptor complex  and for the strongest suppressor, TatB F13Y, constitutive TatB vacation of the TM5 site and occupancy of the TM6 site (Tooke and Palmer, unpublished).
The structure of the signal peptide-activated form of the receptor complex is not known. However, it has been shown that covalently attaching the twin-arginine motif to its binding site on TatC does not inhibit translocation of a substrate across the membrane, indicating that the twin-arginine residues remain at the cytoplasmic face of the membrane (Gérard & Cline, 2006). Extensive crosslinks have been detected throughout the signal peptide h-region with TatB (Alami et al., 2003;Gérard & Cline, 2006;Panahandeh et al., 2008) and a site-specific crosslink observed between a cysteine residue in the C-terminal end of the h-region and a cysteine in TatC TM5 (Aldridge et al., 2014). Applying these constraints to modelling the TatC Step 2. The signal peptide transitions to bind more deeply into the receptor, inserting in a hairpin conformation. The deep insertion of the signal peptide displaces TatB from its resting state binding site on TatC to occupy the TatA binding site at TMH6. A TatA molecule is now recruited to the binding site vacated by TatB.
Step 3. The positioning of TatA at the TM5 binding site allows the further recruitment and nucleation of TatA molecules to form a large oligomer.
Step 4. The signal peptide hairpin unhinges and the substrate passes across the membrane facilitated by the TatA oligomer.
Step 5. The signal peptide is cleaved and the mature domain is released at the periplasmic side of the membrane. Following substrate translocation, the TatA oligomer dissociates and the Tat receptor returns to the resting state TatC variants that have dominant negative activity (i.e., that inhibit Tat transport activity in the presence of a wild type copy of TatC) strongly supports the inference that hetero-oligomerisation of the receptor complex is a functional requirement (Cléon et al., 2015).
A recent study has confirmed that Tat signal peptides bind to the receptor complex in a hairpin conformation. Fluorescence quenching experiments place the C-terminal end of the signal peptide h-region at the tip of the hairpin, directly preceding the helix-destabilising residue ( Figure 1b; Hamsanathan et al., 2017). The second arm of the hairpin would then be formed from the signal peptide c-region and potentially residues at the N-terminus of the mature domain ( Figure 5), Indeed, crosslinking of both of these regions to TatB have been detected (Gérard & Cline, 2006;Hamsanathan et al., 2017). A role for the early mature domain of the substrate protein in receptor binding is supported by the isolation of suppressor substitutions in this region that can compensate for inactivating twin-arginine substitutions (Ulfig & Freudl, 2018). Collectively, these results point to a model, where the signal peptide may make contact with two separate TatB molecules;  Figure 2b). The mechanism of TatA oligomer assembly is not understood. However, it has been speculated that the concave face of TatC may form a platform to support multimerisation (Rollauer et al., 2012). Intriguingly, activation of the receptor complex results in TatA occupancy at the TatC TM5 binding site which lies at the edge of this face and could potentially act as a nucleation point for TatA polymerisation (Alcock et al., 2016;Habersetzer et al., 2017). Some support for this mechanistic model comes from the work of Aldridge et al. who observed recruitment of Tha4 (the thylakoid orthologue of TatA) to a site at the concave face of TatC under protein transport conditions (Aldridge et al., 2014). At present, it is not clear whether TatA forms an oligomer of fixed size or a series of size-variable assemblies (Beck et al., 2013;Dabney-Smith et al., 2006;Dabney-Smith & Cline, 2009;Gohlke et al., 2005;Leake et al., 2008;Oates et al., 2005;Richter & Brüser, 2005). Further studies are required to understand the formation and arrangement of the TatA oligomer.
Evidence suggests that unhinging of the signal peptide hairpin may be a critical step in substrate translocation and deliberate locking of the hairpin by internal crosslinking inhibits transport (Hamsanathan et al., 2017 2018) and will not be described in detail here. However, according to current models, the assembled TatA oligomer forms the substrate translocation pathway either through formation of a (size-variable) channel or by promoting localised membrane weakening and transient bilayer disruption (Brüser & Sanders, 2003;Gohlke et al., 2005;Leake et al., 2008;Rodriguez et al., 2013). Following passage of substrate across the membrane, the signal peptide is cleaved (Figure 4), the TatA oligomeric pore dissociates as the assembled translocation system rearranges to the resting state. Currently almost nothing is known about the mechanism of Tat translocon disassembly and whether it is an obligate step for each round of substrate transport.

| FUTURE PER S PEC TIVE S
Significant progress towards understanding the mechanism of protein transport by the Tat pathway has been catalysed by the determination of high-resolution structures for TatC and the helical regions of TatA and TatB (Ramasamy et al., 2013;Rodriguez et al., 2013;Rollauer et al., 2012;Zhang et al., 2014). However, currently, we still lack a molecular level understanding of protein translocation, which ideally requires structural resolution of protein complexes and transport intermediates. The highly dynamic nature of the Tat system makes this particularly challenging, but may be facilitated through isolation of mutations that lock the translocon in intermediate states. The identification of a substitution that promotes constitutive translocon assembly (e.g., E. coli TatBF13Y;  could offer insight into the nature of the assembled translocon and has the potential to address questions including whether the assembled TatA oligomer is of fixed size and how TatA molecules are scaffolded. Finally, it is unclear how translocon disassembly is initiated and whether this process is related to the mechanism by which the Tat system fails to transport some unfolded proteins (Panahandeh et al., 2008;Richter & Brüser, 2005).

ACK N OWLED G EM ENTS
We thank Professor Ben Berks (University of Oxford) for critical comments on the manuscript. Work in the authors' laboratories is