The Tat protein export pathway

Authors

  • Ben C. Berks,

    1. Centre for Metalloprotein Spectroscopy and Biology, School of Biological Sciences, University of East Anglia, Norwich NR4 7TJ, UK.,
    Search for more papers by this author
  • Frank Sargent,

    1. Centre for Metalloprotein Spectroscopy and Biology, School of Biological Sciences, University of East Anglia, Norwich NR4 7TJ, UK.,
    Search for more papers by this author
  • Tracy Palmer

    1. Centre for Metalloprotein Spectroscopy and Biology, School of Biological Sciences, University of East Anglia, Norwich NR4 7TJ, UK.,
    2. Department of Molecular Microbiology, John Innes Centre, Colney, Norwich NR4 7UH, UK.
    Search for more papers by this author

Ben C. Berks. E-mail b.berks@uea.ac.uk; Tel. (+44) 1603 592186; Fax (+44) 1603 592250. E-mail Tracy.Palmer@bbsrc.ac.uk; Tel. (+44) 1603 456900; Fax (+44) 1603 454970.

Abstract

The Tat (twin-arginine translocation) system is a bacterial protein export pathway with the remarkable ability to transport folded proteins across the cytoplasmic membrane. Preproteins are directed to the Tat pathway by signal peptides that bear a characteristic sequence motif, which includes consecutive arginine residues. Here, we review recent progress on the characterization of the Tat system and critically discuss the structure and operation of this major new bacterial protein export pathway.

Introduction

Proteins that are exported to the bacterial periplasm are usually synthesized with cleavable N-terminal extensions, termed signal peptides. The vast majority of periplasmic proteins are exported by the general secretory (Sec) pathway (Pugsley, 1993; Danese and Silhavy, 1998; Fekkes and Driessen, 1999). The Sec machinery involves a membrane-embedded SecYEG translocation complex, together with the ATP-hydrolysing SecA protein. A major feature of the Sec mechanism is that proteins are translocated in an extended conformation and are often bound by SecB or other cytoplasmic chaperones to prevent folding before export.

It has recently become clear that most bacteria possess a second general protein export pathway that is quite distinct from the Sec apparatus (Berks, 1996; Yates et al., 1997; Santini et al., 1998; Sargent et al., 1998a; Weiner et al., 1998). This Sec-independent pathway has been termed the Tat (for twin-arginine translocation) system (Sargent et al., 1998a), because precursors are targeted to the pathway by signal peptides bearing a characteristic sequence motif that includes two consecutive and invariant arginine residues. An alternative Mtt (membrane targeting and transport) nomenclature has been applied to the pathway by Weiner et al. (1998). The Mtt nomenclature is based on the premise that certain pathway substrates, namely the extrinsic membrane subunits of E. coli dimethyl sulphoxide (DMSO) reductase, of E. coli formate dehydrogenase-O (Benoit et al., 1998) and of the tetrachloroethene dehalogenase of Dehalospirillum multivorans (Neumann et al., 1994; 1998; Miller et al., 1997), are not transported but are instead attached by the ‘Mtt’ apparatus to the cytoplasmic face of partner integral membrane proteins. We have reviewed the ‘Mtt’ viewpoint in detail elsewhere (Berks, 1996; Sargent et al., 1998a) and have nothing further to add at this juncture.

The most remarkable characteristic of the Tat pathway is that it apparently functions to transport folded proteins of variable dimensions across the cytoplasmic membrane, a feat that must be achieved without rendering the membrane freely permeable to protons and other ions. In most cases, the substrates of this pathway are proteins that bind one of a range of cofactors in the cytoplasm and are thus folded before export. Such proteins function predominantly in respiratory and photosynthetic electron transport chains and are vital for many types of bacterial energy metabolism. Some cofactorless proteins may also be transported by the Tat pathway (Berks, 1996; Brüser et al., 1998; N. R. Stanley, F. Sargent, T. Palmer and B. C. Berks, unpublished observations), probably because they either require cytoplasmic factors for folding or fold too rapidly or tightly for transport by the Sec apparatus.

Recent work has shown that the bacterial Tat system is very closely related to the ‘ΔpH-dependent’ protein import pathway of the plant chloroplast thylakoid membrane. The bacterial and thylakoid transporters require similar components (Settles et al., 1997; Mori et al., 1999) and similar targeting signals (Chaddock et al., 1995). Indeed, bacterial Tat signal peptides are able to direct thylakoid import specifically by the ΔpH-dependent pathway with an efficiency that is indistinguishable from authentic plant substrates (Mori and Cline, 1998; Wexler et al., 1998; Halbig et al., 1999a). Consequently, although we consider the bacterial Tat system in depth in this review, we also highlight the most important complementary studies of the chloroplast ΔpH-dependent translocase. The bacterial and plant systems do, however, differ in the treatment of precursors before the transport step because, in contrast to well- characterized ΔpH-dependent pathway precursors, bacterial Tat substrates bind cofactors. Thus transport in bacteria has to be co-ordinated with cofactor insertion. More details of the plant ΔpH-dependent transport system can be found in the recent reviews by Robinson et al. (1998), Settles and Martienssen (1998), Dalbey and Robinson (1999) and Keegstra and Cline (1999), with the last three references also giving some consideration to the bacterial Tat system.

Here, we review recent advances in our understanding of the Tat protein transport system and highlight those issues that need to be addressed in determining its structure and operation.

Tat targeting signals

Sec pathway signal peptides in Gram-negative bacteria are on average 24 amino acids in length and comprise three distinct regions: an N-terminal positively charged region (n-region), a hydrophobic α-helical region (h-region) and a c-domain that contains the site of cleavage by signal peptidase (von Heijne, 1985; Izard and Kendall, 1994; Cristóbal et al., 1999). Tat pathway signal peptides have a similar tripartite organization to Sec signal peptides but exhibit a number of distinctive features, the most notable of which is a conserved (S/T)-R-R-x-F-L-K sequence motif at the n-region/h–region boundary, in which the consecutive arginine residues are invariant and the other motif residues occur at a frequency of more than 50% (Berks, 1996). It has been suggested that the initial residue in the motif acts as an N-cap for an α-helix formed by the h-region (Berks, 1996). Consistent with this proposal, a potentially helix-breaking proline is often present at the end of the h-region. Indeed, recent analysis has shown that bacterial Tat signal peptides, but not Sec signal peptides, are characterized by a high occurrence of proline residues at position −6 to the signal peptidase cleavage site (Cristóbal et al., 1999). The c-region of Tat signal peptides also characteristically contains basic amino acids (Brüser et al., 1998; Wexler et al., 1998; Cristóbal et al., 1999), whereas Sec pathway precursors show a bias against positively charged residues in the vicinity of the signal peptidase cleavage site (von Heijne, 1986). Bacterial Tat signal peptides are on average 14 amino acids longer than Sec signal peptides, with most of this additional length being caused by an extended n-region (Cristóbal et al., 1999). Further, the h-region of Tat signal peptides is significantly less hydrophobic than that of Sec signal peptides due, in the main, to a higher occurrence of the amino acids glycine and threonine and a significantly lower abundance of leucine residues (Cristóbal et al., 1999).

It has been shown that the Tat signal peptide of Escherichia coli trimethylamine N-oxide reductase (TorA) directs exclusively Tat-dependent export of the periplasmic P2 domain of E. coli leader peptidase (Lep), a protein that is normally a substrate of the Sec pathway (Cristóbal et al., 1999), whereas a Sec signal peptide (from PelB) directs the two-subunit Tat substrate E. coli hydrogenase-2 complex (HybOC) to the Sec apparatus (Rodrigue et al., 1999). Thus, in E. coli, the signal peptide alone mediates mutually exclusive sorting of precursor proteins between the Tat and Sec pathways. Which of the structural features of signal peptides allow functional interaction with the Tat or Sec apparatus and which features prevent interaction with the alternative pathway?

Several experimental studies have shown that both arginine residues of the twin-arginine consensus motif are absolutely required to route a protein through the Tat pathway successfully. A complete block on export was observed when the twin-arginine residues of Zymomonas mobilis glucose-fructose oxidoreductase (GFOR) were conservatively substituted by lysine either individually or together (Halbig et al., 1999b). Similarly, conservative substitution of both arginine residues of the signal peptide of the TorA:P2 fusion with a pair of lysines abolished export in E. coli (Cristóbal et al., 1999). These observations are in marked contrast to the behaviour of Sec signal peptides, in which mutations preserving the n-region basic charge have no effect on the transport process (Sasaki et al., 1990). In other studies, export of the nitrous oxide reductase of Pseudomonas stutzeri was shown to be blocked when the first arginine residue in the consensus motif was substituted by the oppositely charged amino acid aspartate (Dreusch et al., 1997), and export of Wolinella succinogenes hydrogenase was prevented when both arginines in the small subunit signal peptide were replaced with glutamines (Gross et al., 1999). At some variance with the absolute dependence of export on the paired arginine residues observed in these studies are early experiments with a Desulfovibrio vulgaris hydrogenase Tat signal peptide/β-lactamase chimera expressed in E. coli. These experiments indicated that substitution of the first consensus arginine with a range of amino acids, including lysine, only partially inhibited export of the fusion protein (Nivière et al., 1992). If the export observed in these experiments was indeed mediated by the Tat apparatus, then these results indicate that, in some signal peptide contexts, both consensus arginine residues may not be obligatory for Tat targeting.

The functional importance of the difference in hydrophobicity between Tat and Sec signal sequence h-regions has been investigated by substantially increasing the hydrophobicity of the h-region of the TorA:P2 fusion (Cristóbal et al., 1999). Export of P2 was rendered Tat independent and sensitive to depletion of the Sec pore protein SecE. Thus, increasing the h-region hydrophobicity redirects the protein through the Sec translocon. It was not established in these experiments whether the mutant fusion protein was targeted to the Sec translocon by the post-translational or SRP-dependent co-translational route. However, when the cell was asked to process a construct containing the wild-type TorA signal peptide and full-length Lep, a protein whose transmembrane helices are known SRP-dependent targeting signals (de Gier et al., 1996), membrane insertion was Sec dependent and Tat independent (Cristóbal et al., 1999). The topological organization of the transmembrane helices of Lep imposes an N-terminal–cytoplasmic transmembrane orientation on the TorA signal peptide of the fusion protein, which is then processed, presumably by signal peptidase. Thus, co-translational Sec targeting not only overrides Tat targeting information but can also force an otherwise Sec-incompatible wild-type TorA signal sequence into the Sec translocon.

Export of the TorA:P2 h-region mutant is at least an order of magnitude slower than transport of well-studied authentic Sec substrates, suggesting that hydrophobicity is not the only feature of Tat signal peptides that prevents them acting as efficient Sec targeting signals. One obvious candidate for an additional Sec-repelling feature was suggested by studies of the Tat-analogous thylakoid ΔpH-dependent pathway, for which it has been shown that the signal peptide c-region basic residues, although not required for routing by the ΔpH-dependent pathway, block Sec-dependent thylakoid import (Bogsch et al., 1997). When the two c-region arginines of the TorA:P2 h-region mutant were replaced with the neutral polar residues asparagine and glutamine, export became very rapid and remained Sec dependent (Cristóbal et al., 1999), suggesting that the c-region basic residues of bacterial Tat signal peptides also interfere with Sec pathway export. Note that, in this signal peptide double mutant, the twin-arginine consensus motif remains intact, whereas export occurs via the Sec system. Therefore, the twin-arginine consensus has no Sec-avoidance role. The ‘Sec-avoidance’ function of c-region basic residues might be connected to the long-recognized phenomenon that basic residues are poorly tolerated in the vicinity of the signal peptidase cleavage site of Sec pathway precursors (e.g. Geller et al., 1993; and references therein). Notably, as has also been observed for the TorA:P2 system, the export defect caused by such basic residues can be substantially suppressed by increasing the h-region hydrophobicity (MacIntyre et al., 1990). It has been established experimentally that the cytoplasmic membrane potential inhibits insertion of precursors with such basic residues into the Sec translocon (Geller et al., 1993). However, if this is a direct electrophoretic effect, it is debatable whether this could be the mechanistic basis of Sec avoidance, as the same phenomenon is seen in thylakoids in which there is a negligible transmembrane potential difference.

In summary, current data suggest that targeting to the bacterial Tat pathway requires the paired arginine residues of the consensus motif together with a more hydrophilic h-region than is found in Sec signal peptides, whereas Tat signal peptides avoid functional interaction with the Sec apparatus by a combination of the relatively low h-region hydrophobicity and possession of c-region basic residues. No studies have so far addressed the role of the non-arginine amino acids of the Tat consensus motif in the transport process even though around 90% of Tat signal peptides have at least two of these additional residues and, to our knowledge, only one potential Tat signal peptide (that of Methylophilus methylotrophus MauM) lacks such residues (Berks, 1996).

What is the nature of the interaction between Tat signal peptides and the membranous Tat translocase? It has been suggested (Fincher et al., 1998) that, by analogy with protein transport by the Sec pathway and across the endoplasmic reticulum (ER), the Tat system may operate via a loop mechanism in which both the N-terminus of the signal peptide and the bulk of the passenger protein remain at the cytoplasmic side of the membrane after the initial translocon–precursor interaction (Kuhn et al., 1994). In the classic Sec loop, the tip of the loop containing the signal peptidase cleavage site is exposed at the periplasmic face of the membrane with the two arms of the loop being formed by the inverted signal peptide and approximately the first 20 amino acids of the mature protein. The mature protein arm is subsequently segmentally extruded by SecA insertion cycles. Is an analogous model viable for Tat transport? Weak evidence is available that the N-terminus of the Tat signal peptide does indeed remain at the cytoplasmic side of the membrane. The Rieske iron–sulphur protein subunits of bacterial cytochrome bc1 and b6f complexes appear to have uncleaved Tat signal peptides that also act as transmembrane anchor helices, with the twin-arginine end of the helix motif located at the cytoplasmic side of the membrane (van Doren et al., 1993). This would be consistent with a cytoplasmic location for the signal peptide N-terminus during transport, provided that the final orientation of the Rieske protein helix is actually determined by the Tat translocase. In addition, Fincher et al. (1998) have shown that a thylakoid ΔpH-dependent pathway substrate can still be imported into isolated thylakoids when a non-transported protein is fused to the precursor N-terminus, demonstrating that the N-terminus of the precursor must remain on the stromal (equivalent to cytoplasmic in bacteria) side of the membrane. This experiment is, however, open to the criticism that the fusion used contains a linker region (actually the stromal targeting domain of the precursor) between the thylakoid signal peptide and the N-terminally fused protein that is sufficiently long to span the membrane and thus allow the signal peptide to be inserted in either orientation.

Assuming, nevertheless, that the N-terminus of the Tat signal peptide remains at the cytoplasmic side of the bacterial cell membrane, a classical loop model in which the mature polypeptide N-terminus forms the second arm of the loop cannot apply to Tat transport if the N-terminus of the mature protein is folded before and during transport. Inspection of the crystal structures of Tat substrates shows that, in general, the mature protein lacks a substantial (> 10 amino acids) length of disordered polypeptide at the N-terminus that could be used in a precursor loop structure. It remains possible that the mature protein N- terminus does not attain its final conformation until after signal peptide cleavage. Indeed, for the high-potential iron–sulphur protein (HiPIP) from Chromatium vinosum, in vitro studies suggest that the N-terminal region of the protein can be unfolded while the C-terminus remains co-ordinated to the iron–sulphur cluster cofactor (Bentrop et al., 1999). If the mature protein is fully folded at the translocon binding step, the signal peptide might pivot around the consensus motif binding site during substrate transport. Alternatively, at the start of transport, when the mature domain is at the cytoplasmic side of the membrane, the signal peptide alone could form a loop that would flex to a fully extended conformation across the membrane by the end of the translocation process.

Folding and cofactors

A fundamental feature of the Tat system is that it is capable of transporting folded proteins. This has been demonstrated directly for the Tat-analogous thylakoid ΔpH-dependent pathway, which was shown to translocate a chimeric substrate comprising a dihydrofolate reductase (DHFR) domain fused to the C-terminus of a ΔpH-dependent pathway precursor, even in the presence of methotrexate, which binds with high affinity to the DHFR domain causing tight folding (Hynds et al., 1998). Similar experiments have also been reported with a ΔpH-dependent pathway precursor/bovine pancreatic trypsin inhibitor (BPTI) chimera, in which the fusion protein was successfully imported into thylakoids by the ΔpH-dependent pathway even when the, admittedly rather small, 6.5 kDa BPTI domain was permanently folded using irreversible disulphide cross-linkers (Clark and Theg, 1997).

In the absence of an in vitro assay for the bacterial Tat system, a similar direct demonstration that the pathway can translocate folded proteins has not yet been possible. There is, however, a very considerable body of indirect evidence from in vivo studies supporting the idea that transportation of cofactor-containing folded proteins is a fundamental feature of the bacterial Tat apparatus. An emerging feature of the biogenesis of many Tat substrates is that cofactor insertion is assisted by dedicated cytoplasmic assembly factors that recognize and bind the precursor. For example, E. coli hydrogenase-2 requires the assembly factors HybG (Menon et al., 1994), a homologue of the HypC protein that binds the apoform of the catalytic subunit of cytoplasmic hydrogenase-3 (Drapal and Böck, 1998), and HybD (Fritsche et al., 1999), the dedicated protease that undertakes C-terminal processing of the catalytic subunit after nickel cofactor insertion. Such specific protein–protein interactions imply that the cofactorless precursor protein already possesses substantial tertiary structure in the cytoplasm. For many of the precursor proteins that accumulate in the cytoplasm of E. coli Tat mutants, it has been possible to show that they contain cofactor (Bogsch et al., 1998; Sargent et al., 1998a, 1999; Weiner et al., 1998), demonstrating that cofactor insertion before export is feasible. Similarly, removal or mutation of the Tat signal peptide stops export but can still allow cofactor acquisition (Santini et al., 1998; Gross et al., 1999; Hilton et al., 1999; Rodrigue et al., 1999). If cofactor insertion is blocked, Tat precursors accumulate in the cytoplasm, not the periplasm, suggesting that cofactor binding is a prerequisite for export (Berks, 1996; Bernhard et al., 1996; Santini et al., 1998) and, indeed in one case, it has been shown that the cytoplasmically located apoprotein is exported when cofactor insertion is restored (Santini et al., 1998). In an elegant variation on this line of experimentation, Halbig et al. (1999b) have shown that point mutations of Z. mobilis GFOR, which reduce the affinity of the protein for its tightly bound NADP cofactor, drastically slow the rate of protein export, suggesting that cofactor binding is a prerequisite for the export of this Tat substrate. A further consideration is the existence of heterodimeric periplasmic proteins for which export depends on a Tat signal peptide located on just one of the subunits, the presence of both partner subunits and cofactor insertion into each subunit (reviewed by Berks, 1996; and for a recent study, see Rodrigue et al., 1999). These requirements demand subunit–subunit recognition and hence protein folding. Indeed, the jamming of the Sec apparatus observed when one such protein complex was retargeted to the Sec pathway in signal peptide swapping experiments has been attributed to folding of the precursor (Rodrigue et al., 1999). A single signal peptide may even be capable of targeting a three-subunit enzyme through the Tat pathway, because the E. coli yagTSR gene products, which are homologous to the three subunits of the crystallographically defined carbon monoxide dehydrogenase of Oligotropha carboxidovorans (Dobbek et al., 1999), have a single Tat signal peptide associated with the YagT precursor. A further remarkable example of co-translocation of multiple proteins by the same twin- arginine-containing signal sequence can be found in Gram-positive Streptomycetes, in which secretion of the apoprotein of the copper-dependent enzyme tyrosinase (MelC2) is mediated by the signal peptide-bearing chaperone MelC1 (Lee et al., 1988; Chen et al., 1992; Leu et al., 1992), with post-translocation release of MelC1 requiring activation of the tyrosinase by copper ion insertion (Chen et al., 1992; Tsai and Lee, 1998).

Given that the Tat system is capable of translocating folded proteins, does the transporter mechanistically require that the substrate be folded? The only data bearing on this problem come from the plant thylakoid system, in which it has been shown that a chimeric protein in which DHFR was fused to the C-terminus of the thylakoid ΔpH-dependent substrate pre23K is efficiently imported into thylakoids by the Sec-independent route, even if the DHFR portion of the chimeric protein is truncated (Hynds et al., 1998). Such a deletion will affect proper folding of the DHFR domain, suggesting that the ΔpH-pathway transporter is mechanistically capable of transporting at least partially misfolded proteins. In contrast, a separate study found that even small truncations of pre23K itself severely retard thylakoid import (Roffey and Theg, 1996), raising the possibility that the intact N-terminal 23K domain of the pre23K-DHFR′ chimera was triggering the transport process and allowing import of the misfolded C-terminal DHFR domain. However, incorporation of amino acid analogues, which were demonstrated to destabilize 23K, into either pre23K-DHFR or pre23K did not significantly affect thylakoid import (Hynds et al., 1998). Thus, although additional experiments in this area are needed, the preliminary conclusion is that the Tat transporter mechanism does not require a correctly folded protein as substrate. However, as the physiological function of the Tat system is the transport of folded proteins, it is reasonable to ask whether the system nevertheless contains elements that check the folding state of the substrates and prevent the functional interaction of precursor with the translocon until the protein is properly folded. Proofreading would presumably not be a necessity for those Tat substrates that do not bind cofactors, if the reason such proteins are targeted to the Tat pathway is because Sec-dependent export is too slow to prevent the protein forming a Sec-incompatible structure. However, for the majority of bacterial Tat substrates that do bind cofactors, it is clearly important that the protein contains the cofactor before export is attempted. A mechanism for checking the cofactor loading of such precursors would be logical. Similar considerations would apply to proteins without cofactor that are targeted to the Tat system, because they require cytosolic folding factors. How could such ‘proofreading’ be undertaken?

A general mechanism would have to monitor the folding state of the substrate. Various possibilities exist. An unfolded protein exposes large hydrophobic regions of polypeptide, and this might be sensed either directly by the Tat system or indirectly because the protein was bound to general folding catalysts or chaperones. In this context, it is interesting to note that, in E. coli, the GroE chaperone system is required for the maturation of the hydrogenase-1 but not the hydrogenase-2 isoenzyme, both enzymes being substrates for the Tat pathway (Rodrigue et al., 1996). Alternatively, the twin-arginine signal peptide may be sufficiently hydrophobic that it is non-specifically sequestered by the hydrophobic regions of the mature protein during folding. Upon attaining the final three-dimensional structure, the signal peptide would be expelled from the protein interior and would then be available to interact with the Tat translocon. It is known that Sec signal peptides can interfere with the folding of their passenger proteins in vitro (Park et al., 1988), but the physiological significance of these observations is debatable.

Although it is possible that the Tat system monitors the export readiness of potential substrates by assessing their folding state, it is more likely that the signal mechanism is linked to the presence of cofactor in the mature portion of the protein. In such a mechanism, the mature protein would be envisaged to fold partially before cofactor insertion and sequester the signal peptide by specific protein–protein interactions. Upon cofactor acquisition, a conformational change in the mature protein would release the signal peptide from its binding site to allow the cofactor-containing precursor to interact with the translocon. Alternatively, the signal peptide might be bound by an accessory protein that attaches itself to the apo form of the precursor and both disengages from the precursor and releases the signal peptide after cofactor insertion (essentially the ‘shelter model’ of Santini et al., 1998). It is possible that such accessory proteins could be the assembly factors (‘chaperones’) involved in cofactor insertion. Note, however, that signal peptides are not necessarily required for cofactor insertion into Tat substrates (Santini et al., 1998; Hilton et al., 1999; Rodrigue et al., 1999) and, therefore, signal peptides are not the major binding determinant for assembly factors. In all the above models, the signal peptide is bifunctional, acting both to direct export and to signal the cofactor status of the passenger protein. Thus, in addition to the common features of twin-arginine signal peptides required for Tat targeting, signal peptides for cofactor-containing proteins should have distinct structural features allowing specific protein–protein interactions with the mature protein and/or its assembly factors. Consistent with this proposal, signal peptides for similar cofactor- containing proteins from different bacteria exhibit marked sequence conservation in addition to the twin-arginine motif (http://www.blackwell-science.com/products/journals/contents/berks.htm), suggesting that the mature protein exerts evolutionary pressure on the signal peptide sequence. Particularly striking examples are the signal peptides of [Ni-Fe] hydrogenase small subunits (Fig. 1; see also Voordouw, 1992) and the enormously long (57 amino acids) signal peptides of methylamine dehydrogenase small subunits (Berks, 1996).

Figure 1.

. Alignment of periplasmically located [Ni-Fe] hydrogenase small subunit Tat signal peptides. Residues that are found in 50% or more of the sequences are blocked in black, with conservative substitutions of these residues blocked in grey. The consensus twin-arginine motif that is common to all types of bacterial signal peptides is indicated at the foot of the alignment. The sequences marked * were produced by the Salmonella typhi Sequencing Group at the Sanger Centre and can be obtained from http://www.sanger.ac.uk/Projects/S_typhi/.

Those heterodimeric proteins that are targeted to the Tat apparatus by a signal peptide on just one subunit face an additional ‘proofreading’ problem. How does the subunit with the signal peptide determine that the partner subunit is present and so delay its own export accordingly? In the [Ni-Fe] hydrogenases, the signal peptide is located on the iron–sulphur cluster-containing small subunit. Export does not occur until both the small subunit binds the large subunit and maturation of the large subunit [Ni-Fe] cofactor is complete (Bernhard et al., 1996; Maier and Böck, 1996; Rodrigue et al., 1999). In turn, maturation of the large subunit requires the small subunit, although conflicting experimental data make it uncertain whether this necessitates the small subunit having a functional signal peptide (Bernhard et al., 1996; Gross et al., 1999; Rodrigue et al., 1999). A possible scenario that would explain these observations starts with the signal peptide of the isolated small subunit being sequestered at what will become the intersubunit interface of the mature enzyme. Initially, an immature form of the large subunit binds to the small subunit without displacing the signal peptide. This interaction allows completion of the processing of the large subunit, which, in adopting its final conformation, extrudes the signal peptide for interaction with the Tat apparatus. In support of this model, we note that the processed N-terminus of the small subunit of the periplasmic [Ni-Fe] hydrogenase of Desulfovibrio gigas is positioned at the intersubunit boundary (Volbeda et al., 1995), suggesting that it would be physically reasonable for the signal peptide to associate with the interface. The strategy adopted to sequester the signal peptide until heterodimer formation is achieved may differ between metalloprotein types. For example, in the distinct periplasmic [Fe] hydrogenases, the signal peptide-bearing small subunit does not constitute a domain structure, but instead comprises a string of α-helices that form a belt around the cofactor-containing large subunit (Nicolet et al., 1999). Clearly, the small subunit can only form this structure in the presence of the large subunit. Thus, if the isolated small subunit were to shelter the signal peptide in this enzyme, a very large conformational change occurs on binding to the large subunit. The large subunit undergoes a C-terminal processing event (Hatchikian et al., 1999), which may be linked to proofreading, because similar processing is not seen on homologous cytoplasmic [Fe] hydrogenases (Peters et al., 1998). Intriguingly, the proteolytically processed ends of the two periplasmic [Fe] hydrogenase subunits are adjacent in the mature enzyme (Nicolet et al., 1999), perhaps indicating a functional interaction between the released peptides.

Control of protein assembly before export will also be important for Tat substrates in which the mature enzyme is a homo-oligomer, with each subunit bearing its own Tat targeting signal. Examples include the homodimeric copper enzyme nitrous oxide reductase (Coyle et al., 1985), the homotrimeric enzyme copper nitrite reductase (Godden et al., 1991) and the homotetrameric enzyme GFOR (Kingston et al., 1996). It is likely that each subunit in these enzymes is exported independently, because the signal peptide of each subunit is correctly processed even though signal peptidase only acts on sites correctly positioned in the bilayer headgroups (Nilsson et al., 1994; Paetzel et al., 1998) and would not be expected to trim extra signal peptides that do not mediate the export process. Mechanisms must therefore exist by which oligomerization of these proteins in the cytoplasm is avoided.

In an earlier study, we proposed that, for periplasmic proteins containing cofactors, there exists an absolute correlation between the type of cofactor bound and whether the protein was exported by a Tat or a Sec signal sequence (Berks, 1996). In one respect, this division was somewhat unsatisfactory, because proteins composed of similar β-barrel domains and binding copper atoms as their only cofactors were divided into Tat- and Sec- targeted groups solely on the basis of the number of copper atoms present. The artificial nature of this division was emphasized when the availability of additional sequence data suggested that the enzyme nitrite reductase, which binds two copper atoms per subunit, is transported in some organisms by the Sec pathway and in other organisms by the Tat system (Prudêncio et al., 1999). Furthermore, it has become clear that, even for Tat pathway proteins, copper insertion occurs subsequent to transport (Dreusch et al., 1997; and the Streptomyces tyrosinase studies described above), and thus there is no obligate requirement for the precursor to fold in the cytoplasm to enable cofactor binding. Possibly, certain β-barrel-based copper proteins are exported via the Tat pathway in order not to allow cytoplasmic cofactor insertion but because the cell has increasing difficulty in either preventing or reversing rapid folding of the stable β-barrel structures as the number of domains (and hence the number of copper atoms bound) increases.

A second exception to the cofactor-pathway correlation is the periplasmic flavocytochrome fumarate reductase of Shewanella sp. (Pealing et al., 1992). In this enzyme, the presence of Sec-associated (Thöny-Meyer and Künzler, 1997) c-type haem cofactors on the same polypeptide chain that binds the typical Tat cofactor FAD results, as judged by signal peptide type, in the protein being targeted through the Sec rather than the Tat pathway. Again, folding state is probably the key to understanding the routing of this enzyme. Covalent attachment of haem to c-type cytochromes relies on a group of specialized assembly proteins that are localized to the periplasmic side of the cell membrane and apparently recognize and attach haem to a specific amino acid motif in the unfolded protein (Kranz et al., 1998). As synthesis of c-type cytochromes requires unfolded periplasmic protein as substrate, and as it is certainly feasible for FAD to be transported across biological membranes (e.g. Tzagoloff et al., 1996), we suggest that the mechanism of haem insertion has to take precedence over the requirements of flavin insertion in this protein. This apparent biosynthetic anomaly is all the more remarkable because, in a highly homologous enzyme from Wolinella succinogenes (Simon et al., 1998), the c-type cytochrome and FAD-binding regions are separate polypeptides with, in accordance with the cofactor–signal sequence correlation, the c-type cytochrome having a Sec signal peptide and the flavoprotein bearing a twin-arginine signal peptide.

Tat system components

Studies in E. coli have identified four genes coding for integral membrane proteins involved in the Tat export pathway (Bogsch et al., 1998; Sargent et al., 1998a; Weiner et al., 1998). The tatA (also termed mttA1), tatB (mttA2) and tatC (mttB) genes are the first three cistrons of the tatABCD operon (Weiner et al., 1998) and are located at 86 min on the E. coli chromosome, whereas the fourth gene, tatE, forms an independent transcriptional unit positioned at chromosomal minute 14 (Fig. 2). Note that, in the initial report on the tatABCD operon (Weiner et al., 1998), tatA and tatB were erroneously presented as a single open reading frame (Sargent et al., 1998a). TatA, TatB and TatE are sequence-related proteins and are homologous to the Hcf106 component of the maize ΔpH-dependent thylakoid import pathway (Settles et al., 1997; Chanal et al., 1998; Sargent et al., 1998a; Weiner et al., 1998). TatA and TatE are almost 60% identical in sequence, whereas the TatB sequence is more divergent, sharing ≈ 25% amino acid identity with the other two proteins. Mutagenesis has shown that the tatB and tatC genes are essential Tat pathway components (Bogsch et al., 1998; Weiner et al., 1998; Sargent et al., 1999), but that tatA and tatE mutations only fully block export by the Tat system when combined in a tatA tatE double mutant strain (Sargent et al., 1998a). Thus, although a TatA/E-like protein is required for Tat pathway export, TatA and TatE are sufficiently similar in structure that each can, at least partially, functionally substitute for the other (Sargent et al., 1998a).

Figure 2.

. The organization of tat genes, where present, in bacteria of known genome sequence. Escherichia coli and Haemophilus influenzae are γ-Proteobacteria, Helicobacter pylori is an ε-Proteobacterium, Rickettsia prowazekii is an α-Proteobacterium, Bacillus subtilis and Mycobacterium tuberculosis are low and high G + C Gram-positive bacteria, respectively, Synechocystis is a Cyanobacterium, Aquifex aeolicus is a member of the Aquificale, and Archaeoglobus fulgidus is an Archaeon (Euryarchaeote). Genes encoding structurally related proteins in different organisms have the same fill pattern: tatA/E, black; tatB, white; tatA/B/E family but not further classifiable, stippled; tatC, hatched; tatD, waves. Slanted vertical lines represent the divisions between groups of linked tat genes. The product of the A. fulgidus gene AF1344 is so divergent in sequence from other TatC proteins that there must be some doubt as to whether it has the standard TatC function.

TatA/B/E and TatC proteins are coded by all complete genomes of prokaryotes possessing proteins with twin-arginine signal peptides (Fig. 2), but are absent from the genomes of those organisms (primarily intracellular parasites, methanogens and those bacteria with a solely fermentative metabolism) that do not export proteins of this type. Where a Tat system is present, the genome generally encodes one TatC and two TatA/B/E homologues (Fig. 2), although a few organisms have one additional copy of TatA/B/E or TatC and, exceptionally, Rickettsia prowazekii has only a single TatA/B/E-like protein, possibly as a consequence of ongoing genome reduction (Fig. 2). Recent data suggest that the plant thylakoid ΔpH-dependent transport system also requires at least two TatA/B/E-type proteins (Mori et al., 1999).

Why does an organism have two TatA/B/E-like proteins and what is the function of additional copies of these proteins or the TatC protein? One obvious possibility is that the multiple homologous Tat proteins interact with specific subsets of Tat precursors. Indeed, in P. stutzeri, the gene cluster encoding the Tat pathway enzyme nitrous oxide reductase also codes for a TatA/B/E homologue (orf57; Glockner and Zumft, 1996), raising the possibility that this protein has a dedicated function in the export of nitrous oxide reductase. This type of interpretation is not, however, supported by current data from E. coli, in which no difference in substrate specificity between TatA/E and TatB has been established in tatAE and tatB mutants (Sargent et al., 1998a; 1999; Weiner et al., 1998). An early report that tatB was not required for the export of hydrogenase-2 (Chanal et al., 1998) has subsequently been found to be incorrect (Sargent et al., 1999). At this juncture, we think it most probable that the Tat transporter requires structurally related, but non-identical, TatA/E and TatB proteins for mechanistic reasons rather than for the purpose of interacting with different substrates.

The region of sequence similarity between TatA/B/E-like proteins is predicted to form a single N-terminal transmembrane helix followed by a cytoplasmically located (or stromally located for the chloroplast protein) amphipathic helix (Fig. 3; Settles et al., 1997). We note here that the conserved amphipathic region is basic with, on average, an ≈ 3:1 ratio of positive to negatively charged amino acids. It is possible that this predicted helical region could lie at the lipid–water interface, as depicted in Fig. 3, with the basic amino acids interacting electrostatically with the phospholipid head groups. A non-conserved, water-soluble region of variable length follows the amphipathic helix. For the chloroplast Hcf106 protein, Settles et al. (1997) have confirmed a thylakoid membrane location, and their protease accessibility studies support the idea that a major part of the Hcf106 protein is accessible from the stromal phase. Sequence conservation among the TatA/B/E-like proteins is low, with only one absolutely conserved amino acid, a glycine at the transmembrane helix/amphipathic helix boundary. The conserved glycine is normally followed by a proline residue. Between them, these residues would be expected to form a flexible non-helical region that could act as a bend or hinge between the two adjacent helices. Experimental evidence for the importance of the helix junction region in TatB function is provided by the observation that the export-defective tatB allele isolated by Weiner et al. (1998) substitutes a leucine for the conserved TatB proline. Given that most organisms with the Tat system have at least two TatA/B/E homologues and that, in E. coli, TatA/E and TatB have been shown to be functionally distinct, we wondered whether we could find sequence features that would consistently divide the TatA/B/E proteins of an organism into separate ‘TatA/TatE-like’ and ‘TatB-like’ classes. No such easy division appeared and so, in Fig. 2, only genes coding for proteins that are unambiguously more similar to E. coli TatA/E than TatB or vice versa are labelled as such. Presumably, there has either been a degree of sequence interchange between these two similar groups of proteins over time or their function is constrained by general structural properties rather than specific sequence features.

Figure 3.

. Sequence-derived topology predictions for the E. coli tat gene products.

The predicted topological structure of TatC has six transmembrane helices arranged such that the N- and C-termini of the polypeptide chain are located at the cytoplasmic side of the membrane (Fig. 3; Sargent et al., 1998a). Fifteen amino acids are completely conserved among the eubacterial TatC proteins of known sequence, all but one of which are found in predicted helices three and four, the cytoplasmic loop between helices two and three or the cytoplasmic N-terminal tail. TatC homologues are encoded not only in bacterial genomes but also in certain plastid genomes, where they are presumably involved in the thylakoid ΔpH-dependent protein import system. In addition, TatC homologues are encoded in the mitochondrial genomes of algae, higher plants and some protists, suggesting that a Tat system operates in at least some mitochondria (Bogsch et al., 1998; Weiner et al., 1998). Potential substrates for this mitochondrial Tat system are, however, not obvious.

Are there further essential tat genes waiting to be identified? Phylogenetic analysis would suggest that this is probably not the case because, whereas genes coding for TatC or TatA/B/E proteins show genetic linkage in many cases, we have been unable to identify any uncharacterized genes consistently linked to the known tat genes. In E. coli, the tatA operon contains a fourth gene, which might be expected to code for an additional component of the Tat system. The product of this tatD gene is a water-soluble cytoplasmic protein (M. Wexler, F. Sargent, E. Bogsch, N. R. Stanley, C. Robinson, B. C. Berks and T. Palmer, unpublished results), which has two further E. coli homologues encoded by the unlinked ycfH and yjjV genes (Fig. 2). Despite the co-transcription of tatD with tatABC, a strain with in frame chromosomal deletions in all three of tatD, ycfH and yjjV exhibits no detectable defect in Tat operation (M. Wexler, F. Sargent, E. Bogsch, N. R. Stanley, C. Robinson, B. C. Berks and T. Palmer, unpublished results). Thus, a TatD-like protein is not an obligate component of the Tat pathway, a finding that is consistent with the observations that the occurrence of TatD homologues does not match the phylogenetic distribution of Tat systems, and that genes encoding TatD are not linked to those for other Tat components in non-enteric bacteria (Fig. 2).

The Tat transport system: structural considerations

The Tat system apparently performs the remarkable feat of transporting folded proteins of varying cross-sectional area across a coupling membrane without rendering the membrane freely permeable to ions (Teter and Theg, 1998). At least three possible general mechanisms can be envisaged by which this might occur. The integrity of the membrane could be preserved by encapsulating the transported protein in a proteolipid vesicle and fusing this with the cytoplasmic membrane. Alternatively, the Tat apparatus could form an aqueous pore of sufficient internal diameter to accommodate the passage of even the largest folded substrate, with the pore gated at both the cytoplasmic and periplasmic ends, such that only one of the gates can be open at any given time. A further option is that the Tat translocon forms a seal of continuously variable diameter around the substrate during transport.

The vesicular mechanism is highly unlikely. There is no evidence in most bacteria for intracellular vesicular structures. Further, if a cytoplasmically located vesicle were initially formed by budding from the cytoplasmic membrane (which it must surely do if transport is to be a cyclical process), the vesicle would encapsulate periplasmic, not cytoplasmic, proteins, and the export substrate would still need to be moved across a bilayer to be imported into the vesicles.

If the Tat system does not operate by a vesicular mechanism, then the Tat proteins must provide some sort of transmembrane pore. Assuming that the pore is formed from the known integral membrane Tat components, the possible structural arrangement of these proteins is constrained by the physical dimensions of known Tat substrates. Although there are no crystal structures of the largest proteins transported by the E. coli Tat pathway, in some instances three-dimensional structures for analogous proteins from other sources are available. The structure of TMAO reductase is known from the closely related organism Shewanella massilia (90 kDa mature protein; Czjzek et al., 1998), and the diameter of this enzyme at the narrowest cross-section is around 50 Å. The HyaAB (100 kDa mature protein) and HybOC (96 kDa; Sargent et al., 1998b) complexes of E. coli hydrogenases-1 and -2 are structurally related to the Desulfovibrio gigas hydrogenase (88 kDa; Volbeda et al., 1995), which is 50–60 Å across at the narrowest cross-section. The largest protein complex known to be transported by any Tat system is the 142 kDa FdnGH subcomplex of E. coli formate dehydrogenase-N (Berg et al., 1991). Even assuming that the subcomplex is spherical in shape, the diameter will not be above 70 Å. At the other end of the size range, the smallest known natural substrates of the E. coli Tat system have a mature protein molecular mass of around 20 kDa (e.g. NrfC; Hussain et al., 1994; Weiner et al., 1998), whereas the smallest candidate Tat substrate in any bacterium, the 9 kDa high-potential iron–sulphur protein (HiPIP) of Chromatium vinosum, has minimum and maximum diameters of around 20 Å and 30 Å (Carter et al., 1974). Taken together, these considerations suggest that the protein translocating pore of the Tat apparatus has a maximum diameter of 60–70 Å and that the pore must be able to maintain an ionic seal around substrates with maximal cross-sectional areas differing by a factor of at least four.

How might the Tat proteins be arranged in order to provide a pore of the required size? The membrane-integral portions of bacterial cytoplasmic membrane proteins are invariably composed of transmembrane α-helical elements. All of TatA, TatB, TatC and TatE are predicted to contain such membrane-spanning helices (above and Fig. 3), and these regions are obvious candidates for forming the transmembrane pore. In order to get an idea of the minimal number of Tat proteins required to form the translocon, it is informative to ask how many transmembrane α-helices aligned parallel to the membrane normal would be needed to enclose an aqueous pore with an internal diameter of 70 Å. Porcine ribonuclease inhibitor, which has a large arc of parallel helices (Kobe and Deisenhofer, 1993), or the inner palisade of helices of Rhodopseudomonas acidophila light-harvesting complex 2 (McDermott et al., 1995) would be reasonable structural analogues for such a ring. Scaling from these models suggests that at least in the region of 21–23 helices would be required. This is larger than the sum (nine) of the predicted helices in TatABCE, so we can infer that multiple copies of at least some Tat components are required to form the translocon.

The width of a lipid bilayer including the phospholipid head groups is around 50 Å. Thus, some of the larger Tat substrates could probably not be accommodated completely within the thickness of the membrane. This suggests that some substrates may protrude above one or both membrane surfaces during the transport process and, if a gated pore structure applies, would probably require portions of the Tat translocon to extend into the aqueous phase(s) to provide shutters at both ends of the pore.

Settles and Martienssen (1998) suggested, on the basis of possible sequence similarity between the thylakoid TatA/B/E homologue Hcf106 and an Archaeal SecY protein, that the Tat translocon evolved from and, by implication, is structurally and mechanistically related to the Sec translocon. Although we reserve judgement on the significance of this sequence comparison, a mechanistic similarity to the SecYE and structurally related endoplasmic reticulum (ER) translocon pore complex is not perhaps as unlikely as originally seemed. Recent studies of co-translational protein transport across the ER suggest that, during the export event, the diameter of the translocon channel is 40–60 Å (Hamman et al., 1997). This is much larger than is required to accommodate a signal peptide together with a protein strand in extended conformation and approaches the apparent maximum size of the Tat pore. During co-translational protein transport, the ER (and presumably Sec) translocon is sealed by the bound ribosome at the cytoplasmic side of the membrane. SecA probably has an analogous role during post- translational protein export. In the absence of ribosomes, the pore of the ER translocon narrows to 9–15 Å in diameter, and the luminal end is sealed (Hamman et al., 1998). Thus, an ER/Sec-type translocon can produce a variable-diameter gated protein channel that, when fully open, approaches the maximum diameter inferred for the Tat pore. Furthermore, as was inferred above for the Tat transporter, the ER and Sec translocons are also composed of multiple copies of their integral membrane subunits (Hanein et al., 1996; Meyer et al., 1999).

The possibility that the Tat translocon operates by a mechanism completely distinct from that of the ER/Sec translocon should also be considered. Noting the highly distinctive structure of the TatA/B/E-like proteins, we suggest a more radical structural model, in which multiple copies of the TatA/B/E-like proteins form the translocon pore. In this ‘sea anemone’ model, the transmembrane helices form a ring in the membrane, with the C-terminal regions controlling access to the pore at the cytoplasmic side of the bilayer. Such an approximately radially symmetrical circular structure might allow maintenance of the permeability barrier by means of an iris mechanism, in which the individual subunits slide against each other as the diameter of the transported protein varies. Alternatively, exchange of TatA/B/E-like proteins between translocon-bound and monomeric states within the membrane could expand and contract the pore in small increments to fit the pore size to the transported protein. A multimeric TatA/B/E structure would also allow the possibility that the amphipathic helical regions, instead of lying along the bilayer–water interface, as depicted in Fig. 3, could instead be positioned with the hydrophobic face interacting with the transmembrane helices or the membrane interior, and the hydrophilic face lining the transport channel.

Could such an arrangement in which the channel walls must pack non-specifically around the hydrophilic surface of the transported protein be sufficiently tight to maintain the permeability barrier of the cytoplasmic membrane? It can probably be assumed that ions leaking through the Tat seal would need to retain at least their inner hydration shell, giving an effective diameter for the transported ions of approaching 10 Å. It is not unreasonable to expect non-specific protein packing to exclude entities of this size even if the effective diameter is at times reduced somewhat by statistical fluctuations or the presence of polar proteinaceous groups in the solvation shell. However, protons (and hydroxyl ions) will migrate extremely rapidly within a network of hydrogen-bonded water molecules and/or protein hydroxyl groups by random exchange of hydrogen bonds for covalent bonds (the Grotthuss mechanism). It is not clear whether substrate-translocon packing would be able to prevent the formation of such hydrogen-bonded arrays around the transported protein. There must therefore be some doubt as to whether a dynamic seal mechanism is feasible for the Tat transporter.

Which Tat proteins are involved in twin-arginine signal peptide recognition? The transporter clearly recognizes particular amino acid sequences within Tat signal peptides, indicating that the contact is mediated by rather specific protein–protein interactions. It seems reasonable that the amino acid residues involved in forming the binding site for the twin-arginine consensus sequence would be evolutionarily highly conserved and should include at least some polar residues to bond to the guanido groups of the arginine residues. Some workers have proposed, without evidence, that TatA/B/E form a ‘receptor’ for the Tat system (Settles et al., 1997; Chanal et al., 1998), but such proteins contain few highly conserved amino acids, and these are non-polar. In contrast, TatC proteins exhibit considerably higher sequence conservation including conserved polar residues and, on this basis, we suggest that TatC is most likely to be the signal peptide-binding component of the Tat transport system.

It is likely that protein transport through the Tat apparatus is energized solely by the transmembrane proton electrochemical gradient, as demonstrated for the Tat-analogous thylakoid ΔpH-dependent import system (Mould and Robinson, 1991). Consistent with this, export by the Tat pathway in whole E. coli or Z. mobilis cells is sensitive to protonophores (Wiegert et al., 1996; Santini et al., 1998; Cristóbal et al., 1999), although it should be noted that a direct effect of the protonmotive force on the translocon would be exceedingly difficult to demonstrate by in vivo experiments. The classic Sec pathway inhibitor azide, which prevents ATP hydrolysis by SecA, has been reported to block export partially (Santini et al., 1998) or completely via the bacterial Tat pathway (Wiegert et al., 1996). However, azide is a rather non-specific inhibitor, and so it is quite possible that its affects the Tat system indirectly, for example by inhibition of the terminal reductases of the protonmotive force-generating electron transport chains. The mechanism of coupling protonmotive force to protein transport in the Tat system is unknown.

Concluding comments

Our analysis of the E. coli genome indicates that 22 gene products have plausible twin-arginine signal peptides. For comparison, it has been estimated that perhaps 20% of the ≈ 4300 proteins of the entire E. coli proteome are exported from the cytoplasm via the Sec pathway (Pugsley, 1993). Clearly, exported proteins are preferentially targeted to the Sec rather than the Tat pathway. What little evidence there is suggests that this is not because Sec-dependent proteins are incompatible with the Tat pathway (Nivière et al., 1992; Robinson et al., 1994; Henry et al., 1997; Cristóbal et al., 1999). Conceivably, transport by the Sec system is either more rapid or less energetically costly than Tat-mediated transport. Overall protein flux through the E. coli Tat system is substantially lower than through the Sec pathway with comparative transit half-times in the order of a few minutes (Santini et al., 1998; Sargent et al., 1998a; Bogsch et al., 1998) and a few seconds respectively. However, the relative concentration of translocating sites for the two systems has not been established. For the thylakoid ΔpH-dependent system, Teter and Theg (1998) have estimated that a transporter translocates one ≈ 150-amino-acid protein per minute. In the same time, each E. coli Sec translocon is estimated to have transported proteins containing 10 000 amino acids (Pugsley, 1993). Thus, it is plausible that the bacterium's preference for the Sec over the Tat mechanism for general secretion is based on the relative rates of protein transport.

Our knowledge of the Tat system is in its infancy. Further studies of this widely occurring and mechanistically distinctive protein transport system promise to be of great interest.

Acknowledgements

We wish to thank the many colleagues with whom we have discussed aspects of the Tat system. Work in the authors' laboratories is supported by the Biotechnology and Biological Sciences Research Council, The Leverhulme Trust, the University of East Anglia and The Royal Society.

Ancillary