Architecture of the Helicobacter pylori Cag-type IV secretion system


L. Terradot, Institut de Biologie et Chimie des Protéines, Biologie Structurale des Complexes Macromoléculaires Bactériens, UMR 5086 CNRS Université de Lyon, IFR128, 7 Passage du Vercors, F-69367 Lyon Cedex 07, France
Fax: +33 472 722604
Tel: +33 472 722652
E-mail: and
G. Waksman, Institute of Structural and Molecular Biology, UCL and Birkbeck, Malet Street, London, WC1E 7HX, UK
Fax: +44 (0)207 631 6803
Tel: +44 (0)207 631 6833
E-mail: or


Type IV secretion systems (T4SS) are macromolecular assemblies used by bacteria to transport material across their membranes. T4SS are generally composed of a set of twelve proteins (VirB1–11 and VirD4). This represents a dynamic machine powered by three ATPases. T4SS are widespread in pathogenic bacteria where they are often used to deliver effectors into host cells. For example, the human pathogen Helicobacter pylori encodes a T4SS, the Cag-T4SS, which mediates the injection of the toxin CagA. We review the progress made in the past decade in our understanding of T4SS architecture. We translate this new knowledge to derive an understanding of the structure of the H. pylori Cag system, and use recent protein–protein interaction data to refine this model.


cytotoxin-associated gene-pathogenicity island


C-terminal domain


electron microscopy


N-terminal domain


type IV secretion system


Secretion systems are widespread in bacteria for which they provide means of exchange between the extra- and intracellular milieus. Seven different systems have been described in Gram-negative bacteria [1]. Amongst them, the type IV secretion systems (T4SS) have raised considerable attention during the past 10 years because of their role in bacterial pathogenesis. T4SS are macromolecular devices that bacteria use to transport various macromolecules, including protein, DNA or nucleic acid/protein complexes, across the cell envelope [2,3]. These systems are remarkably versatile and have been classified into three different groups according to their function [3]. T4SS in the first group are used to transfer DNA from one cell to another in a process referred to as conjugation [4]. Incorporating new DNA sequences is a major selective advantage for these organisms, which can rapidly acquire new genetic features and adapt to changes in the environment. Such mechanisms are involved in the spread of antibiotic resistance genes among pathogenic bacteria [5]. One of the most studied T4SS of this group is encoded by the Agrobacterium tumefaciens VirB/D system, which is able to deliver nucleoprotein complexes into plant cells leading to crown gall disease [6]. The artificial utilization of a modified VirB/D system has proved instrumental in the production of genetically modified plants [7]. The second group of T4SS, exemplified by the ComB system of H. pylori and the GGI system of Neisseria gonorrhoeae, is used for DNA uptake and release from and to the extracellular milieu [8]. The third group consists of T4SS that transfer protein effectors and are used by numerous pathogenic bacteria, including Bordetella pertusis, Legionella pneumophila, Brucella spp., H. pylori and Bartonella spp. In these species, T4SS can be described as molecular pumping devices that facilitate host–pathogen interaction and/or inject toxins into the host cell [5].

In the human pathogen H. pylori, T4SS of the three different groups have been identified. The most conserved one is the so-called ComB system (group 2), which is considered to mediate the import and integration of environmental DNA fragments into its genome [9]. Another one (named Tfs3, group 1) is less conserved and appears to play important roles in determining the remarkable plasticity of the H. pylori genome [10–12]. The last T4SS (group 3) is present only in the most virulent H. pylori strains that produce a major toxin, the CagA protein [13]. This toxin and the T4SS apparatus responsible for its transfer to the eukaryotic cell are encoded by the so-called cytotoxin-associated gene-pathogenicity island (cagPAI), a 40 kbp DNA fragment that is transmitted horizontally; see the accompanying review by Tegtmeyer et al. [14 ]. CagA is considered as a paradigm for bacterial carcinogenesis [15]. Briefly, once injected into the host cell, CagA is phosphorylated and interacts with more than 20 different human proteins involved in signal transduction [16]. As a consequence of CagA action, epithelial cells will have some of their major functions disturbed, such as cell–cell adhesion, signalling, adherence and proliferation [17]. CagA translocation and phosphorylation are required to trigger the so-called ‘hummingbird phenotype’, a form of cell scattering [18]. The cagPAI system also delivers bacterial peptidoglycan that triggers the Nod1-response and induction of the nuclear factor-κB pathway [19].

How these effectors are injected into the host cell is still poorly understood. However, the last decade has seen a number of spectacular advances in the structural definition of individual components of the T4SS and, more recently, on how they associate into large macromolecular complexes. We review these advances, as well as how, when integrated within the larger context of the numerous functional studies on the Cag proteins, a clearer picture of the architecture and function of the Cag-encoded T4SS can be derived.

Architecture of T4SS

The T4SS apparatus generally consists of twelve proteins named VirB1–11 and VirD4 based on the nomenclature used for A. tumefaciens T4SS (Fig. 1). They assemble to form three interlinked subparts: a cytoplasmic/inner membrane complex, a double membrane-spanning channel and an external pilus (Fig. 1). Variations exist among the different types of T4SS but the composition of the compartments is generally conserved. The cytoplasmic/inner membrane complex is composed of three NTPases (VirB4, VirB11 and VirD4), VirB6 and VirB8; the trans-membranes pore complex (VirB7, 9, 10; also termed ‘the core complex’) forms a channel from the inner to the outer membrane; and the external pilus generally consists of the VirB2 and VirB5 proteins. Other components are essential for the formation of the T4SS complex: VirB1 allows for the insertion of the system in the periplasm and VirB3, the function of which is unknown, is often associated with VirB4. In certain bacteria, some of these components are absent. This is the case for the H. pylori ComB system where the apparatus is used to import DNA at the outer membrane and relies on the other competence system ComEC to transport DNA into the cytoplasm [20].

Figure 1.

 (A) Schematic view of the cagPAI encoded by the H. pylori strain 26695. Numbers correspond to the HP0XXX number of the ORF [57] represented by arrows; see also the accompanying review by Fischer [21]. (B) Schematic representation of the prototypal T4SS VirB/D from A. tumefaciens (left) and comparison with components of the Cag-T4SS (right). Cytoplasmic NTPases are coloured in blue, proteins forming the core trans-membrane complex are indicated in various shades of green, and pilus components in yellow/orange. Integral trans-membrane segments or proteins are depicted as squares. Note the presence of additional components (coloured in pink) that have been shown to participate in the Cag-T4SS complex. In addition, the effector CagA (coloured in black) has been located at the tip of the pilus.

The conserved components of the cagPAI encoded T4SS (Cag-T4SS) have been identified first by sequence comparison with those of the VirB/D system; see the accompanying review by Fischer [21]. However, although the archetypal VirB/D system has 12 components, the cagPAI encodes for 27 proteins. Note that several nomenclatures exist for cagPAI proteins, which are listed in the review by Fischer [21]. Except for the VirB/D system homologues, these proteins are unique to H. pylori; see the accompanying reviews by Fischer [21] and Cendron and Zanotti [22]. The role of most Cag proteins in the T4SS apparatus is not clear, although several of them are essential for the secretion of CagA or the induction of interleukin-8 [23].

The NTPases battery of the Cag-T4SS

The Cag-T4SS powering machinery is composed of three cytoplasmic NTPases, HP0525 (VirB11), HP0524 (VirD4) and HP0544 (VirB4 putative homologue), which supply the energy necessary to assemble the apparatus and secrete CagA. Structural information is available for the VirD4 homologue TrwB from Escherichia coli conjugative plasmid R388 and for HP0525 but not for VirB4. These NTPases have the canonical walker A and B motifs, and couple NTP hydrolysis to conformational changes. These might in turn be coupled to unfolding or transfer of the substrate [24].

Coupling protein VirD4 (HP0524)

VirD4 proteins are also named ‘coupling’ proteins because they can recruit substrates to the T4SS apparatus, although this function is absent in some systems. The most studied VirD4 protein is probably TrwB, which is required for the translocation of DNA by the E. coli R388 conjugation system. The crystal structure of TrwB showed that the protein is a globular hexamer with an orange-like shape, with each subunit forming an orange segment (Fig. 2A) [25]. The protein is composed of a 70 residues N-terminal trans-membrane segment (absent in the crystal structure) and two domains, a nucleotide binding domain and an all-α-domain. TrwB binds to ATP at the interface between subunits and hydrolysis is stimulated by DNA. In the Cag-T4SS, VirD4 is encoded by the hp0524 and the resulting protein is much larger (748 residues). HP0524 is essential for CagA translocation but not for interleukin-8 induction [23]. Evidence has recently been provided that HP0524 interact with CagA, suggesting that HP0524 might also act as a coupling protein in the Cag-T4SS, as in other systems [26,27].

Figure 2.

 Structures of cytoplasmic NTPases. Side and top views of the crystal structures of T4SS NTPases shown as ribbon. (A) Structure of TrwB, the VirD4 homologue from E. coli conjugative plasmid R388 [25]. (B) Structure of HP0525 [28], the VirB11 homologue from H. pylori Cag-T4SS with the NTDs and the CTDs coloured in light and dark blue, respectively. (C) Structure of the HP0525/HP1451 [33]. HP0525 protomers are coloured in light blue and are in the same orientation as in (B). The two molecules of HP1451 are coloured in pink and magenta.

VirB11 (HP0525) and its regulation

HP0525 is essential for CagA secretion [23]. The structure of HP0525 showed that the protein also forms hexamers, with each subunit consisting of two domains, an N-terminal domain (NTD) and a RecA-like C-terminal domain (CTD) containing the motifs found in all members of the traffic ATPases family [2]. Each domain of HP0525 forms an hexameric ring that surrounds a central chamber. The CTD ring has a grapple-like shape, with two helices from each of the monomers pointing into the centre of the ring to form the claws of the grapple [28]. VirB11 proteins are dynamic assemblies, the conformations of which depend on their nucleotide binding state [29]. Indeed, in HP0525 crystal structures, the nucleotides bind at the NTD–CTD interface and stabilize their interaction. By contrast, in the absence of nucleotide, the NTDs become disordered and point outwards from the centre of the hexamer, leaving an open NTD ring. When nucleotides are bound, the NTDs are ordered and point inwards in a closed ring conformation. Such important structural changes generated by nucleotide binding, together with functional analysis of other T4SS components, suggest that VirB11 proteins play a role in substrate translocation by participating in the local unfolding of effector proteins during translocation [30].

Nucleotides are not solely responsible for structural changes in HP0525. A protein, HP1451, was originally identified to interact with HP0525 in a high throughput yeast two-hybrid experiment and the interaction was confirmed biochemically [31,32]. The structure of the complex between HP0525 and a large portion of HP1451 revealed that two molecules of the latter interacted with the hexamer of HP0525 [33] (Fig. 2C). The HP1451 monomer structure consists of two consecutive KH domains that are used by the protein to interact with several parts of the HP0525 NTDs. The two HP1451 molecules lock the HP0525 NTDs in the closed state and obstruct the chamber. From this structure, HP1451 was suggested to play an inhibitor role for HP0525 ATPase activity and CagA transfer, a hypothesis supported by ATPase assays and in vivo observation of complex formation. It is therefore likely that HP1451 acts as a negative regulator for toxin secretion [33], thereby controlling part of the secretion process.

HP0544 (VirB3/B4)

Little is known about the HP0544 (also named CagE). On the basis of sequence signature, HP0544 contains motif conserved in both VirB3 and VirB4 proteins [34]. This is reminiscent of the Campylobacter jejuni T4SS, where VirB3 and VirB4 appear to be fused into a single protein. Thus, it is possible that CagE may also combine VirB3 and VirB4 functions. In other T4SS, VirB4 is known to make numerous interactions with other T4SS components, including VirB3, VirB8, VirB10, VirB11 and VirD4 [2]. Yet, in A. tumefaciens, it does not interact with the substrate DNA.VirB4 activity is required for T4SS function and might also have a structural role at the inner membrane.

The translocation pore of the Cag-T4SS

Until recently, the periplasmic core of the T4SS was considered to be composed of VirB8, VirB7, VirB9 and VirB10. The structures of these proteins or parts of these proteins have been solved individually (VirB8, VirB9 [35]) or in a complex (VirB9:VirB7 [36]; VirB7-VirB9-VirB10 [37,38]) and have provided important information on the architecture of the periplasmic core (Fig. 3A). Compared to these structures, the homologues from the cagPAI are significantly different. Indeed, similarities between VirB10 and HP0527 and VirB9 and HP0528 reside only in the C-terminal portion (Fig. 3C). This discrepancy is particularly apparent for HP0527 that consists of almost 2000 residues, of which only 400 at the C-terminus correspond to VirB10 (approximately 400 residues in all T4SS). The remaining 1600 residues have a unique composition with a number of tandem repeats regions; see the accompanying review by Fischer [21]. HP0532 and HP0530 are considered as putative homologues of VirB7 and VirB8, respectively, although the similarities are very poor and it can be anticipated that their properties might also be different (Fig. 3B).

Figure 3.

 Periplasmic core complex. (A) Ribbon representation of the crystal structures of VirB8 (Brucella suis) and ComB10 (VirB10 homologue from H. pylori) and NMR structure of the TraO/TraN complex (VirB7/9 homologues from the pKM101 plasmid) [35,36]. (B) Cryo-EM structure of the T4SS core complex at 15 Å resolution composed of TraO, TraN and TraF (VirB7/9/10). The 1.05 MDa complex spans the entire periplasmic space and forms channels in the inner and outer membranes. It is subdivided into two layers: the I layer inserting into the inner membrane and the O layer inserting into the outer membrane [38]. (C) Graphic representation of the VirB homologues from the Cag-T4SS: HP0527 (VirB10), HP0528 (VirB9). The coloured circles represent the homologous regions with VirB/D systems protein structures. (D) Crystal structure of the O-layer with the individual components coloured as in (A) [37].

Structure of the translocation pore

Recently, two major advances have provided the molecular details of the translocation pore of a T4SS. First, the cryo-electron microscopy (EM) structure of a VirB7-VirB9-VirB10 complex, from the plasmid pKM101 T4SS, was determined at 15 Å resolution (Fig. 3C). The complex forms a large approximately 1 MDa complex spanning both the inner and the outer membranes and containing 14 copies of each of the three proteins [38]. The structure is cylindrical (length 185 Å, diameter 185 Å) with two distinct layers termed ‘I’ and ‘O’ connected by narrow linkers and with a central channel spanning the entire structure (Fig. 3C). The channel is open on the cytoplasmic side (55 Å opening) but constricted on the outer-membrane side (10 Å). Two chambers, one in each of the layers, are clearly visible.

The I and O layer connect, respectively, the inner and outer membranes and each display double-walled architecture but have different composition and structures [38]. The I layer consists of the N-terminal part of VirB9 and VirB10 and is inserted into the inner membrane. The I ring has a large central chamber with a narrower 55 Å base that forms a cup. The O layer has two different parts, a main body and a cap that is inserted in the outer membrane, and is formed by the CTDs of VirB9 and VirB10 and the full-length VirB7.

The second advance has been the crystal structure of the O-layer [37]. This structure demonstrates a number of surprising features. First, VirB10 forms the outer membrane channel. Indeed, the inside of the O-layer is lined by VirB10, and VirB10 contributes the part crossing the outer membrane. Because VirB10 is also known to insert into the inner membrane, this endows VirB10 with the remarkable and unique property of spanning both membranes. Second, the structure spanning the outer membrane is helical, instead of β-stranded as are the vast majority of proteins forming pores in the outer membrane. Indeed, VirB10 projects a helical segment through the outer membrane, and 14 of them form the outer membrane channel. Although, in the EM structure, this channel was constricted/closed (only a 10 Å hole was observed; see above), in the X-ray crystal structure, the channel is open. Third, VirB9 interacts closely with VirB7 and 14 VirB7/VirB9 complexes form an outer ring stabilizing the VirB10 tetradecameric channel. Finally, the CTD of VirB10 exhibits an extended approximately 30 residues sequence at its N-terminus, termed the ‘lever arm’, which embraces three consecutive VirB10 subunits in the tetradecameric structure and forms a platform inside the O-layer. Interestingly, this platform locates at a different level in the crystal and EM structures. This observation, coupled with the fact that, in the EM structure, the outer membrane channel is closed, whereas, in the crystal structure, this channel is open, has led to the suggestion that the lever arms might regulate the open/closed state of the channel.

Although the VirB7-VirB9-VirB10 (HP0532-HP0528-HP0527) complex appears to be conserved, additional Cag-T4SS specific proteins participate in the periplasmic complex (Fig. 1); for details, see the accompanying review by Fischer [21]. For example, HP0532 (VirB7) does not interact directly with HP0528 (VirB9) but requires HP0537 (CagM) that stabilizes the translocation pore HP0528, HP0527 and HP0532 complex [34]. Moreover, other protein–protein interactions occur between the core complex and the periplasmic proteins HP0538, HP0522, HP0530 and HP0537. In particular, HP0522 was found to be part of a large complex involving several Cag proteins, including cytoplasmic, periplasmic and pilus components, and therefore might be an important part of the outer membrane complex of the T4SS [39].

The T4SS pilus

T4SS pili are generally composed of two proteins, VirB2 and VirB5. VirB2 is considered the major pilin subunit and VirB5, although less abundant, decorates the external part of the appendage formed by VirB2. In the VirB/D system, VirB2 is a small protein processed into a 7.2 kDa T-pilin that is cyclized before pilus formation [40,41]. HP0546 was proposed to be a functional homologue of VirB2 based on sequence similarity [34,42,43]. HP0546 is present in membrane fractions and at the bacterial surface but is only a minor component of the Cag-T4SS specific pilus (see below) [43].

The structure of TraC, the VirB5 homologue from the pKM101 T4SS, showed that the protein consists of a helix bundle capped by a globular domain [44] (Fig. 4). Mutational studies of TraC have suggested that VirB5 proteins play a role in adhesion, mediating cell–cell interaction during conjugation [44]. There is no obvious homologue of VirB5 in the cagPAI. However, a detailed analysis of the HP0539 (also named CagL) sequence suggested that it could be a structural homologue of VirB5, which is consistent with its interaction with host-cell receptors and its location at the Cag-T4SS specific pilus (see below) [34,45].

Figure 4.

 A model of Cag-T4SS pilus assembly upon contact with the cellular receptor integrin α5β1. Protein directly interacting with the α5β1 integrins are HP0539 [49], HP0527, HP0540 and CagA [48]. A possible function of these interactions would be to trigger a signal for oligomerization of HP0527, assembly of the Cag-T4SS injection machinery, and locking of the α5β1 receptor, as suggested by [48]. The pilus substructure is composed of the VirB2 functional homologue HP0546 (indicated by the yellow colour) of the pilus that is initially present only at some areas of the cell surface [43]. The protein HP0539 (CagL) could be a homologue of VirB5 and binds to α5β1 via a RGD motif. The structure of the VirB5 homologue TraC [44] is shown in ribbon representation (left). After sequential binding of CagA, HP0527 and possibly HP540 to the receptor, the assembly of the injection apparatus would take place and effectors (CagA and peptidoglycan fragments) could be translocated.

Cag-T4SS specific pilus

The Cag-T4SS pilus is remarkably unusual. Its composition appears more complex than the prototypal VirB2/B5 pilus produced by other T4SS. The Cag-T4SS pilus involves not only HP0546 and HP0539 proteins, but also HP0527, HP0528, HP0532, as well as CagA. By contrast with other T4SS, there is no evidence suggesting that the VirB2 homologue HP0546 is the main component of the needle-like structure described in two studies [46,47]. Perhaps the most surprising finding is that part of the translocation core complex is also involved in the pilus external structure. Indeed, HP0527 (VirB10) and HP0532 (VirB7) associate with the pilus surface and were detected by immunogold labelling [46,47]. HP0527 is able to make intramolecular interactions with itself [34]. The central region of HP0527 interacts with the C-terminal portion and this interaction could provide a means of oligomerization to form a super-structure in direct prolongation of the translocation pore (Fig. 4). Pilus formation might be coupled with receptor binding because Cag-T4SS assembly first requires a contact with epithelial cells [46]. This receptor is likely to be the α5β1 integrin. Indeed, several Cag proteins, including HP0527, HP0539, HP0540 and CagA itself, were shown to bind to different domains of the integrin α5β1 [48,49]. This suggests that HP0540 might also be exposed at the surface of the Cag-T4SS pilus. Some of these results are conflicting (see the accompanying review by Fischer [21]) and a general consensus has yet to emerge. However, all studies emphasize the role of the Cag-T4SS pilus in mediating interactions with α5β1.

Concluding remarks

How the Cag-T4SS is assembled is still poorly understood. Very recently, HP0523 was proposed to act as a lytic transglycosylase, suggesting that HP0523 might be the H. pylori homologue of VirB1 [50]. VirB1 proteins are important with respect to piercing the peptidoglycan layer in the periplasm. Formation of the double-membrane spanning core complex formed by the HP0532-HP0528-HP0527 (VirB7-VirB9-VirB10) proteins is likely to occur next because these proteins assemble spontaneously in other T4SS [38]. Because HP0527 is a major component of the Cag-T4SS pilus, core complex formation might be coupled with pilus formation. Subsequent steps might include recruitment of the cytoplasmic/inner membrane ATPase complex. Once assembled, the Cag-T4SS delivers two types of effectors: the CagA protein and peptidoglycan fragments. These have different effects on the cell and it is unclear whether they are secreted together. Little structural information is available on the main effector CagA, which cannot be produced as a recombinant protein under standard conditions [51]. However, a crystal structure of a C-terminal fragment of CagA in complex with mitogen-activated protein kinase was recently determined [52]. This structure was sufficient to reveal 12 residues of CagA bound to the kinase active site, demonstrating that the toxin inhibits the enzyme by mimicking its natural substrate [52]. However, only 12 of 120 residues bound to the kinase were visible, suggesting that the remaining part of the polypeptide was unfolded in the crystal. This is somehow reminiscent of bacterial effectors delivered by other systems that are unfolded during translocation [53]. It is therefore possible that a large part of CagA is unfolded during and after translocation, although more studies are necessary to decipher the structural details of this process. Although structural data are accumulating on T4SS, a number of specific questions remain unanswered concerning the Cag-T4SS machinery. This is illustrated by the structural studies of ‘not-T4SS’ Cag proteins, which have revealed that these proteins do not resemble known structures [54–56]; see the accompanying review by Cendron and Zanotti [22]. Therefore, although evolutionary related to other T4SS, the Cag-T4SS displays numerous specific features, and more studies will be necessary to obtain a more complete understanding of this fascinating machinery, which is involved in one of the main steps of H. pylori infection.


This work was funded by grant 082227 from the Wellcome Trust to G.W. and by an ATIP-Avenir and Ligue contre le cancer grant to L.T.