Structural and functional aspects of unique type IV secretory components in the Helicobacter pylori cag-pathogenicity island

Authors


G. Zanotti, Department of Biological Chemistry, University of Padua, Viale G. Colombo 3, 35121 Padua, Italy
Fax: +39 0498073310
Tel: +39 0498276409
E-mail: giuseppe.zanotti@unipd.it

Abstract

Helicobacter pylori cytotoxin-associated gene-pathogenicity island (cagPAI) is responsible for the secretion of the CagA effector through a type IV secretion system (T4SS) apparatus, as well as of peptidoglycan and possibly other not yet identified factors. Twenty-nine different polypeptide chains are encoded by this cluster of genes, although only some of them show a significant similarity with the constitutive elements of well characterized secretion systems from other bacteria. The other cagPAI components represent almost unique proteins in this scenario. The majority of the T4SS include approximately fifteen components, taking into account either the transmembrane complex subunits, ATPases or substrate factors. The composition of the cagPAI is very complex: it includes proteins most likely involved at different levels in the pilus assembly, stabilization and processing of secreted substrate, as well as regulatory particles possibly involved in the control of the entire apparatus. Despite recent findings with respect to components that play a role in the interaction with the host cell, the function of several cagPAI proteins remains unclear or unknown. This is particularly true for those that represent unique members with no clear similarity to those of other T4SS and no obvious evidence of involvement in the secretion of CagA or induction of pro-inflammatory responses. We summarize what is known about these accessory components, both from a molecular and structural point of view, as well as their putative physiological role.

Abbreviations
cagPAI

cytotoxin-associated gene-pathogenicity island

IL

interleukin

T4SS

type IV secretion system

Introduction

Cytotoxin-associated gene-pathogenicity island (cagPAI) characterizes the type I strains of Helicobacter pylori (i.e. the virulent strains) responsible for most gastroduodenal diseases, including active chronic gastritis, peptic ulcers, gastric adenocarcinoma and mucosa-associated lymphoid tissue lymphoma [1–3]. CagA, a major antigenic factor of the bacterium, is the main signature of the cagPAI-positive strains. Indeed, cagPAI confers H. pylori the capability to express and translocate the CagA protein inside the host cell through a secretion machinery, which is coded by the components of the PAI; see the accompanying review by Fischer [4]. Once translocated, CagA associates with the inner side of the membrane and is phosphorylated at EPIYA motifs by host tyrosine kinases. The phosphorylation triggers a series of interactions between CagA and human proteins that interfere with the signalling cascades at multiple levels, resulting in a dramatic change of cellular morphology, known as the ‘hummingbird phenotype’, and a remarkable enhancement of cellular motility, causing cell scattering [5]; see the accompanying review by Tegtmeyer et al. [6].

The entire cagPAI region is 37 kb long, including approximately 29 genes [7], which encode for the components of a type IV secretion system (T4SS), homologous to the VirB/D4 machinery of Agrobacterium tumefaciens, the best characterized T4SS that is regarded as the prototype among that family members [8]. T4SS are multicomponent membrane-spanning transport systems ancestrally related to the conjugation processes, which can be responsible for diverse processes such as DNA transfer, DNA uptake and release, and translocation of proteins that have an effector role in the target cell. Eleven out of the 29 Cag proteins can be ascribed to the secretion machinery itself or have been proposed to represent functional homologues of VirB proteins [9–11]. For a more detailed description of the correspondence; see the accompanying review by Fischer [4], who describes the relationships between cag and virB/D4 genes in detail.

A second major effect on the host cells, which is elicited only by H. pylori strains harbouring a functional cagPAI, is the activation of nuclear transcription factor-κB and the induction of pro-inflammatory production of cytokines, mainly interleukin (IL)-8 [12]. The activation of a pro-inflammatory signalling cascade has been confirmed to be Nod1 cytoplasmic receptor dependent and a model has been proposed [13] according to which bacterial peptidoglycan is the key effector of such a response. Indeed, the peptidoglycan muropeptides could be transferred to and internalized into the gastric epithelial cells only if a functional cagPAI is present. However, it is yet to be established whether this kind of stimuli as a result of muropeptides secretion is promoted by a syringe-like mechanism, analogous to CagA, or whether the only intimate contact with the host cell surface of the complex coded by the cagPAI could induce a facilitated internalization of H. pylori peptydoglycan fragments. CagA deficient mutants do not affect that response to wide extent, whereas, in the absence of a functional cagPAI, peptidoglycan release may still occur but with much lower efficiency.

Electron microscopy studies indicate that, upon attachment of H. pylori to gastric epithelial cells, pilus-like structures are formed between the bacteria and host cells. Partial characterizations of the secretion organelle allow a description of some of the interactions occurring in the H. pylori T4SS, mainly those concerning components that belong to the external pilus, protruding from the bacterial surface toward the extracellular milieu. At least five of the VirB functional/structural homologues of this syringe-like complex have been localized in this way: CagC (VirB2), CagL (VirB5), CagT (VirB7), CagX (VirB9) and CagY (VirB10) [9,14,15]. At the same time, yeast two-hybrid system approaches combined with immunoprecipitation studies allow a description of the interactions that were assumed to involve Cag proteins [16,17]. These findings, combined with previous analysis on single components, have allowed the proposal of preliminary models of the H. pylori T4SS.

Systematic studies have established that some Cag proteins are essential or important for CagA translocation, whereas others are involved in IL-8 secretion, or both [18,19]. A few are apparently unnecessary for any of these effects. Despite the fact that many of the cagPAI proteins have been demonstrated to be involved in the CagA translocation and/or IL-8 induction/peptidoglycan release, very little is known about their specific function, and this is particularly true for those components that are unique to the Cag apparatus.

In this minireview, we concentrate on this last class of cagPAI proteins, and summarize what is known about them both from a molecular and structural point of view, as well as their putative physiological roles.

The unique members of the Cag-T4SS

Cagζ (cag1/HP0520), Cagε (cag2/HP0521)

Very little is known about the proteins encoded by these two genes. Both were found not to be necessary for either translocation of CagA or for IL-8 induction [18]. cagζ (cag1/HP0520) was found to be among the most highly expressed genes in an H. pylori transcript profile analysis from infected human gastric mucosa, together with cagC (cag25/HP0546), whereas Cagε was not detected at all. Interestingly, the expression levels of cagC and cagζ reached values analogous to genes important for bacterial survival and homeostasis, such as catalase, urease and NapA; by contrast, the transcript abundance of other cag genes appeared to be much lower, indicating that cagPAI consists of multiple operons tightly controlled by different promoter regions [20]. The presence of the cagζ gene was also found to be related to specific ethnic groups, although an association with virulence and disease has not yet been demonstrated [21].

Cagδ (cag3/HP0522)

Cagδ is a 55 kDa protein essential both for the secretion of CagA and the induction of IL-8. The corresponding sequence is characterized by a clear N-terminal signal sequence for export into the periplasmic space, although no transmembrane helices can be predicted by the most common software. Sequence alignments with a nonredundant database allow the identification of some weak but interesting similarities to proteins associated with adhesion and interaction with eukaryotic cells, such as the surface-exposed Streptococcus pneumoniae choline-binding protein A.

When H. pylori bacteria were investigated in the absence of host cell contact, Cagδ was found to enrich in the membrane fractions, even if a minor contribution was also present in the soluble pool. Moreover, it was shown to co-purify with other Cag proteins, mainly CagT, Cagβ, CagD and Cagζ, which represent the most specific interaction partners. Other Cag-T4SS members were coimmunoprecipitated with Cagδ, even if with a lower abundance: CagM, CagX, CagC, CagE and Cagα [22]. Previous yeast two-hybrid experiments demonstrated that Cagδ associates both in homotypic oligomers and heterotypic complexes with CagV (VirB8), CagT (VirB7), CagM and CagG [16]. In particular, the interaction with CagT, a core component involved in the Cag-T4SS outer membrane sub-complex, was identified by multiple independent techniques and more accurately elucidated. Size exclusion chromatography analysis allowed the isolation of both Cagδ-independent oligomers and large Cagδ-CagT complexes, reminiscent of what occurs in vivo. Finally, pulse-chase assays demonstrated a mutual correlation between expression levels and the stability of Cagδ and CagT proteins [22]. Taken together, these results suggest that Cagδ represents a unique and essential core component of the Cag-T4SS.

CagZ (cag6/HP0526)

CagZ protein, a 23 kDa soluble protein, was found to be absolutely essential for the translocation of CagA but not for the induction of IL-8 [18]. The crystal structure [23] shows that it consists of a single compact L-shaped domain, composed of seven α-helices, including approximately 70% of the total residues (Fig. 1A). The protein fold can be described as an up-and-down bundle: four long, twisted stretches run antiparallel to each other. A twist in each stretch produces an L-shaped molecule: its longest arm has dimensions of approximately 15 × 25 × 60 Å, whereas the shortest is 15 × 14 × 30 Å. The side chains located at the protein interior are all hydrophobic, with the exception of three residues; thus, packing of the entire bundle is driven by hydrophobic forces. By contrast, the molecular surface is heavily charged: 26 negative- and 21 positively-charged side chains, over a total of 199 residues. The presence of a flexible C-terminal tail and the heavily-charged surface suggest that CagZ may participate in the interaction of effector proteins with one or more components of the H. pylori T4SS on the cytoplasmic side of the inner membrane. One or more clusters of surface exposed amino acids have been suggested to represent structural motifs of some relevance for protein activity (NEST prediction, ProFunc server; http://www.ebi.ac.uk/thornton-srv/databases/profunc/).

Figure 1.

 (A) Cartoon representation of CagZ protein fold. The L-shape is clearly visible. (B) Two views of the electrostatic potential surface of CagZ. In the overall, the surface is strongly hydrophilic, with patches of positive and negative charges.

An exhaustive search of structural similarities in the Protein Data Bank did not provide any remarkable information about the function of the protein. The only weak and partial similarity found by the most common servers involve Rab GTPases, comprising proteins that regulate the maturation and transport of endoplasmic-reticulum-derived vesicles in eukaryotic cells (ProFunc server analysis: http://www.ebi.ac.uk/thornton-srv/databases/profunc/).

CagZ has been detected by 2D differential in-gel electrophoresis from H. pylori cultures in vitro [16], demonstrating that it is expressed at a relatively high abundance compared to other Cag proteins. No processing of eventual N-terminal signal peptide was observed, which is in agreement with predictions based on the sequence only. This supports the idea that its surface and charge distribution make it prone to be involved in protein assemblies, most likely from the cytoplasmic side. Finally, CagZ has been proposed to interact with multiple Cag components, not only non-VirB homologues, such as CagF, CagM, CagG and CagI, but also with some T4SS core components, such as CagV, CagY and the ATPase CagE [16].

CagS (cag13/HP0534)

The CagS gene is located immediately after the cluster of cagPAI genes whose putative products show homologies with the VirB proteins that define the structural core of T4SS. Experimental evidence showed that, similar to CagZ, CagS is expressed at a reasonable abundance in H. pylori cultures in vitro [16].

Primary sequence analyses do not show any strong similarity of CagS with proteins of known function, except for some weak similarities with components involved in the peptidoglycan biosynthesis, which belong to the FemABX family of enzymes [24]. The crystal structure of CagS has been determined at a resolution of 2.3 Å [25]. The protein is a single compact domain, with an all-α structure (Fig. 2). Ten α-helices, labelled from A to J, can be distinguished. Helices B, E, F and H, arranged in an up-and-down topology, form the structural core, whereas the short helices C, D and G on one side, and A, I and J on the other, represent types of appendices, conferring a ‘peanut shape’ to the overall structure. The ten helices that form the protein are held together mainly by hydrophobic forces, with few hydrophilic interhelical interactions. The model lacks 20 amino acids at the N-terminus and 25 at the C-terminus, which are flexible or disordered. Some confuse electron density is present in correspondence to the N-terminus, although it was not possible to trace the polypeptide chain, with the exception of few helical turns.

Figure 2.

 (A) Cartoon view of CagS protein. Methionine residues are represented by ball and sticks. A cluster of four methionine residues is visible in the lower part. (B) van der Waals representation which shows the ‘peanut’ shape of the protein.

The model shows a highly charged surface, with 24 positively- and 24 negatively-charged residues. In particular, the tertiary structure defines a negatively-charged region including several glutamate and aspartate residues confined to the portion of the molecule involving α-helix A, the nearby loop, the first and last turns of helices E and F, and the C-terminus helices I and J. In addition, there is a lysine-rich N- and C-terminus, in accordance with the basic isoelectric point of CagS. However, these lysine rich unmodelled N- and C-terminal appendages might define some positively-charged brunches playing a potential role in Cag proteins interactions. As mentioned in the case of CagZ, even if to a minor extent, some putative interactions with the other cagPAI components have been detected by a yeast two-hybrid system, involving mainly CagZ and CagM proteins. Another peculiar feature of the molecule is the presence of fourteen methionine residues over a sequence of 199 amino acids, which is an unusually high content compared to other proteins. Four of them, M69, M130, M133, M138, define a cluster in the 3D structure, approximately located in the internal side of the peanut. The results of the crystallographic model of CagS do not show any clear evidence of architectural similarity to other known structures, with the exception of a weak structural homology with the phosphotransfer domain (HPt) of CheA, a histidine protein kinase that controls chemotaxis response in bacteria [26]. This homology is too weak to be considered as providing any clues with respect to protein function. Even a primary sequence alignment with a nonredundant database shows very limited similarities; the one with the best score being that with phosphocholine cytidylyltransferases, comprising rate-limiting enzymes for surfactant phospholipid synthesis.

CagQ (cag14/HP0535), CagP (cag15/HP0536)

The products of these two short genes correspond to putative proteins of 126 and 114 amino acids, respectively. They are both predicted to be membrane proteins [27] and they were found not to be necessary for either translocation of CagA or for IL-8 induction in H. pylori 26695 strain [18]. CagP appears to play some role in H. pylori adherence to gastric epithelial cells, in addition to classical adhesins, because mutations in its gene may affect bacterium pathogenicity by reducing either the ability of the bacteria to attach to gastric epithelial cells or the intensity of bacteria–host cell interactions [28].

CagM (cag16/HP0537)

CagM protein is a 43.7 kDa protein, characterized by a putative N-terminal signal sequence, and at least three transmembrane helices can be predicted from the sequence pattern. It has been detected in the membrane-bound fractions isolated from both in vivo (from gastric patients isolates) and in vitro H. pylori cultures by 2D-electrophoresis and MS studies [16,29]. In particular, the results of the 2D differential in-gel electrophoresis analysis performed by Busler et al. [16] is suggestive of an N-terminal processing as hypothesized by the predictions.

Systematic mutagenesis analysis clearly showed that ΔcagM mutants are neither able to produce an efficient CagA translocation, nor to release peptidoglycan degradation fragments [13,18,30], thus suggesting an essential role for the cagM gene product. By using a reporter assay in human gastric cancer cells, CagM (along with cagPAI coded protein, CagL) has also been demonstrated to promote the activation of nuclear factor-κB.

More recent experiments with ΔcagE, ΔcagM and ΔcagA isogenic mutant strains of H. pylori provided preliminary evidence that these genes could be involved in the repression of the gene coding for the catalytic subunit of a human gastric H/K-ATPase {{ 770 Saha,A. 2008;}}. Generally, this effect might be stimulated by a functional Cag-T4SS, allowing H. pylori to inhibit acid secretion by gastric cells and induce episodes of transient hypochlorhydria that facilitate bacterial colonization. Evidence for protein–protein interactions involving CagM has been observed by yeast two-hybrid analysis. In such experiments, CagM was found to form complexes with many other Cag proteins both belonging to the core apparatus, including CagX, CagY, CagT, CagV and Cagδ, as well as other Cag components such as the ATPase CagE, CagF, CagG, CagZ and CagS [16]. In a different study employing a similar approach, interactions with CagX and partial interactions with CagT were confirmed, whereas those with CagF and CagY were not [17]. Furthermore, both studies identified a clear tendency for CagM to associate, forming homotypic oligomers. An analogous behaviour was observed when we purified a recombinant CagM construct expressed in E. coli for structural studies. Oligomers composed of five to six subunits were isolated and partially characterized by gel filtration and preliminary electron microscopy analysis (L. Cendron, unpublished results).

In any case, the proposed rich network of interactions of this protein agrees with its functional relevance and localization studies, where it was found to enrich both in the inner and partially in the outer membrane fractions. A model has been proposed according to which CagM, together with CagX and CagT, associates in the outer membrane basal body of the Cag-T4SS, and the results obtained are in good agreement with the main studies in this respect.

CagN (cag17/HP0538)

Full-length CagN protein (306 amino acids, 35 kDa) has been demonstrated to be processed at the C-terminus, giving rise to a product of approximately 24 kDa (CagN1–216), most likely by a mechanism that is not dependent on other cagPAI proteins [31]. Interestingly, the first 24 amino acids are intact in the endogenous protein, despite it shows a clear N-terminus hydrophobic pattern and a putative cleavage site that can be easily predicted. The entire primary sequence is predicted to be largely unfolded (foldindex, http://bip.weizmann.ac.il/fldbin/findex). CagN localization studies demonstrated that it is not delivered into the host cell together with CagA but, in contrast, it remains localized at the bacterial membrane, most likely anchored by a N-terminal hydrophobic helix [31]. cagN gene deletion appears not to abolish directly the main consequences of a functional cagPAI (i.e. CagA translocation and IL-8 induction), even if a variable efficiency of both processes has been observed [18].

Recombinant CagN deleted forms have been produced (His6-CagN25–306, CagN25–216-His6) and partially characterized in our laboratory. Although the ΔN-terminal hydrophobic construct was strongly prone to aggregation, the one also truncated at the C-terminal resulted in a soluble protein that behaves similar to a monomer in solution, showing a secondary structure content composed of 13%β-sheet, 30%α-helix, with a certain fraction not being ascribed to any well characterized secondary structure motifs (L. Cendron, unpublished results).

CagI (cag19/HP0540), CagH (cag20/HP0541)

Very little is known about CagI (41.5 kDa) and CagH (39 kDa), despite the fact that knockout studies demonstrated they are essential for CagA translocation and tyrosine phosphorylation. CagH was also shown to be involved in IL-8 induction in epithelial cells, whereas a cagI deletion mutant does not affect that ability [18]. CagH is also predicted to be secreted out of the inner membrane as a result of an N-terminal signal sequence, whereas only one of three different algorithms tested suggests the presence of a putative hydrophobic helix in the mature protein. CagI most likely is a nonsecretory protein, anchored to the inner membrane toward the periplasmic space as a result of a N-terminal hydrophobic helix, spanning residues 26–51, thus supporting the idea that it might be involved in the translocation as a putative effector protein rather than being a component of the T4SS apparatus. Finally, interaction studies indicated that CagI might interact with CagZ and CagG, and weak evidence of interaction with Cagβ was also observed [16].

CagG (cag21/HP0542)

The product of this gene is a 16 kDa protein with a very acidic isoelectric point and a predicted N-terminal signal peptide with a putative cleavage site between residues 27 and 28. A weak homology with the flagellar motor switch protein or toxin co-regulated pilus biosynthesis protein D has been detected [32]. Yeast two-hybrid screens, as noted elsewhere, indicate that CagG is involved in multiple protein–protein interactions with CagM, Cagδ, CagF and CagZ and, to a minor extent, with CagT and the VirD4 homologous Cagβ [16].

Analogous to cagI, cagG deletion mutants are incapable of delivering CagA into gastric epithelial cells, although they retain the capacity to induce IL-8 production, pointing toward a potential effector role for the protein. Other studies provided different evidence, showing a marked reduction of IL-8 production from gastric epithelial cells, as well as a reduced capacity to adhere to epithelial cells in vitro and to colonize Mongolian gerbils in vivo [33]. Similar results were found in an experiment with cagG-deleted strains tested on cultivated KATOIII cells [34].

CagF (cag22/HP0543)

This 268 amino acids protein was demonstrated to interact with CagA, presumably at the inner bacterial membrane, and this interaction is essential for CagA translocation in the host. These data were used to suggest that CagF might play a chaperone function in the early steps of CagA recruitment and delivery into the T4SS channel [35,36]. Subsequently CagF was shown to interact with the 100 amino acids region adjacent to the C-terminal secretion signal of CagA [37]. Weak interactions involving three other Cag proteins (CagZ, CagT and CagM) were also detected. Localization studies indicated that it is both present in the membrane fractions and in the cytoplasm. A His6-tagged construct in our hands behaves as a soluble protein, even if it has a clear tendency to form oligomers of different sizes, coexisting with a major fraction approximately corresponding to a monomer. Detergent treatments appeared to reduce the contribution of very large unspecific oligomers and favour the presence of dimers and/or monomers.

CagD (cag24/HP0545)

The cagD locus is present in a majority of clinical isolates, although little is known about its role. Its amino acid sequence contains a predicted signal sequence for secretion in the periplasmic space.

The crystal structure of CagD, solved in two different crystal forms at medium resolution (2.2 Å and 2.75 Å for the monoclinic and the hexagonal forms, respectively), shows that, in both cases, CagD is a homodimer, where the two monomers are covalently linked by a disulfide bridge (Fig. 3) [19]. In both crystal forms, the N-terminal domain is not visible in one monomer and absent in the other, as a result of proteolysis. Consequently, the model is available only for residues 47–176. The visible part of each monomer folds as a single domain, characterized by a β-sheet flanked by α-helices. Five β-strands, labelled from A to E, run all contiguous. The N-terminus of the monomer includes strand A, an α-helix and a long stretch, and the C-terminal portion includes two relatively long α-helices and a final β-strand, F, which protrudes from the core of the monomer and runs anti-parallel to the same strand of a second monomer, allowing for the formation of the dimer. The surface of interaction between monomers also involves portion of chains D and E of the two monomers, which are held together not only by the S-S bridge between two Cys172, but also by hydrogen bonds. A second intramolecular disulfide bridge, between Cys120 and Cys133, helps to stabilize the 3D structure. The dimer presents a large crevice inbetween the two monomers, and the 46 N-term amino acids of one monomer could partially fill in this cavity.

Figure 3.

 (A, B) Two different views of CagD dimer. Disulfide bridges are shown in yellow. It is possible to see the two β-strands, one per each monomer, that favour dimerization of the protein. (C) The electrostatic potential surface of CagD dimer.

The CagD overall fold is relatively common: the most relevant among proteins that present a similar fold is the SycT chaperone of Yersinia enterocolitica type III secretion system [38]. In particular, SycT shares the same topology in the region including the N-terminal helix and the β-sheet element, whereas it displays a remarkable difference in the orientation of the α-helices located at the C-terminus. Analogous to Yersinia SycT chaperone, CagD presents all the main α-helical motifs grouped on just one side of the β-strands, whereas all the other type III secretion system chaperones display a third α-helix on the opposite side, and this last is widely involved in the dimerization process. However, the dimeric arrangement of CagD is quite different from SycT as well as that of other members of this family.

Finally, when crystallized in the presence of Cu(II), the protein shows the presence of the ion coordinated in a small cavity of the surface at the polar opposites of each monomer, close to another dimer present in the crystal packing and partially involving it in the coordination. It is likely that the presence of Cu(II) is an artefact of the crystallization, although the possibility that the protein can physiologically bind cations cannot be ruled out completely.

Disruption of the cagD gene was first reported to have an intermediate and variable effect on CagA delivery and IL-8 induction phenotype [18]. Subsequently, CagA tyrosine phosphorylation and IL-8 assays have demonstrated that CagD is involved in CagA translocation into the host epithelial cells, although it is not an absolute requirement for T4SS pilus assembly [19].

As suggested by the presence of a secretion signal at the protein N-terminus, CagD is mainly found in the periplasmic space, partially associated with the inner membrane. Interestingly, it was found to be secreted in the culture supernatant and this result was found not to be a result of generic bacterial lysis. Moreover, in a H. pylori infection experiment with AGS cells, significant amounts of CagD were found to be associated with the host cell membranes, and this interaction appeared to be independent of CagA translocation or the components of the T4SS, such as CagF. Because this localization was independent of the various tested cag mutants, these findings may indicate that CagD is released into the supernatant during host cell infection by an unknown independent mechanism and then binds to the host cell surface or is incorporated in the pilus structure.

Taken together, these results suggest that CagD may serve as a multifunctional component of the T4SS, which is involved in CagA secretion at the inner membrane and may localize outside the bacteria to promote additional effects on the host cell; however, whether these effects are required for CagA translocation or trigger CagA-independent virulence functions remains unclear.

Conclusions

Despite several studies carried out during the last 15 years on cagPAI, several questions about its components still remain unanswered. Those members that are not strictly structural are, in this sense, particularly puzzling because the role they play in the process of CagA secretion or IL-8 induction is still unknown or uncertain. However, partial maps of the H. pylori transmembrane core apparatus and external pilus have been defined as a result of recent localization and interaction studies. Together with the VirB/D homologues CagV, CagX, CagY, CagT, Cagα, Cagβ and CagE, the proteins Cagδ, CagM and CagZ have been identified as part of a wide network of interactions, with the first two most likely as unique oligomeric core components. CagF has been recognized to act as a chaperone of the major effector CagA, with CagL as a component playing a role in pilus adhesion to gastric epithelial cells. Finally, for a few other Cag proteins, localization in the bacterial compartments has been characterized. As described in this minireview, the 3D structure of only three of these unique components (CagZ, CagS and CagD) is now available, along with Cagα ATPase.

The lack of a clear picture of the biological function and organization of some cagPAI components is also a major obstacle to structural studies because most of these gene products possibly do not act as single proteins, but perhaps as subunits of larger complexes, or they are made to act in concert with other partners. For these reasons, further accurate studies on the interactions among cagPAI components will be relevant not only to clarify the function of these proteins, but also for future structural investigations.

Acknowledgements

We acknowledge all the PhD students and post-doctoral students that have contributed to the structural studies on the cag proteins over the years. This work was supported by the Ministero dell’Istruzione, dell’Università e della Ricerca, MIUR (PRIN 2007LHN9JL) and by the University of Padua, Italy.

Ancillary