Proteome variability among Helicobacter pylori isolates clustered according to genomic methylation

Authors

  • I. Vitoriano,

    1. Faculdade de Engenharia, Universidade Católica Portuguesa, Rio de Mouro, Portugal
    Search for more papers by this author
  • J.M.B. Vítor,

    1. Faculdade de Farmácia, iMed.UL, Universidade de Lisboa, Lisbon, Portugal
    Search for more papers by this author
  • M. Oleastro,

    1. Departamento de Doenças Infeciosase Departamento de Genética Humana, Instituto Nacional Saúde Dr. Ricardo Jorge, Rio de Mouro, Portugal
    Search for more papers by this author
  • M. Roxo-Rosa,

    Corresponding author
    1. Departamento de Doenças Infeciosase Departamento de Genética Humana, Instituto Nacional Saúde Dr. Ricardo Jorge, Rio de Mouro, Portugal
    2. BioFIG-Center for Biodiversity, Functional & Integrative Genomics, Faculdade de Ciências, Universidade de Lisboa, Lisbon, Portugal
    • Faculdade de Engenharia, Universidade Católica Portuguesa, Rio de Mouro, Portugal
    Search for more papers by this author
  • F.F. Vale

    Corresponding author
    • Faculdade de Engenharia, Universidade Católica Portuguesa, Rio de Mouro, Portugal
    Search for more papers by this author

Correspondence

Mónica Roxo-Rosa, Instituto Nacional Saúde Dr. Ricardo Jorge, Av. Padre Cruz, 1649-016 Lisboa, Portugal. E-mail: roxorosa@hotmail.com

and

Filipa F. Vale, Faculdade de Engenharia, Universidade Católica Portuguesa, Estrada Octávio Pato, 2635-631 Rio de Mouro, Portugal. E-mail: filipavale@fe.lisboa.ucp.pt

Abstract

Aims

To understand whether the variability found in the proteome of Helicobacter pylori relates to the genomic methylation, virulence and associated gastric disease.

Methods and Results

We applied the Minimum-Common-Restriction-Modification (MCRM) algorithm to genomic methylation data of 30 Portuguese H. pylori strains, obtained by genome sensitivity to Type II restriction enzymes' digestion. All the generated dendrograms presented three clusters with no association with gastric disease. Comparative analysis of two-dimensional gel electrophoresis (2DE) maps obtained for total protein extracts of 10 of these strains, representative of the three main clusters, revealed that among 70 matched protein spots (in a universe of 300), 16 were differently abundant (< 0·05) among clusters. Of these, 13 proteins appear to be related to the cagA genotype or gastric disease. The abundance of three protein species, DnaK, GlnA and HylB, appeared to be dictated by the methylation status of their gene promoter.

Conclusions

Variations in the proteome profile of strains with common geographic origin appear to be related to differences in cagA genotype or gastric disease, rather than to clusters organized according to strain genomic methylation.

Significance and Impact of the Study

The simultaneous study of the genomic methylation and proteome is important to correlate epigenetic modifications with gene expression and pathogen virulence.

Introduction

Helicobacter pylori is a spiral-shaped, flagellated, Gram negative bacterium that infects the gastrointestinal tract of more than half of the world's human population (Atherton 2006). Although most patients remain asymptomatic, in the absence of treatment, both infection and the resulting inflammation persist throughout their lives. Acting as a quasispecies, that is, each strain can undergo genetic alteration in vivo (by recombination and mutation), the strains of these hypermutable bacteria appears to be unique but highly adapted to the human stomach (Israel et al. 2001; Salama et al. 2007), able to survive the extreme conditions of the stomach and evade the human immune system (De and Bereswill 2007). The genetic variability among H. pylori strains is also related to differences in their virulence (Oleastro et al. 2009; Panayotopoulou et al. 2010; Vitoriano et al. 2011b; Rizzato et al. 2012). This, in addition to host genetic susceptibility and environmental factors, determines that, in about 20% of the infected individuals, gastritis progresses to more severe gastric diseases, namely peptic ulcer disease or gastric cancer (Atherton 2006). Indeed, genetic analysis confirmed that, among others, the presence of cytotoxin-associated gene A (cagA) and toxic allele (s1 allele) of the vacuolating cytotoxin gene (vacA) is more common in H. pylori strains associated with peptic ulcer disease or gastric cancer (Wen and Moss 2009).

Typing methods are in general useful for understanding the natural history, epidemiology, mode of transmission, reservoirs and clinical implications of bacteria (Owen et al. 2001). Typing studies have clearly showed that H. pylori strains from the same geographic region are more closely related to each other than strains from other geographical regions (Falush et al. 2003; Gressmann et al. 2005; Vale et al. 2008, 2009; McClain et al. 2009; Kawai et al. 2011; Lehours et al. 2011; Correa and Piazuelo 2012). However, even within the same region, strains vary greatly in their virulence and, therefore, should present variations in their proteome (Wu et al. 2008; Franco et al. 2009; Bernarde et al. 2010; Vitoriano et al. 2011a,b). There is still a lack of knowledge about the potential influence of bacterial proteome on the strains' similarity in typing methods. The multilocus sequencing typing (MLST) technique is a powerful, classically used method that enables the discrimination of bacterial strains according to their allelic profile of constitutive housekeeping genes required for maintenance of basic cellular functions (Sullivan et al. 2005; Yamaoka 2009). However, for bacteria with a high number of expressed Type II DNA methyltransferases and companion restriction endonucleases, as is the case of H. pylori, the genomic methylation typing method, followed by analysis with the Minimum-Common-Restriction-Modification (MCRM) clustering algorithm, is particularly useful (Vale and Vitor 2007; Vale et al. 2008). Indeed, the advantage of its use arises from the fact that it is based on the conservation of restriction and modification systems assumed by others (Kusano et al. 1995). This assumption was verified by our follow-up study of 10 patients, for a period of 6 months to 4 years, that revealed substantial conservation (94·5%) in methyltransferase expression (Vale and Vitor 2007). Corroborating this is the genome sequencing of four chronically infected Colombians (isolation intervals from 3 to 16 years) plus one human volunteer experimentally infected with H. pylori that revealed the genome stability in the absence of mixed infection (Kennemann et al. 2011). Therefore, we have developed the MCRM algorithm (Vale et al. 2008) based on: the selfish behaviour of restriction and modification systems, determined by the fact that in their absence the spread of the bacterial population is inhibited by postsegregational death (Kusano et al. 1995; Naito et al. 1995; Handa and Kobayashi 1999); the assumption that each strain evolves by acquiring new restriction and modification systems, without losing those it already owns. The algorithm works by first grouping strains that share a common minimum set of restriction and modification systems and gradually adds strains according to the number of the restriction and modification systems acquired. Both MLST and MCRM cluster H. pylori strains with common geographical origin (Falush et al. 2003; Gressmann et al. 2005; Vale et al. 2008, 2009) or that were isolated from members of the same family (Raymond et al. 2004). The use of restriction endonucleases thus allows an inexpensive mode to do a first evaluation of the DNA methylation, because these enzymes recognize specific sequences. Modern methods to study the methylome of bacteria use single-molecule real-time sequencing, which permit to obtain the DNA sequence and also to determine base modifications due to differences in the DNA polymerase kinetics when the template DNA is methylated (Eid et al. 2009; Flusberg et al. 2010; Korlach et al. 2010; Korlach and Turner 2012).

Still unknown was, however, the influence of the proteome of each strain on cluster organization. We considered the proteome not in the strict sense of the synthesized proteins (protein expression concept) but in the broader sense of protein species (protein speciation concept, that is, synthesized proteins and their variants resulting from post-translational processing and modifications) present in each strain (Patterson and Aebersold 2003; Jungblut et al. 2008). In the light of the protein speciation concept, our previous studies of proteome variability among H. pylori clinical isolates (Vitoriano et al. 2011a,b) by two-dimensional gel electrophoresis (2DE) analysis followed by mass spectrometry (MS) reinforced the idea that the genomic variability of these bacteria is not fully reflected in their proteomes. Nevertheless, strains associated with different gastric diseases differ in the abundance of some protein species, explaining differences in virulence (Bernarde et al. 2010; Vitoriano et al. 2011a,b). Fluorescence difference gel electrophoresis (DIGE) has been recently exploited (Wu et al. 2008; Franco et al. 2009; Momynaliev et al. 2010) as an alternative approach to the classical 2DE in the search for H. pylori virulence biomarkers. Other technologies involving metabolic labelling, such as the stable isotope labelling with amino acids in cell culture (SILAC) (Harsha et al. 2008), have been extensively applied to Escherichia coli cells, but to our knowledge, they have not been used for H. pylori proteomics yet. Notwithstanding their enhanced sensitivity, which is positive for comparative proteomics of closely related H. pylori strains, the higher costs associated with these technologies still limit their use.

Here, we aimed to understand whether besides sharing the expression of particular DNA methyltransferases (i.e. sharing the presence of particular active methyltransferases), the strains that cluster together in the MCRM analysis are also more closely related in terms of their proteome profile or whether, instead, it is the associated disease status and/or the presence of CagA virulence factor that dictate differences in the abundance of protein species among strains. For that, we studied a group of 30 Portuguese H. pylori clinical isolates, heterogeneous both in their host's age, gender and gastric disease. Strains were first typed by applying the MCRM algorithm to their genome-methylation status data, and the variability of the proteome was then evaluated in a small subset of strains.

Methods

Helicobacter pylori strains characterization

The analysed H. pylori strains belong to the collection of bacterial strains from the Department of Infectious Diseases of the National Institute of Health Dr. Ricardo Jorge (Lisbon, Portugal). Strains were selected to have a common geographical origin [this was assured by considering only strains from unrelated Portuguese Caucasian patients, taking into account the described homogeneity of this human population (Rosser et al. 2000)] and to be heterogeneous regarding their cagA and vacA genotype and the age, gender and gastric disease of the patient from whom each of them was collected. These included 30 clinical isolates from unrelated Portuguese patients, 14 presenting with nonulcer dyspepsia (NUD), 10 with peptic ulcer disease (PUD), and six with gastric cancer (GC) (Table 1), with average ages of 18 (range 3–62), 31 (range 5–66) and 65 years (range 43–80), respectively. The first sequenced H. pylori strain (strain 26695, ATCC 700392), that is, the reference English strain NUD-associated strain (here assigned NUD-26695) was also included (Tomb et al. 1997).

Table 1. Helicobacter pylori strains
Helicobacter pyloriPatient
Strain cagA vacA Histologic analysisAgeGender
  1. cagA column: +, strain containing the gene; −, strain lacking the gene. vacA column: s1, toxic form of the gene; and s2, nontoxic form of the gene. NUD, nonulcer dyspepsia; PUD, peptic ulcer disease; GC, gastric cancer; M, male; F, female. All strains were isolated from Portuguese patients, except the sequenced NUD-26695 strain (from an English patient with nonulcer dyspepsia).

32/00+s1NUD3M
94/99+s1NUD5M
228/99s2NUD8M
323/99+s1NUD12F
375/01+s1NUD5M
440/02s2NUD9F
514/02s2NUD14F
600/99s2NUD10F
795/99+s1NUD31M
796/99+s1NUD20M
855/99+s1NUD62M
864/99s2NUD57F
1094/03s2NUD4M
1622/05+s1NUD15F
68/00s1PUD5M
93/00+s1PUD29M
147/00s1PUD56F
483/99+s1PUD66F
624/99+s1PUD65M
713/99+s2PUD29M
1603/05+s1PUD18F
1615/05+s2PUD14M
1661/05+s1PUD14M
1692/05+s1PUD18F
B22/99s2GC43M
B23/99+s1GC62M
B46/95+s2GC80M
B63/98+s1GCUnknownM
JP1/95S1GCUnknownF
P3/92s2GC73F
26695+s1NUDUnknownUnknown

Cluster analysis of restriction enzyme susceptibility

A binary matrix (Table S1) was recorded with previously obtained data (Vale et al. 2009) on the hydrolysis of the genomic DNA of each strain with 28 different Type II restriction endonucleases (AseI, BseRI, BssHII, BssKI, BstUI, DdeI, DpnI, DpnII, EagI, FauI, Fnu4HI, FokI, HaeIII, HhaI, HpaII, Hpy188I, Hpy188III, Hpy99I, HpyCH4III, HpyCH4IV, HpyCH4V, MspI, NaeI, NlaIII, Sau3AI, Sau96I, ScrFI and TaqI). Hydrolysis suggests unmethylated DNA (entries ‘0’ in Table S1), and unhydrolysed DNA suggests protection by methylation (entries ‘1’ in Table S1), thus the presence of an active methyltransferase. This matrix was then used for cluster analysis (i.e. dendrograms construction) with the MCRM algorithm (Vale et al. 2008). Briefly, the algorithm starts by choosing one of the strains in the set that contains the least restriction and modification systems that is hypothesized to be the one that have the core set of the most abundant restriction and modification systems expressed among the typed strains. We consider the hypothesis that this core set of restriction and modification systems was the first to be acquired by H. pylori, so that they exhibit a large dissemination (expression) among several daughter strains. The set of strains is now divided in two groups: the group that has the core set of restriction and modification systems and the group that has not. The algorithm then runs recursively until all restriction and modification systems are considered, and all strains are positioned at the dendrogram. A complete description of the algorithm can be found in our previously publication (Vale et al. 2008). The MCRM algorithm is based on: the selfish behaviour of restriction and modification systems, in which the loss of Type II restriction and modification gene complexes inhibits the propagation of a cell population and causes chromosome breakage; and on the assumption that each strain evolves by acquiring new restriction and modification systems, without losing those that it already possesses. MCRM algorithm was implemented using Matlab® R12 and is available for download at http://www.ff.ul.pt/~jvitor/Bioinformatics/MCRM_algorithm.zip. The produced dendrograms were checked for similarity considering the following parameters: rate of k/nM, where k is the number of methyltransferases commonly expressed by all strains and nM is the number of screened methyltransferases; discriminatory capacity of the method, that is, the strain's frequency per cluster, given by nj/N, where nj is the number of clustered strains in a given similarity level and N is the number of tested strains (Maslow et al. 1993); Simpson's index of diversity, which reflects the capacity of the method to distinguish unrelated strains (Maslow et al. 1993); and typeability, that is, the proportion of strains that are assigned to a specific type by the typing system (Maslow et al. 1993).

Proteome comparison

We analysed the digitalized images (eight bit grey values, 200 ppi, ImageScanner, GE Healthcare) of Coomassie-stained 2DE gels previously obtained for the soluble proteome of 10 H. pylori strains (i.e. strains NUD-228/99, NUD-375/01, NUD-440/02, NUD-514/02, NUD-1094/03, NUD-1622/05, PUD-93/00, PUD-147/00, GC-B23/99 and GC-P3/92) (Vitoriano et al. 2011a). These were made by resolving bacterial protein extracts, prepared from the total biomass recovered from a grown plate (of each strain), by high resolution 2DE. Isoelectric focusing was run in 18 cm Immobiline DryStrips with a nonlinear wide-range pH gradient from 3 to 11 (GE Healthcare, Uppsala, Sweden) using an Ettan IPGphor 3 unit (GE Healthcare) for a total of 100 kVh, during which the voltage was gradually increased up to 5000 V for a total of 66 h. For the second dimension, equilibrated and blocked strips were applied onto 7–16% (w/v) gradient polyacrylamide gels and run overnight at 1 W per gel (Ettan DALTsix System; GE Healthcare). Now, computer-assisted analysis, that is, spot detection, gel-to-gel matching and comparative statistical analysis, was performed using the ImageMaster™ 2D Platinum 6.0 software (GE Healthcare) taking the NUD-228/99 2DE map as reference. The automatically detected spots and gel-to-gel matches were carefully checked and, whenever necessary, corrected to achieve reliable results. The intensities of Coomassie staining were normalized taking into account the standardized relative intensity volume of spots (%Vol, i.e. the volume of each spot over the volume of all spots in the gel) reflecting the protein spots' abundance. Differences in protein abundance among strains were statistically assessed by the two-sample t-test for n observations, where n is the number of analysed strains. Differences were considered statistically significant for P < 0·05.

Search for methyltransferases recognition sequences at the gene promoters

The promoter region comprised nucleotides between −100 and the primary transcriptional start site (TSS) (Sharma et al. 2010) of each gene of interest, that is, those encoding the proteins found to be differentially abundant in our analysis. The promoter region was searched for recognition sequences for the 28 used restriction endonucleases. This in silico analysis was carried out using the NEBcutter V2.0 software (Roberts et al. 2010) and the genome sequence of NUD-26695 deposited in the NCBI database and considering the TSSs defined by others (Roncarati et al. 2007; Sharma et al. 2010). Then, we checked the methylation status of each strain for the methyltransferases whose recognition sequences were present at those promoters and, as before, we considered a methylated status of ‘1’ for all entries (Table S1; Vale and Vitor 2007).

Results

Cluster analysis of restriction enzyme susceptibility

Cluster analysis of the genome-methylation status data of the 30 epidemiologically unlinked Portuguese H. pylori strains (selected from unrelated Portuguese Caucasian patients) used, plus the sequenced NUD-26695 strain (Tomb et al. 1997), using the MCRM algorithm, produced 10 dendrograms (Fig. 1a shows the most frequent one) with a similarity level of 14·3% (k/nM = 4/28*100, where k is the number of methyltransferases commonly expressed by all strains and nM is the number of screened methyltransferases). Of the four transversally expressed methyltransferases that underlie this value, two (M.HhaI methylating the sequence Gm5CGC and M.NaeI methylating the sequence GCCGGC in an undetermined cytosine base) were previously described as possibly conserved in H. pylori (Vale and Vitor 2007) and the other two (M.BssHII methylating the sequence GCGCGC in an undetermined cytosine-5 base and M.BseRI methylating the sequence GAGGAG in an undetermined adenosine-N6 base) as highly frequent methyltransferases in this species (Vale and Vitor 2007; Vale et al. 2009). Also, the highly frequent methyltransferase M.NlaIII found in this study was previously considered conserved by others (Xu et al. 1997). At this similarity level, the discriminatory capacity of the method was less than 5% (nj/= 0·03%), as recommended by Maslow et al. (1993). Furthermore, both the Simpson's index (Maslow et al. 1993) and typeability reached values of 100%, revealing that all strains were considered different, in agreement with the bacterial quasispecies model (Israel et al. 2001; Salama et al. 2007). The dendrograms produced with the MCRM algorithm applied to data on the genome-methylation status of 30 epidemiologically unlinked Portuguese H. pylori strains included in the study (Fig. 1a) confirmed the good discriminatory power of this typing method.

Figure 1.

Cluster analysis. (a) Global dendrogram produced by the MCRM algorithm for the 31 Helicobacter pylori strains analysed (30 genetically and epidemiologically unlinked Portuguese clinical isolates, plus the reference strain NUD-26695); (b) Restricted dendrogram produced by the MCRM algorithm for 11 of those strains (eight belonging to the three main clusters of the Global dendrogram, C1, C2 and C3; two being outside of any cluster (OG cluster); and the reference strain NUD-26695). For each strain, the cagA genotype (+, strain containing the gene; −, strain lacking the gene); the vacA genotype (s1, toxic form of the gene; and s2, nontoxic form of the gene); and the associated disease by the prefix NUD (Nonulcer dyspepsia), PUD (Peptic ulcer disease), or GC (Gastric Cancer) in the strain's name, are indicated.

Taking the cut-off of k/nM = 0·35, we observed the presence of three main clusters of strains, termed C1, C2 and C3, in all the generated dendrograms (Fig. 1a). As expected (Vale et al. 2009), the NUD-26695 strain was not included in any of these clusters owing to its different geographical origin. Moreover, clusters did not reflect the association of the strain with a specific disease; instead, nonulcer dyspepsia-, peptic ulcer disease- and gastric cancer-associated H. pylori strains appeared mixed through the dendrogram. A total of 11 strains, eight belonging to the three main clusters (strains NUD-375/01 and PUD-93/00 from C1, strains NUD-1094/03 and GC-P3/92 from C2, and strains NUD-228/99, NUD-440/02 NUD-1622/05 and GC-B23/99 from C3), two being outside of any cluster [NUD-514/02 and PUD-147/00, termed outsiders group (OG)] and the reference strain NUD-26695, were analysed once again with the MCRM algorithm (Fig. 1b). The previous cluster pattern was maintained in the 10 newly produced dendrograms (Fig. 1b shows the most frequent one), for which the discriminatory method capacity was 9·1%, and, as before, both the Simpson's index and typeability were 100%.

Proteome analysis according to methylation status

Aiming to understand the influence of the protein spots' abundance pattern on dendrogram organization, we evaluated the consistency of the soluble proteome of the strains within each cluster. Given the complexity and high costs of the technology, proteomic studies are, in general, limited in respect to sample size (Smith et al. 2007; Franco et al. 2009; Zhang et al. 2009, 2011). Our study is not an exception, and therefore, to achieve our goal, we sought to minimize this issue by selecting for the analysis the 2DE maps [previously obtained in (Vitoriano et al. 2011a)] of the group of 11 epidemiologically and genetically unlinked H. pylori strains used in the restricted dendrogram (Fig. S1; Vitoriano et al. 2011a). Of approximately 300 protein spots detected on each 2DE map (Vitoriano et al. 2011a), we selected for the analysis only those (a total of 70) for which the gel-to-gel matching, that is, the identification of the correspondent spots over at least 50% of the 2DE gels, was previously confirmed (Vitoriano et al. 2011a). Computer-assisted analysis was performed throughout the statistical evaluation of changes in the %Vol of each matched protein spot, with the strains grouped according to cluster organization (C1, C2, C3 and OG). Among the 70 analysed protein spots, 16 showed a differential abundance (P < 0·05) among the four groups (nine in C1 vs C2; eight in C2 vs C3; five in C1 vs C3; and four, two and three in OG vs C1, C2 and C3, respectively) (Fig. S1 and Table 2). Strains presenting the highest number of differently abundant proteins were those belonging to the more closely related clusters (i.e. C2 and C3). The comparative analysis of the 2DE maps of strains belonging to clusters C1, C2 and C3 with those of strains in the OG group showed that, despite their distance in the dendrogram, the strains of the OG group do not differ greatly from one another in terms of abundance of the 70 protein spots analysed. The identification of some of these proteins spots was confirmed by previously reported MS data, performed on these 2DE gels (Vitoriano et al. 2011a), while others were identified by comparison with the 2DE database (Jungblut et al. 2010, Jungblut et al. 2012). Overall, differentially abundant proteins include the following: five unknown protein spots (spots 4, 5, 7, 10, and 14); heat shock protein 70 (DnaK, spot 1), involved in protein folding; flagellin A (FlaA, spot 2) and the hypothetical protein HP0958 (HP0958, spot 9), involved in bacteria motility; one species of glutamine synthetase (GlnA, spot 3), inorganic pyrophosphatase (Ppa, spot 12) and flavodoxin A (FldA, spot 13), involved in bacteria metabolic pathways; the chemoreceptor haemolysin secretion protein (HylB, spot 6); one species of the translation elongation factor EF-Tu (EF-Tu, spot 8) and the 50S ribosomal protein L7/L12 (Rpl7/l12, spot 15), involved in the translation process; one subunit of alkyl hydroperoxide reductase (AhpC, spot 11), involved in cell detoxification; and the virulence factor neutrophil-activating protein (NapA, spot 16). The previously reported reproducibility and consistency of these 2DE maps (Vitoriano et al. 2011a) can be confirmed in Fig. 2, where the gel sector around spots 1 and 2 for the 11 strains analysed is presented (Fig. 2). No methyltransferase was identified in this group of proteins. However, as the basis of our cluster analysis was the differential expression of these enzymes among strains, it is reasonable to think that these proteins were excluded from the group of the 70 analysed spots, especially considering that the 2DE database (Jungblut et al. 2010, Jungblut et al. 2012) does not contain any of the tested methyltransferase. Nevertheless, we cannot rule out the possibility that one of those five unknown protein spots is a methyltransferase.

Table 2. Differentially abundant protein spots
   Genomic methylation status % Vol ± SD  cagA Genotype % Vol ± SD  Gastric disease paragraph % Vol ± SD  
Spot no.ProteinORFOGC1C2C3  P Variation reflects: cagA genotype or gastric disease?cagA+cagAVariation cagA+ vs cagAPNUDPUDGCVariationP
  1. Protein spots found to be differentially abundant among 11 heterogeneous H. pylori strains clustered based on their genomic methylation status (clusters C1, C2 and C3 and group OG). The additional columns show the relative abundance of the same protein spots when the strains are regrouped according to their cagA genotype and associated gastric disease. For each spot, the average %Vol ± SD, within the respective class, is indicated. ↑ spots presenting higher average %Vol in the former class than in the second one. ↓ spots presenting lower average %Vol in the former class than in the second one.

  2. a

    Statistically significant variations (< 0·05). The ORF column indicates the correspondent ORF in the NUD-26695 strain.

  3. b

    Identification based on the standard H. pylori 2DE gel of the database (Jungblut et al. 2010, Jungblut et al. 2012).

  4. c

    Identified in (Vitoriano et al. 2011a,b).

  5. d

    Identified in (Vitoriano et al. 2011b).

1DnaKbHP01090·743 ± 0·1080·953 ± 0·2870·767 ± 0·3391·104 ± 0·221OG vs C3↓2·17E-02aNo      
2FlaAcHP06012·183 ± 0·4262·280 ± 0·7833·203 ± 0·2683·580 ± 1·192

C1 vs C3↓

C1 vs C2-3↓

OG vs C2↓

OG vs C3↓

7·19E-02

4·97E-02

2·75E-02a

6·48E-02

Disease 3·194 ± 1·8082·443 ± 0·6923·824 ± 0·666PUD vs GC↓4·71 E-02a
3GlnAbHP05120·172 ± 0·0570·166 ± 0·0220·294 ± 0·1050·139 + 0·044C2 vs C3↑2·07E-02aDisease 0·159 ± 0·0550·163 ± 0·0330·316 ± 0·112

NUD vs GC↓

PUD vs GC↓

1·16E-02a

8·15E-02

4Unknown0·358 ± 0·0640·242 ± 0·0940·450 ± 0·1640·220 ± 0·077

C2 vs C3↑

C1 vs C2↓

OG vs C3↑

4·40E-02a

7·32E-02

3·90E-02a

cagA 0·228 ± 0·0920·348 ± 0·1429·55E-02 
5Unknown0·222 ± 0·0380·140 ± 0·0980·331 ± 0·0570·224 ± 0·046

C2 vs C3↑

C1 vs C2↓

C1 vs C2-3↓

OG vs C2↓

1·90E-02a

4·18E-02a

4·87E-02a

3·28E-02a

cagA/disease0·183 ± 0·0840·261 ± 0·0738·33E-020·218 ± 0·0680·197 ± 0·085 0·380 ± 0·094

NUD vs GC↓

PUD vs GC↓

5·87E-03a

4·59E-02a

6HylBbHP05990·413 ± 0·1230·168 ± 0·0890·472 ± 0·0830·458 ± 0·160

C1 vs C2↓

C1 vs C3↓

C1 vsC2-3↓

OG vs C1↑

4·97E-03a

1·20E-02a

2·40E-03a

3·28E-02a

cagA 0·266 ± 0·1230·509 ± 0·1723·70E-03a 
7Unknown0·245 ± 0·0310·331 ± 0·1000·291 ± 0·0830·173 ± 0·062

C2 vs C3↑

C1 vs C3↑

3·99E-02a

4·76E-02a

No    
8EF-TuaHP12050·158 ± 0·0390·035 ± 0·0430·132 ± 0·0360·104 ± 0·060

C1 vs C2↓

C1 vs C3↓

C1 vs C2-3↓

OG vs C1↑

2·95E-02a

6·34E-02

2·30E-02a

5·85E-03a

cagA 0·071 ± 0·0630·127 ± 0·0576·50E-02
9HP0958dHP09580·247 ± 0·2820·026 ± 0·0520·463 ± 0·1650·227 ± 0·138

C2 vs C3↑

C1 vs C2↓

C1 vs C3↓

C1 vs C2-3↓

4·70E-02a

1·70E-03a

1·75E-02a

6·98E-03a

cagA/disease0·116 ± 0·1360·328 ± 0·2363·79E-02a0·151 ± 0·1540·257 ± 0·2770·432 ± 0·202NUD vs GC↓168E-02a
10Unknown0·119 ± 0·045Absent0·099 ± 0·0240·051 ± 0·064

C1 vs C2↓

C1 vs C2-3↓

OG vs C1↑

1·93E-04a

3·89E-02a

1·17E-03a

cagA Absent0·113 ± 0·0392·14E-07a 
11AhpCcHP15634·060 ± 1·1913·279 ± 1·1463·407 ± 0·8505·271 ± 1·254

C2 vs C3↓

C1 vs C3↓

3·44E-02a

3·51E-02a

Disease 4·652 ± 1·4102·865 ± 0·8154·465 ± 1·138

NUD vs PUD↑

PUD vs GC↓

4·10E-02a

9·50E-02

12PpabHP06200·209 ± 0·0330·450 ± 0·1560·196 ± 0·1130·212 ± 0·093

C1 vs C2↑

C1 vs C3↑

C1 vs C2-3↑

6·23E-02

3·57E-02a

9·85E-03a

cagA /disease0·377 ± 0·1590·186 ± 0·0836·31E-03a0·172 ± 0·0620·606 ± 0·1870·457 ± 0·262

NUD vs PUD↓

NUD vs GC↓

1·33E-05a

5·74E-03a

13FldAdHP11610·3832·206 ± 0·8980·239 ± 0·1911·276a0·715

C2 vs C3↓

C1 vs C2↑

C1 vs C3↑

C1 vs C2-3↑

5·78E-02

1·76E-02a

9·88E-02

2·26E-02a

No 
14 Unknown0·422 ± 0·2890·604 ± 0·1390·227 ± 0·1390·538 ± 0·356C1 vs C2↑9·06E-03a cagA 0·638 ± 0·2920·331 ± 0·2202·05E-02a0·355 ± 0·1990·577 ± 0·2480·704 ± 0·393NUD vs GC↓4·83E-02a
15Rpl7/I12aHP11990·374 ± 0·1981·298 ± 0·2770·405 ± 0·3450·967 ± 0·231

C2 vs C3↓

C1 vs C2↑

C1 vs C3↑

C1 vs C2-3↑

OG vs C1↓

OG vs C3↓

3·80E-02a

1·31E-02a

8·20E-02

2·92E-02a

4·68E-03a

1·23E-02a

cagA 1·150 ± 0·3220·534 ± 0·3522·98E-03a 
16NapAaHP02431·306 ± 0·3582 404 ± 0·9330·956 ± 0·3461·818 ± 0·591

C2 vs C3↓

C1 vs C2↑

C1 vs C2-3↑

OG vs C1↓

3·38E-02a

3·55E-02a

5·83E-02

9·04E-02

cagA 2·095 ± 0·9411·397 ± 0·5085·19E-02
Figure 2.

Sector of the Coomassie-stained 2DE maps of the 11 Helicobacter pylori strains studied (10 Portuguese genetically and epidemiologically unlinked strains and the reference strain Hp NUD-26695). The indicated spots numbers (1 and 2) are the same as those used in Table 2. The full image of each of these 2DE maps was previously reported (Vitoriano et al. 2011a).

To verify the relationship between C2 and C3, the more closely related clusters, we next set the strains of both clusters in the same group (C2-3) and compared it with C1. Eight protein spots (spots 5, 6, 8, 9, 10, 12, 13 and 15) were shown to be differentially abundant between these groups (P < 0·05) (Fig. 3). From these, the abundance of spots 6, 8, 10, 12 and 13 was consistent between C2 and C3, but different from C1 (Fig. 3). HylB (spot 6) and EF-Tu (spot 8) were more abundant in C2-3, while Ppa (spot 12) and FldA (spot 13) were less abundant in C2-3 compared with C1. Interestingly, the unknown protein (spot 10) was not detected in C1. However, for the remaining spots (spots 5, 9 and 15), a differential abundance was still observed when C2 was compared with C3 (Table 2).

Figure 3.

Comparative analysis of the abundance level of the 16 protein spots that were differentially abundant between C1, C2, C3 and OG. Strains were clustered into five classes OG □, C1 image_n/jam12187-gra-0001.png, C2 image_n/jam12187-gra-0002.png, C3 ■image_n/jam12187-gra-0003.png, C2-3 image_n/jam12187-gra-0004.png). For each spot, column bars indicate the average of %Vol ± SD. Brackets indicate significantly different values within the pair (< 0·05).

Methyltransferases recognition sequences at the promoters

To evaluate the influence, if any, of the tested restriction and modification systems on the abundance of those 11 identified proteins (Table 2), we searched the promoters of their encoding genes for recognition sequences for the 28 used restriction endonucleases (Table 3). For this in silico analysis, we considered the genome sequence of the NUD-26695 strain deposited in the NCBI database and the TSSs determined by others (Roncarati et al. 2007; Sharma et al. 2010). Table 3 and Fig. 4a show the presence of the M.DpnII recognition site (5′-GATC-3′) at the promoter of the dnak operon of strain NUD-26695. Being its genome efficiently hydrolysed by DpnII (Table S1; Vale et al. 2009), PUD-147/00 (cluster OG, Fig. 1b) was the only unmethylated strain at 5′-GATC-3′. Interestingly, among the 11 H. pylori studied strains, this one presented the lowest abundance of DnaK (spot 1). This analysis suggests that unmethylation of the promoter of the dnak operon, at the 5′-GATC-3′ sequence, leads to a decrease in its abundance. Similar results were found for GlnA (spot 3), which has at its encoding gene promoter a recognition sequence for M.HpyCH4V (5′-TGCA-3′) (Table 3, Fig. 4b). Again, the H. pylori strains unmethylated at 5′-TGCA-3′ (Table S1; (Vale et al. 2009), namely PUD-147/00 (cluster OG, Fig. 1b), NUD-375/01 (cluster C1, Fig. 1b) and NUD-1622/05 (cluster C3, Fig. 1b), presented a lower abundance of GlnA compared with the other eight methylated strains. In contrast, methylation by M.Fnu4HI, at the 5′-GCNGC-3′ sequence, found at the promoter of the hylB operon in the NUD-26695 genome (Table 3, Fig. 4b), is apparently down-regulating the abundance of HylB (spot 6). In fact, the abundance of this protein is statistically lower (P < 0·05) in strains NUD-375/01 (cluster C1, Fig. 1b), PUD-93/00 (cluster C1, Fig. 1b) and NUD-1622/05 (cluster C3, Fig. 1b); all of them presumably present this promoter sequence in its methylated form. Although interesting, these correlations need further experiments to be fully established.

Table 3. Restriction Endonuclease (REase) (and respective methyltransferase) recognition sequences at promoters sequences in the NUD-26695 genome
REaseRecognition sequence (5′→3′)Genes of the strain NUD-26695
dnaK (HP0109)flaA (HP0601)glnA (HP0512)hylB (HP0599)EF-Tu (HP1205)HP0958ahpC (HP1563)ppa (HP0620)fldA (HP1161)rpl7/l12 (HP1203)napA (HP0243)
  1. +, indicates the presence of recognition sequence; −, indicates the absence of recognition sequence; N designates A, T, C or G nucleotides; W designates A or T nucleotides; m indicates that the methyltransferase/restriction endonuclease system only recognizes that sequence if the subsequent nucleotide is methylated.

AselATTATT
BseRIGAGGAGN10
BssHIIGCGCGC
BssKICCNGG
BstUICGCG+
DdelCTNAG+
DpnlGmATC++
DpnllGATC++
EaglCGGCCG
FaulCCCGCN4+
Fnu4HIGCNGC+
FoklGGATGN9
HaelllGGmCC
HhalGCGC+++
HpallCmCGG
Hpy188lTCNGA
Hpyl88IIITCNNGA+
Hpy99lCGWCG
HpyCH4lllACNGT
HpyCH4IV ACGT
HpyCH4VTGCA++
MsplmCCGG
NaelGCCGGC
NlalllCATG+
Sau3AIGATC++
Sau96lGGNCC
ScrFlCCNGG
TaqlTCGmA
Figure 4.

Analysis of the promoter regions for dnak, glnA and hylB. Boxes indicate the sequences 5′-GATC-3′ (specific for M.DpnII, Dam methylation), 5′-TGCA-3′ (specific for M.HpyCH4V) and 5′-GCNGC-3′ (specific for M.Fnu4HI) at the analysed regions of the promoters of dnak (a), glnA (b) and hylB (c), respectively. For dnak, recognition sites were found in the promoter region, which comprises 156 nucleotides upstream of the primary TSS according to Roncarati et al. (Roncarati et al. 2007; Sharma et al. 2010). The TSSs were determined according to Sharma et al. (Sharma et al. 2010). The numbers refer to the positions with respect to the TSS (position +1). The start codon is in boldface, and the Pribnow box (tgnTAtaAT) is underlined.

Proteome analysis according to cagA genotype

Interestingly, the five H. pylori strains in cluster C1 in the global dendogram were cagA-positive (Fig. 1a), while strains in C2 and in OG in the restricted dendogram were all cagA-negative (Fig. 1b). To clarify the influence of cagA genotype in the abundance of those 16 protein spots, the 2DE maps of the 11 H. pylori strains were regrouped into two classes (cagA-positive and cagA-negative) and re-evaluated in terms of changes in the %Vol of their spots. Statistical analysis revealed that from those 16 proteins spots, six (spots 6, 9, 10, 12, 14 and 15) were differentially abundant (P < 0·05) and other four proteins (spots 4, 5, 8 and 16) showed a tendency for differential abundance (P < 0·1) between these two classes (Fig. 5a, Table 2). All these proteins showed the same pattern of abundance either comparing cagA-positive with cagA-negative strains, or comparing the clusters C1 with C2, or C1 with OG (Table 2). Indeed, cagA-positive 2DE maps, including the ones from C1, presented significantly lower levels of HylB (spot 6) and of hypothetical protein HP0958 (spot 9), and higher levels of Ppa (spot 12), spot 14 (unknown protein) and Rpl7/l12 (spot 15). Moreover, spot 10 (unknown protein) was only detected in cagA-negative 2DE maps, including the ones from C2, C3 and OG. For these proteins, it is likely that the abundance detected when comparing the clusters is more related to the associated cagA genotype of the strains.

Figure 5.

Comparative analysis of the abundance level of protein spots in the Helicobacter pylori strains that were shown to be differentially expressed (< 0·05) according to: (a) the cagA genotype of the strain (cagA-negative □; cagA-positive ■); and (b) the disease to which each strain was associated (NUD □, PUD image_n/jam12187-gra-0001.png, and GC image_n/jam12187-gra-0004.png). For each spot, column bars indicate average %Vol ± SD. Brackets indicate significantly different values within the pair (< 0·05).

Proteome analysis according to associated gastric disease

We have previously demonstrated that clinical isolates of H. pylori from the same geographical region but associated with different gastric diseases present differences in the abundance of some proteins (Vitoriano et al. 2011b). Therefore, although the clusters obtained showed no association with gastric disease, our next step was to compare the abundance of those 16 protein spots by having the 2DE maps of the 11 H. pylori strains reorganized into three classes according to their associated gastric disease, that is, nonulcer dyspepsia, peptic ulcer disease and gastric cancer classes. Seven of those protein spots (spots 2, 3, 5, 9, 11, 12 and 14) were shown to be differentially abundant among these three classes (P < 0·05) (Fig. 5b, Table 2). For these seven proteins, it is likely that the variation in abundance detected when comparing C1, C2, C3 and OG may be somehow related to the associated gastric disease of each strain. In fact, FlaA (spot 2) was significantly more abundant in gastric cancer-2DE maps compared with peptic ulcer disease-2DE maps and thus in C2 and C3 (clusters containing strains associated with gastric cancer) compared with C1 or OG (clusters containing strains associated with peptic ulcer disease). The two protein spots that were more abundant in gastric cancer-2DE maps compared with nonulcer dyspepsia and peptic ulcer disease-2DE maps, that is, one species of GlnA (spot 3) and an unknown protein spot (spot 5), were also more abundant in C2. In addition, Ppa (spot 12), hypothetical protein HP0958 (spot 9) and the unknown spot 14 were found to be more abundant in 2DE maps of strains associated with more severe diseases (peptic ulcer disease and gastric cancer). One subunit of AhpC (spot 11) was also shown to be less abundant in peptic ulcer disease-2DE maps when compared with nonulcer dyspepsia- and gastric cancer-2DE maps.

Discussion

The cluster analysis performed revealed the presence of three main clusters (C1, C2 and C3) in all generated dendrograms (Fig. 1a), suggesting that the divergence nodes are preserved and reinforcing the correctness of the topology at that branch. Eight strains belonging to the three main clusters, two strains whose branch was outside of any cluster, and the reference H. pylori strain NUD-26695 were analysed once again with the MCRM algorithm, originating dendrograms with the same pattern of clustering (Fig. 1b).

We moved forward to assess whether the genomic methylation typing of strains with common geographic origin reflects differences in their proteomes among clusters. In the light of the protein speciation concept (Patterson and Aebersold 2003; Jungblut et al. 2008), we compared the abundance of 70 protein spots matched through the 2DE maps for the 11 strains of the restricted dendrogram. This analysis revealed 16 protein species showing significant differences in their abundance (P < 0·05) among clusters (Table 2). Under these experimental conditions, and analysing only 70 matched proteins, the most distant clusters presented the lowest number of differentially abundant protein spots between them. Therefore, our data suggest that clusters based on genome-methylation status do not reflect proteome proximity among strains. Indeed, in the group of 16 differentially abundant protein spots, only five were suggestive of reflecting similarities between the proteomes of clustered strains, because their abundance was consistent between closer groups (C2 and C3) but different from that of more distant groups (C1 and OG). These were revealed when C2 and C3 were settled into the same group (C2-3) and compared with C1. These include the following: the chemosensor protein HylB (spot 6) and a species of the elongation factor EF-Tu (spot 8), which had similar abundance levels in C2 and C3 and were significantly less abundant in C1 (P < 0·05); the two metabolic enzymes Ppa (spot 12) and FldA (spot 13), which were found to be significantly less abundant in C2 and C3 compared with C1 (P < 0·05); and the unknown protein spot 10, which was not detected in the 2DE maps of the strains in C1.

Although our cluster analysis was based on the different expression of methyltransferases among strains, these enzymes were not included in the group of the 11 differently abundant proteins identified (Table 2). This may find justification in the fact that methyltransferases may correspond to: any of the five unknown proteins (spots 4, 5, 7, 10 and 14); proteins with amounts below the detection threshold; or proteins that fall out of the detection window of the isoelectric point and molecular weight. However, our most plausible justification is that the gel-to-gel matching technique herein used, which led to the selection of 70 proteins from a total of 300 proteins, has immediately excluded methyltransferases. In fact, our approach essentially selects proteins that are expressed in the majority of the tested strains, implying the exclusion of methyltransferases because their encoding genes are strain specific (Alm et al. 1999).

It is well established that within the same geographic region, H. pylori strains vary in their virulence, ranging from those that cause no symptoms in infected patients to those that cause severe gastric diseases (peptic ulcer disease and gastric cancer) (Vitoriano et al. 2011a,b). The fact that a greater number of differentially abundant protein spots were observed when cluster C1 (cagA-positive) was compared with C2 (cagA-negative) prompted us to analyse 2DE maps according to the cagA genotype of the strains. The CagA protein is an important H. pylori virulence factor considered to be a bacterial biomarker of strains associated with peptic ulcer disease and gastric cancer (Atherton 2006). This protein is coded by the cag pathogenicity island (PAI), which, besides cagA, includes the encoding genes for type IV secretion system. Once within the cytoplasm of the human gastric cells, CagA induces abnormal proliferation, disruption of tight junctions, cytoskeleton rearrangements and IL-8 secretion in the host cell (Wen and Moss 2009). Of the 16 protein spots that were differently abundant among C1, C2, C3 and OG, 10 proved to be correlated with the cagA genotype of the strain, six of them with statistical significance. While HylB (spot 6) and the hypothetical protein HP0958 (spot 9) were less abundant (P < 0·05) in cagA-positive 2DE maps, Ppa (spot 12), Rpl7/l12 (spot 15) and the unknown protein spot 14 were more abundant (P < 0·05) in these strains (Fig. 5a). In addition, the unknown protein spot 10 was absent in cagA-positive 2DE maps. These results are consistent with those obtained when C1, composed of only cagA-positive H. pylori strains, was compared with C2 or OG (mainly cagA-negative strains). Although not reaching statistical significance (P < 0·1), the unknown protein spots 4 and 5 and the elongation factor EF-Tu (spot 8) were less abundant in cagA-positive 2DE maps. In contrast, NapA (spot 16), which in addition to its role in activation of the host immune system is involved in protecting bacteria under oxidative stress conditions (Wang et al. 2006b), was more abundant in cagA-positive 2DE maps (P < 0·1). Again, the same pattern of protein abundance was observed comparing C1 with C2 or OG. If these differences are a result of natural selection during the colonization process by cagA-positive strains or if the abundance of these proteins is directly influenced by cagA as a regulator or the CagA protein, which appear masked in the dendrogram, we do not know and further studies are necessary to clarify this.

Another way of looking at the virulence of the strain is to compare the strains grouped according to the gastric disease affecting the patient from whom they were isolated (Wu et al. 2008; Franco et al. 2009; Bernarde et al. 2010; Vitoriano et al. 2011a,b). Therefore, we analysed the pattern of abundance of those 16 protein spots, having the 11 selected H. pylori strains regrouped into nonulcer dyspepsia, peptic ulcer disease and gastric cancer classes. One species of GlnA (spot 3), a key enzyme in the metabolism of nitrogen assimilation (Marais et al. 1999), and the unknown protein spot 5 were significantly more abundant in gastric cancer-2DE maps (P < 0·05). This result is consistent with other studies, which showed that this protein is abundantly immunodetected in the sera of patients with gastric cancer (Lin et al. 2006). In a recent study, we linked the virulence of the strain to amino acid metabolism (Vitoriano et al. 2011b). The hypothetical protein HP0958 (spot 9), Ppa (spot 12) and the unknown protein spot 14 were also more abundant in H. pylori strains associated with severe pathologies (peptic ulcer disease and gastric cancer). Corroborating the association of cagA-positive genotype and the virulence of the strain, the latter two were more abundant in cagA-positive strains, as mentioned earlier (Table 2). In contrast, the hypothetical protein HP0958 (spot 9) was less abundant in cagA-positive strains, which is in agreement with our previous results (Vitoriano et al. 2011b). FlaA (spot 2), essential for bacterial motility (O'Toole et al. 2000), was significantly less abundant in the peptic ulcer disease-2DE maps than in those associated with gastric cancer. The same pattern of abundance was observed for one subunit of AhpC (spot 11), the most abundant antioxidant protein in H. pylori (Chuang et al. 2006; Wang et al. 2006a). These results are in agreement with the study that proposed this protein as a biomarker for gastric cancer (Bernardini et al. 2007) and with our previous study, which showed that levels of the AhpC-related antioxidant protein, catalase, are lower in strains associated with duodenal ulcer compared with nonulcer dyspepsia strains (Vitoriano et al. 2011b).

Taking all together, from the 70 matched protein spots, 16 differ in abundance among the 2DE maps of the genomic methylation status-based clusters. However, differences in the abundance pattern of 13 of them appear to reflect differences in the virulence of the strains. The abundance pattern of the other three, that is DnaK (spot 1), the unknown protein spot 7 and FldA (spot 13), although not correlated with the cagA genotype or gastric disease, may be correlated with cluster proximity or other unknown factors not detected in the dendrogram. It is expected that the proteome analyses performed after other genotyping methods, such as MLST, produce similar results, that is, the genotyping methods reflect genomic variability but not protein diversity. However, only the future studies including total methylome and proteome analysis may confirm this.

In prokaryotes, DNA methylation occurs primarily within the context of restriction and modification systems, and for particular situations, as is the case of the Dam methyltransferase of enteric bacteria, it has been shown to exert a regulatory role on bacterial gene expression and virulence (Heithoff et al. 1999). The present work reinforces that DNA methylation may also have a role in H. pylori gene regulation. In fact, our data suggest a correlation between the abundance of three of those 11 identified proteins and the methylation of the promoters of their encoding genes. Strains presumably methylated at the 5′-GATC-3′ and 5′-TGCA-3′ sequences within the promoters of dnak and glnA, respectively, showed a higher abundance of these two proteins (spots 1 and 2). Others have showed that the expression of the dnak operon is controlled by the hpyIM, gene encoding a methyltransferase specific for 5′-CATG-3′ sites (Donahue et al. 2002). Additionally, our data suggest that presumed methylation of the hylB promoter, at 5′-GCNGC-3′, leads to a lower abundance of HylB protein (spot 6). Future work will include the inactivation of these restriction and modification systems to fully establish these correlations.

Despite its typing power, strains of genomic methylation-based clusters do not share a particular proteome. Indeed, differences in protein abundance among strains seem to be more closely related to the cagA genotype of the strains or to their association with gastric diseases than to genome-methylation status. This is in line with the fact that, within the same geographic region, strains vary greatly in their virulence. Therefore, we conclude that the genomic methylation typing methods discriminate mostly based on differences in the nonexpressed genome of H. pylori strains. Notwithstanding, DnaK, GlnA and HylB protein abundance appears to be dictated by the methylation status of the respective gene promoter. Although the results obtained here need further confirmation, this represents a unique strategy trying to relate methylome and proteome in closely related bacterial strains. Further studies should include the determination of the methylation status of the gene promoters of the analysed strains, using for instance single-molecule real-time sequencing that allows the determination of modifications in the DNA sequence. Indeed, the future in methylome analysis will be the single-molecule real-time sequencing determining the modified bases as a part of routine sequencing procedure. In a recent study, six bacterial genomes were resequenced and all predicted N6-methyladenine and N4-methylcytosine methyltransferases were detected unambiguously. In all six genomes, a number of new N6-methyladenine and N4-methylcytosine methylation patterns were discovered and the DNA methyltransferases responsible for those methylation patterns were assigned (Murray et al. 2012). However, this methodology cannot detect 5-methylcytosine modification, but soon this limitation will be overcome (R. Roberts, personal communication). Also, comparative proteomics using more sensitive technologies than the classical 2DE, such as the DIGE and SILAC (Harsha et al. 2008), alone or combined with 2DE, may be helpful in such future studies. Evaluation of the protein species-specific regulation, that is, of the abundance of all protein species for a given expressed protein, which implicates a broad protein spots' MS identification, may also bring important novelties in this field.

Acknowledgements and Disclosures

This work was supported by a PPCDT/SAL-IMI/57297/2004 research grant from the Fundação para a Ciência e a Tecnologia (FCT, Portugal) and by a Research Fellowship 2011 from the Sociedade Portuguesa de Gastrenterologia (Portugal). I. Vitoriano was recipient of SFRH/BD/38634/2007 (FCT) doctoral fellowship. Jorge Vítor's laboratory was funded by New England Biolabs, Inc. (USA). The authors thank two anonymous reviewers for their valuable contributions and Patrícia Fonseca for editorial support.

Ancillary