Present address: Department of Dermatology, Aarhus University Hospital, Aarhus, Denmark.
Evolution of the paralogous hap and iga genes in Haemophilus influenzae: evidence for a conserved hap pseudogene associated with microcolony formation in the recently diverged Haemophilus aegyptius and H. influenzae biogroup aegyptius
Article first published online: 9 DEC 2002
Volume 46, Issue 5, pages 1367–1380, December 2002
How to Cite
Kilian, M., Poulsen, K. and Lomholt, H. (2002), Evolution of the paralogous hap and iga genes in Haemophilus influenzae: evidence for a conserved hap pseudogene associated with microcolony formation in the recently diverged Haemophilus aegyptius and H. influenzae biogroup aegyptius. Molecular Microbiology, 46: 1367–1380. doi: 10.1046/j.1365-2958.2002.03254.x
- Issue published online: 9 DEC 2002
- Article first published online: 9 DEC 2002
- Accepted 9 September, 2002.
Certain non-capsulate strains belonging to the Haemophilus influenzae/Haemophilus aegyptius complex show unusually high pathogenicity, but the evolutionary origin of these virulent phenotypes, termed H. influenzae biogroup aegyptius, is as yet unknown. The aim of the present study was to elucidate the mechanisms of evolution of two paralogous genes, hap and iga, which encode the adhesion and penetration Hap protein and the IgA1 protease respectively. Partial sequencing of hap and iga genes in a comprehensive collection of strains belonging to the H. influenzae/H. aegyptius complex revealed considerable genetic polymorphism and pronounced mosaic-like patterns in both genes, but no evidence of intrastrain recombination between the two genes. A conserved hap pseudogene was present in all strains of H. aegyptius and H. influenzae biogroup aegyptius, each of which constituted distinct subpopulations as revealed by phylogenetic analysis. There was no evidence for a second, functional copy of the hap gene in these strains. The perturbed expression of the Hap serine protease appears to be associated with the formation of elongated bacterial cells growing in chains and a distinct colonization pattern on conjunctival cells, previously termed microcolony formation. The fact that individual hap pseudogenes differed from the ancestral sequence by zero to two positions within a 1.5 kb stretch suggests that the silencing event happened ≈ 2000–11 000 years ago. Divergence of H. aegyptius and H. influenzae biogroup aegyptius occurred subsequent to this genetic event. The loss of Hap protein expression may be one of the genetic events that facilitated exploitation of the conjunctivae as a new niche.
Haemophilus influenzae is a Gram-negative bacterium exclusively associated with man. Members of this species are commensals of the upper respiratory tract particularly during childhood. Occasionally, resident strains may spread to the paranasal sinuses, the middle ear and the conjunctivae and cause local infection. However, particular virulence is associated with serotype b strains and strains belonging to the so-called biogroup aegyptius, both of which may cause invasive infections. H. influenzae serotype b was one of the three leading causes of bacterial meningitis until the implementation of vaccination in many countries, and H. influenzae biogroup aegyptius causes a septic paediatric disease known as Brazilian purpuric fever (BPF). This disease, first recognized in Brazil in 1984, is characteristically preceded by purulent conjunctivitis and develops into a fulminant sepsis with circulatory collapse and haemorrhagic lesions in the skin, and has a high mortality rate (Brazilian Purpuric Fever Study Group, 1987). Isolates from Brazilian BPF cases belong to a limited number of clones that are genetically distinct from isolates from an indistinguishable syndrome observed in Australia (McIntyre et al., 1987; Brenner et al., 1988; Wild et al., 1989; Musser and Selander, 1990; Tondella et al., 1995).
The species Haemophilus aegyptius was formally designated by Pittman and Davis in 1950 based on isolates from seasonal purulent conjunctivitis in Texas and is believed to represent the bacterium originally observed in Egypt by Koch (1883) in pus from eyes with conjunctivitis and cultured from cases of conjunctivitis in the USA by Weeks (1886) (hence the term ‘Koch–Weeks bacillus’). DNA homology studies reveal that H. aegyptius and H. influenzae are closely related, and it has been argued that they constitute a single species (Casin et al., 1986). Nevertheless, a number of phenotypic differences have been demonstrated including an elongated rod shape of H. aegyptius (Pittman and Davis, 1950; Mazloum et al., 1982), and the two taxa seem to differ with respect to pathogenic potential, although systematic evidence for this assumption is lacking. Since the discovery of BPF, H. influenzae biogroup aegyptius and H. aegyptius have often been treated as synonyms because of their mutual association with conjunctivitis, and the fact that they share a number of phenotypic traits (urease activity, inability to ferment xylose, lack of indole production, and lack of ornithine decarboxylase activity) (Brenner et al., 1988; Musser and Selander, 1990; St Geme et al., 1991). However, there is no direct genetic evidence to support the assumption that they are indistinguishable.
Although the polysaccharide capsule is crucial for the ability of H. influenzae serotype b to cause invasive disease, BPF case isolates are non-encapsulated, yet resistant to the bactericidal action of normal human sera (Porto et al., 1989). The underlying mechanism is unknown. A significant difference between the pathogenesis of BPF clone- and H. influenzae serotype b-associated disease is the port of entry into the bloodstream. Although the initial site of infection in BPF is the conjunctivae, serotype b strains colonize and invade through mucosal membranes of the upper respiratory tract. This apparent difference in tropism suggests that the bacteria use different mechanisms for adherence to and/or invasion of particular cell types.
Two virulence factors of H. influenzae, which are thought to be associated with adherence and invasion in different ways, are the IgA1 protease and the Hap protein. The two proteins are encoded by the paralogous iga and hap genes (Poulsen et al., 1989; St Geme et al., 1994). The IgA1 protease cleaves human IgA1 in the hinge region, which presumably enables the bacteria to evade the protective functions of this principal mediator of humoral immunity in the upper respiratory tract and in the eye (Kilian et al., 1996). The Hap (Haemophilus adhesion and penetration) protein was identified based on its ability to re-establish the adherence and epithelial cell penetration potential in a non-virulent laboratory strain (St Geme et al. (1994). Both proteins are serine proteases and belong to the so-called autotransporter family of proteins found in Gram-negative bacteria (Jose et al., 1995). All known proteins of this family are produced by mucosal pathogens, and most of the proteins have been ascribed virulence properties, and include, among others, a number of proteins that interact with eukaryotic cells (Hendrixson et al., 1997). Like the IgA1 protease (Pohlner et al., 1987; Poulsen et al., 1989), Hap is synthesized as a precursor comprising a signal peptide in the N-terminus, followed by a serine protease domain (Haps) and a C-terminal domain (Hapβ) that forms a beta-barrel structure in the bacterial outer membrane. This pore enables translocation of the protease domain to the bacterial surface, where it gains proteolytic activity and eventually releases itself by autoproteolysis (Pohlner et al., 1987; Hendrixson et al., 1997). It has therefore been suggested that Hap protein has dual functions, an unprocessed cell-associated form mediating adherence to epithelial cells and a secreted serine protease with as yet unknown substrate specificity (St Geme et al., 1994; Hendrixson et al., 1997). Support for the former function comes from experiments reported by Hendrixson and St Geme (1998), in which a strain retaining uncleaved cell-associated Hap as a result of a point mutation in the active site serine showed enhanced adherence to and microcolony formation on epithelial cells in vitro.
Analysis of the predicted amino acid sequences of the two H. influenzae proteins, Hap and IgA1 protease, revealed an overall 30–35% identity and 51–55% similarity (St Geme et al., 1994), indicating that the genes encoding the two proteins originally emerged as a result of gene duplication in the H. influenzae genome. In the H. influenzae strain Rd genome, the two genes are separated by 0.77 Mb (Fleischmann et al., 1995). H. influenzae and H. aegyptius iga genes show significant genetic polymorphism, which is reflected in antigenic diversity and in two distinct cleavage specificities (Mulks et al., 1982; Kilian et al., 1983; Poulsen et al., 1992; Lomholt and Kilian, 1995). Previous studies have indicated that strains of H. aegyptius produce a protease that cleaves the Pro-231–Ser-232 peptide bond in the IgA1 hinge region (type 1 protease), and that the antigenically distinct protease of BPF isolates cleaves the Pro-235–Thr-236 peptide bond (type 2 protease). Strains of H. influenzae may express IgA1 proteases that cleave either of the two bonds, and numerous antigenic types are found (Mulks et al., 1982; Kilian et al., 1983; Carlone et al., 1989; Lomholt and Kilian, 1995). Pronounced mosaicism in iga genes encoding antigenically distinct IgA1 proteases suggests that recombination plays a significant role in generating this diversity (Poulsen et al., 1992). There is no information about sequence diversity and evolutionary patterns of the hap gene.
In the present study, we examined hap and iga genes in a comprehensive collection of strains belonging to the H. influenzae/H. aegyptius complex to elucidate the patterns of evolution of these paralogous genes compared with the overall population structure. We also report that hap genes show pronounced mosaic-like patterns, but found no evidence of recombination between the two paralogous genes. Notably, strains belonging to the cluster of isolates from Brazilian BPF and conjunctivitis and authentic strains of H. aegyptius from cases of conjunctivitis in Texas all harboured a conserved hap pseudogene, which is associated with microcolony formation on epithelial cells in vitro.
The hap gene sequence in a strain of the BPF case clone
The hap gene was amplified by polymerase chain reaction (PCR) using DNA from strain HK871 (Adolfo Lutz Institute no. 287/86 and CDC no. F3034) as template and a number of combinations of primers designed on the basis of previously published sequences of hap and flanking regions from H. influenzae strains N187 and Rd (GenBank accession numbers U11024 and U32710 respectively). The resulting amplicons were sequenced and revealed a stretch of 4003 nucleotides with homology to the open reading frame (ORF) of the two known hap genes (the HK871hap sequence has been submitted to the DDBJ/EMBL/GenBank databases under accession number AF517153). An alignment showed that the hap nucleotide sequence from strain HK871 was 94% (including eight gaps) and 84% (including 15 gaps) similar to those of strains Rd and N187 respectively. In comparison, the identity between hap genes from strains Rd and N187 was 85% (including 14 gaps). The similarity between the HK871hap sequence and the published 1380 nucleotide (nt) partial sequence of the paralogous iga gene from the same strain (accession number X86103) was 45%, and the alignment revealed large areas with no significant homology. Notably, the 4003 nucleotide hap sequence in the H. influenzae biogroup aegyptius strain HK871 genome did not constitute an ORF. Compared with the hap genes of H. influenzae strains N187 and Rd, a total of four mutations interrupting the reading frame were found in the HK871hap sequence: an insertion of two As just after the ATG start codon resulting in a stop codon, TGA, at positions nt 25–27, a deletion of 40 nucleotides corresponding to positions nt 477–517, insertion of an A at position nt 2403 and a C to T transition at position nt 2425 creating a stop codon (all positions are relative to the first base of the ATG start codon in the N187hap gene). The two insertions of A nucleotides were within stretches of seven and six As, respectively, suggesting that the molecular mechanism creating these mutations was slipped strand mispairing. Thus, strain HK871 harbours a non-functional hap pseudogene.
Sequence diversity in the hap gene
For an additional 25 strains including three serotype b strains and 21 non-capsulated isolates from cases of conjunctivitis and BPF and one non-encapsulated strain isolated from pharynx, we amplified by PCR and sequenced a segment of the first half of the hap gene corresponding to nucleotide positions 207–1742 in the N187 hap gene (Fig. 1). For another three strains (H. influenzae serotype c strain HK635, H. influenzae HK1220 isolated from pharynx and non-BPF clone H. influenzae biogroup aegyptius strain HK1226), the PCR amplification was unsuccessful presumably because of lack of match with the primers used. The segment selected for sequencing included the active site serine and the 40 nt frameshift deletion present in the HK871 hap sequence. Together with the three complete hap sequences described above, this enabled us to compare this partial 1.5 kb hap sequence from a total of 28 strains of H. influenzae and H. aegyptius.
The partial hap sequences were highly variable. The similarity among the most diverse sequences was only 74%, and the alignment of the 28 sequences included a total of 15 gaps. Nevertheless, 14 strains showed the same partial hap sequence except for four polymorphic positions within a stretch of 1476 nucleotides, amounting to a total of three alleles termed 1A, 1B and 1C (Table 1). All hap sequences in this cluster contained the 40 nt frameshift deletion, indicating that they represented hap pseudogenes as identified in HK871. The remaining 14 partial hap sequences represented nine different alleles, termed 2–10. All these constituted an ORF and included the potential of encoding the amino acid sequence GDSG with the active site serine. Some of these gene sequences showed a high degree of similarity. Notably, one strain from an Australian case of BPF-like disease (HK1213) was identical over the entire stretch of 1476 nt to two serotype b strains (HK393 and HK368). The third serotype b strain, HK715, which belongs to the distinct evolutionary lineage II of H. influenzae serotype b defined by Musser et al. (1988), was very different (Fig. 1).
Occurrence of the 40 nt frameshift deletion characteristic of the hap pseudogene was examined by PCR in an additional 27 Haemophilus strains isolated from cases of BPF or conjunctivitis and 28 strains isolated from sites other than the eye. PCR was performed on whole bacteria using one primer spanning the site of deletion in the pseudogene combined with one 500 nt downstream. In this assay, a PCR amplicon indicates the presence of a hap pseudogene characterized by the deletion, whereas lack of a product does not necessarily imply that a given strain harbours a functional hap gene. A 0.5 kb amplicon was detected in 18 strains all of biovar III isolated from cases of conjunctivitis or BPF in Brazil or Texas. No other strain isolated from infections at other sites or from subjects in other parts of the world showed this particular hap pseudogene.
To examine the possibility that strains with the hap pseudogene might, in addition, harbour a functional hap gene, we performed Southern blot analysis on EcoRI-digested whole-cell DNA from 10 of the strains with the pseudogene and from eight of the strains with an ORF in the partial hap sequence. A DNA fragment from H. influenzae strain HK275 corresponding to positions nt 1501–1912 in the H. influenzae strain N187 hap gene (St Geme et al., 1994) was used as a probe for hybridization. This fragment has only a low degree of homology to the known iga gene sequences, and it contains a region of ≈ 100 nt that is highly conserved among the hap sequences. Thus, the probe is assumed to be specific for the hap sequences. All strains examined showed a single EcoRI fragment hybridizing with the probe, indicating that a single copy of the hap sequences is present in the genome of H. influenzae and H. aegyptius (data not shown). This is in agreement with the published genome sequence of H. influenzae strain Rd (HI no. 0247 represents the hap gene in Fleischmann et al., 1995).
(DDBJ/EMBL/GenBank database accession numbers for the determined hap gene sequences are AF517129, HK1213; AF517130, HK1221; AF517131, HK1227; AF517132, HK1231; AF517133, HK1232; AF517134, HK1233; AF517135, HK1240; AF517136, HK1245; AF517137, HK1247; AF517138, HK1248; AF517139, HK1249; AF517140, HK272; AF517141, HK274; AF517142, HK275; AF517143, HK284; AF517144, HK286; AF517145, HK292; AF517146, HK295; AF517147, HK367; AF517148, HK368; AF517149, HK369; AF517150, HK393; AF517151, HK61; AF517152, HK715; AF517153, HK871.)
Mosaic-like structures in the hap gene
The alignment of the 28 partial hap sequences showed several mosaic-like structures. For example, as illustrated in the four segments of the alignment shown in Fig. 1, strain HK715 was identical to the cluster of 14 strains harbouring the hap pseudogene in segment nt 746–845, identical to strains HK274 and HK275 in segment nt 1381–1480 and similar to strains HK61, HK284, HK292 and Rd in segment nt 1615–1711. Likewise, strain Rd was identical to strains HK1213, HK368 and HK393 in segments nt 444–543 and nt 746–845, whereas in segment nt 1381–1480, it was highly homologous to strains HK272, HK286 and HK295 and, in segment nt 1615–1711, it was similar to HK284 and HK292. Another example was the cluster of strains HK1213, HK368 and HK393, which was almost identical to the cluster that includes strains HK272, HK274, HK275, HK286 and HK295 in segment nt 1615–1711, whereas in the three other segments shown, these two clusters were very different. Such mosaic structures in the sequence alignment provide evidence of horizontal transfer of hap gene sequences between strains.
Diversity of an equivalent part of the iga gene
The strains included in the study all had IgA1 protease activity. For 12 strains, we successfully amplified by PCR and sequenced a segment of the first third of the iga gene corresponding to nucleotide positions 150–1527 in the HK368 iga gene (accession no. M87492, the numbering is from the first nucleotide in the ATG start codon). This sequence includes the substrate cleavage site determinant (Grundy et al., 1990) and the active site serine (Bachovchin et al., 1990; Poulsen et al., 1992). Combined with published iga gene sequences, we were able to compare this iga gene fragment from a total of 20 strains, 16 of which were also included in the alignment of partial hap gene sequences presented in Fig. 1. In all, except for one strain, the partial iga gene sequences constituted an ORF in agreement with the observed IgA1 protease activity. The exceptional strain HK266 produced very low IgA1 cleaving activity compared with the other strains. The partial iga gene sequence of HK266 contained a stretch of 10 guanines that separates the ORF into two parts. We propose that the homopolymeric G tract is subject to slipped strand mispairing resulting in loss of a single nucleotide in the tract, thereby restoring the ORF in a small fraction of the bacterial cells, consistent with the low IgA1 protease activity. Slipped strand mispairing in a homopolymeric tract as a mechanism of phase variation has been observed previously in several genes from Gram-negative bacteria (e.g. Wang et al., 2000; Karlyshev et al., 2002; Shafer et al., 2002).
The alignment of the 20 partial iga sequences confirmed the high diversity of this part of the iga gene reported previously (Poulsen et al., 1992; Lomholt and Kilian, 1995). A number of gaps were introduced in the sequences by the program pileup to obtain the alignment, and the overall similarity between the most diverse iga sequences was only 63% (Fig. 2). In agreement with previous observations (Poulsen et al., 1992), we found examples of mosaic-like organization of the homologies among the partial iga sequences, although the phenomenon was not nearly so pronounced as in the hap gene alignment.
We found no correlation between distinct alleles of the hap and iga genes, except for a cluster of five strains, HK1245, HK369, HK1213, HK1248 and HK1231, with almost identical iga sequences, of which four harboured the hap pseudogene (Fig. 2). Notably, strain HK1213, which had hap sequences identical to two serotype b strains, had an iga sequence quite distinct from these. In general, strains with identical hap gene sequences had quite different iga alleles (Fig. 2). Strain HK871 representing the BPF case clone had an iga gene sequence very different from other isolates with the hap pseudogene, and three strains with hap gene type 2 were all different in the iga gene sequence. Analogously, strains Rd and HK715 had identical iga gene sequences and very different hap sequences.
The partial iga and hap gene sequences analysed encode homologous regions of the IgA1 protease and Hap. A pairwise comparison of each of 20 iga gene sequences with each of 28 hap sequences revealed that, within larger stretches (>100 nt), none of the iga sequences showed a high degree of homology (>90%) with any of the hap gene sequences. The overall similarity between iga and hap in this first third of the genes was around 50%. In conclusion, iga and hap seem to have evolved independently within the genomes of H. influenzae and H. aegyptius.
The hap pseudogene is restricted to particular evolutionary lineages
To evaluate whether the presence of the hap pseudogene was associated with a particular subpopulation of the H. influenzae/H. aegyptius complex, we performed phylogenetic analysis of the strain population based on 15 gene loci examined indirectly by multilocus enzyme electrophoresis (MLEE) and four biochemical assays. All 15 enzyme loci analysed by MLEE were polymorphic among the 58 strains, with an average of 4.5 alleles per locus (range 2–10). A total of 45 electrophoretic types (ETs) were identified by MLEE analysis. The mean genetic diversity per locus (H) was 0.470 ± 0.068. This estimate of genetic diversity is very similar to the value reported by Musser et al. (1988) for non-encapsulated and capsulated H. influenzae (0.467). The index of association (IA) calculated on the basis of all strains was 1.826 ± 0.181 and, using ETs as the unit, the IA was 1.406 ± 0.206. Both these values are indicative of linkage disequilibrium and indicate that accumulations of mutations have played a more important role in the evolution of the genome of non-capsulated H. influenzae in general than horizontal gene transfer. This is in agreement with observations for capsulated H. influenzae (Musser et al., 1988; Feil et al., 2001).
Division A in the phylogenetic tree (Fig. 3) branched out into two major clusters (A1 and A2), of which the former included authentic strains of H. aegyptius (including the type strain) and the latter BPF clone isolates. We therefore refer to these two clusters as H. aegyptius and H. influenzae biogroup aegyptius respectively. The strains outside division A included the type strain of H. influenzae and several encapsulated H. influenzae strains of serotype b and c that were all considered to be typical of H. influenzae according to biochemical, morphological and physiological criteria (Kilian, 1976). All 38 strains in division A, with the exception of strains HK1212 and HK1213 from Australian cases of BPF-like infections, had the characteristic silencing deletion in the hap gene. In contrast, none of the 20 H. influenzae strains outside division A showed the hap gene deletion. The restricted occurrence of the hap pseudogene in strains belonging to particular clusters in Fig. 3 concurs with the conclusion that the genetic population structure is basically clonal.
Adherence to conjunctival epithelial cells
Three representative strains of each of the clusters, H. influenzae, H. aegyptius and H. influenzae biogroup aegyptius, were examined in adherence assays using conjunctival epithelial cells. Examination by light microscopy revealed striking differences between strains. The quantitative aspect is not considered in this study, but it was evident that, although all strains of H. aegyptius and H. influenzae biogroup aegyptius adhered in large numbers, significant differences were observed among the three H. influenzae strains examined. A further striking feature observed with all strains was that some epithelial cells were heavily colonized, whereas others were completely devoid of adhering bacteria.
Representative patterns are shown in Fig. 4. With all strains of H. aegyptius and H. influenzae biogroup aegyptius, adherent bacteria were present as large clusters of elongated bacteria often occurring as end-to-end chains (Fig. 4A–C). Similar elongated chains of bacteria were observed in Gram-stained smears of the same bacteria grown in liquid culture (Fig. 4F). This morphological feature was common to all strains belonging to H. aegyptius and H. influenzae biogroup aegyptius. In contrast to this pattern, adhering H. influenzae occurred as individual coccobacilli evenly distributed across individual epithelial cells (Fig. 4D–E). The same coccobacillary morphology was observed with the two Australian strains, HK1212 and HK1213, which had apparently regained a functional hap gene.
The majority of bacterial virulence factors show significant antigenic diversity as a consequence of the selection pressure exerted by the immune system of the host. This is reflected in significant sequence diversity in the related structural genes, usually exceeding that in other parts of the bacterial genome. The results of this study are in full agreement with this pattern. Both the hap and the iga gene of H. influenzae, which encode evolutionarily related but functionally distinct virulence factors, showed extensive genetic diversity. Comparison of hap gene sequences revealed distinct mosaic-like patterns similar to those observed previously in the iga gene (Poulsen et al., 1992). This indicates that the mechanism creating the genetic diversity is a combination of accumulation of mutations producing the basic variation and recombination forming novel combinations on which immunological selection may work. The high degree of diversity among the hap genes is most likely a result of immunological selection for variation in the surface-exposed Hap protein.
Gene duplication followed by specialization of each copy is thought to be a major mechanism in the evolution of genes (Jensen and Gu, 1996). The hap and iga genes are examples of duplicated genes that have diverged into distinct functions (St Geme et al., 1994). The significant sequence homology remaining between these paralogous genes and their mosaic-like structure inspired us to examine the relative significance of intrastrain recombination between the two genes and horizontal transfer of gene sequences between strains. Lack of a more pronounced sequence homology between comparable stretches of the two genes in individual strains, combined with lack of concordance between the genetic relationships of the two genes among the examined strains (Figs 1 and 2), clearly shows that intragenomic recombination between iga and hap has played a limited, if any, role in the recent evolution of these two genes. Conceivably, many years of independent diversification to optimize functional differences have rendered the identities between iga and hap too limited to allow for efficient homologous recombination.
Horizontal transfer of gene sequences between strains, which appears to be the dominant mechanism behind the mosaic-like patterns, is facilitated by the presence in the hap gene of the H. influenzae DNA uptake signal AAGTG CGGT. In strain Rd, this signal is found 69 bp upstream of the ATG start codon and, in strain N187, it is found 1985 bp downstream (St Geme et al., 1994; Fleischmann et al., 1995).
Considering that the Hap protein is considered as a virulence factor (St Geme et al., 1994), it was surprising that all authentic strains of H. aegyptius and H. influenzae biogroup aegyptius isolates from cases of BPF including the BPF case clone had a hap pseudogene, in contrast to all other non-capsulated and capsulated strains of H. influenzae. The hap gene was silenced by more mutational events including a nonsense mutation, insertions of a single or two nucleotides interrupting the reading frame and an out of frame deletion of 40 nt. We provide evidence that these strains do not have a duplicated functional hap gene in addition to the pseudogene, which suggests that the Hap protein is not essential for the ability of these bacteria to colonize and invade the conjunctival epithelium.
The important question is whether lack of Hap protein expression contributes to the particular virulence of H. aegyptius and BPF isolates. Studies reported by St Geme et al. (1991) revealed no difference in the ability to adhere to or invade conjunctival cells among eight isolates that, based on our data, must have included Hap-positive and Hap-negative phenotypes. Our observations confirm that lack of expression of the Hap protein does not eliminate the ability to adhere (Fig. 4). Conversely, the colonization pattern observed for strains with the hap pseudogene was strikingly similar to the colonization pattern described as microcolony formation by Hendrixson and St Geme (1998). Microcolony formation was achieved experimentally by these authors in an H. influenzae strain in which they selectively attempted to study the function of surface-associated Hap protein by interfering with its autorelease from the bacterial surface by a point mutation that affects its serine protease activity. Our results strongly suggest that lack of Hap serine protease activity rather than enhancement of its surface function is associated with microcolony formation on epithelial cells. Furthermore, it is conceivable that the elongated cell morphology and chain formation, which has long been considered one of the distinguishing but unexplained characteristics of H. aegyptius (Pittman and Davis, 1950; Mazloum et al., 1982), is a direct consequence of the lack of expression of the secreted Hap protease. However, it is not clear whether this change in phenotype affects virulence. Studies reported by Qui et al. (1998) indicated that human lactoferrin is capable of abolishing Hap-mediated adherence of H. influenzae. It is possible that lack of this target for lactoferrin is an advantage for strains of H. aegyptius and H. influenzae biogroup aegyptius, which may have compensated by enhanced expression of surface pili (Weyant et al., 1990; Reid et al., 1996) or other non-pilus adhesin (St Geme et al., 1991).
Feavers and Maiden (1998) recently proposed that the loss of expression of porA was one of the steps in the emergence of the gonococcus from a population that is represented today by the meningococcus. The porA gene is present in the gonococcal genome as a relatively well-conserved pseudogene. The scenario revealed by our study may be a parallel to this, and supports the hypothesis that selective loss of a property may allow bacterial pathogens to exploit a new niche.
Theoretically, a gene sequence may spread by clonal expansion or by horizontal gene transfer combined with positive selection. Only in the former case will the presence of the gene sequence correlate with the overall structure of the population, as was the case with the hap pseudogene (Fig. 3). Conservation of the hap pseudogene in the two separate phylogenetic clusters representing H. aegyptius and H. influenzae biogroup aegyptius (Fig. 3) suggests that the silencing mutations occurred before their evolutionary separation. The finding that the characteristic frameshift deletion in the hap gene was present exclusively in members of the two clusters representing H. aegyptius and H. influenzae biogroup aegyptius indicates that the genetic event resulting in this deletion occurred in a common ancestor. Assuming that sequence diversity at synonymous sites accumulates by mutations at a relatively constant rate (molecular clock hypothesis), this situation provides an opportunity to estimate the approximate time in evolution when these two clusters of eye pathogens separated. The total number of polymorphic sites within a hap gene stretch of 1476 nt was four among the 14 strains examined in detail (Table 1). A parsimonious analysis of the polymorphism shown reveals that the theoretical ancestral sequence is allele 1A (i.e. the sequence that assumes the least number of mutations to explain the polymorphism observed). This is supported by the fact that the characteristic nucleotides of allele 1A were regularly found among the strains of H. influenzae. In this background, the mean number of mutations within the stretch of 1476 nt among the 14 strains is one (the accumulated number of mutations required to explain the hap gene phylogeny of the 14 hap-typed strains shown in Fig. 3 is 14). Whittam (1996) and Guttman and Dykhuizen (1994) have calculated the rate of accumulation of synonymous mutations in Escherichia coli to be 6 × 10−9 and 3 × 10−8, respectively, per year. Assuming that the molecular clock rates for Haemophilus and E. coli are comparable, that bacteria causing acute infections undergo approximately 10 times more generations per year (≈ 3000) and that all sites in a pseudogene may be considered synonymous in this context, it can be calculated that ≈ 2000–11 000 years have elapsed since the deletion occurred in an H. influenzae strain that became the common ancestor of H. aegyptius and H. influenzae biogroup aegyptius. Based on these estimates, we conclude that separation of H. aegyptius and H. influenzae biogroup aegyptius, which occurred subsequent to this event, was recent in evolutionary time.
The two strains (HK1212 and HK1213) isolated from Australian infectious cases that were indistinguishable from BPF had a hap gene without the deletion, in spite of the fact that our phylogenetic analysis suggests that they diverged from H. aegyptius after the genetic event that led to the silencing of the hap gene. The finding that the partial hap gene sequence was identical to the corresponding sequence in two virulent serotype b strains (Fig. 1), whereas the iga gene was almost identical to that of H. aegyptius isolates (Fig. 2) suggests that this clone may have regained a functional hap gene as a result of horizontal gene transfer from a serotype b strain. The observation that these two strains showed a coccobacillary morphology in contrast to the elongated cells of H. aegyptius and H. influenzae biogroup aegyptius further supports the conclusion that the Hap protein plays a role in determining cell morphology and cell division.
To summarize, both hap and iga genes show considerable sequence diversity, and each shows clear evidence of interstrain recombination. In contrast, intrastrain recombination between these two paralogous genes played a limited, if any, role in their genetic diversification. A conserved hap pseudogene that is associated with elongated cells occurring in chains and microcolony formation on epithelial cells was present in all strains of H. aegyptius and in strains of H. influenzae biogroup aegyptius, which constituted distinct subpopulations that diverged recently in evolutionary time.
Fifty-eight strains labelled H. influenzae, H. aegyptius or H. influenzae biogroup aegyptius were studied in detail. Confirmation of the characteristic growth requirements for haemin and NAD and detection of tryptophanase (indole production), urease and ornithine decarboxylase activities were performed as described previously (Kilian, 1976). Among the 58 strains, 49 had been isolated from cases of conjunctivitis or from the blood of patients suffering from BPF. They were all non-capsulated and had the biochemical properties described for H. influenzae biovar II (indole positive, urease positive, ornithine decarboxylase negative) or III (indole negative, urease positive, ornithine decarboxylase negative) (Kilian, 1976). One pharyngeal non-capsulated isolate with the characteristics of H. influenzae biovar I (indole positive, urease positive, ornithine decarboxylase positive) and five strains of H. influenzae that were encapsulated on initial isolation were included for comparison: three strains of serotype b, one strain of serotype c and one strain of serotype d (strain Rd, now rough mutant). The strains are listed in Fig. 3 with information on their origin and assignment to biovar and serovar. Brazilian and Australian isolates were received from Drs M. C. Brandileone and I. M. Landgraf, Adolfo Lutz Institute, Sao Paulo, Brazil, and strains 18a, 178a, 758, 763, 46 and 24, all designated H. aegyptius, were a generous gift from Dr M. Pittman (deceased), National Institutes of Health, Bethesda, MD, USA. The remaining strains were from our own collection, of which most have been described in detail (Kilian, 1976). In addition to these 58 strains, 27 clinical isolates of H. influenzae were examined for the presence of the hap pseudogene by a PCR assay as described in the text. Thirteen of these were isolated in Denmark and 14 in Brazil. Two isolates were biovar I, and the remaining were either biovar II (n = 23) or biovar III (n = 2). Six of the isolates were serotype b, and 21 were non-capsulated. The strains were isolated from pharynx, sinusitis, otitis media, chronic bronchitis or cerebrospinal fluid.
The strains were cultivated on heated blood agar (chocolate agar) incubated at 37°C in air plus 5% CO2. Brain–heart infusion (BHI) broth (Difco Laboratories) supplemented with 5 mg each of haemin and nicotinamide dinucleotide (NAD) per litre was used as fluid growth medium.
Determination of IgA1 protease cleavage type
A loopful of bacteria harvested from a chocolate agar culture was suspended in a solution of purified human myeloma IgA1 and incubated overnight at 37°C. Cleavage of the IgA1 substrate was demonstrated by SDS-PAGE, and the site of cleavage (Pro-231–Ser-232, type 1 protease; Pro-235–Thr-236, type 2 cleavage) was determined by including IgA1 cleaved by reference strains for which the specificity had been identified by amino acid sequence analysis (Kilian et al., 1983).
Analysis of the phylogenetic relationships of the bacterial strains was performed on the basis of the results of MLEE combined with selected biochemical characteristics. Bacterial lysates for MLEE were prepared by sonication, electrophoresed in starch gels and selectively stained for activity of each of 15 metabolic enzymes as described by Selander et al. (1986). The enzymes assayed were phosphoglucomutase (PGM), glutamate dehydrogenase (GDH), 6-phosphogluconate dehydrogenase (6PG), glucose-6-phosphate dehydrogenase (G6P), carbamylate kinase (CDK), phosphoglucose isomerase (PGI), adenylate kinase (ADK), malate dehydrogenase (MDH), glutamate oxaloacetic transaminase (GOT), malic enzyme (ME), peptidase (PEP), leucine aminopeptidase (LAP), glyceraldehyde-3-phosphate dehydrogenase (G3P), nucleoside phosphorylase (NSP) and hexokinase (HEX). Absence of detectable activity was treated as missing data. Electromorphs (allozymes) of each enzyme equated with alleles at the corresponding structural gene locus were scored and assigned a number according to decreasing rate of anodal migration. Analyses of mean genetic diversity per gene locus (H) and the index of association (IA) were based on the MLEE data set and were performed using the programs etdiv and etlink, respectively, from T. S. Whittam (http:www.foodsafe.msu.eduwhittam#Programs). The MLEE data combined with data for tryptophanase, urease and ornithine decarboxylase activities and cleavage type of IgA1 protease constituted the data set for phylogenetic analysis. A pairwise distance matrix was produced with the program etmega from T. S. Whittam, and the phylogenetic tree was constructed using the neighbour-joining algorithm in the mega version 2.1 software package (Kumar et al., 2001).
Segments of the iga and hap genes were amplified by PCR using Ready To Go PCR beads (Pharmacia Biotech) combined with 10 pmol of each primer and whole bacteria boiled in water for 10 min as template. The thermocycling programme used for the PCRs consisted of denaturation at 94°C for 5 min and 30 cycles of 94°C for 1 min, 60°C for 1 min and 72°C for 2 min, followed by an extension at 72°C for 8 min. The PCR products were analysed by agarose gel electrophoresis. When no amplicons were detected, another PCR was attempted including an annealing temperature of 55°C. The PCR products were used as templates in the sequencing reactions after purification on Wizard minicolumns (Promega). For DNA sequencing, we used the same primers as for the PCR as well as internal primers designed on the basis of previous sequences. Individual sequence reactions were done with a Thermo Sequenase dye terminator cycle sequencing kit (Amersham Life Science) and analysed with an Applied Biosystems DNA sequencer. All nucleotide sequences were determined on both DNA strands.
For both the iga and hap genes, we selected for sequencing a segment encoding the N-terminal part of the preprotein. This area in the iga gene is known to be highly variable among H. influenzae strains and, as we subsequently found in this study, this also applies to the selected part of the hap gene. For amplification of these segments of the iga and hap genes, a large number of primers were designed on the basis of homologies among the known iga and hap gene sequences and taking into account which areas were assumed to be conserved within each of the genes as well as between the two. Generally, primers supposed to be gene specific were combined and, in addition, in order to maximize the possibility of amplifying the gene segment of interest, primers supposed to be specific for the particular gene were combined with one with homologous sequences present in the known iga and hap sequences (details of the primers are available upon request). However, PCRs using different combinations of primers resulted in successful amplifications from only part of the strains tested. The resulting amplicons were sequenced and, for each of the two genes, a region common to all the sequences determined was used for the alignment. Sequence alignments were performed with the programs gap and pileup from the GCG package (The Genetics Computer Group, University of Wisconsin, Madison, WI, USA).
Southern blot analysis
Approximately 2 µg of whole-cell DNA was digested with EcoRI, separated according to size by 1% agarose gel electrophoresis and blotted and fixed onto Nytran nylon membranes (Schleicher and Schuell) by the Southern blotting procedure (Sambrook et al., 1989). The probe used for hybridization was the product of a PCR containing whole-cell DNA from strain HK275 combined with the primers 5′-CGTGGTGGTCGCTTAGATCTTAAC-3′ and 5′-GTGGTATAC CTTCCATTTCTGACC-3′, which amplifies a fragment corresponding to nucleotide positions 1736–2150 in the H. influenzae strain Rd hap gene (accession no. U32710). The PCR product was electrophoresed in an agarose gel, eluted by the Gene Clean procedure (BIO 101) and labelled with [32P]-dATP using a random-primed DNA labelling kit (Roche Molecular Biochemicals). The hybridization was performed at 60°C as described previously (Sambrook et al., 1989) except that the filters were soaked in 1% (v/v) Triton X-100 before prehybridization, and 0.1% Na-pyrophosphate was included in all solutions. The final post-hybridization wash was at 60°C in 1× SET, 0.1% SDS and 0.1% Na-pyrophosphate.
PCR for detection of deletion in hap
The supernatant of whole bacteria in water boiled for 10 min served as template in the PCR using Ready To Go PCR beads. The forward primer 5′-CTGAACCGATAGGTATGACT ATCC-3′ was designed to cover the deletion of 40 nt, and the reverse primer 5′-GGATTAAATAACTTCACAGTTCCTAC-3′ located 500 nt downstream was selected in order to obtain an appropriate size of the amplicons. The thermocycling programme was as described for the DNA sequencing. The resulting PCR products were analysed by agarose gel electrophoresis.
Examination of the adherence pattern of nine representative strains (HK266, HK275, HK389, HK367, HK1227, HK1233, HK1237, HK1238, HK1232) was performed using human conjunctival epithelial cells (clone 1-5c-4 Wong–Kilbourne derivative of Chang conjunctiva, ATCC CCL-20.2, American Culture Collection, Manassas, VA, USA) cultured in Eagle medium with Earle's salts and non-essential amino acids (Gibco Invitrogen) supplemented with 10% heat-inactivated fetal calf serum (FCS; Biochrom KG) and 2.0 mM l-glutamine. For each experiment, 5 ml of tissue culture medium with ≈ 2 × 105 cells ml−1 was added to a 5 cm Petri dish containing a glass microscope coverslip. The cell cultures were then incubated at 37°C in air plus 5% CO2 for 2–3 days.
The bacteria used for adherence assays were grown overnight in BHI broth supplemented with 10% (v/v) Levinthal broth (Statens Serum Institut). A volume corresponding to 107 bacteria was inoculated directly into the tissue culture medium. After incubation for 4 h at 37°C in air plus 5% CO2, the supernatant was removed, and the cell monolayers were rinsed twice with tissue culture medium. The cells with adherent bacteria were then Gram-stained, and the coverslips were mounted with Aqua mount (BDH) on a microscope slide and examined by light microscopy.
This study was supported by the Danish Medical Research Council Grant 22-01-0265 and by the Velux Foundation. We thank Drs Ilka Maria Landgraf and Maria C. Brandileone at the Adolfo Lutz Institute, Sao Paulo, Brazil, for generously providing strains for this study, and Dr Mikkel H. Schierup, Center for Bioinformatics, University of Aarhus, for helpful discussions concerning estimation of age. Birte Esmann, Tove Findal and Anni Skovbo provided excellent technical assistance.
- 1990) Inhibition of IgA1 proteases from Neisseria gonorrhoeae and Haemophilus influenzae by peptide prolyl boronic acids. J Biol Chem 265: 3738–3743. , , , , and (
- Brazilian Purpuric Fever Study Group (1987) Brazilian purpuric fever: epidemic purpura fulminans associated with antecedent purulent conjunctivitis. Lancet ii: 757–761.
- 1988) Biochemical, genetic, and epidemiologic characterization of Haemophilus influenzae biogroup aegyptius (Haemophilus aegyptius) strains associated with Brazilian purpuric fever. J Clin Microbiol 26: 1524–1534. , , , , , , et al. (
- 1989) Potential virulence-associated factors in Brazilian purpuric fever. J Clin Microbiol 27: 609–614. , , , , , , et al. (
- 1986) Deoxyribonucleic acid relatedness between Haemophilus aegyptius and Haemophilus influenzae. Ann Microbiol (Paris) 137B: 155–163. , , and (
- 1998) A gonococcal porA pseudogene: implications for understanding the evolution and pathogenicity of Neisseria gonorrhoeae. Mol Microbiol 30: 647–656. , and (
- 2001) Recombination within natural populations of pathogenic bacteria: short-term empirical estimates and long-term phylogenetic consequences. Proc Natl Acad Sci USA 98: 182–187. , , , , , , et al. (
- 1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269: 496–512. , , , , , , et al. (
- 1990) Localization of the cleavage site specificity determinant of Haemophilus influenzae immunoglobulin A1 protease genes. Infect Immun 58: 320–331. , , and (
- 1994) Clonal divergence in Escherichia coli as a result of recombination, not mutation. Science 266: 1380–1383. , and (
- 1998) The Haemophilus influenzae Hap serine protease promotes adherence and microcolony formation, potentiated by a soluble host protein. Mol Cell 2: 841–850. , and , III (
- 1997) Structural determinants of processing and secretion of Haemophilus influenzae Hap protein. Mol Microbiol 26: 505–518. , , , and , III (
- 1996) Evolutionary recruitment of biochemically specialized subdivisions of family I within the protein superfamily of aminotransferases. J Bacteriol 178: 2161–2171. , and (
- 1995) Common structural features of IgA1 protease-like outer membrane protein autotransporters. Mol Microbiol 18: 378–380. , , and (
- 2002) A novel paralogous gene family involved in phase-variable flagella-mediated motility in Campylobacter jejuni. Microbiology 148: 473–480. , , , and (
- 1976) A taxonomic study of the genus Haemophilus, with the proposal of a new species. J Gen Microbiol 93: 9–62. (
- 1983) Molecular biology of Haemophilus influenzae IgA1 proteases. Mol Immunol 20: 1051–1058. , , , and (
- 1996) Biological significance of IgA1 proteases in bacterial colonization and pathogenesis: critical evaluation of experimental evidence. APMIS 104: 321–338. , , , , and (
- 1883) Bericht über die Thätigkeit der Deutchen Cholerakommission in Aegypten und Ostindien. Wien Med Wochenschr 33: 1548–1551. (
- 2001) Molecular Evolutionary Genetics Analysis Software. Tempe, AZ, USA. Arizona State University. , , , and (
- 1995) Distinct antigenic and genetic properties of the immunoglobulin A1 protease produced by Haemophilus influenzae biogroup aegyptius associated with Brazilian purpuric fever in Brazil. Infect Immun 63: 4389–4394. , and (
- 1987) Brazilian purpuric fever in central Australia. Lancet ii: 112. , , , and (
- 1982) Differentiation of Haemophilus aegyptius and Haemophilus influenzae. Acta Pathol Microbiol Immunol Scand Section B 90: 109–112. , , , and (
- 1982) Relationship between the specificity of IgA proteases and serotypes in Haemophilus influenzae. J Infect Dis 146: 266–274. , , , and (
- 1990) Brazilian purpuric fever: evolutionary genetic relationships of the case clone of Haemophilus influenzae biogroup aegyptius to encapsulated strains of Haemophilus influenzae. J Infect Dis 161: 130–133. , and (
- 1988) Evolutionary genetics of the encapsulated strains of Haemophilus influenzae. Proc Natl Acad Sci USA 85: 7758–7762. , , , and (
- 1950) Identification of the Koch–Weeks bacillus (Haemophilus aegyptius). Proc Soc Exp Biol Med 118: 671–679. , and (
- 1987) Gene structure and extracellular secretion of Neisseria gonorrhoeae IgA protease. Nature 325: 458–462. , , , and (
- the Brazilian Purpuric Fever Study Group (1989) Resistance to serum bactericidal activity distinguishes Brazilian purpuric fever (BPF) case strains of Haemophilus influenzae biogroup aegyptius (H. aegyptius) from non-BPF strains. J Clin Microbiol 27: 792–794. , , , and
- 1989) Cloning and sequencing of the immunoglobulin A1 protease gene (iga) of Haemophilus influenzae serotype b. Infect Immun 57: 3097–3105. , , , , and (
- 1992) A comparative genetic study of serologically distinct Haemophilus influenzae type 1 immunoglobulin A1 proteases. J Bacteriol 174: 2913–2921. , , and (
- 1998) Human milk lactoferrin inactivates two putative colonization factors expressed by Haemophilus influenzae. Proc Natl Acad Sci USA 95: 12641–12646. , , , , , and (
- 1996) Duplication of pilus gene complexes of Haemophilus influenzae biogroup aegyptius. J Bacteriol 178: 6564–6570. , , , and (
- 1989) Molecular Cloning: A Laboratory Manual, 2nd edn. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. , , and (
- 1986) Methods of multilocus enzyme electrophoresis for bacterial population genetics and systematics. Appl Environ Microbiol 51: 873–884. , , , , , and (
- 2002) Phase variable changes in genes lgtA and lgtC within the lgtABCDE operon of Neisseria gonorrhoeae can modulate gonococcal susceptibility to normal human serum. J Endotoxin Res 8: 47–58. , , , , , , et al. (
- 1991) Surface structures and adherence properties of diverse strains of Haemophilus influenzae biogroup aegyptius. Infect Immun 59: 3366–3371. , III, , and (
- 1994) A Haemophilus influenzae IgA protease-like protein promotes intimate interaction with human epithelial cells. Mol Microbiol 14: 217–233. , III, , and (
- 1995) Brazilian purpuric fever caused by Haemophilus influenzae biogroup aegyptius strains lacking the 3031 plasmid. J Infect Dis 171: 209–212. , , and (
- 2000) Lewis antigens in Helicobacter pylori: biosynthesis and phase variation. Mol Microbiol 36: 1187–1196. , , , and (
- 1886) The bacillus of acute catarrhal conjunctivitis. Arch Opthalmol (old series) 15: 441–451. (
- 1990) Purification and characterization of a pilin specific for Brazilian purpuric fever-associated Haemophilus influenzae biogroup aegyptius (H. aegyptius) strains. J Clin Microbiol 28: 756–763. , , , , , , et al. (
- 1996) Genetic variation and evolution processes in natural populations of Escherichia coli . In Escherichia coli and Salmonella. Neidhardt, F.C., Curtiss, R., III, Ingraham, J.L., Lin, E.C.C., Low, K.B., Magasanik, B., et al. (eds). Washington, DC: American Society for Microbiology Press, pp. 2708–2720. (
- 1989) Brazilian purpuric fever in Western Australia. Med J Aust 150: 344–346. , , , , and (