A novel integrative and conjugative element (ICE) of Escherichia coli: the putative progenitor of the Yersinia high-pathogenicity island

Authors


Summary

Diversification of bacterial species and pathotypes is largely caused by horizontal transfer of diverse DNA elements such as plasmids, phages and genomic islands (e.g. pathogenicity islands, PAIs). A PAI called high-pathogenicity island (HPI) carrying genes involved in siderophore-mediated iron acquisition (yersiniabactin system) has previously been identified in Yersinia pestis, Y. pseudotuberculosis and Y. enterocolitica IB strains, and has been characterized as an essential virulence factor in these species. Strikingly, an orthologous HPI is a widely distributed virulence determinant among Escherichia coli and other Enterobacteriaceae which cause extraintestinal infections. Here we report on the HPI of E. coli strain ECOR31 which is distinct from all other HPIs described to date because the ECOR31 HPI comprises an additional 35 kb fragment at the right border compared to the HPI of other E. coli and Yersinia species. This part encodes for both a functional mating pair formation system and a DNA-processing region related to plasmid CloDF13 of Enterobacter cloacae. Upon induction of the P4-like integrase, the entire HPI of ECOR31 is precisely excised and circularised. The HPI of ECOR31 presented here resembles integrative and conjugative elements termed ICE. It may represent the progenitor of the HPI found in Y. pestis and E. coli, revealing a missing link in the horizontal transfer of an element that contributes to microbial pathogenicity upon acquisition.

Introduction

Horizontal gene transfer, the intraspecies and interspecies exchange of genetic information plays an essential role in the evolution of bacteria (Jain et al., 1999; de la Cruz et al., 2000; Ochman et al., 2000; Jain et al., 2002). Three major mechanisms, transformation, transduction and conjugation (Davison, 1999), provide bacterial populations with access to a horizontal gene pool, enabling them to rapidly respond to environmental challenges (Hacker and Kaper, 2000). For example, many bacterial pathogens contain clusters of virulence genes not present in closely related non-pathogenic strains or species. These gene clusters may be located on transmissible phages or plasmids, but are often found as so-called pathogenicity islands (PAIs) on the chromosome (Groisman and Ochman, 1996; Hacker et al., 1997). The acquisition of a pathogenicity island is likely to have been a key step in the evolution of the pathogen. However, specific transmissibility of PAIs has not yet been demonstrated and their evolutionary origin remains unknown. The high-pathogenicity island (HPI), first described in Yersinia pestis, Y. pseudotuberculosis and Y. enterocolitica IB biotype, displays characteristics of a typical pathogenicity island as: (i) it is a large chromosomal DNA fragment (35–45 kb); (ii) it carries virulence genes, namely the yersiniabactin siderophore system essential for the expression of the high-virulence phenotype in yersiniae; (iii) it is inserted at the 3′-end of a tRNA gene (asn tRNA); (iv) its G + C content is different to that of the remainder of the chromosome, and (v) it is flanked by repeated sequences (Carniel et al., 1996; Bearden et al., 1997; Pelludat et al., 1998). A unique characteristic of the HPI is its wide distribution in various members of the family Enterobacteriaceae, above all in extraintestinal pathogenic isolates of E. coli (ExPEC) (Schubert et al., 1998; Karch et al., 1999; Bach et al., 2000; Schubert et al., 2000; Oelschläger et al., 2003). In ExPEC strains, the HPI has been found to be functional and most closely associated with virulence compared to other ‘traditional’ virulence factors (Johnson and Stell, 2000). Furthermore, the HPI has been shown to be involved in the pathogenicity of ExPEC strains (Johnson and Stell, 2000; Schubert et al., 2002). The mode of HPI mobilization and transfer, however, has yet not been determined and, though these processes are presumed to involve bacteriophages, no bacteriophage-related sequences beside the P4-like integrase gene are detectable on the HPI. Here we have characterized the HPI of E. coli strain ECOR31 (HPIECOR31), the structure of which is distinct from the HPI of both Yersinia and other E. coli. The HPIECOR31 reveals an additional 35 kb DNA fragment at the 3′-border exactly at the position where an IS100 element is inserted into the HPI of Y. pestis and Y. pseudotuberculosis. This DNA fragment encompasses three distinct regions encoding: (i) a complete and functional mating pair formation system related to IncX plasmid R6K of E. coli; (ii) a putative nic site (oriT) together with a DNA-processing region related to plasmid CloDF13 of Enterobacter cloacae (MobB, MobC), and (iii) ORFs displaying a weak homology to chromosomal genes of Vibrio cholerae. Induction of the phage P4-like integrase results in precise excision and circularisation of the complete HPIECOR31. Thus, the entire HPIECOR31 structurally resembles a subgroup of integrative and conjugative elements (Hochhut and Waldor, 1999; Burrus et al., 2002a) and is suggestive of being a mobilizable progenitor of the HPI found in both E. coli and Yersinia species.

Results

Characterization of the right border of the HPI in E. coli strain ECOR31

To determine and compare the insertion locus of the HPI in E. coli, we subjected the E. coli collection of reference (ECOR) to a PCR survey using primers covering all four asn tRNA genes present in E. coli. In all except one of the HPI-positive ECOR strains, the HPI was found to be located adjacent to the asnT tRNA gene as has been previously described for other E. coli strains (Schubert et al., 1999). The E. coli strain ECOR31, however, revealed a different insertion locus (asnV tRNA) and a considerably larger 3′-part of the HPI, which lacks a deletion of the 3′-border described for all other HPI-positive E. coli(Fig. 1). In order to further characterize the 3′-border of the HPIECOR31 (right border, RB-HPIECOR31), we constructed a cosmid library and screened 800 cosmid clones by colony blot hybridization. We chose two of the cosmids, pDU17 and pDU18 (Table 1), which covered the entire right part of the HPIECOR31 together with the neighbouring chromosome (Figs 2 and 4), and determined the complete sequence by a shotgun approach. Sequencing of the cosmids revealed that the RB-HPIECOR31 is highly homologous to the HPI of Y. pestis/Y. pseudotuberculosis up to 3427 nucleotides downstream of the stop codon of the fyuA/psn gene (99.8% identity, Fig. 1). In Y. pestis/Y. pseudotuberculosis a copy of the IS100 insertion element had been found at this position, followed by a fragment of 250 bp and 519 bp in Y. pseudotuberculosis and Y. pestis, respectively, and a 17 bp direct repeat (attO) which represents the ultimate border of the HPI (Buchrieser et al., 1998; Hare et al., 1999). In contrast, the RB-HPIECOR31 is composed of a 34 480 bp DNA fragment that reveals homology to conjugative plasmids (Fig. 2, Table 2). The overall G + C content of the RB-HPIECOR31 is 47.9%, which is about the average for the E. coli K-12 genome (50.8%). However, the G + C content varies significantly within the RB-HPIECOR31 (25–66%), suggesting a composite structure derived from diverse sources. Twenty-nine ORFs larger than 150 nt were identified corresponding to a total coding region of 80.9% (Fig. 2, Table 2). Seven translated ORFs were found to show no significant or only a very low similarity to protein sequences in the databases. With regard to the nucleotide homology and the putative function of the gene products, the RB-HPIECOR31 region exhibits a modular structure with three distinct regions (Fig. 2).

Figure 1.

Comparison of the structure of the HPI in E. coli ECOR31, Y. pestis and other E. coli strains (e.g. E. coli strain CFT073). Grey arrows show the integration site of the HPI (asparagine tRNA genes asnT or asnV) with the attO site indicated as hatched boxes representing the 17 bp direct repeat (DR) flanking the HPI of Y. pestis and E. coli ECOR31. Dashed black lines show the neighbouring chromosome. Black arrows indicate the int and fyuA genes marking the boundary of the core of the HPI. RB-HPI ECOR31 defines the location of the 34 480 bp right border of the HPI of E. coli strain ECOR31.

Table 1. . Bacterial strains, plasmids, cosmids and oligonucleotides used in this study.
Strains, plasmids, cosmids or oligonucleotidesRelevant characteristic(s) and descriptionReference(s) or source
Strains
ECOR 31Strain 31 of the E. coli collection of reference (ECOR); Tcr, Knr, Apr, Cms, Tps Ochman and Selander (1984)
DH10BFmcrAΔ(mrr-hsdRMS-mcrBC)Φ80 δλαχΖΔM15 ΔlacX74 endA1 recA1 deoR(ara leu)7697 araD139 galU galK nupG rpsLλInvitrogen
XL-1 Blue MR′ Δ(mcrA)183 Δ(mcrCB-hsdSMR-mrr)173 endA1 supE44 thi-1 recA1 gyrA96 relA1 lac Stratagene
TH2 supE44 hsdS20 (rBmB) recA13 ara-14 proA2 lacY1 galK2 rpsL20 xyl-5 mtl-1 thi trpR624Takara Shuzo Co. Ltd.
TH2-SpecSpontaneous spectinomycin resistant mutant of E. coli TH2This study
TH2-TeTetracycline resistant derivative of E. coli TH2K. Hantke, Tübingen
S17-1RP4-2 (Tc::Mu, Kn::Tn7), Tpr Smr Simon et al. (1988)
Plasmids, cosmids
SuperCos1Cosmid vector; 7.6 kb; Apr KnrStratagene
pACYC184Cloning vector, P15A replicon, Cmr TcrInvitrogen
pGP1-2Cloning vector, P15A replicon, contains the T7 RNA polymerase gene, Knr Tabor et al. (1985)
pGP1-2DHFR pGP1-2 with EZ::TN DHFR-1 insertion within the Knr-cassette, TprThis study
pCR4-TOPOTA cloning vector; 3.9 kb; Apr KnrInvitrogen
pDU17SuperCos1 cosmid vector carrying 36-kbp fragment of the HPIECOR31This study
pDU18SuperCos1 cosmid vector carrying 35-kbp fragment of the HPIECOR31This study
pDU19pDU18 carrying EZ::TN < DHFR-1 > transposon insertion within the pilX5 geneThis study
pDU20pDU18 carrying EZ::TN < DHFR-1 > transposon insertion within the pilX6 geneThis study
pDU21pDU18 carrying EZ::TN < DHFR-1 > transposon insertion within the mobB geneThis study
pDU25pACYC184 carrying the1838-bp DNA fragment with the putative oriT of HPIECOR31; TcrThis study
pDU26pDU18 carrying EZ::TN < DHFR-1 > transposon insertion within the helicase geneThis study
pDU30pT7-5 expression vector tagged with a Cmr cassette carrying the int gen of E. coli ECOR31This study
Oligonucleotides, position (nt)
oriT.for, 162215′-GCCGTATGAATGACATCTATTTCCGTC-3′This study
oriT.rev, 180875′-GCTTATTTAAAATCATCATGCCTCCTTCGT-3′This study
P15′-CCGCCATTACTTACAACCAGA-3′This study
P25′-AGAAGGCTTGAGGGTGCGGATTT-3′This study
P35′-CACAACTGCGGCTTCACTCAAA-3′This study
P45′-CGTGTGCAAATTATCGAC-3′This study
Figure 2.

Genetic organization of the 34 480 bp RB-HPIECOR31 with three distinct DNA regions I to III. The locations and orientation of the ORFs described in this study are indicated by arrows. Genes of the Mpf gene cluster of region I are indicated by grey shading, black arrows indicate genes encoding for the DNA-processing system of region II with the IS element IS630 homologue depicted as hatched arrow (tnpA), and white arrows indicate ORFs of region III. Designations above the schemes represent genes with homology to other bacterial alleles as shown in Table 2. The black triangle upstream of region I shows the end of homology to the HPI of Y. pestis/pseudotuberculosis, the black bar downstream of region III indicates the end of HPIECOR31 built by the 17 bp direct repeat (DR).

Figure 4.

Structural and functional analyses of RB-HPIECOR31. The central, long horizontal line represents the map of the RB-HPIECOR31 with the three DNA regions depicted as grey, black and white boxes. The small horizontal arrows point in the direction of transcription of the genes for RB-HPIECOR31. The arrowhead under region II indicates the position of the putative nic site. Below the RB-HPIECOR31 map, horizontal lines indicate the DNA segments remaining in several HPIECOR31 derivatives. The name of each cosmid/plasmid is shown on the right-hand side. Open triangles indicate the position of trimethoprim resistance cassettes inserted by EZ::TN DHFR-1 mutagenesis with dashed lines projecting the respective position on the map of the RB-HPIECOR31. Functional properties of deletion derivatives (right): Mob: +, functional Mob region as deduced from the ability of the particular derivative (helper plasmid) to mobilize the indicator plasmid pDU25 (carrying oriTHPI) from E. coli donor strain DH10B bearing the respective helper plasmid to the E. coli recipient strain TH2; –, absence of functional Mob region.

Table 2. . Characteristics of ORFs and deduced amino acid sequences present in the sequenced DNA fragment.
ORFProduct size (amino acids)ORF location (start, stop)aORF product exhibits homology to:SourceIdentityAccession no.
No.Name%Range (AA)
  • a

    . Nucleotide position from start to stop codon in the HPIECOR31 sequence with regard to the start of the fyuA gene (start of fyuA as nt 1).

  • b

    . Presented as the percentage amino acid identity between the 3′-part of HPIECOR31 and the best hit as determined with blast and fasta. The range is the number of amino acids over which this identity exists.

01 pilX1 236 6363, 7073Pilx1 protein, mating pair formationIncX plasmid R6K, E. coli58205CAC20138.1
02 pilX2  97 7073, 7366VirB2 protein (TraC protein), mating pair formationPlasmid pSB102, R. meliloti36 91CAC79181.1
03 pilX3- 912 7379, 10117N-terminal: VirB3 proteinPlasmid pSB102, R. meliloti36102CAC79180.1
pilX4   C-terminal: Pilx4 proteinIncX plasmid R6K, E. coli44810CAC20141.1
04 pilX5 23510135, 10842ORF6 of plasmid pYC, PilX5 homologuePlasmid pYC, Y. pestis41235AAF05102.1
05ORF5 8010850, 11092Lipoprotein Eex homologueIncX plasmid R6K, E. coli36 58CAC20143.1
06 pilX6 35711096, 12169ORF5 of plasmid pYC, PilX6 homologuePlasmid pYC, Y. pestis37249AAF05101.1
07ORF7 4512261, 12398no significant homology    
08 pilX8 22712391, 13074Pilx8 protein, mating pair formationIncX plasmid R6K, E. coli37179CAC20146.1
09 pilX9 30213071, 13979Pilx9 protein, mating pair formationIncX plasmid R6K, E. coli42283CAC20147.1
10 pilX10 41614023, 15273hypothetical protein, PilX10 homologue Xylella fastidiosa Dixon73333NZ_AAAL01000169.1
11 pilX11 34115263, 16288Pilx11 protein, mating pair formationIncX plasmid R6K, E. coli46320CAC20149.1
12 yggA 10116718, 17023YggA proteinR721 plasmid, E. coli55 98BAB12662.1
13ORF1310117057, 17362no significant homology    
14ORF1410017473, 17775no significant homology    
15 mobB 62918042, 19931MobB protein, coupling protein/relaxase Enterobacter cloacae, plasmid CloDF1334423CAB62409.1
16 mobC 24819941, 20687MobC protein, involved in relaxation E. cloacae, plasmid CloDF1331216CAB62410.1
17 vrlS 96423642, 20748VrlS, putative DEAH helicase, ATP-dependent Dichelobacter nodosus 30436AAC33388.1
18 vrlR 26025034, 24255VrlR protein, unknown function Dichelobacter nodosus 29121AAC33387.1
19 31526285, 25338Antirestriction protein Mesorhizobium loti 44280BAB52493.1
20 tnpA 34326818, 27849Transposase of insertion element IS630 Shigella sonnei 94343CAA29389.1
21aORF21a 8628381, 28641no significant homology    
21bORF21b20028651, 29253no significant homology    
22 vc0181 17830135, 29599VC0181, conserved hypothetical protein Vibrio cholerae 57143AAF93357.1
23 vc0180 53831815, 30199VC0180, conserved hypothetical protein Vibrio cholerae 50485AAF93356.1
24 vc0179 43233113, 31815VC0179, conserved hypothetical protein Vibrio cholerae 61434AAF93355.1
25 vc0178 36134198, 33113VC0178, patatin-related protein Vibrio cholerae 64344AAF93354.1
26ORF2657236325, 34607hypothetical protein Desulfovibrio desulphuricans 26302ZP_00131100.1
27ORF2729737306, 36413hypothetical protein Burkholderia fungorum 26149ZP_00028189.1
28ORF2829338283, 37402hypothetical protein, patatin-like phospholipase Nitrosomonas europaea 68256ZP_00004075.1

The HPI of E. coli ECOR31 encodes for a complete type IV secretion system (T4SS) and genes for mobilization

Region I is located between positions 5450 and 16 449 (accession no. AY233333), and carries 11 ORFs. The deduced amino acid sequences of the 11 ORFs showed homology either to PilX proteins involved in the conjugal pilus assembly of the conjugative plasmid R6K of E. coli, or to respective proteins encoded by plasmid pYC of Y. pestis and plasmid pSB102 of Rhizobium meliloti (Table 2). The gene organization of region I is identical to the pilX region of IncX plasmid R6K; all ORFs are in the same orientation, most are either overlapping or separated by only a few nucleotides, suggesting that they comprise an operon and are translationally coupled. Thus, region I encodes a putative mating pair formation system (Mpf) as found on self-transmissible conjugative plasmids.

Region II is located between positions 16 450 and 26 714 and harbours ORF12 to ORF19. Beside ORFs with no significant homology to sequences in the GenBank database (ORF13 and ORF14), region II carries ORFs with homology to genes involved in mobilization of conjugative plasmids: ORF15 and ORF16 reveal homology to genes mobB and mobC of plasmid CloDF13 of Enterobacter cloacae encoding atypical mobilization proteins (Núñez and de la Cruz, 2001). In accordance with the structure of plasmid CloDF13, an oriT with a putative nic site (5′-GGTTG/GTCGCG-3′) was localized 240 bp upstream of the putative mobB gene (Fig. 3). However, this nic site reveals two nucleotide substitutions compared to the corresponding site of plasmid CloDF13 (Núñez and de la Cruz, 2001). Unlike the nic site of CloDF13, the putative nic site of HPIECOR31 is part of an inverted repeat structure (IR2, Fig. 3) resulting in a duplication of the proposed nic sequence in inverted orientation to each other. The presence of two independent nic sites suggests that either of the two DNA strands could be nicked and transferred into the recipient cell. This unique structure of two oriT (nic) sites in either of the strands of a conjugative plasmid is an exceptional finding which has yet only been described for the R6K plasmid, and poses interesting regulatory questions as to their activation (Avila et al., 1996). Two further translated ORFs of region II show homology to proteins involved in plasmid mobilization, namely a putative helicase of Dichelobacter nodosus (ORF17), and a putative antirestriction protein of Mesorhizobium loti (ORF19). The putative antirestriction protein encoded by ORF19 further exhibits high similarity to the N-terminal 300 amino acids of the antirestiction protein ArdC encoded by the IncW plasmid pSa of E. coli, and to the protein transport domain of the TraC1 primase of the promiscuous E. coli plasmid RP4, a protein which is transferred to recipients during conjugation. As both ArdC and TraC1 are able to bind single-stranded DNA and escort plasmid DNA during conjugation (Rees and Wilkins, 1990; Belogurov et al., 2000), the putative protein encoded by ORF19 may play a role in the conjugative transfer of the island. A complete copy of the RP4-type traC gene is not located on the RB-HPIECOR31, as the primase domain is missing. Thus, region II carries genes encoding for a putative coupling protein and proteins involved in cleavage of the oriT site, which is likely to be located within region II upstream of ORF15 (mobB). A 1 147 bp DNA fragment with a 88.5% identity to insertion sequence IS630 represents the border with region III and carries ORF20, which shares a 94% identity with the transposase TnpA of insertion element IS630.

Figure 3.

Gene organization of the HPIECOR31 Mob region.
A. Comparison of the gene organization of the Mob region of plasmid CloDF13 (accession number AJ224861) and the Mob region of HPIECOR31. Above the gene boxes (box arrowheads point in the direction of transcription), some key sequence coordinates (in bp) are shown. The brackets mark the map position of the nic site and gives a sequence alignment of the conserved sequence found in the nic sites of both Mob regions. The black arrowhead indicates the position of the nic site in the complementary DNA strand.
B. The sequence of the putative oriT region of HPIECOR31 reveals inverted (IR) and one direct repeat (DR) indicated by black arrows above the sequence. Unlike CloDF13, the putative nic site of HPIECOR31 is part of an inverted repeat structure (IR2) resulting in a duplication of the proposed nic sequence. Black arrowheads indicate the position of the nic site in the DNA strand and black boxes mark the conserved nic site sequence motif.

Region III, located between positions 27 861 and 38 928 and extending from ORF23 to ORF31, encodes putative products with either low homology to chromosomal genes found in Vibrio cholerae (ORF22–ORF25), or with no significant similarity to sequences in the GenBank database. Unlike all HPIs found in E. coli so far, the integrated HPIECOR31 is flanked by a 17 bp direct repeat sequence which has been characterized in Yersinia spp. as the core (attO) of the HPI attachment site, similar to the att sites of site-specifically integrating bacteriophages (Figs 1 and 5) (Campbell, 1992). In conclusion, the RB-HPIECOR31 consists of three regions that provide a mating pair formation system and a DNA-processing region for conjugative transfer. It is noteworthy, that no conserved repABC genes or any other indication of a plasmid replicon have been found in the RB-HPIECOR31. Together with the remaining 5′-part of the HPI, which represents the known structure of the HPI in E. coli (P4-like integrase gene at the 5′-border, integration at an asn tRNA gene), the HPIECOR31 fulfils all structural criteria of the recently defined family of integrative and conjugative elements, ICE (Hochhut et al., 1999; Burrus et al., 2002a).

Figure 5.

Recircularisation of the HPIECOR31.
A. Proposed model of site-specific integration and excision of the HPIECOR31. The HPIECOR31 integrates specifically into a chromosomal attachment site, attB. Integration and excision of the element are mediated by an integrase (int). The left and right HPIECOR31-chromosome junctions, attL and attR, are formed by the recombination between the chromosomal attB and attP of HPIECOR31. After chromosomal excision and circularisation, an element-specific att site (attP) is created. The HPIECOR31 is represented by black lines with the integrase gene shown as a black arrow, whereas the neighbouring chromosomal DNA being shown in grey. The attP site (P-O-P′) is depicted as black hatched boxes (P-P′) enclosing the O-part (black box) which forms the core of the attachment site, the attB site is shown in grey. The small black arrows below the figure (P1, P2, P3 and P4) indicate the location and orientation of the primers used for the detection of the extrachromosomal circular form of the HPIECOR31 (P2 and P3), as well as for the identification of the attB site (P1 and P4) after excision of the chromosomal HPIECOR31.
B. PCR analysis for the recirculated form of HPI (primers P2 and P3) and attB site (P1, P4); lane 1 and 4, DH5α used as control; lane 2 and 5, ECOR31 without induction of the int gene; lane 3 and 6, ECOR31 (pDU30) after induction of the int gene.

The T4SS (pilX-homologue) encoded by the RB-HPIECOR31 is functional and serves as a mating pair formation system

In order to determine the functional properties of the mating pair formation system (Mpf) encoded by RB-HPIECOR31, we isolated the cosmid pDU18 which carries the entire pilX gene cluster. We further generated pDU18 derivatives by in vitro transposon mutagenesis (EZ::TN < DHFR-1 > Insertion Kit, Epicentre) and in each mutant confirmed the insertion site of the resistance cassette within the pilX gene cluster by means of sequencing. Cosmid pDU18 and the two derivatives pDU19 and pDU20 carrying Tn insertions within the pilX5 and pilX6 gene, respectively, were used in mating experiments with the SuperCos cosmid vector (Fig. 4). The plasmid pDU18 was self-transmissible in matings from E. coli DH10B (donor) to E. coli TH2 (recipient) with a transfer frequency of 10−5. In contrast, the conjugal transfer of the SuperCos vector, of pDU19 and pDU20 were reduced to below detectable levels. When introduced into an E. coli donor strain that provides an intact Mpf, e.g. E. coli strain S17-1, the transfer frequencies of pDU19 and pDU20 to the recipient were as high as that of the parental pDU18 cosmid. These results indicate that the Mpf system encoded by the HPIECOR31 is functional and mediates conjugal transfer.

The RB-HPIECOR31 bears an origin of transfer (oriT) and encodes functional DNA-processing system active at this oriT

Within conjugative systems the oriT contains the nic site and is typically located next to the DNA-processing genes. On the basis of structural analogy to plasmid CloDF13 of Enterobacter cloacae (Núñez and de la Cruz, 2001), a 1838 bp DNA fragment of the RB-HPIECOR31 between ORF11 and ORF15 (mobB) was considered likely to harbour an origin of transfer (oriTHPI) (Fig. 3). The sequence of this 1838 bp region contains several features, including direct and indirect repeats and an area with a high A + T content, that are commonly found in the oriT of conjugative elements (Fig. 3) (Ippen-Ihler and Skurray, 1993). As the origin of transfer is the only element required in cis on a DNA molecule for transfer (Lanka and Wilkins, 1995), we cloned the fragment into a vector lacking an oriT in order to determine if the putative mobilization proteins MobB and MobC of RB-HPIECOR31 could mediate the transfer of a plasmid carrying oriTHPI. For this, the1838 bp DNA fragment of the putative oriTHPI was amplified by PCR (Table 1, oriT.for; oriT.rev), and cloned into the pACYC184 vector (Invitrogen) that is not ordinarily mobilizable, resulting in plasmid pDU25 (Figs 2 and 4). Matings were performed using the donor strain E. coli DH10B carrying the pDU18 cosmid that provides both the conjugal pilus system and the putative coupling proteins, together with E. coli TH2-Te as the recipient. The exconjugants were selected on LB agar with appropriate antibiotics in order to monitor both pDU18 and pDU25 in each experiment. As expected, the pSuperCos vector alone was not able to mediate the transfer of pDU25 (pACYC184 +oriTHPI). In contrast, using the E. coli strain DH10B carrying cosmid pDU18 as the donor, we were able to detect a transfer of pDU25 at a frequency of 10−4 transconjugants/donor, but not of the pACYC184 vector. Ten of 100 tested exconjugants containing pDU25 showed resistance to tetracycline and kanamycin, indicating that co-transfer of the pDU25 with pDU18 did occur at a low frequency. Consistently, derivatives of cosmid pDU18 carrying a mutated mobB gene (pDU21) or a mutation of the putative helicase gene (pDU26) could not effect transfer of tester plasmid pDU25 (Fig. 4). The results demonstrate that: (i) the RB-HPIECOR31 provides an origin of transfer structurally related to the oriT of plasmid CloDF13, and that (ii) the Mob proteins encoded by RB-HPIECOR31 are functional, recognize the oriTHPI and serve as a HPI-encoded nickase in a conjugative machinery that possibly mediates the transfer of the entire HPIECOR31. However, additional studies are required to precisely identify the nic site within this region.

Induction of HPI-encoded integrase gene promotes precise excision and circularisation of HPI ECOR31

Bacteriophages such as λ, as well as certain conjugative transposons such as CONSTINs (e.g. the SXT element of V. cholerae), all integrate and excise from the chromosome via a circular extrachromosomal intermediate (Campbell, 1992; Hochhut et al., 1999). Functional integrase and excisionase proteins are required for the site-specific recombination between the attL and attR sequences, leading to the generation of a circular extrachromosomal intermediate of the respective element. We tested whether a similar intermediate is formed after the excision of the HPIECOR31 from the chromosome. As no extrachromosomal HPI fragments could be isolated from the ECOR31 strain with several different types of plasmid preparation procedures, we used a more sensitive PCR assay for the detection of an extrachromosomal circular form of the HPIECOR31. For this, oligonucleotide primers oriented towards the right and left HPIECOR31-chromosome junctions were designed (primers P2 and P3, Fig. 5). These primers will amplify a product if the right (attR) and left (attL) ends of the integrated HPIECOR31 excise and circularise to yield an extrachromosomal form of the element (Fig. 5). Using these primers, a PCR product with the expected size (486 bp) was amplified from E. coli ECOR31 strain. However, the appearance of this PCR product carrying the recirculated junction (attP) of the HPIECOR31 was rather inconsistent probably as a result of the variable expression of the integrase gene. Therefore, in order to enhance expression of the int gene, we introduced the plasmid pDU30 into the E. coli ECOR31 strain, which carries an inducible integrase gene under control of T7 promoter (Tabor and Richardson, 1985). Induction of integrase expression resulted in high, constant levels of excision and re-circularisation of the HPIECOR31 (Fig. 5). The junction formed by the circularisation of the HPIECOR31 was further analysed by subcloning and sequencing of the 486 bp PCR products derived from both ECOR31 and ECOR31 (pDU30). Identical sequences were found in these plasmids which included the 17 bp direct repeat sequence (DR; 5′-CCAGTCAGAGGAGCCAA-3′; designated attO in Fig. 5) predicted to be formed by circularisation of the right (attR) and left (attL) ends of the integrated element. The sequences left and right of attO are identical with the ultimate ends of HPIECOR31 representing a complete attP site with the structure attP-O-P′ (Fig. 5). These results indicate that a circular extrachromosomal form of the HPI is present in ECOR31. As we were able to detect an intact attB sequence of E. coli by PCR after the induction of the integrase gene (Fig. 5), excision of the element does not leave an integrated copy in the chromosome. There was no indication of a plasmid replicon within the HPIECOR31, and it is therefore indicative that the circular form may be a non-replicative transfer intermediate.

Discussion

In this study we report on the pathogenicity island HPI of E. coli strain ECOR31 which reveals both common and rather distinct structural features compared to HPIs of other E. coli strains and Yersinia species, as well as to other PAIs found in members of the Enterobacteriaceae family. As with all HPIs described so far, the HPIECOR31 is (i) inserted at a tRNA locus (ii) it contains a P4-like integrase gene at one border, and (iii) it encodes a virulence determinant, the yersiniabactin siderophore system. However, the HPIECOR31 is twice as large as all known HPIs as a result of an additional DNA region within the right part of the HPI, which is not present in any other HPI described to date. It encodes DNA-processing system, pilus assembly and mating pair formation functions typical for conjugative DNA transfers. No such complete set of conjugative plasmid functions have yet been found on other pathogenicity islands. The genetic information for conjugal pilus formation is most similar in its sequence and organization to R6K, a conjugative plasmid derived from E. coli, whereas the mobilization system resembles the transfer system of plasmid CloDF13 of Enterobacter cloacae (Núñez and de la Cruz, 2001). However, HPIECOR31 is not simply a plasmid that integrates as it lacks the highly conserved repABC genes and other genes required for either autonomous replication or segregation. This therefore suggests that the circular form of the HPIECOR31 detectable upon induction of the integrase gene may be a non-replicative transposition or transfer intermediate. The chromosomal integration and excision of the HPIECOR31, and the recombination events underlying these processes are closely related to site-specific recombination found in lambdoid phages. Like lambda, the HPI forms a circular extrachromosomal intermediate. Recombination occurs between sites of local identity on the element (attP) and the chromosome (asn tRNA, attB) during integration and between attL and attR during excision. The excision and integration reactions have previously been demonstrated for the HPI of Y. pestis using a minimal integrative module of the pathogenicity island (Rakin et al., 2001). These steps were independent of homologous recombination and solely required the functional HPI-encoded int gene that belongs to the P4 integrase family. However, the mobility of the HPI differs greatly among the three pathogenic Yersinia species and E. coli. In Y. pseudotuberculosis, precise excision of the HPI occurs spontaneously at a frequency of ∼10−4 (Buchrieser et al., 1998). The HPI of Y. pseudotuberculosis has retained the ability to excise and reintegrate into the chromosome. In contrast to Y. pseudotuberculosis, the excision of the Y. pestis HPI is not precise but occurs as part of a much larger chromosomal deletion of a 102 kb genome fragment which is encompassed by two copies of the IS100 insertion element. This spontaneous excision of the 102 kb segment occurs at very high frequencies (2 × 10−3), probably by homologous recombination between the two IS100 copies that flank the unstable region (Fetherston et al., 1992; Fetherston and Perry, 1994). It may therefore mask a lower deletion frequency of the HPI alone (Buchrieser et al., 1998). The HPI of Y. enterocolitica appears to be rather stable probably as a result of a dysfunctional integrase caused by a frame shift mutation. The HPI hitherto found in E. coli appears to be the most stable. It is characterized by a deletion of the 3′-border that results in a loss of the attR (Schubert et al., 1999). Additionally, the HPIs of several E. coli strains reveal a distinct deletion within the integrase gene (Karch et al., 1999). Both observations are likely to be responsible for the inability of a precise excision and mobilization of all E. coli HPIs described so far. The HPI of ECOR31 presented in this study, however, is mobilizable and circularisation of the entire HPI is detectable upon expression of the integrase gene.

It is of note, that the HPI of Y. pestis is almost identical to the HPI of E. coli sharing more than a 99% nucleotide sequence identity, and is more distantly related to the HPI of Y. enterocolitica (96%). This points to a recent horizontal transfer event between Y. pestis and E. coli, but though its distribution among different members of the family Enterobacteriaceae is also indicative of recent and widespread horizontal transmission, the HPI has not yet been shown to be transferable. This is also true for almost all other PAIs described, as the mechanism(s) of acquisition of PAIs by their bacterial hosts has not yet been elucidated. In addition, PAIs have thus far not been shown to encode genes involved in self-mobilization. The structure and the mobilization of the HPIECOR31 are reminiscent of both temperate bacteriophages and conjugative plasmids. The element encodes a lambda family recombinase (P4-like integrase) that is required for its excision from the chromosome, and circularisation by recombination between the right and left ends of the integrated element. Generation of this extrachromosomal intermediate is likely to be an essential step in the successful transfer of the HPIECOR31 and must precede conjugative transfer to recipient cells. Once transferred to the recipient, the HPIECOR31 is thought to integrate site-specifically into the 5′ end of the asn t-RNA gene. The integration of the HPI, like the excision, requires the HPI int gene as has been shown by an in vitro model (Rakin et al., 2001). Accordingly, the excision of the HPIECOR31 is dependent on the presence of the HPI-encoded integrase. Interestingly, the recent report of Tauschek et al. (2002) describes a functional integrase of a novel LEE pathogenicity island of a rabbit enteropathogenic E. coli strain. This integrase can mediate site-specific integration of foreign DNA at the pheU tRNA locus of E. coli DH1 indicating possible mechanisms of mobilization and integration of this particular LEE PAI. Recently, Sullivan and colleagues reported a 502 kb chromosomally integrated element of Mesorhizobium loti that is transferable to non-symbiotic mesorhizobia in the environment (Sullivan and Ronson, 1998; Sullivan et al., 2002). As with the HPIECOR31, this symbiosis island of mesorhizobia lacks plasmid replication genes, suggesting that it forms a distinct type of site-specific conjugative transposons. However, beside a specific mobilization and conjugal transfer of PAIs, other mechanisms are likely to contribute to the wide distribution, e.g. phage transfer by generalized transduction. Indeed, a bacteriophage-mediated transfer of a PAI between bacterial isolates has been reported for the Gram-negative bacterium Vibrio cholerae and the Gram-positive bacterium Staphylococcus aureus. It has been suggested that the Vibrio pathogenicity island (VPI) is at the same time a novel filamentous phage, though this observation is still under debate (Karaolis et al., 1999; Faruque et al., 2003). In addition, an alternate transfer mode of the VPI between Vibrio cholerae isolates has recently been shown to be mediated by the generalized transducing phage CP-T1 (O’Shea and Boyd, 2002). In S. aureus the 15.2 kb genomic island SaPI1, which encodes the toxic shock syndrome toxin and requires a helper bacteriophage 80α to excise and replicate, is transduced to recipient strains at very high frequencies (Ruzin et al., 2001). However, it is unclear whether the 15.2 kb SaPI1 entirely conforms to the definition of a PAI or may rather represent a defective phage that requires a helper phage similar to the P2/P4 phage interaction (Boyd et al., 2001; Ruzin et al., 2001).

It is tempting to speculate that the HPIECOR31 described here is a yet transferable progenitor of the HPI found in yersiniae and other E. coli. Interestingly, the 35 kb DNA region of the HPIECOR31 is located exactly at the position of the IS100 insertion found within the 3′-border of the HPI of Y. pestis and Y. pseudotuberculosis. This IS100 copy has been described as an atypical one which lacks the target duplications indicative of transposition (Hare et al., 1999). This observation could easily be explained by a recombination-deletion event of two IS100 copies that previously inserted into a larger HPI, e.g. that of HPIECOR31. Furthermore, the insertion of the IS100 copy at the 3′-border of the HPI of Y. pestis and Y. pseudotuberculosis interrupts a small ORF which shares homology to the haemolysin expression modulating protein Hha of E. coli. The HPIECOR31 carries an intact ORF with a homology to Hha throughout the entire ORF. Thus, this suggests that the IS100 insertion found in the HPI of Y. pestis/Y. pseudotuberculosis, and the multiple insertions of IS elements at the right border of the HPI of Y. enterocolitica may have deleted larger right-border DNA regions as found in E. coli ECOR31. Our study has shown that the HPI of E. coli strain ECOR31 is reminiscent of large site-specific integrative and conjugative elements (ICE) as defined by Burrus et al. (2002a, b). The HPIECOR31 described here is the first example of a large ICE found in E. coli, and according to the classification of Burrus and co-workers we therefore propose that the HPIECOR31 be designated as ICEEc1. It appears likely that the ICEEc1 is transferred via a type IV mating pore in a mechanism analogous to plasmid-mediated conjugation. Questions of particular interest include the signals and mechanisms underlying horizontal transfer of the ICEEc1 in the environment. We suggest, finally, that further study of the ICEEc1 transfer will provide an important body of knowledge related to the evolution and spread of pathogenicity islands in general.

Experimental procedures

Bacterial strains and DNA methods

The E. coli ECOR31 strain is part of the E. coli Collection of Reference (ECOR) (Ochman and Selander, 1984), which was obtained from Thomas Whittam (Pennsylvania State University, State College, PA, USA). Isolation of plasmids, cosmids and genomic DNA, as well as cloning of DNA fragments were performed using standard techniques (Ausubel et al., 1989). A cosmid library of total DNA from E. coli strain ECOR31 was constructed according to the manufacturer's recommendations using the SuperCos 1 Cosmid Vector Kit and the Gigapack III Gold Packaging Extract (Stratagene, Heidelberg, Germany). Polymerase chain reaction products representing parts of the HPI (irp1, irp2 and fyuA) were used for screening the cosmid library and further DNA hybridization analyses (Schubert et al., 1998).

DNA sequencing and phylogenetic analysis

DNA was sequenced using the BIG Dye Deoxy Termination Kit according to the manufacturer's instructions together with a model 377 DNA sequencing system (Applied Biosystems, Weiterstadt, Germany). In order to sequence the 3′-part of HPIECOR31, a shotgun library of overlapping cosmids was prepared from mechanically sheared DNA. Fragments with sizes of 1.5–2.0 kb were separated by agarose gel electrophoresis, end repaired and cloned into the pZErO-2.1 vector (Invitrogen, Heidelberg, Germany). DNA of random plasmid clones was isolated, purified and used as templates for shotgun sequencing reactions as described above. To complete the project, directed sequencing with custom primers was carried out on PCR products amplified from genomic DNA. Approximately 500 sequences were used for the final assembly. Given an average read of 500 bp, approximately 250 kb of unique reads were generated. Management and analysis of nucleotide sequence data was performed using the Lasergene sequence analysis software system (DNASTAR, Madison, WI, USA). Homology searches were performed by comparing the sequences with the public DNA and protein databases using the programs blastn and blastx (Altschul et al., 1997) (http://www.ncbi.nlm.nih.gov) and fasta (http://www.ebi.ac.uk/fasta3).Construction of pilX, mobB, and helicase mutants by EZ::TN transposon insertion. The cosmid pDU18 carrying region I (pilX operon, mating pair formation system) and region II (putative DNA-processing region) was subjected to in vitro mutagenesis using the EZ::TN < DHFR-1 > Insertion Kit (Epicentre) (Table 1, Fig. 4), and transformed into E. coli DH10B. Insertion points were confirmed by sequencing using primers provided with the EZ::TN < DHFR-1 > Insertion Kit (Epicentre). Plasmids carrying < DHFR-1 > EZ::TN insertions into the pilX operon (pDU19, pDU20), the mobB gene (pDU21) and the helicase gene (pDU26) were isolated and investigated in mating experiments (Table 1, Fig. 4).

Bacterial conjugations

Conjugation experiments were carried out on Columbia agar plates containing 5% sheep blood (Becton Dickinson, Heidelberg, Germany) essentially as described previously (Waldor et al., 1996). In the matings between E. coli strains, donor strains (E. coli strain DH10B carrying different recombinant plasmids of RB-HPIECOR31) were streaked together with the recipient strain E. coli TH2 (Takara Bio, Shiga, Japan) or its antibiotic resistant derivatives TH2-Spec and TH2-Te. All plate matings were carried out overnight at 37°C with a donor-to-recipient ratio of approximately 1:1. To quantify conjugative transfer, dilutions were plated on LB media containing appropriate antibiotics for selection of plasmid-containing recipients. The frequency of conjugation was determined by dividing the number of transconjugant cells by the total number of recipient cells.

Nucleotide sequence accession number

The nucleotide sequence reported in this study has been deposited in the GenBank database under accession number AY233333.

Acknowledgements

We would like to thank Kirsten Weinert for her excellent technical assistance and Klaus Hantke, Tübingen, for providing the E. coli TH2-Te strain. We acknowledge Rainer Haas and Sally Darlington for critical review of the manuscript. This study was supported by a grant from the Deutsche Forschungsgemeinschaft to S.S. (SCH 1494/1–1).

Ancillary