The selC-associated SHI-2 pathogenicity island of Shigella flexneri


  • Jeremy E. Moss,

    1. Skirball Institute, New York University Medical Center, 540 First Avenue, New York, NY 10016, USA.,
    2. Department of Microbiology, New York University Medical Center, 540 First Avenue, New York, NY 10016, USA.,
    Search for more papers by this author
  • Timothy J. Cardozo,

    1. Skirball Institute, New York University Medical Center, 540 First Avenue, New York, NY 10016, USA.,
    Search for more papers by this author
  • Arturo Zychlinsky,

    1. Skirball Institute, New York University Medical Center, 540 First Avenue, New York, NY 10016, USA.,
    2. Department of Microbiology, New York University Medical Center, 540 First Avenue, New York, NY 10016, USA.,
    Search for more papers by this author
  • Eduardo A. Groisman

    1. Howard Hughes Medical Institute, Department of Molecular Microbiology, Washington University School of Medicine, 660 S. Euclid Ave, St. Louis, MO 63110, USA.
    Search for more papers by this author

Arturo Zychlinsky. E-mail; Tel. (1) 212 263 7058; Fax (+1) 212 263 5711.


Pathogenicity islands are chromosomal gene clusters, often located adjacent to tRNA genes, that encode virulence factors present in pathogenic organisms but absent or sporadically found in related non-pathogenic species. The selC tRNA locus is the site of integration of different pathogenicity islands in uropathogenic Escherichia coli, enterohaemorrhagic E. coli and Salmonella enterica. We show here that the selC locus of Shigella flexneri, the aetiological agent of bacterial dysentery, also contains a pathogenicity island. This pathogenicity island, designated SHI-2 (Shigellaisland 2), occupies 23.8 kb downstream of selC and contains genes encoding the aerobactin iron acquisition siderophore system, colicin V immunity and several novel proteins. Remnants of multiple mobile genetic elements are present in SHI-2. SHI-2-hybridizing sequences were detected in all S. flexneri strains tested and parts of the island were also found in other Shigella species. SHI-2 may allow Shigella survival in stressful environments, such as those encountered during infection.


Pathogenicity islands (PAIs) are clusters of chromosomal virulence genes present in pathogenic strains of a species and absent or sporadically distributed in closely related non-pathogenic strains (Groisman and Ochman, 1996; Hacker et al., 1997). The majority of PAIs identified to date are found in close association with tRNA genes. For example, the PAI-4 and PAI-5 PAIs of the uropathogenic strain of Escherichia coli (UPEC) J96 are located next to the pheV and pheR tRNA genes (Blum et al., 1995; Swenson et al., 1996), and the PAI-2 PAI of UPEC strain 536 is located next to the leuX tRNA gene (Blum et al., 1994). The selC tRNA locus is the site of integration for three distinct PAIs: PAI-1 in UPEC strain 536 (Blum et al., 1994); the locus of enterocyte effacement (LEE) in enteropathogenic E. coli (EPEC) and enterohaemorrhagic E. coli (EHEC) (McDaniel et al., 1995; Perna et al., 1998); and SPI-3 in Salmonella (Blanc-Potard and Groisman, 1997). It is also the attachment site for the E. coli retronphage φR73 (Sun et al., 1991).

Despite the overwhelming genetic similarity between Shigella spp. and E. coli (Keusch and Acheson, 1998), these bacteria differ in the relationships they establish with their hosts. Whereas most strains of E. coli are non-invasive, all four species of Shigella invade the colonic epithelium and induce acute inflammation (Zychlinsky et al., 1994). This bacterial invasion and the accompanying inflammatory response result in dysentery, a disease characterized by blood and mucus in the stool. Shigella-induced dysentery is dependent on the virulence genes in a large plasmid present in all clinical isolates (Sansonetti et al., 1981; 1982).

Several chromosomal genes are also involved in Shigella virulence. These include: sodB, LPS synthesis genes, genes encoding iron acquisition proteins and, in Shigella dysenteriae, genes encoding Shiga toxin (Hale, 1991; O'Brien et al., 1992; Wyckoff et al., 1998). In Shigella flexneri serotype 2a, the genes encoding the enterotoxin ShET-1 and an immunoglobulin A protease homologue are located in a pathogenicity island of unknown chromosomal location (Rajakumar et al., 1997).

Here, we describe a second pathogenicity island of Shigella located at the selC locus of S. flexneri serotype 5a. This region, designated SHI-2 for Shigellaisland 2, contains genes encoding the aerobactin iron acquisition siderophore system, colicin V (ColV) immunity and several previously undescribed proteins. SHI-2-hybridizing DNA is present in all S. flexneri strains tested, and parts of the island are present in other Shigella species.

Results and Discussion

The selC loci in Shigella flexneri and E. coli K-12 are different

In E. coli K-12, orf307 is located ≈2 kb downstream of the selC gene. orf307 is highly conserved between Salmonella enterica and E. coli K-12 (Blanc-Potard and Groisman, 1997), bacterial species more distantly related than S. flexneri and E. coli. To assess whether the region downstream of selC in S. flexneri serotype 5a was similar to the corresponding region in E. coli K-12, we attempted to amplify the fragment between selC and orf307 from the S. flexneri strain M90T by PCR. The sequence of the PCR primers was based on the E. coli K-12 sequence (Blattner et al., 1997). We used two different selC and three different orf307 primers designed to amplify the selCorf307 intergenic region and followed PCR protocols optimized for obtaining both short and long DNA fragments. Regardless of the primers or protocol used, we could amplify a fragment of the predicted size from E. coli K-12, but not from S. flexneri chromosomal DNA (data not shown).

To determine whether orf307 was present in S. flexneri, we performed PCR amplifications using two different 5′ and three different 3′ primers corresponding to DNA sequences within orf307. We did not obtain PCR products with S. flexneri DNA as a template, although appropriately sized bands were amplified from E. coli K-12 DNA (data not shown). These results suggest that orf307 is either absent or altered in the S. flexneri chromosome and raised the possibility of the selC locus harbouring horizontally acquired sequences.

Molecular genetic characterization of SHI-2

We cloned the region downstream of selC in S. flexneri by introducing a plasmid library from M90T Mu cts into a selC strain of E. coli. Clones complemented by a functional selC gene were identified based on their ability to reduce nitrate with formate. Two selC+ colonies were obtained in this fashion, and plasmid DNAs extracted from these clones were characterized by restriction digests and sequence analyses. Figure 1 shows the genetic organization of the 23.8 kb region downstream of selC in strain M90T, which we designated Shigella island 2 (SHI-2). We suggest the she island, previously described by Rajakumar et al. (1997), be referred to as SHI-1 for Shigella island 1.

Figure 1.

. Genetic and physical map of the M90T SHI-2 pathogenicity island. A. Genetic map of SHI-2 regions, including ORFs, similar to insertion sequences (labelled with black boxes), and the position of the selC and nlpA genes. B. Physical map of SHI-2 based on restriction digestion of pJM1 and pJM4 plasmid DNA with EcoRI (E), HindIII (H), and DraI (D). C. SHI-2 DNA sequences present in plasmids pJM1, pJM2, pJM3 and pJM4. Arrows indicate that DNA regions that extend beyond the ends of SHI-2 are present in the plasmid. D. G+C content of SHI-2 determined by using a 101 nucleotide window, as described in the Experimental procedures. The line at 51% represents the G+C content of the Shigella chromosome.

The Shigella DNA sequence beyond the 3′ end of SHI-2 exhibits sequence identity to a region located ≈3.8 kb downstream of selC in the E. coli K-12 chromosome. Thus, M90T appears to lack 3.8 kb of DNA present in E. coli K-12 that includes the yicK (orf394 ) and yicL (orf307 ) genes and part of the nlpA gene. The absence of orf307 downstream of selC in M90T may explain our inability to amplify this gene by PCR. The integrations of other PAIs into selC also appear to have caused deletions (Blum et al., 1994; McDaniel et al., 1995; Blanc-Potard and Groisman, 1997; Perna et al., 1998). Integration of SHI-2, however, is associated with the largest loss of DNA at selC described to date.

The average G+C content of SHI-2 (48.6%) is slightly lower than that of the rest of the Shigella chromosome (51%). Interestingly, the G+C content varies dramatically across the island (Fig. 1D). Although DNA corresponding to SHI-2 insertion sequences and genes encoding aerobactin (see below) have a G+C content similar to that of the Shigella chromosome (49.4–54.3% and 52.7% respectively), several of the novel ORFs of SHI-2 have G+C contents well below 51% (Table 2). This finding is consistent with the idea that these SHI-2 sequences were horizontally acquired.

Table 2. . Proteins encoded by novel ORFs of SHI-2. H represents predicted helical stretches and E represents predicted β sheets in the secondary structure section. Numbers in parentheses for homology searches are the NCBI accession numbers and statistical similarity respectively. Numbers in parentheses for transmembrane prediction are the number of predicted transmembrane helices.Thumbnail image of

The ends of SHI-2

The 5′ region of SHI-2 (i.e. downstream of selC ) exhibits sequence identity with the corresponding regions of two other selC-associated PAIs. Like the selC loci in EPEC, EHEC and UPEC strain 536, the discontinuity between E. coli K-12 and S. flexneri begins 16 bp downstream of selC and, thus, this is where the LEE, PAI-1 and SHI-2 PAIs begin. In contrast, Salmonella SPI-3 begins 12 bp downstream of selC (Fig. 2A). The first 284 nucleotides of SHI-2 exhibit 100% sequence identity with the corresponding region in SA100, which is another S. flexneri strain (Vokes et al., 1999), 76% identity to LEE in EPEC, 79% identity to the first 100 bp of LEE in EHEC (Fig. 2B), and 81% identity to the first 37 bp of PAI-1 (no sequence information is available for PAI-1 beyond this point). The SPI-3 island does not exhibit substantial identity with SHI-2 in the first 284 nucleotides downstream of selC (Fig. 2B). The high degree of sequence identity between SHI-2, LEE and PAI-1 suggests that these islands either integrated using similar mechanisms or evolved from a common ancestor.

Figure 2.

. 5′ flanking regions of selC-associated PAIs. A. Alignment of the DNA sequences of S. flexneri M90T, EHEC O157:H7 EDL933, EPEC E2348/69, S. enterica serovar Typhimurium 14028s, UPEC 536, and E. coli K-12, 25 nucleotides downstream of the 3′ end of selC (sequences corresponding to selC are shaded). The Salmonella SPI-3 sequence begins 12 bp beyond the end of selC (first arrow). Shigella SHI-2, UPEC PAI-1, and EPEC and EHEC LEE begin 16 bp from the end of selC (second arrow). B. Percent nucleotide identity of the first 50, 100 and 284 nucleotides of SHI-2 with the corresponding regions in other selC loci PAIs. Alignments were carried out using GeneWorks, allowing for best fit. Nucleotide 285 of SHI-2 is the first nucleotide of the SHI-2 CP4 family integrase.

The 399 bp at the 3′ end of SHI-2 are 97% identical to part of an IS600 insertion sequence. This SHI-2 sequence is located immediately upstream of the disrupted nlpA gene (Fig. 1). Unlike some other PAIs (Hacker et al., 1997), no repeats flank SHI-2.

Putative integrase and transposase genes in SHI-2

A number of regions with sequences similar to those of genes from known mobile genetic elements are interspersed throughout SHI-2 (Fig. 1 and Table 1[link]). Near the 5′ end of the island, there is an open reading frame (ORF) with a high degree of sequence similarity to the integrases of the CP4 family of phages. The selC locus is the attachment site for prophages of the CP4 family including φR73 (Sun et al., 1991). Like SHI-2, the LEE pathogenicity island of EHEC O157:H7 EDL933 encodes an integrase similar to those of the CP4 family of phages near the selC gene. The EHEC LEE, however, contains not only the integrase gene, but also an entire 7.5 kb putative CP4-like prophage (Perna et al., 1998).

Table 1. . Potential gene products in SHI-2 similar to proteins from mobile genetic elements. Percent identity is obtained by Advanced blast searches.* It has not been demonstrated that these proteins are transposases, but they are similar (blast) to known transposases.Thumbnail image of

Other SHI-2 DNA segments have sequence identity to the transposase genes of IS3, IS629, IS2, IS1 and IS600 (Fig. 1). Although DNA fragments corresponding to parts of IS DNA are found in SHI-2, there are no intact insertion sequences in the island. The products of the putative integrase and transposase genes encoded within SHI-2 might have been involved in the assembly of the island, either as a unit or in a stepwise fashion. Although the association of mobile genetic elements with PAIs is well established (Hacker et al., 1997), the abundance and clustering of these elements in SHI-2 is unique.

Most of the SHI-2 ORFs related to insertion sequences encode proteins that appear to be mutated or disrupted when compared with described insertion sequence gene products (Table 1). For example, the gene products of SHI-2 ORFs with similarity to the IS3, IS629 and IS2 transposases are either rearranged or contain premature stop codons; thus, it is unlikely that they encode functional proteins. Also, the SHI-2 IS1 is predicted to produce a non-functional transposase because the IS1 gene insA is not present in the island (Mahillon and Chandler, 1998).

The aerobactin operon

The aerobactin operon, which encodes an iron-regulated siderophore system for iron acquisition, is found in a variety of enteric bacteria both in the chromosome and in plasmids (de Lorenzo and Martinez, 1988). Aerobactin is associated with increased virulence in Shigella because S. flexneri mutants that cannot synthesize functional aerobactin are attenuated in the rabbit ileal loop model of infection (Lawlor et al., 1987; Nassif et al., 1987). The aerobactin operon was previously mapped to 82 min in the Shigella chromosome, the region containing selC (Griffiths et al., 1985). We found that SHI-2 contains the genes encoding aerobactin synthesis and transport in Shigella. In contrast to Shigella, certain E. coli strains harbour the aerobactin operon in a plasmid, pColV, which also contains a number of virulence genes as well as genes for ColV synthesis, export and immunity.

The aerobactin operon consists of the iuc genes (A–D ) that encode proteins responsible for the synthesis of the aerobactin siderophore and iutA, which encodes the receptor for aerobactin complexed to iron (Crosa, 1989). The SHI-2 iucA, iucB, iucC and iucD predicted gene products are 92%, 94%, 95% and 97% identical to their E. coli pColV counterparts respectively. The SHI-2 iutA gene product is slightly more divergent from that encoded in pColV, sharing only 85% amino acid identity, with the most divergent residues corresponding to the signal sequences of the proteins. Finally, the 408 nucleotides upstream of iucA are 94% identical to the known promoter sequence of pColV aerobactin. The aerobactin operon is, therefore, well conserved between Shigella SHI-2 and E. coli pColV.

Colicin V immunity

The genes encoding ColV immunity and aerobactin synthesis and transport are linked in E. coli pColV and in the S. flexneri chromosome (Payne, 1989; Waters and Crosa, 1991). ColV is a molecule produced by certain enteric bacteria that kills sensitive strains. Genes encoding ColV production and immunity are linked in pColV but not in S. flexneri, which does not synthesize ColV (data not shown). We mapped ColV immunity to the region between 4.4 and 11.9 kb in SHI-2. We showed that ColV-sensitive E. coli K-12 strains harbouring pJM1 (which includes nucleotides 1–4400 of SHI-2) remain ColV sensitive, but those containing pJM2 (with nucleotides 1–11 900 of SHI-2), pJM3 (with nucleotides 1–16 400 of SHI-2), or pJM4 (with nucleotides 100–23 790 of SHI-2) are immune to ColV (data not shown). Vokes et al. (1999) identified a gene encoding immunity to ColV in a pathogenicity island of S. flexneri strain SA100 that is 100% identical to shiD in the 4.4–11.9 kb region of M90T SHI-2. The ColV immunity gene of E. coli pColV, cvi (GenBank accession no. AJ223631), does not exhibit sequence similarity to shiD. Moreover, there is no substantial similarity between any known colicin immunity gene and the DNA sequence of shiD. It is therefore unclear how the protein encoded by this gene confers immunity to ColV.

Immunity to ColV may be an important adaptation for Shigella in environmental niches that Shigella shares with bacteria inside and outside the host. ColV immunity may protect Shigella from toxic molecules released by other bacteria. Furthermore, Vokes et al. (1999) showed that S. flexneri produces an as yet uncharacterized colicin and that the predicted gene product of SHI-2 shiD confers immunity to this molecule as well.

Novel ORFs of SHI-2

In SHI-2, there are seven genes encoding proteins that are at least 100 amino acids in length. These ORFs apparently encode globular proteins, with short (10–20 amino acid) stretches of putative transmembrane helices or flexible linkers between structural domains (Table 2) (Altschul et al., 1990; 1997; Bairoch et al., 1997; Frishman and Argos, 1996). Five of the seven putative proteins appear to be integral inner membrane proteins based on transmembrane helix prediction (tmpred; Hofmann and Stoffel, 1993) and protein localization analysis (psort; Nakai and Kanehisa, 1991). The genes for two of these proteins, shiF and shiG, are transcribed in reverse orientations from the same SHI-2 region (Fig. 1) and include a portion of the aerobactin promoter sequence. The protein encoded by shiF has a lipid attachment site and shares similarity with proteins in the tetracycline resistance transporter family (NCBI accession no. 3892629). The shiC gene product is slightly similar to CagO (zega; Abagayan and Batalov, 1997), a protein of unknown function encoded on the cag pathogenicity island of Helicobacter pylori (Censini et al., 1996).

The gene products of shiA and shiB are predicted (tmpred and psort) to be soluble proteins. The predicted fold of the shiA gene product is characteristic of the quinone reductase/NADPH oxidoreductase proteins (zega; Abagayan and Batalov, 1997; Protein Data Bank code lqrd). Although no definitive motif has been reported for this class of proteins, an alignment between the proteins encoded by shiA and lqrd shows strong conservation in the putative first and third beta–alpha loops. Interestingly, these are the primary residues in contact with the FAD and quinone moieties in the active site of the protein and are the only two conserved stretches of residues in a multiple alignment of a large family of quinone reductases. Secondary structure prediction of the shiA product also shows a strong concordance with the actual secondary structure of lqrd. Furthermore, the predicted transmembrane helix in the protein encoded by shiA (tmpred) corresponds to a highly hydrophobic, buried helix in lqrd. Because the postulated role for the quinone reductases in mammalian cells involves protection from free radicals (Beyer, 1992; Ernster and Dallner, 1995), it is possible that the protein encoded by shiA in Shigella may have an analogous role.

Location and distribution of SHI-2 in enteric bacteria

We determined the distribution of SHI-2 in enteric bacteria by carrying out Southern hybridization experiments using genomic DNA extracted from a variety of different species. As probes, we used PCR-generated fragments corresponding to regions of the island containing novel sequences, labelled 1–7 in Fig. 3. Salmonella typhi, Salmonella typhimurium, Yersinia enterocolitica, Citrobacter freundii, Serratia marcescens, Enterobacter aerogenes, and Klebsiella pneumoniae genomic DNA did not hybridize to any of the probes from SHI-2 (data not shown).

Figure 3.

. Distribution of SHI-2 DNA segments and colicin V sensitivity profile of Shigella and E. coli strains. Southern hybridization assays were performed as described in Experimental procedures. +, Hybridization to probe specific for the region; −, no hybridization observed. Colicin V sensitivity assays were performed as described in the Experimental procedures. I, immune to ColV; S, sensitive.

To examine the distribution of SHI-2 sequences among Shigella and E. coli strains, we performed similar Southern hybridization experiments (Fig. 3). Genomic DNA from all four S. flexneri strains tested hybridized to each of the five probes used. The S. sonnei strains analysed contained regions 4 and 5 (Fig. 3), and, in some cases, region 1 (by convention, we call region 1 the area detected by probe 1) but not regions 2 or 3. Some S. boydii and S. dysenteriae strains contained region 5, but none contained regions 1, 2, 3 or 4. pColV isolated from E. coli LG1522 did not possess regions 1–5. None of the four pathogenic E. coli strains tested appeared to contain an intact SHI-2. EPEC Dec 1B and EHEC Dec 3B harboured DNA that hybridizes weakly to probe 3. In addition, this EHEC strain also contained region 4.

All strains of Shigella tested, except for a toxigenic S. dysenteriae strain, contained DNA that hybridizes to two probes in different parts of the aerobactin operon (Fig. 3). This conservation implies that aerobactin-mediated iron acquisition is important for Shigella. In contrast, we did not detect sequences that hybridized to the aerobactin operon in DNA isolated from either non-pathogenic or pathogenic E. coli strains.

To determine whether SHI-2 was located at selC in S. sonnei and in S. flexneri strains other than M90T, we performed a PCR reaction using primers corresponding to the selC gene and the SHI-2 CP4 integrase-like gene. The predicted 800 bp fragment was obtained using chromosomal DNA from all S. flexneri and S. sonnei strains tested, but not from E. coli MC4100 (data not shown). Thus, the SHI-2 junction at the selC end appears to be conserved in S. flexneri and S. sonnei.

ColV immunity is sporadically distributed in Shigella strains (Fig. 3). S. flexneri serotypes 2a and 5a are immune to ColV whereas S. flexneri serotypes 1a and 2b are sensitive. This is surprising because S. flexneri serotypes 1a and 2b appear to contain the entire SHI-2 (Fig. 3). By Southern blot analysis, however, we observed some restriction digest polymorphisms between Shigella strains in regions 1–7 (data not shown). Thus, polymorphisms between resistant and sensitive S. flexneri strains, notably in region 4 which contains the putative ColV immunity gene of SHI-2, may have resulted in an alteration or deletion of the ColV immunity gene. Furthermore, all three of the S. sonnei strains analysed were highly sensitive to ColV and the hybridization pattern of S. sonnei DNA obtained with probe 4 is identical to that of the sensitive S. flexneri strains (data not shown). Therefore, the presence of this shared region, which differs from that of the S. flexneri ColV resistant strains, again correlated with ColV susceptibility. Two of three S. boydii and three of four S. dysenteriae strains tested were not sensitive to ColV. Because these strains do not contain DNA that hybridizes to probe 4 (Fig. 3), it is likely that S. boydii and S. dysenteriae ColV immunity are encoded elsewhere.

The selC-associated pathogenicity island described by Vokes et al. (1999) isolated from the S. flexneri strain SA100 is nearly identical to the M90T SHI-2 between selC and the aerobactin operon. The region 3′ to the aerobactin operon, however, is very different. The island of strain SA100 encodes genes for a second IS2 element and another novel protein instead of the IS600 present in M90T. The reason for the difference between these two strains is not clear. It is possible that a primordial SHI-2 acquired by a Shigella ancestor has since evolved into unique PAIs in different strains. This hypothesis can explain three additional findings: (1) the presence in S. sonnei of DNA hybridizing to regions 1, 4, 5, 6, 7 and the aerobactin operon, but not regions 2 and 3; (2) the detection of region 5 and the aerobactin operon, but not the 5′ end of SHI-2 in some strains of S. dysenteriae and S. boydii ; and (3) the finding that S. flexneri serotypes 1a and 2b contain DNA recognized by all SHI-2-specific probes tested, but still remain phenotypically ColV sensitive.


We propose that SHI-2 is a PAI located at the selC locus of S. flexneri. SHI-2 fits the definition of a PAI because it: (1) is a chromosomal locus that contains known virulence genes; (2) is found in pathogenic but absent from related non-pathogenic bacterial species; (3) contains multiple cryptic mobile genetic elements; (4) is associated with a tRNA locus; and (5) has regions with a G+C content different from that of the rest of the chromosome. SHI-2 appears to be a PAI that facilitates bacterial survival in stressful environments. Aerobactin allows SHI-2-containing bacteria to survive in low-iron environments (de Lorenzo and Martinez, 1988; Crosa, 1989). ColV immunity allows bacteria harbouring SHI-2 to survive in environments where they compete with other bacteria (Waters and Crosa, 1991). A predicted ORF of SHI-2 encodes a protein similar to a tetracycline transporter and the product of another putative ORF may play a role in scavenging free radicals (Table 2).

Finally, Maurelli et al. (1998) proposed that Shigella virulence requires the absence of certain genes normally present in E. coli K-12. Therefore, it is possible that both the gain of SHI-2 as well as the absence of the selCnlpA intergenic region are important for Shigella virulence. Alternatively, the three genes absent from M90T and present in E. coli K-12 may encode protein products that are redundant or non-essential in Shigella.

Experimental procedures

Bacterial strains and growth conditions

The strains used were: S. flexneri, (1) M90T serotype 5a (Sansonetti et al., 1982), (2) 12022 serotype 2b (American Type Culture Collection), (3) 9199 serotype 1a (ATCC), (4) 2457T serotype 2a (Du Pont et al., 1969); S. sonnei, (1) Sonnei 4 (Guichon and Zychlinsky, 1997), (2) Sonnei Rivera (Guichon and Zychlinsky, 1997), (3) 25931 (ATCC); S. boydii, (1) 9207 (ATCC), (2) 12029 (ATCC), (3) 49812 (ATCC); S. dysenteriae, (1) 12180 (ATCC), (2) Dys 2 (kind gift of P. Sansonetti, Institut Pasteur, Paris, France), (3) Dys 3 (P. Sansonetti), (4) Shiga (P. Sansonetti); E. coli, (1) MC4100 (E. Groisman laboratory collection, Washington University School of Medicine, St. Louis, MO, USA), (2) EPEC Dec 1B (Pennsylvania State University E. coli center), (3) EPEC DEC 5 A (Pennsylvania State University E. coli center), (4) EHEC Dec 3B (Pennsylvania State University E. coli center), (5) EIEC 1 (Pennsylvania State University E. coli center), (6) LG1522, pColV+ (P. Sansonetti), (7) FM460, ΔselC400::kan (Yale University E. coli Genetic Stock Center). Shigellae were grown in trypticase soy broth (TSB) and all other strains were grown in Luria–Bertani Broth (LB) at 37°C. Kanamycin was used at 50 μg ml−1 and chloramphenicol at 34 μg ml−1.

Cloning of the M90T selC region

We constructed plasmid libraries from the S. flexneri strain M90T Mu cts using the mini-Mu replicon elements Mud5166 and Mud5005, which can clone DNA fragments as large as 23 and 31 kb respectively (Groisman, 1991). The M90T Mu cts Mud5166-generated library was transduced into E. coli FM460 Mu cts, a strain lacking a functional selC gene. selC+ bacteria can reduce nitrate with formate and form white colonies on MacConkey nitrate plates (Barrett et al., 1979). We screened for selC-containing clones using MacConkey nitrate plates supplemented with chloramphenicol and kanamycin under anaerobic conditions (BBL GasPak Pouch System). Two positive clones, containing plasmids which were designated pJM1 and pJM2, were obtained and these plasmids were used as templates to sequence the M90T DNA. Taking advantage of the sequence information obtained from pJM2, we amplified a DNA fragment corresponding to a region of SHI-2 by PCR. We used pJM2 DNA as a template and primers 855 (5′-CGCATAGTGATCGAGCGC-3′) and 856 (5′-GCTTATCTGCCCGGAACG-3′) in this reaction. The resulting fragment was labelled and used as a probe to screen a M90T Mu cts Mud5005 library in E. coli MC4100 Mu cts by colony hybridization. Among the kanamycin-resistant transductants, we obtained two hybridizing colonies, which contained plasmids that were designated pJM3 and pJM4.

Molecular biological techniques

PCR reactions were performed with purified chromosomal DNA as template using Taq polymerase as specified by the manufacturer (Gibco BRL). To amplify long DNA fragments, we used the TaqPlus Long PCR System (Stratagene). Primers corresponding to selC and orf307 were constructed based on the published E. coli K-12 sequences (Blattner et al., 1997). To amplify orf307 and the selCorf307 intergenic region, the following primers were used: selC towards orf307, (1) 756 (5′-GGAAGATCGTCGTCTCCGGTGAGGC-3′), (2) 462 (5′-ATCCAGTTGGGGCCGCCAGCGGTCCCGGGCAG-3′); orf307 towards selC, (1) 727 (5′-CCCCTTTCTGGTGGAACCCAT-3′), (2) 729 (5′-ATGCCCCAGAACAACGCGGC-3′), (3) 788 (5′-GGCTTTGCTCCATGATGTATTGCGC-3′); and orf307 towards nlpA, (1) 728 (5′-ATGGGTTCCACCAGAAAGGGG-3′), (2) 787 (5′-GCGCAATACATCATGGAGCAAAGCC-3′). For detection of inserts at the selC locus of Shigella strains, we used primers 756 (see above) and 103 (5′-GGGCGCTTGCCAATGTAG-3′).

Colony hybridization experiments were performed using the Amersham ECL kit for direct nucleic acid labelling (RPN 3000). Southern hybridization experiments were performed on purified chromosomal DNA digested with DraI and HindIII. After digestion, the DNA was subjected to electrophoresis on a 1% agarose gel and transferred to a Hybond N+ membrane by capillary action. Hybridization was performed using the Amersham kit (RPN 3000) at 42°C, as specified by the manufacturer. The SHI-2 DNA regions detected by the different probes used are shown in Fig. 3. Probe 1 primers, (1) 14 (SHI-2 nt 1950–1967, 5′-GGCCCATTGCTGGGGGAG-3′), (2) 21 (nt 2792–2775, 5′-CCACAGGTGCTCCTGCTG-3′); probe 2 primers, (1) 863 (nt 5051–5068, 5′-GTCACCTCTGGTTGAGAG–3), (2) 115 (nt 5610–5593, 5′-CCTGTAATAGATGCGGCA-3′); probe 3 primers, (1) 920 (nt 7273–7290, 5′-ACACTAGATATCACCGGC-3′), (2) 120 (nt 7797–7780, 5′-GTCATTGTCTGCACCTGG-3′); probe 4 primers, (1) 150 (nt 9301–9318, 5′-CCAGACCACCAGGCTGAT-3′), (2) 121 (nt 10540–10523, 5′-GCGCTCGATCACTATGCG-3′); probe 5 primers, (1) 913 (nt 12488–12505, 5′-CATCCTCAACTCGTCATC-3′), (2) 129 (nt 13893–13876, 5′-CCATAACAACAGCGGTGG-3′); probe 6 primers, (1) 923 (nt 16698–16715, 5′-ACTGGGCGGCGGAAGACC-3′), (2) 133 (nt 17685–17668, 5′-CCACGGATGTACCGGCAG-3′); probe 7 primers, (1) 15 (nt 17668–17685, 5′-CTGCCGGTACATCCGTGG-3′), (2) 19 (nt 20242–20225, 5′-GGTTTGCGCCATCCGCTA-3′).

The SHI-2 DNA sequence was obtained by sequencing pJM1, pJM2, pJM3 and pJM4 plasmid DNA isolated using Qiagen midipreps. Sequencing was performed on an ABI 377 sequencer by the Rockefeller DNA Sequencing Group using the ABI dye terminator cycle sequencing kit with AmpliTaq DNA polymerase. Mu-attL (5′-CCAATGTCCTCCCGGTTTTT-3′) and -attR (5′-GTTTTTCGTGCGCCGCTTCA-3′) primers were used to determine the location and orientation of Shigella DNA in pJM1, pJM2, pJM3 and pJM4 relative to the E. coli K-12 sequence. These primers correspond to the 5′ and 3′ ends of the mini Mu element oriented towards the cloned DNA. The entire SHI-2 sequence was obtained by chromosome walking using sequence information obtained from each round of sequencing. DNA sequence alignments were performed using GeneWorks and DNAstar. SHI-2 was sequenced completely from both strands a total of three times. The DNA sequence was submitted to the GenBank database under accession number AF141323.

G+C content analysis

A plot of G+C content along the 23.8 kb sequence was generated by scanning a window of 101 nucleotides along the sequence, counting the number of Gs and Cs in the window and dividing this number by 101 (total number of nucleotides). The resulting number is always between 0.0 (no G+C) and 1.0 (all G+C) and is assigned to the centre nucleotide in a window.

Colicin V immunity assays

Strain LG1522 is a pColV-containing E. coli strain that produces ColV. This strain was grown on LB plates for 2 days at 37°C. The plates were inverted over chloroform to kill LG1522. Soft agar containing the strain to be tested was overlaid and the plate was incubated overnight. An immune bacterial strain grew as a lawn overnight, whereas a sensitive strain had a zone of clearing.


We would like to thank Shelly Payne for sharing unpublished data and for promoting co-submission of our manuscripts. This work was supported by grants from the National Institutes of Health AI 37720 and GM54900. E.A.G. is an Associate Investigator of the Howard Hughes Medical Institute.