A chimeric ribozyme in Clostridium difficile combines features of group I introns and insertion elements


  • Veit Braun,

    1. Verfügungsgebäude für Forschung und Entwicklung, Institut für Medizinische Mikrobiologie und Hygiene, Johannes Gutenberg-Universität, 55101 Mainz, Germany.
    Search for more papers by this author
    • †The first two authors contributed equally to this paper.

  • Markus Mehlig,

    1. Verfügungsgebäude für Forschung und Entwicklung, Institut für Medizinische Mikrobiologie und Hygiene, Johannes Gutenberg-Universität, 55101 Mainz, Germany.
    Search for more papers by this author
    • †The first two authors contributed equally to this paper.

  • Michael Moos,

    1. Verfügungsgebäude für Forschung und Entwicklung, Institut für Medizinische Mikrobiologie und Hygiene, Johannes Gutenberg-Universität, 55101 Mainz, Germany.
    Search for more papers by this author
  • Maja Rupnik,

    1. Verfügungsgebäude für Forschung und Entwicklung, Institut für Medizinische Mikrobiologie und Hygiene, Johannes Gutenberg-Universität, 55101 Mainz, Germany.
    Search for more papers by this author
  • Bettina Kalt,

    1. Verfügungsgebäude für Forschung und Entwicklung, Institut für Medizinische Mikrobiologie und Hygiene, Johannes Gutenberg-Universität, 55101 Mainz, Germany.
    Search for more papers by this author
  • David E. Mahony,

    1. Department of Microbiology and Immunology, Faculty of Medicine, Dalhousie University, Halifax, Nova Scotia, Canada B3H 4H7.
    Search for more papers by this author
  • Christoph Von Eichel-Streiber

    Corresponding author
    1. Verfügungsgebäude für Forschung und Entwicklung, Institut für Medizinische Mikrobiologie und Hygiene, Johannes Gutenberg-Universität, 55101 Mainz, Germany.
    Search for more papers by this author


CdISt1, a DNA insertion of 1975 bp, was identified within tcdA-C34, the enterotoxin gene of the Clostridium difficile isolate C34. Located in the catalytic domain A1-C34, CdISt1 combines features of two genetic elements. Within the first 434 nt structures characteristic for group I introns were found; encoding the two transposase-like proteins tlpA and tlpB nucleotides 435–1975 represent the remainder of a IS605-like insertion element. We show that the entire CdISt1 is accurately spliced from tcdA-C34 primary transcripts and that purified TcdA-C34 toxin is of regular size and catalytic activity. A search for CdISt1-related sequences demonstrates that the element is widespread in toxinogenic and non-toxinogenic C. difficile strains, indicating the mobility of CdISt1. In strain C34, we characterize 10 CdISt1 variants; all are highly homologous to CdISt1 (> 93% identity), integrated in bacterial open reading frames (ORFs), show the typical composite structure of CdISt1 and are precisely spliced from their primary transcripts. CdISt1-like chimeric ribozymes appear to combine the invasiveness of an insertion element with the splicing ability of a group I intron, rendering transposition harmless for the interrupted gene.


Among the large number of genetic elements that have been identified in pro- and eukaryotes, two of the most interesting elements are insertion elements and group I introns. Insertion elements (IS-elements) are loosely defined as small, phenotypically cryptic DNA elements showing a simple genetic organization. Yet more than 500 different IS-elements have been identified in a wide range of bacterial species (Mahillon and Chandler, 1998). The genetically compact IS-elements generally encode no functions other than those necessary for their mobility. These include in particular one or occasionally two transposases covering nearly the entire length of the IS-element. Transposases act in concert with host proteins at each end of the element to mediate insertion into new sites (Mahillon and Chandler, 1998). Most IS-elements exhibit short-terminal inverted repeats of between 10 and 40 bp, which are involved in binding of the transposase as well as in cleavage and strand transfer reactions resulting in transposition of the element. On insertion, IS-elements usually generate short-direct repeats – typically 2–9 bp – of the target DNA sequences which flank the integrated IS-element. Transposition activity of IS-elements is generally maintained at a low level, because each new transposition could give rise to detrimental or even lethal mutations in the host cell (Doolittle et al., 1984; Mahillon and Chandler, 1998).

Group I introns are large catalytic RNAs which carry out self-splicing, resulting in their own excision from a precursor mRNA and ligation of the flanking exon sequences (Jaeger et al., 1997). Group I introns share a common secondary structure and conserved sequences which form the catalytic core (Michel and Westhof, 1990). Due to non-conserved DNA stretches and inserted open reading frames (ORFs) they vary extensively in length (Lambowitz and Belfort, 1993; Belfort et al., 1995). Many of the ORFs code for endonucleases involved in site-specific transposition of the intron to intronless alleles of the same gene after cleavage of the target site (Belfort and Perlman, 1995). This process is called homing.

Group I introns have been detected in a wide range of organisms, organelles and genes. They occur in mitochondrial and chloroplast genomes, viruses and nuclear rRNA genes of various fungi, plants and protists (Damberger and Gutell, 1994). In eubacteria, group I introns have been identified exclusively either inserted in tRNA genes (Kuhsel et al., 1990; Xu et al., 1990; Reinhold-Hurek and Shub, 1992; Biniszkiewicz et al., 1994; Paquin et al., 1997) or phage-encoded genes (Bechhofer et al., 1994; Goodrich-Blair and Shub, 1994; Young et al., 1994; Mikkonen and Alatossava, 1995; van Sinderen et al., 1996; Lazarevic et al., 1998; Landthaler and Shub, 1999). Despite the increasing number of introns reported in Gram-negative and Gram-positive species, group I introns have not yet been identified in any species of the genus Clostridium. However, a single group II intron was found in the conjugative transposon, Tn5397, of Clostridium difficile (Mullany et al., 1996).

C. difficile is a human and animal pathogen producing two high molecular weight toxins, the enterotoxin TcdA and the cytotoxin TcdB (Hatheway, 1990). These toxins are responsible for pseudomembranous colitis (PMC) and many instances of antibiotic-associated diarrhoea (AAD) in humans (Knoop et al., 1993). Both toxin genes are organized in the pathogenicity locus (PaLoc) of C. difficile (Braun et al., 1996), a genetic element resembling pathogenicity islands (PAI) (Hacker et al., 1997). TcdA and TcdB irreversibly block eukaryotic signal transduction by covalent attachment of a glucose moiety to small GTPases from the Ras superfamily (Herrmann et al., 1998). Blockade of GTPases results in breakdown of the cellular cytoskeleton and induction of rounding of toxin treated cells (Thelestam et al., 1997).

In this work, we aim to characterize the insertion CdISt1 located in the ORF of the enterotoxin TcdA-C34 of C. difficile strain C34. CdISt1 shows structures characteristic for group I introns and IS-elements. With regard to its special features and its complex structure, we propose that CdISt1 constitutes a novel class of chimeric ribozymes that are specifically adapted to survive and spread in eubacterial genomes.


Identification of CdISt1

Domain A1 (encoding TcdA′s catalytic domain) of the tcdA gene of C. difficile strain C34 was amplified, using the primer pair ACD1C/ACD2N (Rupnik et al., 1997). Compared with domain A1 of the reference strain 10463 (3.1 kb), the PCR product A1-C34 (5.1 kb) was about 2 kb larger (Fig. 1A and B). To explain this difference in size, we cloned PCR product A1-C34. Sequence analysis identified a 1975 bp insertion which we designated as CdISt1. CdISt1 is integrated 498 nt downstream of the start of tcdA-C34 and contains structural elements typical for group I introns and IS-elements.

Figure 1.

PCR analysis of tcdA-C34 and properties of the purified TcdA-C34.

A. Partial map of tcdA-C34 and flanking genes of C. difficile strain C34. Genes tcdA, C and E of the PaLoc are shown as open boxes; arrows indicate the direction of transcription. The chequered box highlights CdISt1. Localization of PCR product A1 used to characterize the catalytic domain of tcdA is shown above the map. Relative positions of primers ICheck1–4, ACD1C and ACD2N used for characterization of CdISt1 are indicated by arrow heads below the map.

B. PCR analysis of domain A1 of C. difficile strains C34 (A1-C34) and 10463 (A1-10463). PCR products obtained using paired primers ACD1C and ACD2N (Table 4) and chromosomal DNA as a template are compared on agarose gel. The difference between A1-C34 and A1-10463 respresents the size of CdISt1 (1975 bp, deduced from our sequence analysis). (DNA marker sizes in kb: 10, 8, 6, 5, 4, 3.5, 3, 2.5, 2, 1.5, 1, 0.75).

C. SDS–PAGE of purified toxins TcdA-C34 and TcdA-10463. TcdA-C34 and TcdA-10463 were purified from supernatants of C. difficile strains C34 and 10436, respectively, separated by SDS–PAGE (7.5% gel; 2 µg of each toxin) and stained with Coomassie blue.

D. Catalytic activity of TcdA-C34. TcdA-C34 was used to catalyse the transfer of [14C]glucose to the indicated GTPases (provided as GST–GTPase fusion proteins). The autoradiogram detects target GTPases glucosylated by TcdA-C34.

Group I intron structures in CdISt1

Analysing the sequence of CdISt1, we detected structural elements characteristic of group I introns (Fig. 2A and B). These included the conserved nucleotides U and G at the putative 5′-splice site and 3′-splice site respectively. Furthermore, we found conserved P, Q, R and S sequences (P: AAUUGCUGGAAA; Q; AAUCAGCAGG; R: CUUCAACGACUA; S; AAGAUAUAGUCU; Fig. 2A) matching well to the consensus sequences reported by Cech (1988) (P: AAUUNCNNGAAN, Q: AAUNNGNAGC, R: GUUCAGAGACUANA and S: AAGAUAUAGUCC). The interaction of P and Q forms domain P4; the interaction of R and S forms domain P7 (Fig. 2A). Both are part of the catalytic core of group I introns (Cech et al., 1994). Using P4 and P7 as structural guides, we could fold the 5′ sequence of CdISt1 to fit the typical secondary structure of group I introns. The proposed secondary structure shows the characteristic base pairing regions P1–P9 (Fig. 2A) necessary for proper folding and function of group I introns (Jaeger et al., 1997). Most group I introns form a P10 pairing to support binding and co-ordination of the 3′-splice site during the second step of RNA self-splicing (Michel and Westhof, 1990; Jaeger et al., 1997). However, analysing the exon sequences 3′ to CdISt1, we could not find an appropriate structure for such a P10 pairing in tcdA.

Figure 2.

Characteristics of CdISt1.

A. Proposed secondary structure of CdISt1. Sequences of CdISt1 are given in capital letters, exon sequences in small letters; nucleotides highly conserved in group I introns are in bold and italic (Michel and Westhof, 1990); characteristic pairings are indicated by P1–P9, the P1 pairing contains the 5′-splicing site with its conserved uridine-guanosine pairing (in bold u–G); domains P4–P6 and P3–P7–P8 are known to form the catalytic core of group I introns (Jaeger et al., 1997); sequences P, Q, R, S are highly conserved regions of the catalytic core; their extent is indicated by lines in parallel to the nucleotide sequence (for a comparison to consensus P,Q,R,S sequences see text); 3′ to the P9.2 pairing at a distance of 196 nt two ORFs, encoding TlpA and TlpB (transposase-like proteins), are depicted as opened boxes (the arrow indicates the direction of transcription). The 3′ terminal nucleotide of CdISt1 is a conserved guanosine. Beginning and end of the 30 bp stretch conserved in CdISt1 and IS8301 (see C) are marked with asteriks. IGS, internal guide sequence.

B. Schematic drawing of the sunY intron of phage T4 (subgroup IA2) according to Jaeger and Michel (1997).

C. DNA sequence comparison of the insertion site and the conserved 30 bp stretch at the 5′ end of CdISt1 and IS8301. Short vertical lines indicate identical nucleotides. In front of the long vertical line the sequence of the 5′-integration site is given.

D. Schematic drawing of the insertion element IS8301 (not to scale). The genes of the two transposases TnpB and TnpC are depicted as opened boxes (the arrow indicates the direction of transcription). Dotted lines indicate regions of homology between IS8301 and CdISt1, the percentages of similarity between TlpA and TlpB of CdISt1 and TnpB and TnpC of IS8301, respectively, are given (Table 1). The sequence of the 5′-insertion site is shown in front of IS8301.

E. Comparison of the DDE domains of TlpB and TnpC. The DDE motif is shown in large bold letters. Short vertical lines indicate identical amino acids, asteriks indicate conservatively substituted amino acids. The number in parentheses show the distances in amino acids between the given amino acids.

From analysis of its primary and proposed secondary structure (Fig. 2A), CdISt1 could not easily be assigned to one of 11 defined subgroups of group I introns (Shub et al., 1988; Michel and Westhof, 1990). CdISt1 most closely resembled introns from the IA2 subgroup (Fig. 2B), because it possesses a relatively short domain 6, two hairpins in domain 7 (P7.1 and P7.2) and an extended domain 9 (Michel and Westhof, 1990; Jaeger et al., 1997). Unusually for group IA2 introns, CdISt1 has an extended domain P5, composed of several pairings (Fig. 2A).

Insertion element structures in CdISt1

Downstream of the P9 pairings, CdISt1 harbours two ORFs of 105 bp (bp 631–738 of CdISt1) and 1116 bp (bp 740–1858) which are oriented in the same direction and separated by a single nucleotide (Fig. 2A). The larger ORF starts with an alternative initiation codon (GTG) (a shorter version of this ORF might initiate at an ATG start codon located 33 bp downstream) and shows sequences which could act as a ribosome binding site. The ORFs were designated as tlpA and tlpB, because both encode transposase-like proteins (Table 1).

Table 1. Transposases homologous to TlpA and TlpB.
(in aa)
Amino acidsAccession
substitutedd (in%)
(in aa)
  • a

    . Bacteria listed according to the best fit to TlpA or TlpB.

  • b

    . The size of the entire ORF according to the Accession No. listed to the right.

  • c

    . Amino acid residues identical to TlpA and TlpB respectively.

  • d

    . Amino acids identical or conservatively substituted compared with TlpA and TlpB respectively.

  • e

    . Number of aa proposed as the overlap by computer analysis.

  • f

    . Database Accession No.

  • ND, not determined.

Clostridium difficile TlpAIS200 homologueCdISt135100100  
 Deinococcus radioduransTnpBIS200 homologueIS8301140627929BAA32389
 Wolbachia sp.transposaseIS200 homologueISW1144435832BAA73610
 Helicobacter pyloriTnpAIS200 homologueIS60664406330AAD06404
 Helicobacter pyloriTnpAIS200 homologueIS605138406330AAD11513
 Vibrio choleraeTnpAIS200 homologueIS1004145466132CAA91526
Clostridium difficile TlpBIS1341 homologueCdISt1372100100  
 Deinococcus radioduransTnpCIS1341 homologueIS83014084863363BAA32390
 Thermophilic bacterium PS3
transposase like
protein of PS3IS
 Dichelobacter nodosusORF375IS1341 homologueIS12533754155359AAB12365
 Borrelia burgdorferi
IS1341 homologueND3553958350AAC97568
 Escherichia coli
IS1341 homologueColIb-P93913655359CAB41498
 Shigella sonneitransposaseIS1341 homologueColIb-P93933655359BAA75131
 Synechocystis sp.transposaseIS1341 homologueND4003755356BAA17870
 Helicobacter pyloriTnpBIS1341 homologueIS6054273857366AAC44690
 Helicobacter pyloriTnpBIS1341 homologueIS6064422844368AAD06403

The deduced protein sequence of tlpA has homologies to IS200-like transposases of Deinococcus radiodurans, Wolbachia spp., Helicobacter pylori and Vibrio cholerae (Table 1) (Bik et al., 1996; Censini et al., 1996; Kersulyte et al., 1998). Whereas IS200-like transposases typically consist of 135–145 aa (Table 1), tlpA codes only for 35 aa, which correspond to the C-terminal end of the homologous IS200-like transposases (Fig. 2A and D). We assume that tlpA is the truncated remainder (3′ part) of a progenitor gene which encoded for a complete IS200-like transposase. Unlike tlpA, the tlpB gene codes for a complete putative transposase, TlpB (372 aa) (Table 1). Transposases homologous to TlpB (Table 1) and likewise situated next to IS200-like transposases, are found in composite insertion elements from the IS605 family.

TlpA and TlpB of CdISt1 display the highest homologies to transposases TnpB and TnpC from the insertion element IS8301 of Deinococcus radiodurans (Table 1). IS8301 (Accession No. AB016803) has the typical structures of an IS605-like insertion element (Fig. 2D). Further members of the IS605 family, encoding transposases homologous to TlpA and TlpB, are IS605 (coding for TnpA-IS605 and TnpB-IS605) and IS606 (coding for TnpA-IS606 and TnpB-IS606) of Helicobacter pylori (Table 1) (Censini et al., 1996; Kersulyte et al., 1998; Mahillon and Chandler, 1998).

Apart from the high homologies between the transposases of CdISt1 and IS8301, we identified further similarities between the two elements. (i) The two genes for the transposases in CdISt1 and IS8301 are oriented in the same direction in contrast to typical IS605-like insertion elements. Moreover, TlpB (CdISt1) and TnpC (IS8301) contain similar DDE domains (Fig. 2E), a motif which has been proposed to be directly involved in catalysis of transposases (Polard and Chandler, 1995). (ii) A database search using the nucleotide sequence of CdISt1 demonstrated that both genetic elements contain a highly conserved DNA stretch of 30 bp (93% identity) at a similar position at the 5′ end (nt 14–43 of CdISt1; nt 11–40 of IS8301) (Fig. 2C). As transposases are known to specifically bind regions close to the ends of their cognate genetic elements while transposing (Polard and Chandler, 1995), it is tempting to assume that the conserved 30 bp DNA stretch is an important structure for mobility. (iii) Both, CdISt1 and IS8301, show a target site specificity, which is characterisitc for insertion elements of the IS605 group (Kersulyte et al., 1998). Both elements insert with their 5′ ends directly downstream of the conserved pentanucleotide TTGAT (Fig. 2C). (iv) Finally, both CdISt1 and IS8301 do not duplicate target sequences during insertion. This is unusual for the majority of IS-elements, but characteristic for transposition of an IS-element of the IS605 group (Mahillon and Chandler, 1998) The similarities between CdISt1 and IS8301 indicate that both elements might use a similar mechanism for transposition.

RT-PCR analysis of CdISt1

Group I introns are self-splicing RNA molecules. To obtain evidence for the splicing of CdISt1 from precursor mRNA in vivo, total RNA isolated from C. difficile C34 was used as a template for RT-PCR analysis. Use of primer pairs ICheck1/ICheck2 and ICheck3/ICheck4 (Fig. 1A) showed transcription over the 5′ and 3′ junction of CdISt1, respectively (Fig. 3A), indicating that transcription of the tcdA gene is not abolished by insertion of CdISt1. Using primer pair ICheck1/ICheck4, which amplifies the entire CdISt1 and parts of the flanking exons, we got PCR products of 2200 bp with chromosomal DNA and of 225 bp with cDNA as templates (Fig. 3A). The difference in size of the PCR products corresponds to the length of CdISt1 (1975 bp), revealing excision of the CdISt1 sequences from the primary transcript. Sequencing of the cloned RT-PCR product verified the assumed splice sites and precise excision of CdISt1.

Figure 3.

RT-PCR analysis of CdISt1.

A. RT-PCR analysis to detect splicing of CdISt1. Using primer pairs given in Table 4, RT-PCR products were generated and separated on 3% agarose gels. Templates used for RT-PCR analyis were chromosomal DNA (lanes 1, 4, 7), cDNA generated with (lanes 2, 5, 8) or without adding reverse transcriptase (lanes 3, 6 and 9); Amplifications without reverse transcriptase remained without product showing that the RT-PCR products of lanes 2, 5 and 8 were generated from isolated pure RNA rather than from contaminating DNA; lane: M, size marker: (size in bp: 501, 489, 404, 331, 242, 190, 147, 111, 110). Sizes of the PCR products are: lane 1 and 2, 397 bp (5′ junction between tcdA-C34 and CdISt1); lane 4 and 5, 199 bp (3′ junction); lane 7, 2200 bp (CdISt1 and flanking exon regions; the band of the expected size is marked with an arrow; the additional band is an unspecific amplificate); lane 8, 225 bp (spliced mRNA).

B. RT-PCR analysis to detect cyclization of CdISt1. ICheck2/ICheck3 were used as primers and cDNA as a template to produce two RT-PCR products (I and II) indicating intramolecular cyclization of CdISt1 at two different sites (lane 1). cDNA synthesized without adding reverse transcriptase was used as a negative control (lane 2).

C. Cyclization sites of excised CdISt1 mRNA. Sequencing of cloned RT-PCR products I and II revealed the two cyclization sites. Cyclization takes place upstream of nt 2 (I) or 94 (II) of the CdISt1 sequence by attack of the 3′ terminal G.

Optionally, excised group I introns cyclize via an additional transesterification (Lambowitz and Belfort, 1993). Performing a RT-PCR analysis with primer pair ICheck2/ICheck3, we got two RT-PCR products indicating intramolecular cyclization of CdISt1 and/or CdISt1-related sequences (Fig. 3B). Sequence analysis of cloned RT-PCR products revealed two cyclization sites. Cyclization occurred upstream of nt 2 and, alternatively, of nt 94 of the CdISt1 sequence by attack of the 3′ terminal G (Fig. 3C).

Self-splicing of CdISt1

In CdISt1, the 3′-splice site is more than 1500 nt downstream of P9.2 (Fig. 2A). Despite the extraordinary distance between P9 and the 3′-splice site, our RT-PCR analysis proved that CdISt1 is precisely excised from precursor mRNA in vivo (Fig. 3A). To check the splicing efficiency of CdISt1, we performed an in vitro self-splicing assay. We generated two different precursor RNAs from plasmids pCR-ISt1 and pCR-ISt5. Both contain CdISt1 (1975 nt) and the same 3′ exon (150 nt) but differ in the length of the 5′ exon. pCR-ISt1 and pCR-ISt5 contain 5′ exons of 308 nt and 477 nt respectively (Fig. 4). The calculated sizes of the ligated exons are 458 nt (pCR-ISt1) and 627 nt (pCR-ISt5). Splicing was monitored over a time course of 5 min (Fig. 4, lane 2) to 180 min (Fig. 4, lane 7). Our self-splicing assay revealed that CdISt1 sequences (ISt 1975 nt) were efficiently excised from both precursor RNA species and, simultaneously, ligated exons (LE) of the predicted sizes were produced by the splicing process (Fig. 4). Although some by-products are observable after an incubation time of more than 120min (Fig. 4, lane 6 and 7), the self-splicing assay proved that the main products of the splicing process are ligated exons representing the correct tcdA sequence and the precisely excised ribozyme CdISt1 (Fig. 4).

Figure 4.

Self-splicing of CdISt1 precursors containing a short (pCR-ISt1) or a long (pCR-ISt5) 5′ exon. The length of the exon and intron sequences are diagrammed at the bottom of the figure. The 5′ exon of precursor pCR-ISt1 contains 239 nt of natural tcdA sequence upstream of the 5′-splice site plus 69 nt of vector sequences, the 5′ exon of precursor pCR-ISt5 contains 408 nt of natural tcdA sequences upstream of the 5′-splice site plus 69 nt of vector sequences. The 3′ exons of both precursors contain 108 nt of natural tcdA sequence downstream of the 3′-splice site and 42 nt of vector sequences. Splicing reactions were carried out at 37°C as described in Experimental procedures. Aliquots were removed at 0 min (lane 1), 5 min (lane 2), 10 min (lane 3), 20 min (lane 4), 60 min (lane 5), 120 min (lane 6) and 180 min (lane 7) after the splicing reaction was started by adding GTP and electrophoresed on a 4% polyacrylamid/8 M urea gel heated to 50°C. PRE indicates the respective precursors, LE denotes the ligated exon products, ISt marks the position of the linear CdISt1 RNA. Mlow, low range RNA ladder (sizes in nt: 1000, 800, 600, 400,300, 200), Mhigh, high range RNA ladder (sizes in nt: 6000, 4000, 3000, 2000, 1500, 1000, 500, 200).

Effects of CdISt1 on expression and properties of TcdA-C34

CdISt1 is inserted in that part of the tcdA-C34 gene which codes for the catalytic domain of TcdA-C34 (Wagenknecht-Wiesner et al., 1997). We therefore checked whether insertion of the element affected the translation and catalytic activities of TcdA-C34. As has been described previously for C. difficile reference strain 10463 (Hundsberger et al., 1997), we found that strain C34 also produces about three times more TcdA than TcdB (data not shown). SDS–PAGE revealed that the size of TcdA-C34 is comparable to that of the reference toxin TcdA-10463 (Fig. 1C). To exclude the possibility that purified TcdA-C34 is the product of a second tcdA-C34 gene, we performed a Southern blot analysis which demonstrated the absence of a further tcdA-C34 gene in the genome of C. difficile (data not shown). Additionally, testing of the enzymatic activities of TcdA-C34 revealed that the toxin glucosylated GTPases Rho, Rac, Cdc42 and Rap1, but not Ras (Fig. 1D). Thus, TcdA-C34 shows the same catalytic activities as reference toxin TcdA-10463 (Just et al., 1995). These experiments prove that insertion of CdISt1 in tcdA-C34 had neither a significant effect on the expression rate of TcdA-C34 nor any effect on the size of this protein or on its catalytic activity, verifying efficient and precise splicing of this element in C. difficile.

CdISt1-related sequences in different C. difficile strains

Southern hybridization experiments were performed to search for CdISt1-related sequences in the genomes of the non-toxinogenic strain 42373 and the toxinogenic strains 10463 and C34. A DNA fragment of CdISt1’s 5′ end (bp 5–580 of CdISt1) was used as a DNA probe to screen HincII or HindIII digested chromosomal DNA. As there are target sequences for both restriction enzymes within CdISt1 (HindIII at position 776; HincII at position 1115 of CdISt1), each hybridization signal corresponds to a single CdISt1 copy under these conditions. Southern hybridizations detected at least 11 CdISt1-related sequences in the genome of C. difficile C34, four in the reference strain 10463 and six in the non-toxinogenic isolate C. difficile 42373 (Fig. 5). In order to check a broader range of C. difficile strains, a collection of approximately 200 clinical isolates was screened for the presence of CdISt1 by PCR. While none of the isolates contained CdISt1 inserted in its tcdA gene, CdISt1-specific PCR products were obtained in all 200 strains (data not shown).

Figure 5.

Southern hybridization to detect CdISt1-related sequences. CdISt1-related sequences were detected by Southern hybridization using a partial CdISt1 fragment as the DNA probe (nt 503–1078 of Accession No. AJ131844). The DNA probe was such that each signal corresponds to a single insertion of CdISt1 in HincII or HindIII digested chromosomal DNA. Chromosomal DNA (15 µg each) of reference strain 10463 (lanes 1 and 4), strain C34 (lanes 2 and 5) and the non-toxinogenic isolate 42373 (lanes 3 and 6) was digested with the indicated restriction enzymes (HincII: lanes 1–3; HindIII: lanes 4–6). Hybridization of CdISt1 to DNA of strain 10463 (lanes 1 and 4) yielded four bands (in lane 4 the largest band resulted from incomplete digestion), six signals on DNA of isolate 42373 (lanes 3 and 6), and at least 11 on DNA of strain C34 (lanes 2 and 4).

Properties of CdISt1 variants in strain C34

By inverse PCR (iPCR), we identified the integration sites of 10 further CdISt1 variants (CdISt1a–CdISt1j) in the genome of strain C34. Sequencing of the iPCR products revealed that all variants are integrated in bacterial ORFs directly downstream of the conserved pentanucleotide TTGAT (Table 2). The deduced amino acid sequences of eight of these ORFs show homologies to known bacterial proteins (Table 3).

Table 2. Characteristics of CdISt1 variants.
(in nt)
Identity (%)Integration sitee
OverallaIntron componentbTlpAcTlpBd 
  • a

    . Identity of the nucleotide sequence compared with CdISt1.

  • b

    . Identity of the nucleotide sequence of the intron component (nt 1–440) compared with CdISt1.

  • c

    . Identity of the amino acid-sequence compared with TlpA of CdISt1.

  • d

    . Identity of the amino acid-sequence compared with TlpB of CdISt1.

  • e

    . The arrow marks the integration site.

  • Δ, deleted.

Table 3. Integration sites of selected CdISt1 variants.
CdISt1 variantHomologue ORFOrganismAccession No.
CdISt1aAlcohol dehydrogenase Thermogota maritima AAD35205
CdISt1cVirulence factor Escherichia coli P75932
CdISt1dPutative AraC-type regulator Ruminococcus flavefaci CAB51935
CdISt1eAspartate racemase Archaeoglobus fulgidus 2649148
CdISt1fPutative transcriptional regulator Mycobacterium leprae CAA18564
CdISt1gPutative integral membrane protein Streptomyces coelicolor CAB52363
CdISt1i16S pseudouridylate synthase Thermotoga maritima AAD35352
CdISt1jPyruvat-formate lyase-activating
Clostridium pasteurianum CAA63749

All identified CdISt1 variants were amplified by PCR and subsequently cloned into vector pCR2.1. Sequence analysis revealed that CdISt1 variants a–j are highly homologous to CdISt1 (overall sequence identity > 93%, Table 2). In addition, the characteristic composite structure of CdISt1 is conserved in all CdISt1 variants. Whereas six out of 10 variants analysed have lost nearly the whole tlpA gene, all variants retained the group I intron component of CdISt1 and the complete tlpB gene (Table 2).

Using primers given in Table 4, we performed a RT-PCR analysis to check splicing of CdISt1 variants a–j. With cDNA as a template, RT-PCR products corresponding to the accurately spliced mRNA species were generated (Fig. 6, lanes 2, 6, 10 and data not shown). These experiments indicate that all CdISt1 variants are actually excised from their primary transcripts. In all cases, precise splicing was verified by sequencing of the cloned RT-PCR products. Controls using chromosomal DNA of strain C34 as a template (Fig. 6, lanes 1, 5, 9) never gave PCR products of the size of the corresponding RT-PCR amplificates, demonstrating that the RT-PCR products of lanes 2, 6, 10 do not originate from a second copy of the respective ORF.

Table 4. Primers used in this study.
PrimerSequence (5′– 3′)Amplificat
  • a

    . With flanking exon regions.

Figure 6.

RT-PCR analysis of CdISt1 variants d–f. RT-PCR analysis of CdISt1 variants d–f are shown as an example for all CdISt1 variants. RT-PCR products were generated with primers given in Table 4 and separated on a 3% agarose gel. Templates used for RT-PCR analysis were chromosomal DNA (lanes 1, 5 and 9) and cDNA generated with (RT+ lanes 2, 6 and 10) or without adding reverse transcriptase (RT–; lanes 3, 7 and 11). Respective negative controls (adding H2O instead of template) are shown in lanes 4, 8 and 12. The calculated sizes of RT-PCR products indicating precise splicing of the respective CdISt1 variant are: CdISt1d, 314 bp; CdISt1e, 247 bp; CdIStf, 228 bp. The amplifications without reverse transcriptase remain without products showing that the RT-PCR products of lanes 2, 6 and 10 were generated from isolated pure RNA rather than from contaminating DNA; lane: M, size marker: (size in bp: 501, 489, 404, 331, 242, 190, 147, 111, 110).


The pathogenic clinical isolate C. difficile C34 was collected in the course of a nosocomial diarrhoea study (Mahony et al., 1991). Characterizing the catalytic domain of tcdA-C34 by PCR and sequence analysis, we identified an intervening segment of 1975 bp (designated as CdISt1) in the tcdA-C34 gene, located 498 bp downstream of its ATG-start codon. Characterization of CdISt1 revealed that it comprises typical features of two different genetic elements: group I introns and IS-elements.

The 434 bp at the 5′ end of CdISt1 show the typical structures and key features of group I introns (Fig. 2A). RT-PCR analysis demonstrated that CdISt1 (1975 bp) is precisely excised from the precursor mRNA in vivo (Fig. 3A). Furthermore the self-splicing assay revealed that accurate splicing of CdISt1 is obviously an efficient process and not a rare event (Fig. 4). In consequence, C. difficile strain C34 produces an enzymatically and biologically active enterotoxin TcdA-C34 (Fig. 1D). Proof of cyclization of the excised CdISt1-mRNA (Fig. 3B and C) verified the ribozyme activity of this genetic element. Thus, CdISt1 exerts the characteristic catalytic activity of a group I intron.

Many group I introns have ORFs looped out of their pairings P1, P2, P6, P8 or P9 (reviewed in Lambowitz and Belfort, 1993). Usually these ORFs encode either maturases or, more often, endonucleases. The two ORFs of CdISt1 differ with respect to both localization and function from ORFs of ‘classical’ group I introns. Neither of the two CdISt1 ORFs shows homology to maturases or endonucleases. Instead, TlpA and TlpB encode for proteins homologous to putative bacterial transposases of the two unrelated IS-elements, IS200 and IS1341 (Table 1). The association of IS200 and IS1341 homologues is characteristic for composite insertion elements of the IS605 type (Kersulyte et al., 1998). However, as the 5′ end of tlpA is obviously deleted, what remains in CdISt1 is the structure of a truncated insertion element from the IS605 family (Fig. 2A). Strikingly this IS-element like structure which harbours tlpA and tlpB can not be allotted to any of the domains P1–P9, but is located more than 190 bp downstream of P9.2 (Fig. 2A). Thus, CdISt1 shows the structure of a chimeric genetic element, composed of a group I intron fused to the 3′ region of a truncated insertion element from the IS605 family (Fig. 2).

We screened over 200 toxinogenic C. difficile strains for the existence of CdISt1. Although we detected CdISt1 related sequences in all investigated isolates, none except strain C34 contained CdISt1 integrated in its tcdA gene (data not shown). There are two explanations for the existence of CdISt1 in strain C34s tcdA gene. Either CdISt1 could have been generated directly in the tcdA gene of strain C34 by successive integration of a group I intron and an IS-element or, alternatively, CdISt1 is mobile and has moved to tcdA-C34 as a chimeric genetic element. In order to address this question, we characterized structure and activity of 10 further CdISt1-related sequences identified in the genome of strain C34 (Fig. 5). Analysis revealed that all these elements are chimeras which are highly homologous to CdISt1 and have its characteristic composite structure (Fig. 6 and Table 2). The existence of CdISt1-like elements at several locations in the genome indicates that CdISt1 is mobile and may transpose as a complete genetic unit. Furthermore, sequence analysis of CdISt1, its variants CdISt1 a–j and their integration sites revealed that these genetic elements show all features characteristic for a IS605-like mobility mechanism (Kersulyte et al., 1998): (i) CdISt1 and its variants insert with their 5′ end downstream of a strictly conserved A+T-rich pentanucleotide (TTGAT), and no sequence specificity is observed next to their 3′ ends (Table 2); (ii) all CdISt1 related elements integrated without duplication of target sequences and; (iii) all elements lack terminal inverted repeats. These data together with the striking similarities found between CdISt1 and the insertion element IS8301 (see Results and Fig. 2) lead us to assume that the IS-element component mediates the spread of CdISt1.

The splicing ability of the 10 identified CdISt1 variants was investigated by RT-PCR analysis (Fig. 6 and data not shown). All CdISt1 variants are accurately spliced from the corresponding primary transcripts, demonstrating their ribozyme activity in several different ORFs. Thus, integration of CdISt1 into genes does usually not severely affect or even abolish expression of the respective proteins. In addition, we found that CdISt1 is spliced efficiently and precisely even in a heterologous, eukaryotic system (unpublished data). Splicing is thus a typical feature of CdISt1 related sequences.

We have shown that the characteristic chimeric structure of CdISt1 and its typical activity are conserved in all analysed CdISt1 variants. Therefore, we conclude that CdISt1 represents a functional genetic unit composed of an insertion element component, probably mediating mobility, and a group I intron component, responsible for the ribozyme activity of CdISt1.

Splice-site recognition of group I introns relies on pairings with exon sequences (Lambowitz and Belfort, 1993). This holds true for CdISt1, too. To form the P1 domain the internal guide sequence (IGS) of the intron component has to pair with the exon sequences flanking its 5′-splice site (Fig. 2A). Because the P1 domain co-ordinates the 5′-splice site to the catalytic core of the group I intron, its correct formation is a prerequisite for precise splicing (Jaeger et al., 1997). To maintain the splicing activity of CdISt1 after transposition, a productive mobility mechanism should always place CdISt1 in an appropriate sequence context to enable formation of a P1 domain. As mentioned above, we found that CdISt1 and all its variants are site-specifically inserted downstream of the conserved pentanucleotide TTGAT (Fig. 2C and Table 2). On the RNA level, the final four nucleotides of this nucleotide stretch (UGAU) can pair with part of the IGS of CdISt1 (GUCA) to co-ordinate the 5′-splice site correctly (Fig. 2A). Thus, its target site specificity apparently ensures that CdISt1 always integrates in sites that support formation of an appropriate P1 domain for the splicing process (Fig. 2A,C). Obviously the mobility mechanism, a presumed function of the IS-element component, and the ribozyme activity mediated by the intron component of CdISt1 are well adapted to each other.

CdISt1 might be regarded as a molecular symbiosis specialized to survive in prokaryotes because it appears to have the advantage of both precursor components, the intron and the insertion element. On the one hand, the group I intron component might profit from the mobility mechanisms provided by the IS-element component. In contrast to homing endonucleases responsible for the mobility of group I introns in eukaryotes (Jurica and Stoddard, 1999), transposases of insertion elements are adapted to mediate mobility in bacterial genomes (Polard and Chandler, 1995; Mahillon and Chandler, 1998). Therefore, transposases may provide a perfect mechanism for the spread of group I introns in prokaryotes. Indeed, our data show that CdISt1 was able to spread very successfully in C. difficile genomes (Fig. 5). On the other hand, the IS-element component could take advantage of the ribozyme activity of the intron component. The transposition activity of ‘classical’ IS-elements is generally maintained at a low level, because each new transposition could give rise to detrimental or even lethal mutations in the host cell (Doolittle et al., 1984; Mahillon and Chandler, 1998). In CdISt1, this obvious disadvantage of IS-elements is compensated for, because its IS-element component can ‘utilize’ the splicing activity of the group I intron component. In fact we could prove that CdISt1 does not influence expression of a functional TcdA-C34 (Fig. 1C and D). The same characteristics should apply to all identified CdISt1 variants because they are precisely spliced from their precurser mRNAs (Fig. 6). The risk to cause mutations resulting from transposition of CdISt1 like genetic elements is thus significantly reduced. Therefore, we hypothesize that in contrast to ‘classical’ IS-elements and group I introns, functional chimeric ribozymes such as CdISt1 are mobile genetic elements with the ability to insert into bacterial genes without severely effecting their transcription and translation.

In conclusion, the study of C. difficile strain C34 led us to the identification of the chimeric genetic element CdISt1. CdISt1 appears to combine the invasiveness of an insertion element with the splicing ability of a group I intron, rendering transposition harmless for the interrupted gene. We assume that CdISt1 constitutes a novel group of ribozymes that are specifically adapted to survive and spread in eubacterial genomes.

Experimental procedures

Bacterial strains

C. difficile strain 10463 was a gift from N. M. Sullivan (Virginia Polytechnic Institute, Blacksburg, VA, USA), with the non-toxinogenic strain 42373 donated by M. Delmée (Unité de Microbiologie, Université de Louvain, Brussels, Belgium). C. difficile C34 is a clinical isolate from Halifax, NS, Canada. C. difficile strains were grown in an anaerobic chamber using brain–heart infusion (BHI) or Wilkins–Chalgren broth. The Escherichia coli host strain TOP10 was used according to the manufacturer's recommendations (Invitrogen).

Protein preparation and glucosyltransferase activity

C. difficile toxin TcdA was prepared as described previously (von Eichel-Streiber et al., 1987). The purity of toxin preparations was assayed by SDS–PAGE. Glucosyltransferase reactions were performed as described by Wagenknecht-Wiesner et al. (1997), using recombinant GTPases Rho, Rac, Cdc42, H-Ras and Rap1 as target proteins. Proteins were separated by SDS–PAGE. Radio-labelled bands were detected by PhosphoImager analysis (Molecular Dynamics).

Standard PCR, cloning of PCR products and sequencing

PCR amplifications were performed as described (Soehn et al., 1998) in a Hybaid Omnigene Cycler (MWG-Biotech), using primer pairs given in Table 4. PCR products were cloned into plasmid pCR2.1 using the TOPO TA cloning kit in accordance with manufacturer's recommendations (Invitrogen). A LICOR 4000 l automatic DNA sequencer from MWG-Biotech and the Thermosequenase Cycle Sequencing Kit from Amersham Pharmacia Biotech were used for DNA sequencing. Primers used for sequencing were IRD-800-labelled (MWG-Biotech). To verify the reliability of the PCR-based sequence, sense and antisense strands of three independent clones were sequenced. The sequence data of A1-C34 (including CdISt1) has been deposited in the EMBL database under Accession No. AJ131844.

Nucleic acid analysis

Analysis of DNA and deduced amino acid sequences was performed with Lasergene software from dnastar. Analysis of RNA secondary structures was carried out with mfold version 3.0, available at http://www.ibc.wustl.edu/~zuker/ (Mathews et al., 1999; Zuker et al., 1999).

RNA preparation and RT-PCR

Isolation of total RNA from C. difficile cells of the exponential growth phase, purification of the RNA to remove any traces of chromosomal DNA and cDNA synthesis using hexanucleotides as primers were performed as described previously (Hundsberger et al., 1997). Oligonucleotides used for RT-PCR experiments are listed in Table 4.

Amplification was achieved by denaturing at 95°C (1 min), primer annealing at 52°C (1 min) and extension at 72°C (1 min), repeated for 30 cycles.

Preparation of precursor RNA

Precursor RNA was transcribed by T7 RNA polymerase from plasmids pCR-ISt1 and pCR-ISt5 linearized with BamHI in a standard reaction mixture for 60 min at 37°C (40 mM Tris-HCl pH 7.9, 6 mM MgCl2, 2 mM spermidine, 10 mM NaCl, 10 mM DTT, 1 mM each NTP, 50 units of Ribonuclease inhibitor, 30 units of T7 RNA polymerase and 1 µg of linearized DNA). Transcription products were 2433 nt (pCR-ISt1) and 2602 nt (pCR-ISt5) in length. After RNA synthesis, the template DNA was digested with 10 units of DNAse (RNAse-free) for 15 min at 37°C. Next, 1 µl of 500 mM EDTA, 20 µl of 3 M sodium acetate and 170 µl of TE10/1 were added to the reaction mixture and subsequently extracted with 300 µl of acidic Phenol (water-saturated)/Chloroform 1:1. The upper phase was transferred to a new reaction tube and the RNA was precipitated with 0.7 volumes of Isopropanol. Precursor RNA was dissolved in 50 µl of H2O and used directly to perform self-splicing assays.

Self-splicing reactions

Splicing of precursor RNAs was carried out in 100 mM (NH4)2SO4, 10 mM MgCl2, 100 mM HEPES pH 7.5 at 37°C. The splicing reaction was started by the addition of GTP (100 µM). Aliquots were removed at specified times and added to an equal volume of gel loading buffer containing 20 mM EDTA. Samples were electrophoresed on 4% polyacrylamid/8 M urea gels heated to 50°C. The RNA was visualized using a transilluminator (260 nm) after staining the gel with ethidium bromide.

iPCR, cloning of iPCR products and sequencing

Inverse PCR was performed as described by Rolfs et al. (1992) using 5 µg chromosomal DNA of strain C34 digested with the restriction enzymes HincII, HindII, EcoRV, PstI or XbaI. iPCR reactions were carried out with the primers ICheck2 and ICheck3 (Table 4). Cloning and sequencing of the iPCR amplificates were performed as described for the standard PCR.

Southern hybridization

Chromosomal DNA (15 µg) of C. difficile strains 10463, C34 and 42373 was digested with restriction enzymes HincII and HindIII. The DNA was separated on a 1% agarose gel and vacuum blotted onto a positively charged nylon membrane. Southern hybridization was performed using the RENAISSANCE Random Primer Fluorescein Labeling kit with Antifluorescein-HRP (NEN) in accordance with the manufacturer's recommendations. The hybridization temperature was 50°C. The stringency washes were performed at the same temperature using 6.0 × SSC, 1.0% SDS and 6.0 × SSC, 0.1% SDS as wash buffers. The membrane was exposed to a BIOMAX MR film (Eastman Kodak Company) and developed after 20 min.


This work was supported by grant Ei 206/10-1 from the Deutsche Forschungsgemeinschaft. Some of the data reported here will be presented in the PhD thesis of M.M. C.v.E. wants to express his special gratitude to the Johannes Gutenberg-University in Mainz for providing his group with laboratory space in the Verfügungsgebäude für Forschung und Entwicklung. D.E.M. acknowledges the past support of the National Health Research and Development Program of Canada (Grant no. 6603-1252-54) when strain C34 was first isolated.


  1. †The first two authors contributed equally to this paper.