Antibiotic resistance in Gram-negative bacteria is often due to the acquisition of resistance genes from a shared pool. In multiresistant isolates these genes, together with associated mobile elements, may be found in complex conglomerations on plasmids or on the chromosome. Analysis of available sequences reveals that these multiresistance regions (MRR) are modular, mosaic structures composed of different combinations of components from a limited set arranged in a limited number of ways. Components common to different MRR provide targets for homologous recombination, allowing these regions to evolve by combinatorial evolution, but our understanding of this process is far from complete. Advances in technology are leading to increasing amounts of sequence data, but currently available automated annotation methods usually focus on identifying ORFs and predicting protein function by homology. In MRR, where the genes are often well characterized, the challenge is to identify precisely which genes are present and to define the boundaries of complete and fragmented mobile elements. This review aims to summarize the types of mobile elements involved in multiresistance in Gram-negative bacteria and their associations with particular resistance genes, to describe common components of MRR and to illustrate methods for detailed analysis of these regions.
Antibiotic resistance, especially simultaneous resistance to multiple classes of antibiotics (multiresistance), is an increasing global problem. Gram-negative bacteria, in particular the Enterobacteriaceae, are adapted to exchanging genetic information and antibiotic resistance in these organisms is often due to the acquisition of genes from a shared pool (Iredell & Partridge, 2010). Genes in this pool are not intrinsically mobile and appear to have been captured from the chromosomes of various species, where they may have originally had other functions (Martinez et al., 2009). Such capture involves two different types of mobile genetic elements: those able to transfer genes between DNA molecules, referred to here as mobile elements, for example insertion sequences (IS) (Chandler & Mahillon, 2002), gene cassettes (Partridge et al., 2009), integrons (Cambray et al., 2010) and transposons (Grinsted et al., 1990; Grindley, 2002), and those able to transfer between cells, for example, conjugative and mobilizable plasmids (Carattoli, 2009; Smillie et al., 2010) and integrative conjugative elements (ICE; Waldor, 2010; Wozniak & Waldor, 2010).
The reservoir of potential antibiotic resistance genes detected in different environments, for example by metagenomic analysis (D'Costa et al., 2006; Sommer et al., 2009), appears to be extremely large (Martinez et al., 2009). However, it is not possible to predict which of these genes will emerge into the pool accessible to potentially pathogenic Gram-negative species (Courvalin, 2005, 2008), which contains only a limited variety of mobilized resistance genes (Martínez et al., 2007; Martinez et al., 2009). Successful mobilization of a resistance gene into this pool presumably requires a combination of events. The relevant mobile element must have access to the organism that carries the resistance gene in question, the mobile element must happen to interact with the gene to allow ‘capture’ (Baquero, 2004) and must be able to transfer the gene to a plasmid or a similar genetic vehicle that is able to enter the relevant pool. It seems likely that such successful capture/mobilization events would be rare and the subsequent acquisition of a gene from this pool by many different bacteria would be more efficient than relying on additional capture events.
Mobilized resistance genes identified in Gram-negative bacteria also vary in their distribution, suggesting that entry into the appropriate pool is not always sufficient for a new gene to become widely established (Shaw et al., 1993; Walsh, 2006; Partridge et al., 2009). Under strong selection, such as the use of an antibiotic, the first element to enter the pool that confers an advantage would have the highest chance of spreading and becoming predominant, but as new genes enter the pool and selective pressures change, new elements may take over (Martinez et al., 2009).
Available evidence indicates that in Gram-negative bacteria, particularly the Enterobacteriaceae, the resistance genes and associated mobile elements carried on plasmids are often found clustered together in large multiresistance regions (MRR). Similar conglomerations are also found in ICE (Wozniak et al., 2009) and various chromosomal ‘resistance islands’, for example SGI1 in Salmonella spp. (Hall, 2010) and AbaR in Acinetobacter baumannii (Post & Hall, 2009; Adams et al., 2010). Insertions into a particular region of a plasmid or a chromosome may disrupt vital functions, preventing replication or conjugation, for example, or disrupting cell growth or division, and insertions into the chromosome usually occur at specific sites. Certain hot-spots for the integration of different genetic information have been identified in the Escherichia coli genome (Touchon et al., 2009), SXT elements insert into the prfC gene of Vibrio cholerae (Hochhut & Waldor, 1999) and SGI variants insert at the same position between the thdF and yidY genes in Salmonella (Doublet et al., 2005), Proteus mirabilis (Boyd et al., 2008) and potentially other bacteria (Doublet et al., 2007). There is also evidence for the targeted insertion of transposons into plasmids (Sota et al., 2007).
If a mobile element is inserted into a location where it has no deleterious effects, either due to such targeting or by chance, it may then act as a ‘founder element’ (Parks & Peters, 2007) that provides a target for further insertions that also do not disrupt vital functions. Such events may be exceptionally rare ‘one-offs,’ but structures derived from them may spread widely if they provide a strong selective advantage. As we only see the ‘tip of the iceberg’ of structures that are successful in terms of the biased sets of samples that have been studied, rather than the results of all possible insertion events, the observed accretion of mobile elements and associated resistance genes into MRR makes sense.
Examination of the available sequences of MRR on plasmids and resistance islands suggests that they are modular, mosaic structures, consisting of different combinations of highly conserved components from a limited set (Baquero, 2004). The number of different configurations of these components is also smaller than would be expected if they were assembled by random interactions (Baquero, 2004). While some combinations can be explained by the targeted insertion mentioned above, some elements, including many IS, do not appear to target specific sites. Instead, common components present in different MRR provide homologous regions that allow ‘combinatorial evolution’ based on new interactions between existing pieces (O'Brien, 2002; Baquero, 2004; Walsh, 2006). The composition of a particular MRR will depend on the availability of different components and some interactions may be more favourable than others. Combined with selection pressure, this may result in ‘winner’ combinations that are very common (Baquero, 2004).
Accretion of resistance genes into MRR means that a bacterium may be able to rapidly acquire a combination of resistance genes en bloc and that the biological success of a resistance gene is dependent on its wider genetic context (Walsh, 2006). The association of a particular resistance gene with others may allow co-selection for its maintenance by several different antibiotics. The insertion of a resistance gene into an MRR may also enable incorporation into new mobile entities that may not have some of the limitations of the original capturing element (O'Brien, 2002). Thus, although the mechanisms of action of the products of many resistance genes are well characterized and many of the associated mobile elements have been studied, our understanding of the complexities of multiresistance is still very limited (Walsh, 2006). There are clearly rules that govern how MRR arise, evolve and spread (Baquero, 2004), and potentially their associations with different types of plasmids, but understanding these rules requires comparison of many examples.
Until recently, relatively few complete MRR or plasmids had been sequenced (Frost et al., 2005), but advances in technology are leading to the availability of increasing amounts of sequence data. Unfortunately, meaningful comparison of MRR is often hampered by inconsistent annotation and incomplete analysis (Iredell & Partridge, 2010), as most automated sequence annotation programs focus on identifying genes and the potential function of their products (or individual domains) by homology to known examples. This is of limited use in DNA segments relating to antibiotic resistance in Gram-negative bacteria, where very similar or even identical genes with known functions have often already been identified many times. In these sequences, identifying precisely which genes are present and their genetic context, i.e. the associated mobile elements and the boundaries between these elements, is more important. This type of detailed analysis of all of the components of a sequenced MRR still usually requires manual input (Frost et al., 2005) in addition to automated annotations. This review aims to summarize the characteristics of the most common components of available MRR and the principles needed to fully analyse sequences relating to antibiotic resistance from Gram-negative bacteria and their plasmids, using examples to illustrate these principles and to demonstrate ways of analysing these complex sequences.
Classic IS are compact mobile elements usually bounded by short, identical or imperfect inverted repeats (IR). One or two genes encoding transposase proteins generally cover almost the entire region between these IR, which are defined as IRL, at the left end relative to the direction of transcription of the transposition gene(s), and IRR, at the right end. Recognition of the IR by the transposase protein enables the movement of the IS to a new location by a ‘cut and paste’ and/or a ‘copy and paste’ process, depending on the particular IS (Chandler & Mahillon, 2002). Transposition of most IS generates direct repeats (DR) of a characteristic length, usually 2–14 bp (Chandler & Mahillon, 2002). Although classical IS as originally defined do not carry resistance genes within them, two copies of the same IS (or two closely related IS) that happen to insert either side of a gene can capture it as part of a composite transposon that can move as a unit. IS may also provide complete or partial promoters that can drive the expression of captured (or adjacent) genes.
IS related to known families but carrying ‘passenger’ genes are now being identified, blurring the distinction with transposons, and have been designated tIS (Siguier et al., 2009). Two other types of IS, ISEcp1-like elements and ISCR, are unusual in that a single copy of the element is able to capture and move resistance genes.
ISEcp1 and related elements
ISEcp1 is flanked by 14-bp IR, but appears to move adjacent regions by failing to recognize IRR and instead using a weakly related downstream sequence (designated IRalt here) in combination with IRL (Poirel et al., 2003, 2005a; Lartigue et al., 2006). DR (5 bp) flanking the entire ‘transposition unit’, i.e. IRL to IRalt, are created on insertion. Several other elements (ISEnca1, ISSm2, IS1247) appear to mobilize adjacent regions, including antibiotic resistance genes, in the same way as ISEcp1 (van der Ploeg et al., 1995).
Gene cassettes are the smallest mobile elements associated with antibiotic resistance and each consists of a gene, often preceded by a ribosome-binding site, but usually not a promoter, and an attC recombination site (or 59-base element). The attC sites of different cassettes vary in length and sequence, but include conserved regions at their ends. Gene cassettes can transiently exist as circular molecules (Collis & Hall, 1992), but do not encode the mechanism for their own movement and are usually found inserted into integrons (In; Stokes & Hall, 1989) or, rarely, at secondary sites (Francia et al., 1993; Recchia et al., 1994).
The minimum components of an integron are an intI gene and an attI recombination site, plus a Pc promoter. The intI gene encodes an IntI integrase that catalyses site-specific recombination between the attI1 site and the attC site of the gene cassette (or between two attC sites) to insert or release cassettes (Collis et al., 1993). Several cassettes may be inserted in tandem into the same integron to create a cassette array, with the expression of cassette-borne genes driven by Pc. Integrons have been divided into classes on the basis of their IntI sequences and chromosomal integrons with arrays of tens to hundreds of cassettes, usually with related attC sites, are found in some species. Chromosomal integrons collectively provide a reservoir of cassettes that can be acquired by mobile or potentially mobile integrons associated with antibiotic resistance (Rowe-Magnus et al., 2001). The latter contain fewer cassettes, often with quite different attC sites, with class 1 integrons being the most common in available MRR. The mechanism by which genes acquire attC sites remains unknown, although generation from mRNA (Recchia & Hall, 1997) and the involvement of group IIC-attC introns (Léon & Roy, 2009) have been suggested.
Unit transposons – the Tn3 family
Unit or complex transposons (Tn) as originally defined are larger than IS and carry antibiotic resistance and/or other genes in addition to genes encoding transposition functions. The Tn3 transposon family (Grindley, 2002) includes two subgroups: Tn3-like and Tn21-like transposons. Both types are bounded by 38-bp IR and include a transposase gene (tnpA), a resolvase gene (tnpR) and a resolution site (res). These transposons move by a replicative process that involves the recognition of the IR by TnpA and the generation of a cointegrate intermediate consisting of the donor and the recipient molecules separated by two copies of the transposon. The cointegrate is resolved by TnpR-mediated site-specific recombination between directly oriented res sites and transposition creates 5-bp DR (Grindley, 2002).
The two subgroups are distinguished by differences in both sequence and organization. In members of the Tn3-like subgroup, res lies between tnpA and tnpR, which face in opposite directions (Grindley, 2002). In transposons of the Tn21-like subgroup (Grinsted et al., 1990) tnpA and tnpR are in the same orientation, with res near the start of tnpR. In Tn3-like and some Tn21-like transposons resistance gene(s) lie beyond tnpR, but the means by which they were captured is not known. Many Tn21-like transposons include a mercury resistance (mer) operon beyond the res site and the resistance gene(s) are carried as part class 1 integrons inserted in or near the res site. Related structures including only tnpR and tnpA genes (e.g. Tn5403) are still defined as transposons (Grindley, 2002).
Associations between particular resistance genes and particular mobile elements
Many different genes conferring resistance to a particular antibiotic or class/group of antibiotics have been (and continue to be) identified in Gram-negative bacteria. However, examination of the available sequences that include context information as well as the resistance gene suggests that, as might be expected from the considerations outlined in the Introduction, each particular resistance gene is generally immediately associated with the same mobile element (Table 1). In some cases, the source of a captured resistance gene can be identified, when the same (or a very closely related) DNA segment is found both associated with a mobile element on a plasmid and without the mobile element on the chromosome of certain species (Table 2). For example, plasmid-borne ampC genes are derived from the chromosomal ampC genes of various Enterobacteriaceae and ancestors of most of the blaCTX-M groups of β-lactamases have been identified on the chromosomes of different Kluyvera species (Table 2). Capture of the blaCTX-M-2 gene from the Kluyvera ascorbata chromosome by ISEcp1 and transfer to a plasmid has also been demonstrated (Lartigue et al., 2006). In other cases, the source of the gene remains unknown, even if an associated mobile element has been identified, for example blaTEM genes associated with Tn3-like transposons.
Table 1. Examples of observed associations between antibiotic resistance genes and mobile elements
The same mobile element (or type of mobile element) may be associated with genes conferring resistance to different classes of antibiotics and genes conferring similar resistance phenotypes may be associated with quite different mobile elements (Table 1), but all types of mobile elements are not associated with all types of resistance genes. For example, no plasmid-mediated ampC genes (Class C in Table 1), which originate from Enterobacteriaceae, have been found in gene cassettes, which may be generated in ‘environmental’ species. Conversely, in Gram-negative bacteria, most of the genes encoding aminoglycoside-modifying enzymes, the exact sources of which are unknown, are found in gene cassettes. This may reflect the limited possibilities for interaction between mobile elements and genes originating from species that are not normally found together.
At least one gene, blaCTX-M-14a, does appear to have been captured in two separate events: either by ISEcp1 or by ISCR1 (Valverde et al., 2009). In other cases, different extents of the source chromosome have apparently been captured by the same mobile element. For example, blaSHV genes are found in two different composite transposons, both flanked by IS26, apparently due to separate mobilizations of chromosomal regions of Klebsiella pneumoniae of different lengths giving rise to two different lineages of genes (Ford & Avison, 2004). Available sequences also include examples of minor gene variants associated with different mobile elements. For example, genes designated catA2 (formerly catII) are apparently found associated with ISCR1 or as part of an IS26-mediated composite transposon. Closer examination indicates that these genes and flanking regions are only 96% identical and could have been captured from slightly different sources. These genes are designated catA2a (ISCR1-associated) and catA2b (IS26-associated) here. Similarly, the catA1 genes and flanking regions associated with IS1 (catA1a here) and with IS26 (catA1b, also known as pp-cat) are about 98% identical.
In other cases where the same resistance gene is apparently closely associated with different mobile elements, detailed examination reveals a more complex picture. In one sequence where blaCTX-M-14a is apparently associated with ISCR1 (EU056266) (Bae et al., 2008), the IRR end of ISEcp1 is present between ISCR1 and blaCTX-M-14a, suggesting initial capture by ISEcp1. Another example is provided by the blaVEB-1 gene, first identified as part of a gene cassette in an array in a class 1 integron (Poirel et al., 1999). blaVEB-1 and minor variants have since been found in contexts other than cassette arrays, associated with ISCR1 (Naas et al., 2006), ISCR2 (Poirel et al., 2009) or a 135-bp repeated element (Re; see Zong et al., 2009 and references therein). Examination of the sequences surrounding blaVEB indicates that it is still part of the same gene cassette, i.e. associated with the expected attC site, although the cassette is missing the first 7 bp in some contexts.
If a resistance gene was first captured some time ago, it may no longer be possible to identify exactly which element was responsible. For example, minor variants of the aphA1 gene have been found in at least two different composite transposons flanked by IS26, Tn4352 (Wrighton & Strike, 1987) and Tn6020 (Post & Hall, 2009), Tn903 flanked by IS903 (Oka et al., 1981) and Tn2680, flanked by IS26, but also carrying IS903 (Mollet et al., 1985a). The region common to these structures extends beyond the aphA1 gene itself but, as the chromosomal source is not known, it is more difficult to identify the region that was initially captured and the element responsible.
Recombination is important in the assembly and evolution of resistance regions
Individual mobile elements and specific gene capture processes may be crucial in the entry of genes into the mobile pool, but repeated identification of the same assemblies of a few components, with the same boundaries between them, in different contexts suggests that recombination between common components also plays a large role in the assembly and evolution of MRR. In addition to homologous recombination, which can occur between essentially any closely related sequences, some MRR also show evidence of resolvase-mediated site-specific recombination in the res sites of Tn3-family and Tn5053-family transposons.
Homologous recombination requires breaking and joining of DNA strands and its likelihood is dependent on the length of the homologous region as well as the degree of relatedness. The complexities of recombination mechanisms will not be discussed here (reviewed by Clark & Sandler, 1994); rather, some of the potential consequences of this process will be described, as an aid to analysing MRR.
Recombination can occur between regions in the same DNA molecule or on different DNA molecules in the same cell. Recombination between two inversely oriented copies of the same element on the same DNA molecule will result in inversion of the segment between them (Fig. 2a). If the repeated element is an IS or transposon that creates DR, then examination of adjacent sequences may provide evidence of inversion (Fig. 2a). Recombination between two directly oriented copies of a duplicated element in the same DNA molecule will release a circular molecule, which carries one copy of the element and the region that lay between the elements, and leave a single copy of the element (Fig. 2b). Such circles may potentially reinsert at a different location by recombination with another copy of the same element, or with a sequence that matches any part of the circle.
Recombination between copies of the same element on different DNA molecules (e.g. two plasmids) may result in the fusion of these molecules (Fig. 2c). ‘Double crossover’ between each of two pairs of repeated elements flanking different DNA segments can result in exchange of these segments between the two locations (Fig. 2d). Recombination can also occur between elements that are closely related rather than identical, forming hybrids that may be apparent from an unequal distribution of nucleotide differences in sequence alignments.
Site-specific recombination in res sites of Tn3-like and Tn5053-like transposons
In Tn3 family and Tn5053 family transposons, resolvase-mediated site-specific recombination between res sites is required to resolve the cointegrate intermediates of transposition (Grindley, 2002). The res sites of Tn3-like transposons contain three subsites (resI, resII and resIII) (Rogowsky & Schmitt, 1984), with recombination occurring at a specific AT dinucleotide in resI in res sites that are aligned in the same orientation (Grindley, 2002). The res sites of Tn5053-family transposons include six IRs (r1-r6), with recombination taking place between r1 and r2 (Kholodii et al., 1995).
Recombination between the res sites of related, but different, Tn3 family (e.g. Partridge & Hall, 2004, 2005) or Tn5053 family (e.g. Mindlin et al., 2001; Labbate et al., 2008; Petrovski & Stanisich, 2010) transposons can also contribute to the evolution of MRR and exchange of components. If adjacent parts of a sequenced region are found to match two different transposons, then determining whether the boundary between the two matching regions is close to resI could indicate whether res-mediated recombination has taken place. Simple diagrams showing the sequences of the res sites of various Tn3-like and Tn5053-family transposons can be found in a number of references (e.g. Partridge & Hall, 2004, 2005; Labbate et al., 2008) and related sites in other similar transposons can be identified from these.
Characteristics of common MRR components and modules
The following sections provide details about some of the most common components and combinations of these components in currently available MRR, mainly focusing on the Enterobacteriaceae. It is not possible to describe all known components and some not listed here may be more frequently identified in the future, but similar principles would be expected to apply to identifying these elements and analysing sequences that contain them.
IS26 is very common in MRR, with >10 copies present in some plasmids. This 820-bp element has a single reading frame encoding the transposase, perfect 14-bp IR (Table 3) and generates 8-bp DR. IS26 creates replicon fusions (cointegrates), suggesting that transposition is accompanied by replication and that homologous recombination is necessary to resolve the cointegrates (Chandler & Mahillon, 2002). A number of IS later found to be identical or closely related to IS26 were originally given different names (e.g. IS6, IS15Δ, IS46, IS140, IS160, IS176). IS15 corresponds to one copy of IS26 inserted inside another (Labigne-Roussel & Courvalin, 1983) and a remnant of this structure, consisting of adjacent complete and partial copies of IS26, is found in Tn6020 in an AbaR (Post & Hall, 2009). Examples of different fragments of IS26 adjacent to a complete copy are also found in available sequences.
Table 3. Sequences defining the ends of common mobile elements
If both IR are identical, the sequence is only shown once. For IS where the IR differ, IRL is shown above IRR. For transposons where the IR differ, IRtnp is shown above, with the stop codon of the tnpA gene in lowercase. For Tn21-like transposons the position of IS4321/IS5075 insertion is in bold.
The underlined A in IRL is G in a few examples of IS26.
The lowercase letters indicate the bases that lie outside the IR, but that are still part of the IS (Partridge & Hall, 2003b).
As the terIS ends of ISCR1 and ISCR3 have not really been defined, the boundaries with conserved adjacent sequences (the 3′-CS or groEL are given).
Tn1721 includes two identical copies of IRtnp, represented by the top sequence.
Tn5393 has 81-bp IR (1 nt difference), but the outer 38 bp are related to IR of other Tn3 family transposons and only these are shown here.
The underlined A residues in IRstr can be Gs and the underlined G can be an A. RSF1010 includes an IR with all of these changes (Fig. 8)
IS26 is the flanking element in a number of different composite transposons that each carry a single antibiotic resistance gene (Table 1; Fig. 3). These are found inserted directly into some plasmid backbones and also as part of larger MRR, but IS26 can cause adjacent deletions (Mollet et al., 1985b), removing the evidence of insertion provided by DR.
IS1 and Tn9
IS1 was one of the first IS to be identified in bacteria and is one of the smallest at 768 bp. IS1 has 23-bp IR (Table 3) and usually creates DR of 9 bp, but 8, 10 and 14-bp DR have also been observed (Chandler & Mahillon, 2002). IS1 includes two overlapping ORFs, known as insA and insB, which are fused by translational frameshifting to produce the functional transposase. IS1 generates both simple insertions and cointegrates and, like IS26, can cause adjacent deletions (Turlan & Chandler, 1995). In plasmids and other sequences relating to antibiotic resistance, IS1 is most often found alone or as part of a derivative of Tn9, which consists of two directly oriented copies of IS1 flanking a region that includes the catA1a gene (Fig. 3d), in MRR.
IS10 and Tn10
IS10 is 1329 bp in length, has 22-bp IR (Table 3), transposes by a ‘cut and paste’ mechanism that creates 9-bp DR and shows some target site specificity (Chandler & Mahillon, 2002). IS10 is commonly found as part of the composite transposon Tn10 (Lawley et al., 2000) that carries tetA(B) encoding a tetracycline efflux protein, tetR(B) encoding a tetracycline repressor protein, the tetC transcriptional regulator gene, tetD and several other genes (Fig. 3b).
ISEcp1 (also called ISEc9), first identified in E. coli in about 1999 (AJ242809), is a member of the IS1380 family. ISEcp1 is 1656 bp in length and is flanked by 14-bp IR (Table 3). As stated above, ISEcp1 uses IRL in combination with IRalt sequences to move adjacent genes, inserting a ‘transposition unit’ at new locations, flanked by 5-bp DR. The availability of matching chromosomal regions including some captured genes (e.g. Fig. 4a) indicates that different-sized transposition units may be moved following the insertion of ISEcp1 adjacent to a gene (Fig. 4b). In other cases, additional segments adjacent to insertion of the first transposition unit may be picked up in subsequent transposition events (Fig. 4c). This process is also evident from transposition experiments using cloned ISEcp1-resistance gene combinations, where adjacent vector sequence may also be captured and moved (Poirel et al., 2005a; Wachino et al., 2006). ISEcp1 therefore has the potential to mobilize regions that include more than one resistance gene.
As IRalt do not fit an easily defined consensus (Lartigue et al., 2006), different methods may be needed to identify the end of the transposition unit distal to IRL. If the ancestral resistance gene has been identified in a chromosomal sequence, it may be possible to identify (or confirm) the end of the transposition unit from the boundary of homology with this sequence (see Fig. 4a). If the transposition unit is inserted within another mobile element or well-characterized region, the IRalt end should become apparent when the remainder of the sequence is annotated (see Fig. 4b). Searching the sequence beyond the resistance gene for a DR of the 5-bp sequence found immediately adjacent to IRL may identify the boundary of a potential transposition unit. The sequences flanking this region can then be joined (minus one copy of the DR) and used to search for identical or closely related sequences that represent an uninterrupted ancestor. If none of these approaches yield results, it may be possible to identify putative IRalt by the boundary of identity to another known sequence (Partridge, 2007; Zong et al., 2010).
In some cases, an ‘inside-out’ ISEcp1 structure has been identified, for example ISEcp1 associated with blaCTX-M-3b in pK29 (EF382672) and with blaCTX-M-62 (1 nt difference) in pJIE137 (EF219134). The arrangement in pJIE137 suggests the insertion of a transposition unit into another copy of ISEcp1, followed by recombination between duplicated regions (Zong et al., 2010). Similarly, in some IncA/C plasmids and partially sequenced regions, blaCMY-2 is duplicated and associated with complete and partial copies of ISEcp1 (Welch et al., 2007; Verdet et al., 2009; Call et al., 2010). The insertion of one ISEcp1-blaCMY-2 transposition unit into a related unit, followed by different recombination events could explain these different structures (Fig. 4d). A Tn10-mediated composite transposon inserted into an SXT/R391-related ICE (Harada et al., 2010) and a similar structure in a partial plasmid sequence (Verdet et al., 2009) could also be derived from one of these structures (Fig. 4d).
A few related elements that appear to operate by the same mechanism as ISEcp1 have been identified adjacent to antibiotic resistance and other genes. ISEnca1, found upstream of aph(2″)-Ie (AY939911) (Chen et al., 2006), is 91% identical to ISEcp1. IS1247 is found in several sequences (e.g. AJ971344) as part of a transposition unit carrying an aac(3)-II gene [suggested name aac(3)-IIf; see section on aac(3)-II genes and regions] and a putative rifampin ADP-ribosyl transferase (suggested name arr6) inserted into the ere(A)2 cassette and flanked by 4-bp DR (van der Ploeg et al., 1995). A partial ISSm2 (∼88% nucleotide identity to IS1247) is found adjacent to the aac(3)-IIb gene (van der Ploeg et al., 1995).
Gene cassettes and cassette arrays
Over 130 different (<98% identical) gene cassettes carrying antibiotic resistance genes and found in mobile resistance integrons are listed in a recent review, with suggested nomenclature, references to exemplar sequences and instructions for identifying cassette boundaries and attC sites (Partridge et al., 2009). Several modifications to cassettes have been observed, including insertions of group IIC-attC introns and IS1111-attC elements into the attC site, specific deletions in the attC site and the creation of hybrid cassettes (see Partridge et al., 2009 and references therein). Certain cassettes appear to be much more common than others in available sequences and surveys and some cassette arrays (e.g. |dfrA17|aadA5| and |dfrA12|gcuF|aadA2|) also seem to be particularly common (Partridge et al., 2009).
Class 1 integrons – creation and definitions
The first integrons were discovered due to their association with antibiotic resistance genes (Stokes et al., 2006). Subsequent identification of other ‘mobile resistance integrons’ and many different chromosomal integrons led to classification by IntI sequences and the original integrons became known as class 1. The ancestor of the first type of class 1 integron structures to be identified was created by the acquisition of intI1 and attI1 from a chromosomal integron by a Tn5053-family transposon (Fig. 5a) (Stokes et al., 2006; Gillings et al., 2008). This coupling of the gene-acquisition and expression properties of the integron with the mobility of the transposon was clearly an important ‘winner’ event (Labbate et al., 2008). The best-known example of this type of structure was named Tn5090 when it was first sequenced (X72585; Rådström et al., 1994) but, as these authors had suggested, it was later found to equate to Tn402, previously identified as a transposable region on the same plasmid, R751 (Shapiro & Sporn, 1977). Names such as Tn5090/Tn402 or Tn402(Tn5090) have been used, but are cumbersome and this transposon is referred to as Tn402 in the remainder of this review. Tn402 is bounded by the 25-bp IR of the ancestral transposon, named IRi, at the intI1 end, and IRt, at the tni end (Fig. 3b; Table 3).
While a number of complete Tn402-like transposons carrying intI1 and attI1 have now been identified, the most commonly encountered structures in available sequences are ‘clinical’ or ‘sul1-type’ class 1 integrons. These were derived from a Tn402-like transposon that included qacE as the final cassette by the incorporation of the sul1 (formerly sulI) sulphonamide resistance gene, truncation of qacE and loss of part of tni402 (Stokes et al., 2006; Gillings et al., 2008) (Fig. 5c). The original ‘structural’ definition of a class 1 integron included all of the Tn402-like transposon from IRi to IRt, if the latter was present, while the current ‘functional’ definition of an integron includes only the minimal integron components intI and attI, which are sufficient to specify the class. The first definition is useful when annotating MRR sequences, as it indicates the extent of ‘mobile unit’ that can potentially be transposed and the term ‘class 1 In/Tn’ will be used to refer to these structures in the remainder of this review.
Linking of genes conferring resistance to antiseptics and to an early class of antibiotics (sulphonamides) to different cassette-encoded resistance mechanisms was clearly a ‘winner’ event, despite the fact that it resulted in defective transposon derivatives that are no longer able to transpose themselves. However, as predicted (Kholodii et al., 1995; Brown et al., 1996; Partridge et al., 2002a), the movement of such transposons with intact IRi and IRt catalysed by Tni proteins provided in trans has now been demonstrated, with RecA-mediated recombination resolving cointegrates formed in the absence of the res site (Petrovski & Stanisich, 2010). Coupled with the res site-hunter characteristics of Tn5053-family elements, this has allowed the spread of class 1 In/Tn and the resistance genes they carry as ‘passengers’ on Tn21-like transposons (see section on Tn21-subgroup transposons carrying class 1 In/Tn).
Class 1 In/Tn structures
In ‘clinical’ class 1 In/Tn two conserved segments (CS) with defined sequences flank the cassette array, also known as the variable region. The 5′-CS starts at IRi and includes intI1 and a Pc promoter responsible for the expression of cassette genes (see Jovéet al., 2010 for a recent summary of Pc variants). The 5′-CS ends with the sequence AAACAAAG within the core site of attI1 (see Partridge et al., 2009) and the adjacent T corresponds to the start of the first cassette of the array (or position 1 of the 3′-CS if the integron is ‘empty’). The final cassette is followed by the 3′-CS, with position 1 defined as the first T of the sequence TTAGAT, corresponding to the start of the remnant qacE cassette. The truncated qacEΔ1 gene overlaps with sul1 and two ORFs of unknown function (orf5 and orf6) are present in the longest examples of the 3′-CS. The 3′-CS may be followed by certain IS or may directly abut various extents of the truncated tni402 region, which ends with IRt (see Table 4 for examples).
Table 4. Different class 1 In/Tn structures
Positions 1-4733 of tni402 correspond to 34376-39108 of R751 in U67194.4.
CA at the junction between the 3′-CS and tni402 belongs to neither segment, but could be a remnant of IS1326.
AT at the junction between the 3′-CS and tni402 could belong to either segment and position numbers include them in both.
GC at the junction between the 3′-CS and tni402 could belong to either segment and position numbers include them in both.
‘Complex’ class 1 In/Tn with ISCR1 and associated resistance genes generally have a second partial copy of the 3′-CS that may be followed by a region matching one of the typical class 1 In/Tn structures.
The first few class 1 In/Tns to be characterized were given integron numbers (Stokes & Hall, 1989) intended to specify the entire structure from IRi to IRt (if the latter was present), including defined extents of the 3′-CS and tni402 and different IS (Table 4). ‘Integron’ numbers are now commonly used to refer only to cassette arrays (a list is available from the Annotation page of INTEGRALL; Table 5), but in most cases, simply stating the cassettes in the array is more helpful. Some early integron numbers, however, remain useful for referring to structures that persist in currently available sequences (Table 4).
In0 and In2 contain the same extents of the 3′-CS and tni402, while In5 has a longer version of 3′-CS and less of tni402 (Fig. 5c), and either no IS, IS1326 alone or both IS1326 and IS1353 may be associated with both of these types of class 1 In/Tn (Table 4). In4 includes a partially duplicated IS6100 element flanked by inverted 123 and 152-bp fragments of the end of the tni402 region that both include IRt (Partridge et al., 2001a) and a similar structure without the IS6100 duplication is found in several class 1 In/Tn (Fig. 5c). Variants lacking part or all of the 3′-CS (Fig. 5c; Table 4) are also found, presumably resulting from IS6100-mediated deletions into the 3′-CS or cassettes, for example the dfrA14 cassette is often followed by a sequence interrupted by IS6100 (Partridge et al., 2001b). So-called ‘complex’ class 1 In/Tn carry ISCR1 and resistance genes that are not part of cassettes, usually between partial duplications of the 3′-CS (see section on ISCR elements and ‘complex’ class 1 In/Tn).
Structures in which related cassette arrays are followed by a region that includes a transposase gene (usually annotated as IS440, although the boundaries of the IS have not been identified), sul3 (formerly sulIII; sulphonamide resistance) and the mef(B) macrolide efflux gene (Liu et al., 2009), rather than the 3′-CS, are increasingly being reported (see Partridge et al., 2009 for examples).
The conserved nature of the 5′-CS and 3′-CS flanking cassette arrays means that entire cassette arrays can be exchanged between class 1 In/Tn by recombination in both of these regions (see Fig. 2d; Partridge et al., 2002a, b), explaining the occurrence of common cassette arrays in class 1 In/Tn with different structures or in different locations. Recombination may also occur between related cassettes common to different arrays to create hybrid cassettes (e.g. Gestal et al., 2005).
The chrA region
A region that includes a gene potentially encoding chromate resistance (chrA) and an ORF (usually annotated as orf98 or padR) has now been identified in several MRR, usually adjacent to the 3′-CS of class 1 In/Tn (Fig. 5f). The boundary between this chrA region and the 3′-CS is defined by a Tn21-like 38-bp IR (designated IRchrA here; Table 4). The entire region is 86% identical to part of a Tn21-like transposon found in pCNB1 (EF079106), an IncP1-β plasmid from Comamonas (Ma et al., 2007), suggesting that a related transposon may be the origin of this fragment and it is interesting that IRt adjacent to IS6100 lies close to the presumed res site (Fig. 5f). The chrA region is usually followed by IRt-IS6100 and a region designated mph(A) here (see section on Macrolide phosphotransferase regions).
ISCR elements and ‘complex’ class 1 In/Tn
Unusual class 1 integrons containing a ‘common region’ between duplications of the 3′-CS were identified in the 1990s (Stokes et al., 1993). The name was later shortened to CR1 when two related regions, designated CR2 and CR3, were identified (Partridge & Hall, 2003a). When it became apparent that these regions are related to IS91-like elements, they were renamed ISCR (Toleman et al., 2006b). Over 20 different ISCR elements have now been identified and related elements can be grouped into families, with differences in the G+C content between families suggesting different origins.
Simple insertions of ISCR elements have not yet been seen and regions containing these elements can be quite complex, including multiple ISCR copies, some of which are truncated, duplications of other regions and multiple resistance genes. The oriIS ends of ISCR1, ISCR2 and ISCR3 (Table 3) and other ISCR have been identified from boundaries with regions carrying different resistance genes found downstream of rcr and from similarity to each other and to oriIS of IS91, IS801 and IS129 (Partridge & Hall, 2003a; Toleman et al., 2006b). However, as successive aberrant transposition events may allow ISCR to acquire multiple segments adjacent to terIS (Fig. 6d), it can be difficult to identify the original ends of these elements and unravel the events that created the structures containing them, unless the origin and boundaries of each segment can be identified.
ISCR2 has been found associated with a limited number of resistance genes (Table 1; Partridge & Hall, 2003a; Toleman et al., 2006b) and several different sequences have been found upstream of the rcr gene of ISCR2, allowing a terIS to be proposed (Table 3). Regions including ISCR2 have mainly been identified on IncA/C plasmids and their relatives, the SXT ICE (Wozniak et al., 2009). These regions seem to be related and often consist of complete or partial copies of ISCR2 separating regions containing different resistance genes. The generation of these structures may rely on direct insertion or homologous recombination between copies of ISCR2. Fragments of IS91 with an oriIS end can move if replicase is provided by an intact copy of the element (Schlör et al., 2000; Garcillán-Barcia et al., 2002) and similar events could potentially explain some structures carrying ISCR2. Defined fragments of ISCR2 are also found on small IncQ plasmids or in a common multicomponent structure (see section on Tn5393 and modules containing fragments of this transposon).
ISCR1 has only ever been found adjacent to position 1313 of the 3′-CS of class 1 In/Tn and this boundary has been used to define the end of the element. This end does not include typical terIS features and it has been suggested that ISCR1 may be a truncated version of an element that inserted adjacent to the 3′-CS and then was fused to this region by a deletion (Toleman et al., 2006b). However, the 2154-bp region from the boundary with the 3′-CS to the end of oriIS is about 300–650 bp longer than related elements (IS91, 1830 bp; IS801, 1512 bp; IS1294, 1688 bp; ISCR2 from terIS in Table 3, 1847 bp). Thus, it is possible that the element defined as ISCR1 had already acquired region(s) before being inserted into the 3′-CS, with selection pressures and sampling bias potentially explaining why the same structure has always been seen. In either case, the association of ISCR1 with the 3′-CS created a ‘winner’ combination that is able to capture different resistance genes (Table 1) and insert them into any class 1 In/Tn with a 3′-CS and that can potentially mobilize different parts of the class 1 In/Tn that lies upstream (Toleman et al., 2006b).
In class 1 In/Tn with ISCR1, the acquired resistance genes lie adjacent to the oriIS end of ISCR1. This is explained by a model that requires a combination of rolling circle transposition and homologous recombination (Toleman et al., 2006b). Replication from oriIS through ISCR1 and into the 3′-CS would create a circular molecule that contains part of the 3′-CS (represented by Fig. 6b). This could insert adjacent to a resistance gene and another round of rolling circle replication with a different sequence acting as terIS would create a circular molecule containing ISCR1, part of the 3′-CS and an adjacent captured region (represented by Fig. 6d). Additional rounds of the same process could add further regions containing resistance genes. The rescue of circular intermediates by homologous recombination with the 3′-CS of a class 1 In/Tn would result in the observed structures, in which partial duplications of the 3′-CS flank the ISCR1-resistance gene(s) region (Fig. 6e). Several examples of class 1 In/Tn include ISCR1 associated with more than one resistance gene (e.g. Fig. 6f). Different extents of the class 1 In/Tn adjacent to ISCR1 could also be captured, allowing recombination in the 5′-CS or cassette array, or recombination with an ISCR1 element already in a class 1 In/Tn could yield a variety of structures (see Toleman et al., 2006b for examples). Once an ISCR1-resistance gene combination has been placed in a class 1 In/Tn, it could also move to other class 1 In/Tn as a circular molecule generated by homologous recombination between the flanking duplicated 3′-CS (see Fig. 2b; Partridge & Hall, 2003a).
A number of elements 75–95% identical to ISCR3 have been identified (Toleman & Walsh, 2008). ISCR3 itself has been found associated with several different resistance genes, while generally only one or two examples of other ISCR3-like elements are currently available in GenBank, each associated with a particular resistance gene (Table 1). A region related to the groEL gene of Xanthomonas spp. is found adjacent to the presumed terIS end of ISCR3 and some related elements, suggesting that an ancestor inserted adjacent to groEL and picked up part of this gene (Toleman & Walsh, 2008). Examination of alignments of ISCR elements with the same boundary with groEL as ISCR3 itself reveals an uneven distribution of nucleotide differences, suggesting that some are hybrid or mosaic structures. Complete and partial duplications of these elements are associated with several resistance genes (Fig. 6i) and rescue of circular transposition intermediates (represented in Fig. 6g) and other events (Fig. 6h) involving recombination between related rather than identical elements could generate these structures. Additional recombination between different, but related groES-groEL regions (Toleman & Walsh, 2008) may explain differences in groEL sequences further from the ISCR element (Fig. 6i).
Several class 1 In/Tn in Pseudomonas aeruginosa are inserted into a transposon related to Tn5051, which has only been partially sequenced, but appears to be a hybrid of the mer region of Tn501 (Brown et al., 1985) and a different tnp region (Mindlin et al., 2001) created by recombination in the res site. Tn501 has orf2, equivalent to urf2M, between the res site and merD and in six examples in GenBank a class 1 In/Tn carrying a blaIMP-13, blaVIM-1, blaVIM-2 or blaGES-1 gene cassette is inserted into the same position in this region (Fig. 7a). Single examples of class 1 In/Tn inserted at two other positions, both in the res site, have also been identified (Fig. 7a).
The insertion of the class 1 In/Tn outside the res site of Tn21 seems to have created a ‘winner’ combination that could explain the number of variants of this transposon found in available sequences (Liebert et al., 1999). Tn21 may also have an advantage in that it includes an ORF (tnpM), consisting of part of urf2M, proposed to encode a protein that enhances transposition (Hyde & Tu, 1985), as shown recently for an equivalent protein (TniM) from a Tn5036-family transposon (Petrovski et al., 2011). Tn1696 and other transposons that lack an equivalent of the urf2M region would not be expected to produce such a protein.
The closely related IS4321 and IS5075 and their minor variants, which belong to the same family as IS1111-attC elements, target the 38-bp IR of Tn21-subfamily transposons, but have not yet been found in the Tn3-subfamily. These IS do not create DR and are unusual in that their ends do not correspond to their IR and require careful annotation (Table 3; Partridge & Hall, 2003b). IS4321-like elements have always been found inserted into the same position (Table 3) with the IRL end towards the middle of the transposon and presumably prevents further TnpA-mediated transposition.
Tn1721 (Allmeier et al., 1992) is a Tn21-like transposon with an unusual structure that includes three 38-bp IR and a partial duplication of the tnpA gene (Fig. 7b). One end consists of tnpA, tnpR, a res site and an ORF encoding a protein related to methyl-accepting chemotaxis proteins (mcp in Fig. 7b and originally called orfI) flanked by 38-bp IR (Table 3). This structure is also known as Tn1722 and can move independently (Grinsted et al., 1990). The other end of Tn1721 includes the tetA(A) tetracycline resistance gene and a regulatory gene tetR(A) gene. Tn1721 may have evolved by internal deletion of a composite transposon consisting of two copies of Tn1722 flanking the tet(A) region (Allmeier et al., 1992; Grindley, 2002).
Complete copies of Tn1721 are rare in sequences currently available in GenBank, but different fragments of this transposon are often found, including adjacent to the rep region of several IncPα plasmids. There is one example of a class 1 In/Tn inserted into the res site of Tn1721 (Fig. 7b) and several res-type hybrids of tnp1721/tnp21 with a class 1 In/Tn inserted at the usual Tn21 location, for example in pAPEC-O2-R (AY214164). A 4.8-kb ISEcp1 transposition unit containing different blaCTX-M-9-like variants and IS903 (e.g. AF458080; Poirel et al., 2003) or a truncated version (e.g. HM440049; Zong et al., 2011) is found inserted into the same position in Tn1721/Tn1722 in several plasmids (Fig. 7b).
Tn1, Tn2, Tn3 and relatives carrying blaTEM genes
Nearly 200 TEM β-lactamases, including extended-spectrum (ESBL), inhibitor resistant (IRT) and combination (CMT) types, have been identified (http://www.lahey.org/Studies/). The first examples, TEM-1 and TEM-2, differ by a Gln to Lys substitution at amino acid 39 (Ambler numbering; Ambler et al., 1991) due to a single base change. These and all other TEM β-lactamases are encoded by minor variants of the same gene, with a single amino acid difference sufficient for assignment of a new number. blaTEM-1 variants with different combinations of silent mutations at characteristic positions have also been identified (‘frameworks’, distinguished by letters e.g. blaTEM-1a, blaTEM-1b; see Supporting Information, Fig. S1). The positions are numbered using a scheme that includes 208 nt upstream of the start codon (Sutcliffe, 1978), as changes in this region yield promoters of different strengths (P3, Pa/Pb, P4, P5; Lartigue et al., 2002). Leflon-Guibout et al. (2000) proposed that the framework and the promoter variant present should also be specified for all blaTEM genes identified, for example blaTEM-33f (P4).
The closely related transposons identified as the carriers of blaTEM genes were originally collectively designated TnA (Hedges & Jacob, 1974), but were later distinguished as Tn1, Tn2 Tn3, Tn801, etc. depending on the blaTEM variant present and the plasmid they were derived from. Detailed analysis of the sequences of Tn1 (blaTEM-2), Tn2 (blaTEM-1b) and Tn3 (blaTEM-1a) indicated that most of the differences between them were confined to short regions flanking the res site, suggesting that they were generated by a combination of site-specific and homologous recombination between ancestral transposons (Partridge & Hall, 2005). Analysis of complete or almost complete sequences of transposons carrying blaTEM genes now available in GenBank (Bailey et al., 2011) suggests that they still fall into the three groups represented by Tn1, Tn2 and Tn3.
Tn1 and Tn801 were the names given to transposons carrying blaTEM-2 from a group of closely related IncP1α plasmids (RP1, RP4, R68 and RK2). Tn1 from the compiled sequence representing this group of plasmids (L27758, BN000925; Pansegrau et al., 1994) has 7 nt differences outside the blaTEM gene/promoter region from Tn1 derived from various other plasmids and from Tn801 transposed from a derivative of RP1 (Burland et al., 1998; Brinkley et al., 2006). These transposons all carry genes based on the 1f framework, although the gene from Tn1R7K was named blaTEM-1c (Revilla et al., 2008), but some encode TEM-2 or other variants (Bailey et al., 2011). An association between Tn1-like transposons carrying various blaTEM-2 derivatives and tnp1696 has been noted (Novais et al., 2010).
Tn2, the transposon from RSF1030 defined as carrying blaTEM-1b, was originally only partially sequenced (X54607; Chen & Clowes, 1987), but part of the complete sequence of a transposon initially called Tn2* (AY123253.3; Partridge & Hall, 2005) is identical to this region. Many examples of a complete or a partial transposon carrying blaTEM-1b or derivatives in GenBank are identical or very closely related to this sequence, suggesting that it can now simply be referred to as Tn2 (Bailey et al., 2011). The complete sequences of transposons carrying blaTEM-1c (HM749967, EU935740) have only six differences from Tn2 outside the blaTEM gene/promoter region and the name Tn2a has been suggested (Bailey et al., 2011). Complete and partial copies of Tn2 and minor variants appear to be much more common than Tn1 and Tn3 in available sequences, but are often incorrectly annotated as Tn3 (Bailey et al., 2011). In several sequences, Tn2 is found inserted into the same position in mer21 or with a common boundary with one or the other end of this region (Partridge & Hall, 2004; Novais et al., 2010) and fragments of Tn2 interrupted by IS26 at different positions, but with the blaTEM gene intact, are also common (Bailey et al., 2011).
The first available sequence of Tn3, carrying blaTEM-1a, transposed from R1 (V00613) (Heffron et al., 1979) includes a 9-bp duplication not present in the sequence recently obtained directly from R1 (HM749966) (Bailey et al., 2011). Tn1331 is a derivative of Tn3 that carries a cassette array between short duplications (Fig. 7c; Tolmasky & Crosa, 1993).
Tn3-like transposons also exhibit transposition immunity i.e. the presence of one copy of the element in a potential target molecule significantly reduces the insertion of a second copy (Grindley, 2002). Thus, if an MRR includes two copies or fragments of one of these mobile elements, it is probably unlikely that both were inserted by direct transposition.
Tn4401 carrying blaKPC genes
The blaKPC gene, encoding a class A β-lactamase capable of conferring resistance to carbapenems, has recently become a problem in several parts of the world. This gene is found flanked by ISKpn6 and ISKpn7 within variants of Tn4401 (Fig. 7d), a Tn3 family transposon with an unusual organization. The Tn4401 structure may have been generated in a manner similar to capture of genes by ISEcp1 (Naas et al., 2008). In this scheme, a transposon containing the tnpA and tnpR genes of Tn4401 first inserted upstream of blaKPC. ISKpn6 then inserted upstream of blaKPC and ISKpn7 downstream, disrupting one IR of the original transposon, and a sequence located further downstream of blaKPC was then used as the second IR in subsequent transposition events. Tn4401 has been found to be associated with Tn1331 in several plasmids, including pLRM24 (Rice et al., 2008), p12 and p15 (FJ223605-6; Gootz et al., 2009) and pKpQIL (GU595196; Leavitt et al., 2010).
Tn5393 and modules containing fragments of this transposon
Tn5393 is a Tn3-like transposon that carries the aminoglycoside phosphotransferase genes strA and strB and is flanked by 81-bp IR, the outermost 38 bp of which are related to those of other Tn3-family transposons (Table 3). IS1133 is inserted in the first example of Tn5393 identified (Chiou & Jones, 1993), but the ‘original’ transposon without IS1133, called Tn5393c, has since been identified (Fig. 8a; L'Abée-Lund & Sørum, 2000). Complete copies of Tn5393-like transposons and some with insertions and/or deletions are found in genomes, plasmid backbones and other transposons.
Part of Tn5393 (including strA, strB and IRstr) is found adjacent to sul2 and a fragment of ISCR2 in the small (8.7 kb) IncQ plasmid RSF1010 (Fig. 8b; Scholz et al., 1989). This structure may have been created by the transposition of Tn5393 into the ISCR2-sul2 structure, followed by a deletion event (Yau et al., 2010). Related IncQ plasmids carry similar regions, in some cases with additional resistance genes inserted (Meyer, 2009). A module apparently derived from RSF1010 (Fig. 8c) is found in several large MRR and has been designated Tn6029 (Cain et al., 2010). It is flanked by directly oriented copies of IS26, with a third, internal, copy of IS26 in the opposite orientation. The 8-bp sequence adjacent to IRL of the internal IS26 is the reverse complement of the 8 bp adjacent to IRR of one flanking IS26 (Fig. 8c), indicating that the segment between these elements has been inverted. Reversing this process and removing the internal IS26 would regenerate the RSF1010-like structure.
aac(3)-II genes and regions
aac(3)-II genes were among the most common encoding aminoglycoside-modifying enzymes identified in early studies and three types, aac(3)-IIa (X13543), aac(3)-IIb (M97172) and aac(3)-IIc (X54723), were distinguished (Shaw et al., 1993). A recent study of human and animal E. coli isolates suggests that genes of this family are still common, but identified two different types: aac(3)-IId (97% identical to aac(3)-IIa; EU022314) and aac(3)-IIe (96% identical to aac(3)-IIa; EU022315; Ho et al., 2010). Little context information is available for any of these genes, but several plasmid and other sequences in GenBank include genes closely related to either aac(3)-IId or to aac(3)-IIe. Genes within each group have similar flanking regions, but are associated with different combinations of mobile elements (Fig. 9a and b). It would be useful to distinguish between these groups in annotations and the names aac(3)-IId and aac(3)-IIe seem appropriate. Another gene, designated aac(3)-IIf here, is found in several sequences in GenBank, associated with the ISEcp1-like element IS1247 (e.g. AJ971344; Fig. 9c). It is interesting that the most closely related gene, aac(3)-IIb (79% identical), is associated with the related (88% identical) element ISSm2 (van der Ploeg et al., 1995).
Macrolide phosphotransferase regions
Several regions including mph genes encoding different macrolide 2′-phosphotransferases (∼30–40% amino acid identity) have been found in MRR in Gram-negative bacteria. The mph(A) region (Fig. 10a) includes mph(A) and genes encoding a protein required for high-level erythromycin resistance (Noguchi et al., 1995) and a transcriptional repressor (Noguchi et al., 2000b; Szczepanowski et al., 2004, 2005), often annotated as mrx and mphR(A), respectively. The mph(A) region is bounded at one end by IS26 and is often found adjacent to IRt-IS6100 marking the end of the chrA region (Fig. 5f). Another region flanked by two different IS encodes proteins about 35–40% identical to those encoded by the mph(A) region, but the genes are organized differently (Fig. 10b; Szczepanowski et al., 2007). The mph gene in this region has been assigned the name mph(F) by the MLS resistance gene website (Table 5), as its original name, mph(E), was assigned to another gene.
The gene now designated mph(E), previously called mph or mph2, plus msr(E), previously called mel or mef(E) and encoding an ATP-binding cassette transporter, are found in several MRR (Schlüter et al., 2007; Kadlec et al., 2011), usually flanked by IS26 and a second IS (Fig. 10c). A region that includes mph(B) and a gene encoding a putative penicillin-binding protein is inserted into tni402 in several plasmids (Fig. 10d), apparently flanked by 12-bp DR (Noguchi et al., 2000a). Examination of the sequence suggests that the ends of the mph(B) region, including these DR, correspond to fragments of an ISCR element (about 68% identical to ISCR2).
Annotating, analysing and comparing MRR
The actions of individual mobile elements in concert with homologous recombination can lead to very complex conglomerations of complete and fragmented version of the components described above with insertions, deletions and rearrangements. Detailed analysis of these regions and consistent annotation allows more meaningful comparisons of different structures and a better understanding of how they may have arisen and evolved. As MRR components are highly conserved, it is also important to carefully check segments of new sequences against existing examples of the same components to reduce errors and give confidence in minor, but real, changes that may help to identify epidemiological relationships. The following sections give suggestions about how to analyse and annotate the sequences of MRR, illustrated by pIP1206 (AM886293; Périchon et al., 2008), a plasmid with a complex MRR with multiple inversions (Fig. 11).
Annotating antibiotic resistance genes
Newly obtained sequence data are usually analysed using automated annotation programs, which will find potential ORF, but often only identify the general function of the gene family (e.g. ‘β-lactamase’; Fig 11a). As resistance genes found in newly sequenced regions are often identical or closely related to well-characterized genes, identifying exactly which resistance genes are present, as shown in Fig. 11b, is important.
Unfortunately, the available nomenclature for antibiotic resistance genes can be extremely confusing. Different nomenclature systems exist for some gene families, for example those encoding aminoglycoside-modifying enzymes. In this case, two systems both distinguish genes encoding N-acetyltransferases (aac), O-adenylyltransferases (aad or ant) and O-phosphotransferases (aph), but indicate the site of modification in different ways [e.g. aac(6′) or aacA; aac(3) or aacC]. One system uses additional Roman numerals to distinguish the different phenotypes conferred (Shaw et al., 1993), while the other does not and is often used for genes found in gene cassettes (Partridge et al., 2009). In other cases, older names using Roman numerals have been replaced by names with Arabic numerals (e.g. sulI, sulII, sulIII vs. sul1, sul2, sul3 and dfrXVII vs. dfrA17), but both types of names continue to be used. It is also not uncommon to find different names/numbers used to indicate the same gene in different locations and, conversely, the same name/number being used for different genes. General databases of antibiotic resistance genes (ARDB, ARGO; Table 5) also use varying nomenclature.
Several groups keep track of and assign names/numbers to different families of antibiotic resistance genes (Table 5) and consulting their websites and submitting sequences before preparing GenBank entries and/or publication enables the correct name to be used or an appropriate name to be assigned to a novel gene. Recently defined nomenclature systems generally favour the use of letters for distinguishing subgroups of genes that belong to the same overall family (e.g. qnrA, qnrB) and numbers to distinguish more closely related genes within each subgroup (e.g. qnrA1, qnrA2, qnrB1, qnrB2) (Jacoby et al., 2008). Different tetracycline resistance operons include a resistance gene, tetA, a regulatory gene, tetR, and other associated genes and it important to indicate which type is present by the use of a letter or a number following the name, for example tetA(A) (Levy et al., 1999).
In some cases, it is the encoded proteins, rather than the genes, that are numbered, the most obvious example being those encoding the various β-lactamase families. Subscript letters are sometimes used to distinguish slightly different genes encoding a particular β-lactamase, for example the blaTEM genes discussed above and different blaCTX-M variants.
Annotating mobile elements and their fragments
Automated sequence analysis programs generally annotate full-length transposase genes, but commonly just as a ‘transposase’, ‘integrase/recombinase’ or even just as hypothetical proteins (Fig. 11a; Kichenaradja et al., 2010) and predicted orfs may cross boundaries between mobile elements. The IR and/or ends of mobile elements and partial copies are often not annotated in sequences in GenBank or a mobile element is given as the same coordinates as the transposase gene it contains. Determining exactly which mobile elements are present and their boundaries, as shown in Fig. 11b, is important to extract the most useful information MRR sequences.
The boundaries of complete versions and some fragments of common mobile elements can rapidly be identified using sequence analysis software to search for their IR sequences (Table 3). A number of websites can also assist in annotating mobile elements (Table 5). ISfinder provides a blast search function that can be used to identify known IS or their relatives in a sequence, with links to pages that provide information about each IS, such as IR lengths and sequences, DR lengths and other characteristics. Newly identified IS can also be submitted for the assignment of names/numbers, while a separate website is dedicated to ISCR elements. IScan can detect specified IS in genomes and ISbrowser has been designed to visualize IS locations on expertly annotated genomes, including plasmids.
Several websites provide information about gene cassettes and integrons. XXR can be used to identify attC sites, ACID provides annotations of integrons and tools to detect cassettes and integrons in novel sequence data, while INTEGRALL is a collection of data on integrons, including lists of known cassette arrays. Instructions for identifying cassette boundaries and attC sites can be found in a recent review, which also lists all gene cassettes found in mobile resistance integrons, with suggested nomenclature and references to exemplar sequences (Partridge et al., 2009). The Repository of Antibiotic-resistance Cassettes (RAC) provides updated lists and sequences can be submitted for the annotation of gene cassettes (G. Tsafnat & S.R. Partridge, unpublished data). The transposon number registry lists recently identified transposons and assigns numbers to newly identified examples, while ACLAME is a website dedicated to mobile genetic elements.
Finding evidence of insertions
Identifying sequences of the expected DR length adjacent to each copy of IS and transposons that create such repeats (Fig. 11c) can be very helpful in analysing MRR. Matching sequences of the expected DR length flanking a mobile element or composite transposon provide direct evidence of insertion, as in Fig. 11c, where one copy of IS1 has matching 8-bp flanking sequences. Potential DR that are not immediately adjacent to the ends of the mobile element, that overlap with the IR, that differ by a base or two or that are not the expected length are unlikely to be real evidence of an insertion. However, mobile elements may occasionally create DR of an atypical length for example Tn5393 in pEFER (CU928144) and the ISEcp1-blaCTX-M-17 transposition unit in pIP843 (Cao et al., 2002) both appear to be flanked by 6-bp DR, rather than the expected 5 bp. In the first case, an uninterrupted version of the flanking sequence is available to confirm this. Identifying DR flanking terminal fragments of a mobile element bounding an MRR can also provide evidence of potential ‘founder’ elements.
Tools for visualizing sequence comparisons, such as Mauve and Artemis/ACT (Table 5), may be useful in identifying rearrangements, particularly those involving large segments, but more detailed analysis may be required to understand how these may have taken place. The identification of a sequence of the DR length adjacent to one end of a repeated element and the reverse complement of this sequence adjacent to the other end of another copy, as shown in Fig. 2a, suggests inversion by homologous recombination between these elements. In Fig. 11c, identifying such sequences suggests two inversions by recombination in IS26, which could have occurred in either order. Figures 11d and 11e simulate the effects of reversing these events by inverting the regions between the relevant copies of IS26. This recreates a structure in which all three copies of IS26 are flanked by DR, indicative of insertions. Removing these IS plus one copy of the DR, as shown in Fig. 11f, yields uninterrupted versions of other recognizable MRR components or the plasmid backbone, generating the presumed ancestral structure.
Identifying common boundaries between mobile elements
Papers describing MRR often indicate the extents of regions that match part of one other known MRR sequence, presumably the first listed in a blastn search result. This is usually not very helpful if the region in question is a common MRR component or a common module composed of several components. Searches with short sequences that overlap the boundaries between mobile elements or other distinct regions can be used to determine how common these combinations are in available sequences and suggest relationships with other MRR.
Examples of MRR structures, evolution and relationships
Many sequences of complete plasmids and genomic islands carrying MRR from human clinical samples, animals and various environments are now available in GenBank. Detailed analysis and comparison of these sequences reveals examples of relationships between MRR in different contexts and variations that illustrate the principles outlined in this review. Some examples from plasmids and chromosomes of different species of Enterobacteriaceae from different times and locations (see Table 6 for details) are shown in Fig. 12 and discussed in the following sections. Other structures, such as SGI variants (see references in Table 1 in Levings et al., 2008), AbaR variants (see references in Table 1 in Post & Hall, 2009 plus Adams et al., 2010) and the SXT-like ICE (Garriss et al., 2009), also share components with these MRR and display similar variations in structure, but will not be discussed here.
Table 6. Details of the sources of MRR shown in Fig. 12
R100 (also called NR1; IncFII), one of the earliest ‘resistance transfer factors’ identified (Nakaya et al., 1960), carries Tn2670, also known as the resistance determinant (r-det). This nested structure consists of a composite transposon related to Tn9 into which Tn21 is inserted (Fig. 12a). This structure illustrates how resistance genes in MRR can have several levels of mobility as the aadA1a gene cassette could move independently (either mediated by IntI1 or by homologous recombination in the CS), as part of In2 (if the necessary Tni proteins are provided), as part of Tn21 (transposition of the entire transposon or one-ended transposition, TnpR-mediated site specific recombination or homologous recombination of segments). The whole of Tn2670 can also move as a composite transposon (Iida et al., 1981) or as a circle generated by homologous recombination (Silver et al., 1980).
Evolution and spread of MRR
A number of available MRR sequences include the tnp21 and/or mer21 regions adjacent to parts of Tn9 with the same boundaries as in Tn2670, but contain different cassette arrays, insertions of different modules and/or various deletions (Fig. 12b). The three structures shown are from two different species, from a plasmid and the chromosome and from at least two locations (Table 6), and illustrate how large complex MRR may persist, evolve and spread. All are flanked by directly oriented copies of IS1 and DR and could have been inserted as composite transposons or as circular molecules.
Complex structures can indicate relationships between MRR
The R100 backbone carries Tn10 (see Fig. 3b) in addition to Tn2670 and in some MRR a Tn2670-related structure abuts a truncated version of Tn10 (Fig. 12c). One end of this region in several plasmids is defined by a multi-IS structure containing IS1 flanked by DR in an IS10 element that abuts 149 bp of the IRR end of IS26, which is followed by a remnant of an IS4321-like element interrupting a 38-bp IR (Fig. 12c). This complex structure was presumably only generated once and MRR containing the entire structure or truncated versions are likely to be derived from one another.
MRR become more complex, but can also degenerate
The MRR of pRMH760 could have been created by the incorporation of a circular Tn2670-like element into a Tn1696-like transposon by homologous recombination in the 3′-CS and TnpR-mediated site-specific recombination between the res sites of Tn21 and another transposon (Partridge & Hall, 2004). This MRR includes two commonly encountered structures: Tn2 inserted into mer21 (Novais et al., 2010) and Tn4352 (Fig. 3c) inserted into tni402. Like the examples in the previous section, pRMH760 illustrates how MRR can become very large and complex.
The boundaries of the pRMH760 MRR match those between the MRR and backbones of several IncA/C plasmids (Fig. 12d) and the MRR in these plasmids appear to be derivatives of the pRMH760 MRR structure that have undergone deletions of progressively larger segments. These could be explained by the insertion of directly oriented IS26 elements, followed by recombination between them. The IR of all Tn21-like transposon fragments in these structures have insertions of IS4321 or IS5075, but differences between pRMH760 and the others suggest insertion at two different times.
Complex MRR containing ‘old’ components plus important ‘new’ resistance genes
Most of the MRR illustrated in Fig. 12a–d carry mainly ‘old’ resistance genes that apparently emerged some time ago. The MRR illustrated in Fig. 12e include selections of these same common MRR components, but also one or more important resistance genes that have apparently emerged more recently. The pCTX-M3 MRR, which carries the armA 16S rRNA methylase gene, is bounded by the ends of Tn2 flanked by 5-bp DR. This suggests that Tn2 was first inserted into the plasmid backbone and underwent multiple insertions of additional components (Gołębiewski et al., 2007). pCTX-M360, with a very similar backbone and a complete copy of Tn2 inserted in the same place and flanked by the same DR, has since been identified (Zhu et al., 2009).
pEK499 carries the blaCTX-M-15 gene, which has become dominant worldwide, as part of an MRR that includes a cassette array that is flanked by IS26 elements rather than the 5′-CS and 3′-CS of class 1 In/Tn. Although the outer ends of the IS26 are not flanked by DR, it is possible that this cassette array was originally captured from a class 1 In/Tn as a composite transposon. The pEK499 MRR is related to the MRR of pC15-1a, the first blaCTX-M-15 plasmid to be sequenced (Boyd et al., 2004; see Fig. 13b), but some regions are missing and pEK499 includes a class 1 In/Tn and other regions not found on pC15-1a.
The pTN48 MRR includes another important gene, blaCTX-M-14, as part of an ISEcp1 transposition unit flanked by DR. An erm(B) gene is also present with ISCR14 in a region flanked by directly oriented copies of IS26, with matching 8-bp sequences adjacent to their inner ends. This suggests that a circular molecule containing one copy of IS26 flanked by DR of this sequence (Fig. 12e), presumably generated by ISCR14 rolling circle replication, was inserted by recombination with an IS26 element in a pre-existing MRR. The region containing erm(B), including the small fragment beyond IRR of IS26 in the circular molecule, is closely related to part of several plasmids and transposons found in Gram-positive bacteria (Brisson-Noël et al., 1988).
Metagenomic sequencing data relating to multiresistance
Data from several recent studies using metagenomic approaches to examine resistant isolates and/or plasmids seem to confirm the dominance of a limited set of resistance genes, MRR components and combinations of components. In one study, pooled plasmid DNA from sets (about 100 each) of clinical K. pneumoniae isolated at two different times was sequenced using Illumina technology. Sequences were then mapped to three plasmids from a single K. pneumoniae isolate from the same hospital sequenced by conventional methods (Zhao et al., 2010). Detailed analysis of one plasmid (pKF3-140) allows the coverage in a specific section of the MRR to be directly related to known components (Fig. 13a). The regions with the highest coverage in both sets correspond to IS26 and high coverage of the 5′-CS and 3′-CS suggests that ‘clinical’ class 1 In/Tn are also common. Despite being one of the most frequently reported cassette arrays (Fig. 12; Partridge et al., 2009) |dfrA17|aadA5| has a relatively low coverage, but few examples of this array in GenBank are from K. pneumoniae (<10) compared with E. coli (>75) and dfrA17 is apparently rare in K. pneumoniae (Brolund et al., 2010). Coverage of the sul2/strAB region is apparently higher in the earlier set, while coverage of chrA/mph(A) is higher in the later set. The right-hand end of the pKF3-140 MRR sequence shown corresponds to part of the aac(3)-IId region, but has apparently undergone an IS26-mediated rearrangement, and ISCfr1, which is not found in all structures carrying this resistance gene (Fig. 9a), has lower coverage. Similar results were obtained by mapping 454 pyrosequencing data from mixed resistance plasmid populations obtained from uncultured organisms in a wastewater treatment plant to known plasmid sequences (Szczepanowski et al., 2008).
In another study, fragments conferring resistance to various antibiotics cloned from two human gut microbiomes were sequenced (Sommer et al., 2009). Sequences identified as transposases corresponded to fragments of IS26, Tn2, Tn2a, ISEcp1 and Tn1721 and fragments from one individual included the blaTEM-1b, blaCTX-M-15, aac(6′)-Ib-cr and aac(3)-IIe resistance genes. The presence of distinctive boundaries between different genetic components previously found almost exclusively in related MRR carrying blaCTX-M-15 (Fig. 13b) suggests that the sequences are derived from related plasmids. Further mapping of this type of short read data to common MRR components may yield additional information.
The information provided in this review, particularly in Figs 11 and 12, illustrates how transferable multiresistance in Gram-negative bacteria appears to be generated by rare gene capture events mediated by different mobile genetic elements, clustering of resistance genes and associated mobile elements and combinatorial evolution between a limited number of shared components.
In addition to the actions of individual mobile elements, homologous recombination is clearly extremely important in both the movement of resistance genes (e.g. cassette arrays, ISCR1-associated genes) and the creation and evolution of MRR. The accumulation of common components in large regions (see Fig. 12) increases the targets for homologous recombination events. If the mobile element that captures a resistance gene is one of these components, or quickly transfers the gene to a region containing such elements, the emerging gene may able to spread very rapidly between existing structures. A number of plasmids carrying the globally successful blaCTX-M-15 gene have now been sequenced and in most cases this gene seems to be associated with not only ISEcp1 but also Tn2, a very common MRR component, in large related MRR, although clearly particular plasmids and bacterial strains are also very important in the dissemination of this gene. blaKPC, initially associated with a ‘new’ transposon, seemed to spread only locally at first, but is now expanding its range, which may partly reflect an increasing association with more typical MRR components. If a captured gene is ‘unlucky’ and does not become associated with common MRR components, its spread may be very limited. Like blaCTX-M-15, rmtC is associated with ISEcp1, but its wider context(s) are not known and rmtC is rarely identified compared with the armA and rmtB 16S rRNA gene methylases found in large complex MRR (see Fig. 12). There are many other examples of resistance genes that have only been identified very rarely and that may have emerged only to disappear again.
One mobile element in particular, IS26, seems to be very important in relation to multiresistance in Gram-negative bacteria, being found in many different species on plasmids and the chromosome. As well as being able to form a number of composite transposons (Fig. 3), recombination between inverted copies of IS26 also seems to be responsible for the generation of many different rearrangements in MRR. Some MRR also include segments corresponding to the ‘payloads’ of different composite transposons separated by a single IS26 element, for example the adjacent Tn4352 (Fig. 3c) and Tn6029 (Fig. 8c) regions in pAKU_1 and pRSB107 (Fig. 12c) and the aac(3)-II and erm(B) regions in pTN48 (Fig. 12e). Such structures presumably arise by recombination between IS26 elements flanking different composite transposons or by incorporation of circular molecules by homologous recombination (see Fig. 2b) into an IS26 element that is already part of a composite transposon. While IS26 can insert resistance genes, it can also cause extensive rearrangements and resistance genes may also be deleted, for example the aac(3)-IIe region is found in the pC15-1a MRR (Fig. 13b), but not the related pEK499 MRR (Fig 12e). Thus, not all of the actions of IS26 may be beneficial, but the presence of this element in MRR potentially provides great flexibility to create different structures, some of which are likely to be selected for.
The combination of all of the factors discussed above has allowed Gram-negative bacteria to resist the actions of antibiotics and there is no reason to suppose that this will change as new antibiotics are developed. There are clearly huge reservoirs of potential antibiotic resistance genes in various environments and identifying and characterizing them could identify new mechanisms of resistance that may inform the development of new antimicrobial agents (Cantón, 2009). However, relatively few resistance genes appear to dominate in clinically important bacteria and it is not possible to predict which genes may become mobilized and available to the pathogenic bacteria that cause infections (Courvalin, 2005, 2008). Predicting which emerging genes are likely to become the most problematic and which other resistance genes they are likely to become associated with, which is crucial for the management of multiantibiotic resistance, may be possible if we had the right data.
Currently available plasmid and other sequences relating to multiresistance can at best only provide hints to fully understanding the processes involved, as we are missing many informative structures. Sequenced examples of plasmids and other regions have generally been selected somewhat randomly, so the available set is likely to be biased, and sequences are often poorly or incompletely analysed, so that important information is lost. Large-scale sequencing of plasmids and/or genomes is becoming more and more feasible and cost-effective, but examples from different geographic locations and times need to be carefully and systematically selected to provide the most useful information (O'Brien, 2002). Dealing with nomenclature issues and developing improved bioinformatic methods tailored to the characteristics of the complex sequences involved in multiresistance in Gram-negative bacteria are also needed to make best use of these data.
I would like to thank Jon Iredell, for interesting discussions, for encouraging me to think about the bigger picture and for comments on this manuscript. Working with Ruth Hall for several years gave me an excellent introduction to this field and supervising Zhiyong Zong and collaborating with Guy Tsafnat also contributed to the ideas in this review. Thanks are also due to the Editor and two reviewers for suggesting improvements, Marilyn Roberts for help with MLS resistance gene nomenclature, Hatch Stokes for useful discussion during revision and Andrew Ginn for help with checking the manuscript. I was partly funded by grant no. 512396 from the National Health and Medical Research Council of Australia.