Horizontal transfer of genomic islands (GEIs), that is, chromosomal regions encoding functions that can be advantageous for the host, plays a key role in bacterial evolution, but their mechanisms of transfer remained elusive for a long time. Recent data suggest that numerous GEIs belong to noncanonical classes of mobile genetic elements (MGEs) that can transfer by conjugation. Among them, the integrative and conjugative elements encode their own excision, conjugative transfer, and integration, whereas the integrative mobilizable elements are autonomous for excision and integration but require the conjugation machinery of helper elements to transfer. Others can self-transfer but require the recombination machinery of the recipient cell to integrate. All these MGEs evolve by acquisition, deletion, or exchange of modules, that is, groups of genes involved in the same function. Moreover, composite GEIs can result from the insertion of a MGE within another or from the site-specific integration of an incoming MGE into one of the recombination sites flanking a resident GEI (tandem accretion). Tandem accretion enables the cis-conjugative mobilization of highly degenerated and nonautonomous GEIs, the cis-mobilizable elements. All these mechanisms contribute to the plasticity and complex evolution of GEIs and explain the highly diverse tableau revealed by more and more genome comparisons.
Horizontal gene transfer (HGT) between bacterial strains and species is a key mechanism of genome evolution (Ochman et al., 2000; Gogarten & Townsend, 2005; Dagan et al., 2008; Skippington & Ragan, 2011; Treangen & Rocha, 2011). A successful HGT event can be split in three successive steps. The first step is the DNA transfer from one cell to another, involving free DNA (by transformation), encapsidated DNA (by transduction or lysogenization), or cell-to-cell contact (by conjugation). The second step is the inheritance of the acquired DNA by the daughter cells during division. It involves either the replication of the incoming DNA as a plasmid or its integration/transposition into a replicon. The third one is the evolutionary success of the strain. It may result from advantageous functions encoded by the transferred genes that allow better adaptation of the recipient cell to environment or colonization of novel niches.
HGT events frequently correspond to the acquisition of a mobile genetic element (MGE; Frost et al., 2005), that is, a DNA fragment that can move from cell to cell (intercellular mobility) and/or within a genome (intracellular mobility) and that carries some or all sequences and genes involved in its mobility (Toussaint & Merlin, 2002). The MGEs, in addition to genes involved in their mobility, frequently carry adaptation genes that contribute to the success of their transfer (Frost et al., 2005; Rankin et al., 2011). Besides traditional classes of MGEs, such as conjugative plasmids and integrated prophages, more and more data suggest that various other types of MGEs, which use novel combination of transfer and maintenance mechanisms, are important mediators of HGT among prokaryotes (Burrus et al., 2002a; Osborn & Böltner, 2002; Toussaint & Merlin, 2002; Brochet et al., 2008; Wozniak & Waldor, 2010; Guerillot et al., 2013). In particular, genome analyses and comparisons have shown that the transfer of another element class, the genomic islands (GEIs), also plays a key role in bacterial evolution and adaptation (Hacker & Kaper, 2000; Hacker & Carniel, 2001; Dobrindt et al., 2004; Juhas et al., 2009).
In essence, a GEI can be described as a chromosomal segment acquired by HGT that carries gene sets enhancing the fitness of its host. Initially, two classes of transferred chromosomal elements were arbitrarily defined on the basis of their size, the GEIs (> 10 kb) and the genomic islets (< 10 kb). All these elements will be referred as GEIs in this review.
GEIs carry genes that can contribute to the adaptation of their host to its specific ecological niche, such as pathogenesis, symbiosis, novel metabolic pathways, and/or resistances to antibiotics, prophages, or heavy metal cations (Dobrindt et al., 2004).
For all GEIs, various lines of evidence, such as their presence in only few strains, unusual G + C percent or unusual codon usage, show that the elements were acquired by HGT. However, GEIs do not belong to traditional classes of MGEs (i.e. insertion sequences, type I and II transposons, plasmids, prophages), and their mechanisms of transfer were rarely identified. Many GEIs carry genes that may be involved in their own intracellular (transposase or recombinase genes) or intercellular (conjugation genes) mobility. Numerous GEIs harbor a tyrosine recombinase gene and are flanked by direct repeats, corresponding to the 3′ end of a gene (e.g. a tRNA gene), strongly suggesting that they have integrated by site-specific recombination in the 3′ end of this gene.
GEIs could thus correspond to unconventional classes of MGEs or to nonmobile elements deriving from other MGEs. For example, the staphylococcal pathogenicity islands (SaPIs) are satellite prophages that are mobilized in trans by unrelated autonomous prophages (Christie & Dokland, 2012). These pirate elements, integrated in the bacterial chromosome, encode their own excision/integration and replication, but redirect the assembly of structural proteins encoded by helper prophages to package their own genome and use the lysin of the helper prophages to release their own virions. Furthermore, more and more data suggest that a large part of GEIs would be MGEs that encode their own transfer by conjugation or mobilization by conjugative elements (Guglielmini et al., 2011).
The present review focuses on increasing evidences of the high prevalence of GEIs/integrated MGEs that can transfer by conjugation (autonomously or not). We will describe their huge diversity and the variety of mechanisms enabling their conjugative mobilization, in particular the one resulting from tandem accretion after site-specific recombination. Accordingly, many and perhaps most GEIs would not correspond to a single element acquired by a unique HGT event. They would rather consist of a mosaic of elements that had integrated in tandem and/or within another element and had undergone subsequent deletions or reorganizations. This review will not detail the functions encoded by these elements as they have already been detailed in various recent review articles (Dobrindt et al., 2004; Juhas et al., 2009; Wozniak & Waldor, 2010).
The prolific bestiary of conjugative and trans-mobilizable GEIs
Integrative and conjugative elements: autonomous in transfer and integration
Definition of integrative and conjugative elements
Only few GEIs encoding their self-conjugative transfer and integration into the genome of a recipient cell have been described in different bacterial taxa until the 1990s. The nomenclature of these MGEs, such as Tn916 from Enterococcus faecalis, pSAM2 from Streptomyces ambofaciens, pRS01 from Lactococcus lactis, CTnDOT from Bacteroides thetaiotaomicron, SXT from Vibrio cholerae, or the symbiosis island from Mesorhizobium loti, was highly heterogeneous as elements were named conjugative transposons (CTns), integrative plasmids, GEIs, sex factor, or remained unclassified (for a review see Burrus et al., 2002b). Burrus et al. (2002b) proposed to group all the GEIs encoding their self-conjugative transfer and integration under the name of integrative conjugative elements (ICEs). Thereafter, even if a large majority of them integrate only in a specific site, ICEs were considered as synonyms of conjugative transposons (Roberts et al., 2008). According to this initial definition, ICEs are elements that encode their own excision by site-specific recombination, their transfer by conjugation, and their integration, regardless of the integration specificity and the conjugation mechanism. Since then, the production of circular forms by TnGBS1 and TnGBS2, self-conjugative and integrative GEIs detected in Streptococcus agalactiae, was shown not to rely on a site-specific recombinase but on a DDE transposase (Brochet et al., 2009; Guerillot et al., 2013). As a consequence, the definition of ICEs should also include the conjugative GEIs that circularize and integrate by means of a DDE transposase.
Like other MGEs, ICEs display a modular structure, that is, genes and sequences involved in a same function are clustered together. A large diversity of combinations of integration/excision, conjugation, and adaptation modules has been described among ICEs (Fig. 1, Table 1).
The activity of transfer and/or integration of underlined elements was not shown. When two names were used or the initial was replaced, the less common or initial name is given between brackets.
The elements are sorted according their origin (bacterial division, genus, species). Obligate intracellular bacteria are underlined with dotted lines. α, alphaproteobacteria; β, betaproteobacteria; γ, gammaproteobacteria; act., actinobacteria; arc., archaea; bac., bacteroidetes; chl., chlamydiae; fir., firmicutes.
Integration sites are given only if numerous events of integration were analyzed or if features characteristic of a site-specific integration were found. 3′ (or 5′), integration in the 3′ (or 5′) end of the target gene. ND, not determined.
ds: double-stranded DNA-processing conjugation module; MPF: mating pair formation system; MOB: relaxase; Rep: replication module; ss: single-stranded DNA-processing conjugation module; Tra: FtsK-like conjugative DNA-translocation protein. The MPF, MOB, Tra and Rep subtype is indicated if known (according to Ghinet et al. 2011; Guglielmini et al. 2011, 2013) or if a significant E-value was obtained using the online web tool CONJscan-T4SSscan (http://mobyle.pasteur.fr/cgi-bin/portal.py#forms::CONJscan-T4SSscan) (Eddy 1998, 2008, 2011).
Only proved or putative functions that could be useful for the host are mentioned. Amp, ampicillin; EtB, ethidium bromide; Cam, chloramphenicol; Em, erythromycin; Hg, mercury; Kan, kanamycin; ND, not determined; RM I, type I restriction-modification system; RM II, type II restriction-modification system; Str, streptomycin; Spc, spectinomycin; Sth, streptothricin; Suf, sulfamethoxazole; Sul, sulphonamide; Tet, tetracycline; Trm, trimethoprim; Van, vancomycin.
Numerous closely related (proved or putative) ICEs are not mentioned in this table.
The ICE integration/excision modules, like those of prophages, generally encode one recombinase, usually a tyrosine recombinase, but can also encode a serine recombinase or a DDE transposase, each enzyme being able to recognize different kinds of direct and/or inverted repeats flanking the ICEs (Wozniak & Waldor, 2010; Table 1). Recently, the recombination modules of three putative ICEs found in various streptococci, ICESsuBM4071 (Holden et al., 2009), Tn1806 (Camilli et al., 2011), and ICE10750-RD.2 (Beres & Musser, 2007), were found to be composed of three adjacent genes encoding distantly related serine recombinases.
In most of the tyrosine recombinase-mediated reactions, a recombination directionality factor (RDF), also known as excisionase, helps to reverse the direction of the recombination to excision (Groth & Calos, 2004). However, some tyrosine recombinases are bidirectional enzymes that efficiently support both integration and excision without any GEI-encoded RDF. For example, the excision of the GEI Ecoc54N from Escherichia coli, probably deriving from an ICE, does not necessitate a RDF (Antonenka et al., 2006). Like most of serine recombinases (Groth & Calos, 2004), TndX encoded by Tn5397 from Clostridium difficile, the only ICE-encoded serine recombinase that has been well characterized, is a bidirectional enzyme that catalyzes both integration and excision without RDF involvement (Wang et al., 2006). However, it cannot be excluded that some other serine recombinases of ICEs are associated with RDFs in excision process, like various serine integrases of prophages (Bibb et al., 2005; Ghosh et al., 2006; Panis et al., 2010; Khaleel et al., 2011).
In both tyrosine and serine recombinase-mediated recombinations, two monomers are bound to each of the two short stretches of identical bases where crossing-over occurs (Groth & Calos, 2004). A short stretch and the adjacent sequence where recombinase and cofactors bind define an attachment (att) site (Fig. 2). The tyrosine or serine recombinase catalyzes the integration and the excision of the element, whereas ICEs may also encode a RDF that controls directionality of the reaction toward excision (Fig. 2). The site-specific recombination between the attachment sites attR and attL flanking an ICE leads to the formation of an attI site carried by the ICE circular form and an attB site located on the host chromosome (Fig. 2). More precisely, a tyrosine recombinase-mediated reaction involves two staggered single-strand exchanges on either side of a 6- to 8-bp region of DNA sequence homology (called the ‘spacer’ or ‘overlap region’; Rajeev et al., 2009). Usually, the sequence homology extends over the short strand exchange region, and therefore, the ICEs are flanked by direct repeats whose length is, as for example ICEBs1, up to 60 bp (Lee et al., 2007). By contrast, serine recombinases make staggered single-strand exchanges on either side of a 2-bp sequence (Groth & Calos, 2004).
Up to now, three families of DDE transposases of ICEs were found, the IS30 transposases for ICE6013 and its closely relative Tn6012 (Han et al., 2009; Smyth & Robinson, 2009), a second family for TnGBS1 and TnGBS2 (Brochet et al., 2009), and a third family distantly related to the ISLre2 family for ICEA from Mycoplasma agalactiae (Dordet Frisoni et al., 2013). These ICEs exhibit short (10–40 bp long) terminal inverted repeat sequences (Brochet et al., 2009; Han et al., 2009; Smyth & Robinson, 2009; Guerillot et al., 2013). Their insertion generates short direct repeats that flank the MGE and correspond to a duplication of the target DNA. It is not known whether the circularization of TnGBs1/TnGBs2, the best known ICEs encoding a DDE transposase, is replicative, leaving an integrated copy of the GEI embedded in the original site as observed for IS911 circularization, or proceeds through a cut-and-paste mechanism (e.g. IS10 transposition mechanism), generating a double-strand circular transposon (Brochet et al., 2009; Guerillot et al., 2013).
A large array of integration sites
The various types of integration/excision module confer to ICEs different features ranging from high to weak integration specificity via their integration in preferential sites (Table 1).
ICEs encoding tyrosine recombinase integrate in a large array of specific sites including not only the 3′ end of tRNA genes but also the 3′ or 5′ end of genes encoding various housekeeping proteins (Table 1). As one of the direct repeats flanking ICEs includes one of the ends of the target gene, their integration does not disrupt it (Fig. 2). Besides, gene remnants can also be suitable integration sites (Ghinet et al., 2011). ICEs presenting high integration specificity can also integrate into alternative or secondary attachment sites, especially if the preferred site is absent. Nevertheless, the integration of ICEBs1, a site-specific ICE from Bacillus subtilis, into secondary sites, is detrimental to both ICEBs1 and the host cell (Menard & Grossman, 2013).
Some ICEs encoding a tyrosine recombinase, such as CTnDOT from Bacteroides, have a lower but nonrandom integration specificity and can integrate in various sites (Cheng et al., 2000). Tn916, Tn1549, and numerous closely related ICEs from firmicutes have a very low specificity of integration in most bacteria, including C. difficile CD196 and 42373, and can integrate in many sites (Clewell et al., 1995a; Roberts et al., 2003; Hussain et al., 2005; Tsvetkova et al., 2010). Surprisingly, in the C. difficile strain CD37, Tn916 has a unique conserved attB site (Mullany et al., 1991). This variability in integration specificity could be due to host factors (Roberts & Mullany, 2011).
The integration specificity of ICEs encoding serine recombinase was determinated only for Tn5397 (Wang et al., 2006). In the donor and the transconjugants of C. difficile, Tn5397 integrates into a specific site belonging to an unknown ORF and therefore disrupts it. Nevertheless, in B. subtilis transconjugants, Tn5397 inserts into different sites. Apart from a conserved GA dinucleotide at the recombination site, there is little sequence conservation between these target sites (Wang et al., 2006).
TnGBS1 and TnGBS2, two elements from S. agalactiae encoding related DDE transposases, insert in various intergenic regions located 15 or 16 bp upstream from the −35 box of sigma A promoters (Brochet et al., 2009; Guerillot et al., 2013). More precisely, TnGBS2 is inserted upstream from the promoter of the nrdF gene in 65/119 bacteria tested, whereas TnGBS1 does not have a so strongly preferred target site. These data and analysis of the sequences flanking related putative ICEs in genomes show that the insertion of these ICEs does not disrupt genes and does not affect significantly the transcription of genes controlled by the targeted promoters. By contrast, other ICEs that encode a DDE transposase such as ICE6013 from Staphylococcus aureus (Smyth & Robinson, 2009) and ICEA from M. agalactiae (Dordet Frisoni et al., 2013) do not seem to have any preferential insertion specificity and insert in intergenic or intragenic sites.
Most ICEs were found in only one copy per analyzed genome or per transconjugant obtained from recipient devoid of resident element. Nevertheless, transconjugants carrying tandemly integrated copies of ICEclc, an ICE encoding chlorobenzene degradation, were isolated after growth of Pseudomonas knackmussii B13 in the presence of this aromatic organic compound as the sole carbon source (Ravatn et al., 1998). Multiple identical or closely related ICEs can also be found in different locations of some genomes for most ICEs that can integrate in several or numerous different sites, including ICEs encoding a tyrosine integrase such as Tn916 (Clewell & Gawron-Burke, 1986), as well as ICEs encoding a DDE transposase such as TnGBS1 (Brochet et al., 2009; Guerillot et al., 2013).
Different conjugation modules
Two types of conjugation module are depicted in both plasmids and ICEs. The most common processes single-stranded DNA and encodes a relaxase (MOB), a mating pair formation system (MPF), and a coupling protein (CP; Fig. 3a; de la Cruz et al., 2010). MOB usually belongs to a complex of proteins known as the relaxosome. Up to now, eight MOB families and six MPF families have been described, and various MOB/MPF combinations are found among ICEs (Table 1, Smillie et al., 2010; Guglielmini et al., 2011, 2013). In this conjugative process, a MOB protein is functionally linked to a MPF by a CP, which both constitute a spanning-membrane multi-protein complex named a type IV secretion system (T4SS). More precisely, a MOB is a transesterase acting as a dimer that initiates DNA transfer by catalyzing a site- and strand-specific nick at the origin of transfer (oriT) of its cognate MGE (Fig. 3a). A relaxase monomer, covalently bound to the 5′ end of the DNA, is then recognized by the CP and transported through the MPF to a recipient cell (Fig. 3b). The rolling-circle replication of the element is concomitant to its transfer in both the donor and the recipient (Fig. 3c). Finally, the relaxase monomers achieve the transfer by recircularizing the two MGE copies (Fig. 3d). Altogether, the conjugative transfer of a MGE by single-stranded DNA process can be considered as an intercellular mode of replication (de la Cruz et al., 2010).
The second type of conjugation modules processes double-stranded DNA and has only been described in ICEs (known as actinomycete integrative and conjugative elements or AICEs) and conjugative plasmids of pluricellular actinomycetes such as Streptomyces (te Poele et al., 2008; Ghinet et al., 2011). The conjugative transfer of double-stranded DNA proceeds in two steps involving different proteins: intermycelial transfer between donor and recipient mycelia and intramycelial spread between cells of the recipient mycelium (Smokvina et al., 1991; Hagège et al., 1993; Possoz et al., 2001). All AICEs encode a DNA translocator (Tra) belonging to the FtsK-SpoIIIE protein family, a rolling-circle replicase (Rep), and few proteins involved only in the transfer of the element between cells belonging to the same mycelium. The Tra proteins assemble as a hexameric channel between cell membranes, bind to a specific double-stranded DNA sequence on the AICE circular form, the cis-acting locus of transfer (clt), and thereafter translocate the circular AICE to a recipient cell belonging to another mycelium (Vogelmann et al., 2011). As the double-stranded DNA translocation is not a conservative mechanism and cells of actinomycetes have multiple copies of genomes, the intracellular rolling-circle replication of the AICE mediated by Rep proteins is mandatory after excision and prior to transfer to the recipient cell for the AICE dissemination to be successful (Hagège et al., 1993).
Barriers to ICE transfer
Three types of ICE-encoded mechanisms inhibiting redundant conjugative transfer have been described. The first is found in the ICEs belonging to the SXT/R391 family, where the entry exclusion is mediated by the interaction between the Eex inner membrane protein with the TraG protein, a component of the MPF (Marrero & Waldor, 2007a, b). The second, encoded by the pSAM2 ICE, relies on a protein named Pif (pSAM2 immunity factor; Possoz et al., 2003). The mode of action of Pif is not known, but its presence in the recipient bacterium inhibits both the excision and the initiation of transfer from the donor, thus reducing the pSAM2 transfer rate by a factor of 2000. The third involves a repressor encoded by a recipient cell that reduces the integration efficiency of an incoming ICE. The frequency of ICEBs1 transfer is indeed reduced 1000-fold when its repressor, ImmR, is expressed in the recipient cell (Auchtung et al., 2007). By contrast, some other ICEs do not inhibit redundant conjugative transfer. For example, the presence in a recipient cell of an ICE of the Tn916 family, which exhibits low integration specificity, does not impede transfer of a related element (Norgren & Scott, 1991).
The conjugative transfer of ICEs and other MGEs can also be inhibited by recipient-encoded mechanisms that recognize and cleave incoming DNA: restriction modification (RM; for an extensive review see Tock & Dryden, 2005) and clustered regularly interspaced short palindromic repeats (CRISPR) systems (for extensive reviews see Westra et al., 2012; Wiedenheft et al., 2012). For instance, various CRISPRs spacers of S. agalactiae target three families of ICEs widespread in streptococci (TnGBS1, TnGBS2, and ICE_515_tRNALys), and the presence of a spacer targeting TnGBS2 in the recipient cell strongly reduces the transfer frequency (Lopez-Sanchez et al., 2012). Nevertheless, like phages or conjugative plasmids, some ICEs encode mechanisms that prevent the action of RM or CRISPR systems. Thus, a gene encoding an ArdA antirestriction protein is found in several ICEs from S. agalactiae (Brochet et al., 2008) and in various ICEs belonging to the Tn916 family (Rice et al., 2007; Serfiotis-Mitsa et al., 2008; Roberts & Mullany, 2009). Another antirestriction protein, ArdC, is encoded by ICEA from M. agalactiae (Marenda et al., 2006) and by the proteobacterial elements ICEKpI (Lin et al., 2008), ICEEcI (Schubert et al., 2004), and ICEMlSymR7A (Sullivan et al., 2002). A gene encoding a small anti-CRISPR protein has also been very recently described in the ICE PAGI-5 from Pseudomonas aeruginosa (Bondy-Denomy et al., 2012).
ICEs or RICEs (replicating ICEs)?
Once they succeeded to enter in a new cell, the maintenance of single-stranded transferring ICEs was originally considered as being exclusively due to integration after transfer as they were not supposed to be able to replicate intracellularly (Burrus et al., 2002a). However, proteobacterial ICEs, such as pKLC102 and ICEHin1056, initially described as plasmids, can be present as multiple circular copies per cell (Kiewitz et al., 2000; Dimopoulou et al., 2002). Recently, the maintenance of two ICEs, TnGBS1 and TnGBS2 from the firmicute S. agalactiae, was found to be ensured by theta-replication and partition modules they share with different plasmids (Guerillot et al., 2013). In the same way, the proteobacterial ICEs of the pKLC102/ICEclc family (including ICEHin1056 and PAPI-1; Dimopoulou et al., 2002; Klockgether et al., 2004; Mohd-Zain et al., 2004; Qiu et al., 2006; Juhas et al., 2007) and the archaeal ICE pNOB8 from Sulfolobus (She et al., 1998) also encode an active partition system they share with plasmids (Schumacher, 2012). Moreover, several recent copy-number analyses of different ICEs of firmicutes showed a multicopy state when carrier cells were treated with a DNA-damaging agent, mitomycin C, suggesting an inducible replication mechanism (Lee et al., 2010; Carraro et al., 2011; Sitkiewicz et al., 2011). The maintenance of one of these ICEs, ICEBs1 (Lee et al., 2010) and that of ICEMlSymR7A (Ramsay et al., 2006) have been shown to involve a unidirectional rolling-circle replication mode initiated at the oriT by the cognate ICE-encoded MOB. Thus, maintenance by rolling-circle replication could be a common feature among single-stranded transferring ICEs, due to the joint involvement of the relaxase in the processes of replication and conjugative transfer. Altogether, these data show that the border between ICEs and conjugative plasmids is vague.
The host range of ICEs has been poorly studied and can be very different from one element to another. The transfer of the ICE pRS01 has only been obtained within the Lactococcus genus (Gasson et al., 1995), whereas elements of the Tn916 family have been detected or have been transferred into a large array of Gram-positive and some Gram-negative bacteria, especially Firmicutes and Proteobacteria (Clewell et al., 1995b; Roberts & Mullany, 2009). Most of the studied ICEs transfer at least between closely related species and genera (Boccard et al., 1989; Shoemaker et al., 2001; Burrus & Waldor, 2003; Auchtung et al., 2005; Burrus et al., 2006). Different steps of the conjugation process can limit the transfer range of an ICE. Indeed the absence of transfer of ICESt3 to L. lactis was hypothesized to rely on the conjugation machinery (Bellanger et al., 2009), whereas the host restriction of TnGBS1 and TnGBS2 is likely due to the requirement for transient replication before integration (Guerillot et al., 2013).
Cargo genes of ICEs
ICEs have a large range of size going from 11 kb (pSAM2 from S. ambofaciens, Pernodet et al., 1984) to 674 kb (PAISt from Streptomyces turgidiscabies, Kers et al., 2005; Huguet-Tapia et al., 2011). This difference in size largely relies on cargo genes, going from only one gene encoding tetracycline resistance in Tn916 to a very high number in PAISt. These cargo genes are highly variable even between ICEs carrying closely related conjugation and recombination modules. They encode functions that are not involved in their intra- and intercellular mobility but that may confer to their host a significant selective advantage or may even change their lifestyle, such as antibiotic, heavy metal or phage resistance, sucrose catabolism, bacteriocin synthesis, pathogenicity, or symbiosis (Fig. 1, Table 1, see Dobrindt et al., 2004; Juhas et al., 2009; Wozniak & Waldor, 2010 for reviews).
Integrative and mobilizable elements: nonautonomous in transfer, autonomous in integration
Definition of integrative mobilizable elements
Besides self-transferable GEIs (i.e. ICEs), some other GEIs encode their own excision and integration but do not carry all the genes necessary for their conjugative transfer (Table 2). They use the mating pore (MPF) of a co-resident ICE or conjugative plasmid to transfer by conjugation. As they are mobilizable in trans, these elements were named integrative mobilizable elements (IMEs; Burrus et al., 2002b). According to the initial definition, the IMEs encode a recombinase and only some conjugation proteins. However, some chromosomal elements, which encode a DDE transposase and/or do not encode any conjugation protein, were recently found to be mobilized in trans by ICEs (Achard & Leclercq, 2007; Daccord et al., 2010). In this review, to include these latter, IMEs will group all the elements that encode their own circularization and their own integration, regardless of the mechanism and/or specificity of circularization and integration, and are mobilized in trans by conjugative elements, regardless of the mechanism.
Int, integrase type; DDE, DDE transposase; ICE, integrative and conjugative element; IME, integrative mobilizable element; Ser, serine recombinase; Tyr, tyrosine recombinase; Ser duo: two distantly related serine recombinases encoded by two adjacent genes.
This table includes all proved IMEs and some putative ones. The activity of transfer and/or integration of underlined elements has not been demonstrated. When two names were used or the initial name was replaced, the less common or initial name is given between brackets. Some of these elements were described as ICEs but do not encode a conjugation pore.
The elements are sorted according their origin (bacterial division, genus, species). α, Alphaproteobacteria; β, Betaproteobacteria; γ, Gammaproteobacteria; act., Actinobacteria; bac., Bacteroidetes; fir., Firmicutes.
Integration sites are given only when numerous events of integration were analyzed and/or when features characteristic of a site-specific integration were found.
Only proved or putative functions that could be useful for the host are mentioned. Amp, ampicillin; Cfx, cefoxitin; Cam, chloramphenicol; Erm, erythromycin; Hg, mercury; Kan, kanamycin; Ln, lincomycin; ND, not determined; RM I, type I restriction modification system; Str, streptomycin; Spc, spectinomycin; Sul, sulfonamides; Tet, tetracycline; Trm, trimethoprim; Van, vancomycin.
The element does not excise. If the mobilization module is inserted on a plasmid, the plasmid is mobilizable in trans by the ICEs CTnDOT and CTnERL.
Two different but very close copies of Tn4453 were found in the same strain of Clostridium difficile. The sequenced copy (Tn4453a) shares 89% identity with Tn4451 from Clostridium perfringens and has identical gene content.
Published putative IME(s) that have closely related mobilization modules but have at least some different passenger gene(s) is (are) not mentioned in this table.
T4SSscan gives only a weak evidence for a relaxase (E-value 3 × 10−4). The sequences involved in the mobilization were not identified.
Like ICEs, IMEs display a large diversity of combinations of integration/excision, mobilization, and adaptation modules (Fig. 4, Table 2). The integration/excision modules of IMEs, like those of ICEs and integrated prophages, can encode a tyrosine recombinase, a serine recombinase, or a DDE transposase (Table 2). Recently, the recombination module of the putative IME Tn6104 from C. difficile was found to carry two adjacent genes encoding distantly related serine recombinases (Brouwer et al., 2011). IMEs have, like ICEs, a large range of integration specificity. Most of known IMEs encode a tyrosine recombinase and integrate in the 3′ end of genes encoding various tRNAs or conserved proteins (Table 2). However, the tyrosine integrases encoded by three IMEs from Bacteroides, Tn4399, Tn4555, and Tn5520, confer them a lower specificity of integration (Hecht & Malamy, 1989; Smith & Parker, 1993; Vedantam et al., 1999). The serine recombinase encoded by the IME Tn4451 from Clostridium perfringens also leads to a low specificity of integration of this element (Adams et al., 2002). By contrast, a putative IME encoding a serine recombinase, described as Erm(TR) element or ICESp2907, was found to disrupt, in the same position, related unknown ORFs both in the donor strain and the transconjugants (Giovanetti et al., 2012). Therefore, this element integrates site-specifically in a target gene leading to its interruption. Another putative IME encoding a serine recombinase, ICE6180-RD.1, seems to be site-specifically integrated in the 3′ end of rumA that encodes a 23S rRNA methyltransferase (Beres & Musser, 2007). Finally, MTnSag1 from S. agalactiae and the related element tISCpe8 from C. perfringens encode a DDE transposase belonging to IS1595 family and have a low specificity of integration (Achard & Leclercq, 2007; Lyras et al., 2009).
For the majority of IMEs, only one copy was found per genome or per transconjugant. However, four IMEs, which have a low integration specificity, can be found in multiple (identical or very closely related) copies in different locations of the same genome: Tn4555 and Tn4399 from Bacteroides that encode a tyrosine recombinase (Hecht & Malamy, 1989; Smith & Parker, 1993), Tn4453 from C. difficile that encodes a serine recombinase (Lyras & Rood, 2000), and MTnSag1 from S. agalactiae that encodes a DDE transposase (Achard & Leclercq, 2007).
All the known IMEs, including the only IME ever found in actinomycetes ATE-1 (Billington et al., 2002), transfer as single-stranded DNA. The mechanism of mobilization in trans of the majority of IMEs is similar to the well-known transfer of various mobilizable plasmids such as those of the IncQ family (Meyer, 2009). Like all mobilizable plasmids, IMEs carry their own oriT. Moreover, most of IMEs encode their own relaxase (MOB) that recognizes and nicks oriT and frequently encode some other relaxosome proteins (Li et al., 1995; Murphy & Malamy, 1995; Crellin & Rood, 1998; Smith & Parker, 1998; Vedantam et al., 2006). They do not encode their own CP or MPF but use those of unrelated ICEs or conjugative plasmids. However, besides putative MOB and proteins of the relaxosome, some putative IMEs from proteobacteria α and γ, that is, the IncP island (Lavigne et al., 2005), BcenGI2 (Graindorge et al., 2012) and their relatives, encode a protein homologous to VirB6, a component of MPF. The IMEs NBU1 (Li et al., 1993), NBU2 (Wang et al., 2000c), Tn4399 (Hecht & Malamy, 1989; Murphy & Malamy, 1993), and Tn5520 (Vedantam et al., 1999) from Bacteroides encode their own relaxase and are mobilized by very different elements, that is, ICEs from Bacteroides (Stevens et al., 1990) and plasmids from proteobacteria belonging to the IncP family. This suggests that, like mobilizable plasmids (Meyer, 2009), these elements are mobilizable by a large array of unrelated or distantly related conjugative elements. Sequence analysis of the IMEs SGI1 from Salmonella enterica revealed three genes encoding proteins related to those of the MPF of the conjugative plasmid R27 but failed to find a MOB or any other conjugation proteins (Boyd et al., 2001). Unlike the IMEs of Bacteroides, these elements are mobilized in trans only by conjugative plasmids belonging to the IncA/C incompatibility group and not by other types of conjugative elements (Douard et al., 2010).
Recently, a family of unusual IMEs, known as mobile genomic islands (MGIs), was found in proteobacteria (Daccord et al., 2010, 2012). These elements encode their own integrase that catalyzes their excision and integration and only carry an oriT that is related to those of the ICE SXTM010 and its relatives (highly significant 63% identity on 282 bp). Three MGIs from Vibrio, MGIVflInd1, MGIVchUSA1, and MGIVvuTai1, were found to be mobilized in trans by ICEs belonging to the SXT family (Daccord et al., 2010). In their absence, these IMEs do not excise and thus do not transfer. When a coresident ICE is present, the ICE-encoded transcriptional activators SetCD trigger the expression of genes encoding the integrase and the RDF of the IMEs, promoting their site-specific excision (Daccord et al., 2012). Once the IME has excised, its oriT is recognized by the ICE-encoded MOB, and the IME is transferred as a single-stranded DNA molecule through the MPF encoded by the ICE (Daccord et al., 2010). As for transfer of all other tested IMEs, the mobilization of a MGI to a recipient is independent of any cotransfer of the helper ICE, the transconjugants harboring the MGI alone (the most frequent), the ICE alone, or the two elements. Even more unusual IMEs from firmicutes, MTnSag1 (Achard & Leclercq, 2007) and its relative tISCpe8 (Lyras et al., 2009), were found to be mobilizable by Tn916. These elements that can be considered as insertion sequences harboring a passenger gene (Siguier et al., 2009) only encode a lincosamide resistance and a DDE transposase, responsible for the formation of a circular extrachromosomal form of the IME and for its insertion (Achard & Leclercq, 2007). The mobilization in trans of MTnSag1 and tISCpe8 involves an oriT that is located within the resistance gene and is probably not related to the Tn916 one (50% identity on only a 60 bp-region included in oriT; Achard & Leclercq, 2007).
IMEs are poorly known, and neither the conjugation immunity, nor the maintenance as extrachromosomal circular form by replication has been studied. However, some of them, such as SGI1 (Boyd et al., 2001), the IncP island (Lavigne et al., 2005), BcenGI2 (Graindorge et al., 2012), the tet(O) fragment (Brenciani et al., 2011), or IME_2603_rpsI (Brochet et al., 2008), encode homologs of proteins involved in the control of replication and/or the partition of plasmids. This suggests that, as for ICEs, replication and partition of excised IMEs could contribute to their maintenance. Almost all IMEs also encode functions that are not involved in their intra- and intercellular mobility but may confer to their host a significant selective advantage. The most frequently identified function is resistance to antibiotics (Table 2), but resistance to arsenate, RM systems, bacteriocin synthesis, or c-di-GMP synthesis has also been described (Table 2). These genes represent a significant part of these elements ranging from 1.7 kb (Achard & Leclercq, 2007) to about 40–50 kb for SGI1 and its numerous relatives (Boyd et al., 2001). One exception is Tn5520, an IME from Bacteroides fragilis that encodes only two proteins, a tyrosine integrase and a MOB, but do not encode any adaptative function (Vedantam et al., 1999).
Elements autonomous in transfer, nonautonomous in integration
Some elements, deriving from an ICE or carrying an ICE, encode their own conjugative transfer but integrate by homologous recombination. The analysis of the genome of B. thetaiotaomicron 5482 revealed CTn4-bt, a GEI that is related to known ICEs from Bacteroides (Xu et al., 2003) but is not self-transmissible. However, in the presence of the ICE CTnERL from B. fragilis or only of its two-component regulatory system rteA-rteB, it transfers only to a recA+ strain already harboring CTn4-bt and thereafter integrates by homologous recombination with the resident element (Moon et al., 2007). Therefore, CTn4-bt encodes all the functions necessary for its transfer, except the regulation that is provided by CTnERL and the integration. Another example of GEI encoding its transfer but integrating by homologous recombination is Tn5385, a 65-kb element from E. faecalis CH19 and CH116 that looks like a class I transposon (Rice & Carias, 1998). It ends by two copies of the insertion sequence IS1216 and carries an ICE closely related to Tn916. Two types of Tn5385 transconjugants have been identified. In the first type, Tn5385 has inserted by mean of two homologous recombinations between each of the chromosomal regions that flank this MGE in the donor and the corresponding homologous sequences in the recipients. The second type of transconjugant would result from Tn5385 excision by recombination between the terminal IS1216 copies and, after the transfer, insertion by recombination between Tn5381 (within the excised Tn5385) and a previously transferred Tn5381 copy integrated in the recipient chromosome.
Furthermore, other mobile elements, the integrative mobile elements exploiting Xer (IMEXs), hijack XerC, and XerD, that is, the tyrosine recombinases that catalyze the resolution of chromosome dimers that arise during chromosome replication (by site-recombination between the dif sites), for their own integration and/or excision (Das et al., 2013). Besides poorly known GEIs and some prophages, such as CTXφ, IMEXs include a family of closely related GEIs from Neisseria gonorrhoeae and Neisseria meningitidis, the ‘gonococcal genetic islands’ or GGIs (for a review see Ramsey et al., 2011). These elements are integrated within the dif site and are flanked by direct repeats. Furthermore, XerCD were found to mediate the excision of the 57-kb GGI from N. gonorrhoeae by site-specific recombination (Snyder et al., 2005; Domínguez et al., 2011), and sequence analysis suggests that the GGIs from N. meningitidis have been acquired by site-specific integration (Woodhams et al., 2012). The GGIs carry a complete conjugation module encoding a MOB, a CP, and a MPF. The MPF catalyzes the secretion of single-stranded chromosomal DNA from an oriT carried by the GGIs into the extracellular environment (Hamilton et al., 2005). The chromosomal-secreted DNA is effective in natural transformation, but the GGIs were not found to transfer by natural transformation (Hamilton et al., 2005). Therefore, although their conjugative transfer has not been shown, it seems likely that the GGIs use XerCD to excise and integrate but encode their own transfer by conjugation. Like various ICEs and IMEs, the GGIs encode putative ParA and ParB proteins, suggesting that the partition of the excised element could contribute to the maintenance of the GEI besides the main mechanism, that is, integration.
Evolution of conjugative and mobilizable GEIs and consequences on mobility
Acquisition, deletion, or exchange of module(s) are key mechanisms of evolution of ICEs, IMEs, and other MGEs (Toussaint & Merlin, 2002). Sequence comparisons indeed reveal numerous exchanges of recombination, conjugation, or adaptation modules between ICEs (see e.g. Burrus et al., 2002a; Carraro et al., 2011; Roberts & Mullany, 2011). Exchanges of integration and conjugation modules between ICEs probably play a significant role in the evolution of their host specificity (Burrus et al., 2002a, b).
Exchanges of modules also occur between ICEs and other types of mobile elements. Numerous examples of closely related adaptation modules from ICEs and plasmids have been reported (Burrus et al., 2002a; Sullivan et al., 2002). Plasmids and ICEs carry conjugation modules belonging to the same types (single-strand or double-strand) or subtypes (different types of MPFs or MOBs). On a phylogenetic tree of VirB4, a highly conserved ATPase found in all MPF types, conjugative plasmids, and ICEs were found to be intermingled at large phylogenetical distances, suggesting that both types of elements have exchanged conjugation modules along their evolutionary history (Guglielmini et al., 2011). At closer distances, the VirB4 proteins encoded by an ICE are generally found to be more similar to VirB4 proteins encoded by other ICEs; the reciprocal was also found for conjugative plasmids. Only few examples of close relationship of conjugation modules between different ICEs and plasmids were reported, suggesting that such exchanges exist but are rare events. For instance, the conjugative modules of the ICE known as the pathogenicity island from E. faecalis UW31114 (PAI_UW3114) and of its nonfunctional relative found in the strain MMH594 are closely related to those of the enterococcal conjugative plasmids pAM373 and pAD1 (Burrus et al., 2002a; Shankar et al., 2002; Laverde Gomez et al., 2011). Moreover, the conjugation modules of the plasmids pMET1 from Klebsiella pneumoniae and pCRY from Yersinia pestis are closely related to those of ICEKp1 from K. pneumoniae and ICEEc1 from E. coli (Soler Bistué et al., 2008). Furthermore, the proteins encoded by the conjugation module of the putative ICE Tn1806 from Streptococcus pneumoniae are very closely related (88–95% identity) to those of the plasmid pAPRE01 from Anaerococcus prevotii (Camilli et al., 2011). An exchange event between plasmids and AICEs of actinomycetes has also been reported: the plasmid pJV1 and the ICE pSLS from Streptomyces encode not only a related putative Tra protein involved in double-strand conjugative transfer between different mycelia but also closely related intramycelial transfer genes (Ghinet et al., 2011).
The majority of integrases of ICEs, IMEs, and integrated prophages belong to the tyrosine and serine recombinase families, suggesting that exchanges of recombination modules might occur. To our knowledge, all the published phylogenetic analyses comparing integrases from prophages and ICEs did not reveal any convincing exchange (Boyd et al., 2009; Napolitano et al., 2011; Van Houdt et al., 2012). However, these analyses only used very few integrases encoded by ICEs and therefore may not be significant. Furthermore, close relationships between recombination modules of ICEs and prophages were never reported. As previously indicated, the DDE transposases encoded by ICEs belong to three distinct families that also include transposases encoded by insertion sequences. In particular, TnGBS1 and TnGBS2 encode related DDE transposases but carry unrelated conjugation modules (Brochet et al., 2009) and therefore belong to two distinct families of ICEs. A phylogenetic analysis using DDE transposases from insertion sequences and ICEs related to TnGBS1 and TnGBS2 showed that the last common ancestor of the TnGBS1 family acquired its DDE transposase gene from an element belonging to the TnGBS2 family and that the last ancestor of TnGBS2 family probably gained its DDE transposase gene from an insertion sequence (Guerillot et al., 2013). A close relationship between mobilization or recombination modules of IMEs and other types of elements was never reported (probably because of the very low number of identified IMEs).
Degenerated GEIs deriving from ICEs, IMEs, or IMEXs
ICEs, IMEs, and IMEXs carry modules involved in intercellular mobility/transfer, intracellular mobility/maintenance, and adaptation of their host to the environment. All of these modules contribute to the evolutive success of the MGEs within bacterial populations. However, only the last ones can contribute to the fitness of the MGE hosts. Therefore, after MGE acquisition, the mutations that inactivate mobility modules would not be counterselected in the cell descent. As expected, more and more sequence comparisons show that various GEIs derive from ICEs, IMEs, and IMEXs by mutations or deletions of their mobility modules. The size range of these elements is very large, going from 75 bp to 680 kb.
In various GEIs, one or a few gene(s) or sequence(s) necessary for intracellular and/or intercellular mobility are inactivated or deleted. For example, orfD of ICE_2603_tRNALys, an element from S. agalactiae closely related to the functional ICE ICE_515_tRNALys, is disrupted by the insertion of an IS1381 copy (Puymège et al., 2013). This conjugation gene encodes a protein homologous to VirB4, that is, a crucial protein of the MPF, explaining the absence of transfer of this element. In addition, the gene encoding the DDE transposase of a variant of ICE6013 from S. aureus is disrupted by the insertion of Tn552, explaining the absence of circular form for this variant (Smyth & Robinson, 2009). Other GEIs carry mobility modules but one of the ends (and therefore one of the recombination sites) lacks. For example, the analysis of the genome of Bradyrhizobium japonicum USDA110 suggests that a giant ICE has been split by a large-scale genome rearrangement leading to two degenerated GEIs: a 680-kb symbiosis island harboring one of the att sites and the conjugation module; and a 6-kb GEI harboring the other att site and the integrase gene (Kaneko et al., 2002).
Other GEIs underwent large deletions. The comparison between ICEEc1 from E. coli and the pathogenicity island HPI from Y. pestis KIM showed that this latter lost its conjugation module. Similarly, the pathogenicity island HPI from E. coli ECOR31 lost its conjugation module and one of its att sites (Schubert et al., 2004). Furthermore, comparison of several GEIs (named CIMEs) with ICESt1 and ICESt3 from Streptococcus thermophilus showed that these elements lost their regulation, conjugation, and recombination modules but retained their attL and attR sites (Pavlovic et al., 2004). Another element from S. thermophilus, ΔCIME308, lost not only its regulation, conjugation, and recombination modules but also its attL site (Pavlovic et al., 2004). Such decay concerns not only ICEs but also IMEs (Brochet et al., 2008) and IMEXs (Woodhams et al., 2012). In a consistent manner, a careful re-analysis of sequences of ICESt1 and of the closely related GEI CIME302 revealed tiny remnants of another ICE integrated in these elements (Fig. 5). ICESt1 is integrated site-specifically in the 3′ end of the fda gene encoding a fructose-1,6-diphosphate aldolase (Burrus et al., 2000). It belongs to a family of ICEs that possess closely related conjugation modules but can have very different recombination modules and thus different specific sites of integration (Carraro et al., 2011) including the 3′ end of the rpsI gene that encodes the S9 ribosomal protein. First, the comparison of ICESt1 with the sequenced genome of S. thermophilus LMG18311 revealed, in the 3′ end of rpsI, a composite GEI corresponding to the tandem integration of a truncated ICE (that possesses a remnant of conjugation module) and of a CIME (possessing a remnant of regulation module; Fig. 5). Both elements possess very closely related attR sites and a gene or a pseudogene encoding an integrase that is very different from that of ICESt1. However, the comparison of these attR sites with ICESt1 and CIME302 reveals closely related short internal sequences (111 and 75 bp, respectively), showing that these two elements include tiny remnants of another ICE (Fig. 5).
The absence or the inactivation of few mobility genes could be complemented by the presence of a functional related element. The presence of several copies of the same or closely related elements was observed for most ICEs and IMEs that integrate with a low specificity (Hecht & Malamy, 1989; Smith & Parker, 1993; Lyras & Rood, 2000; Achard & Leclercq, 2007; Smyth & Robinson, 2009; Dordet Frisoni et al., 2013). Furthermore, the same strain can carry ICEs that possess related conjugation modules but very different integration modules. For instance, Streptococcus suis BM407 harbors two putative ICEs that carry a conjugation module closely related to that of Tn5252, ICESsuBM4071, and ICESsuBM4072 (Holden et al., 2009). The latter encodes a tyrosine recombinase and integrates in the 3′ end of rplL encoding the ribosomal protein L7/L12, while ICESsuBM4071 carries a triplet of adjacent genes encoding distantly related serine recombinases and disrupts an ORF encoding a protein homologous to luciferin monooxygenase. In addition, a complementation between distantly or very distantly related elements might exist. Indeed, an engineered plasmid harboring the transfer origin of the ICE Tn1549 from Enterococcus, its relaxase gene and mobC, a gene encoding a protein that probably belongs to the relaxosome, is mobilizable in trans by the conjugative plasmid RP4 in E. coli (Tsvetkova et al., 2010). Experimental data show that at least some degenerated elements have retained an unusual intercellular or intracellular mobility. Besides the degenerated GEI CTn4-bt that possesses a complete conjugation module but needs the presence of the related ICE CTnERL to transfer (see section about 'Elements autonomous in transfer, nonautonomous in integration'), B. thetaiotaomicron 5482 harbors three other defective elements (CTn1-bt, CTn2-bt, and CTn3-bt). All three elements were also found to be mobilized in trans by CTnERL and to integrate by homologous recombination, but they require not only the regulatory genes rteA and rteB from CTnERL like CTn4-bt but also other genes from this ICE (Moon et al., 2007). The ermF region inserted in CTnDOT, an ICE from B. thetaiotaomicron, has typical features of an IME but cannot excise (Whittle et al., 2001). Indeed, the comparison of the ermF region with the very closely related element Tn6031 from Sphingomonas that is able to excise suggests that one of its recombination sites is truncated (Ghosh et al., 2009). Moreover, an engineered plasmid harboring the mobilization module of the ermF region is mobilized in trans by the ICEs CTnDOT and CTnERL showing that this module is functional (Whittle et al., 2001). Furthermore, all the transfer assays using the pathogenicity island SPI-7CT18 from S. enterica failed, and the comparison of its sequence with the functional element ICESb2 suggests that it is a degenerated GEI that is no longer able to self-transfer (Baker et al., 2008; Seth-Smith et al., 2012). However, it mobilizes in trans the IncQ plasmid R300B (Baker et al., 2008).
Matryoshkas: MGEs within MGEs
The acquisition of novel modules can result from the integration of a MGE within another one followed by a subsequent reorganization. This evolution of conjugative or mobilizable GEIs can involve a large array of mechanisms and types of MGEs including not only ICEs or IMEs but also other integrative/transposable elements and plasmids.
‘Nonmobilizable MGEs’ within ICEs and IMEs: low integration specificity
MGEs that do not encode their own conjugative transfer or mobilization (‘nonmobilizable MGEs’) can integrate within ICEs or IMEs. Most ISs and transposons encoding DDE transposases have a low specificity of integration. However, such elements are frequently found integrated in ICEs. For example, 64 of the 104 ISs found in the genome of M. loti USDA110 are carried by a 680-kb GEI related to ICEMlSymR7A (Kaneko et al., 2002). Various types of transposons or structures that are flanked by related ISs and could be composite transposons have also been identified in numerous ICEs, IMEs, or related GEIs (e.g. see Beaber et al., 2002; Klockgether et al., 2004; Juhas et al., 2007; Levings et al., 2008; Ding et al., 2009; Smyth & Robinson, 2009).
Besides inactivation of mobility genes resulting from their insertion, ISs can also promote deletion or inversion of adjacent sequences. For example, ISs are associated with the deletion of the conjugation module of the HPI island from Y. pestis KIM (Schubert et al., 2004) or the attL deletion from ΔCIME308, a GEI from S. thermophilus related to ICESt3 (Pavlovic et al., 2004). Furthermore, a very large inversion (1 226 011 bp) disrupting a GEI very closely related to the ICE Tn5276 from L. lactis MG1363 was probably due to the transposase of ISLL6, an IS carried by the ICE. Another very large inversion disrupting a Tn5276-related GEI from L. lactis NCDO763 was mediated by homologous recombination between a copy of IS905 carried by the GEI and an inverted copy located elsewhere in the chromosome (Wegmann et al., 2007). If the ISs or transposons do not disrupt any gene or structure necessary for the ICE or IME mobility, the transfer of the element should lead to the transfer of the internal ISs or transposons. Therefore, the ICE would mobilize in cis the carried MGEs, and these MGEs could be considered as aggressive hitchhikers using conjugative or mobilizable elements to invade other strains and species. However, some of these hitchhikers, the transposons, harbor adaptation genes that can increase the fitness of the bacterial host, contributing to the evolutive success of ICEs or IMEs in the bacterial population. For example, the functional ICE SXTMO10 from V. cholerae harbors a putative class II transposon that encodes sulfamethoxazole, trimethoprim, streptomycin, and chloramphenicol resistances (Beaber et al., 2002).
The integration of ‘nonmobilizable’ plasmids in ICEs or IMEs was also reported. It results from: (1) homologous recombination between closely related IS copies carried by the two elements, (2) intermolecular replicative transposition of IS, or (3) unidentified mechanisms. For instance, the cointegration of ‘nonmobilizable’ plasmids encoding lactose catabolism and of the ICE pRS01 notably led to its discovery (Anderson & McKay, 1984). The analysis of one of these events showed that the replicative transposition of the plasmidic ISS1S into the ICE has promoted their cointegration (Polzin & Shimizu-Kadota, 1987). The elements resulting from cointegration between lactose plasmids and pRS01 are conjugative plasmids. Similarly, the insertion of the ‘nonmobilizable’ plasmid pEG920 in various positions of the IME NBU1 from Bacteroides leads to plasmids mobilizable in trans by the conjugative plasmid RP4 (Shoemaker et al., 1993). By contrast, the insertion of the ‘nonmobilizable’ plasmid pPHDP10 from Photobacterium damselae into ICEPdaSpa1, an ICE closely related to SXTMO10, leads to a functional ICE that acquired the genes of the plasmid including a toxin-encoding gene (Osorio et al., 2008).
‘Nonmobilizable MGEs’ within ICEs and IMEs: structure-specific integration of a MGE
Some other ISs or transposons target structures of conjugative or mobilizable elements incoming in the recipient cell. Unlike most ISs and transposons that encode a DDE transposase, the ISs belonging to IS91/IS608/ISCR family encode a transposase related to the Rep/Relaxase superfamily of proteins (Guynet et al., 2008). These elements transpose by rolling-circle replication and exclusively integrate in single-stranded DNA regions. Putative composite transposons integrated in various ICEs belonging to the SXT family from proteobacteria carry an IS belonging to this family, ISCR2, and genes encoding resistance to various antibiotics (Beaber et al., 2002; Toleman & Walsh, 2011). In the same way, a region of the SGI1-related IME SGI2, which could correspond to a transposon, carries a complex integron including an IS belonging to this family, ISCR3, and genes encoding resistance to various antibiotics (Levings et al., 2008; Toleman & Walsh, 2011). The replication of each IS belonging to this family is initiated at its oriIS end and then proceeds through the entire element. Termination of replication stops at the other end, the terIS sequence (for a review see Toleman & Walsh, 2011). However, in 1–10% of replications, the terminus is not recognized and replication proceeds into the adjacent DNA until a surrogate terIS is recognized, leading to the transposition of the IS and the adjacent sequence (Tavakoli et al., 2000). The analysis of the sequences flanking ISCR2 in SXT-related ICEs and ISCR3 in SGI1-related IMEs strongly suggests that these ISs are involved in the accumulation of genes encoding resistance to various antibiotics in these elements (see Toleman & Walsh, 2011).
Furthermore, a widely distributed family of unit transposons, the Tn7 family, also targets incoming conjugative elements. Tn7 possesses two transposition systems that include three common subunits TnpA, TnpB (a DDE transposase), and TnpC but differ by the target selection protein, that is, either TnsD or TnsE (for a review see Parks & Peters, 2009). The first pathway directs transposition into a specific site of the chromosome (attTn7), which corresponds to a single position within the transcriptional terminator of glmS that encodes the glucosamine-fructose-6-phosphate aminotransferase. The second one directs transposition into single-stranded replicating MGEs such as incoming conjugative or mobilizable elements. Hence, these elements take advantage of both the stability of the chromosome and the mobility of all elements able to transfer by conjugation for persistence and dissemination among bacteria that carry the conserved attTn site (Parks & Peters, 2007). Some ICEs such as ICEEc2 from E. coli carry a Tn7-related transposon. The Tn7 found in ICEEc2 carries an integron encoding resistance to trimethoprim, streptothricin, and streptomycin (Roche et al., 2010).
‘Nonmobilizable MGEs’ within ICEs and IMEs: site-specific integration of a MGE
Other ‘nonmobilizable’ MGEs, the integron cassettes and type II introns, integrate in specific sites that can be carried by ICEs or IMEs. An integron is constituted by an intI gene encoding an integrase, a cassette promoter, a recombination site attI, and an array of gene cassettes (see Cambray et al., 2010 for a review). Cassettes generally contain a promoterless gene that can be expressed from the cassette promoter. These nonautonomous mobile elements are flanked by two recombination sites attC but do not encode any protein involved in their mobility. The integrase catalyzes the recombination between the attC sites flanking a cassette, promoting the excision of a covalently closed single-stranded intermediate and its integration preferentially in the attI site. Integrons have been identified in all the numerous IMEs belonging to the SGI1 family (Doublet et al., 2005; Levings et al., 2008) and in some ICEs such as members of the SXT family from V. cholerae (Hochhut et al., 2001b; Iwanaga et al., 2004). LexA, the repressor of the SOS response, represses the integrase gene of integrons (Guerin et al., 2009; Cambray et al., 2011). Upon conjugative transfer, incoming single-stranded DNA, such as the ICE SXT, induces the SOS response and the intI gene in the transconjugant, triggering rearrangements of cassette arrays (Baharoglu et al., 2010). Although this has not been tested, this induction should also lead to cassette exchanges between a resident integron and an integron that would be carried by an incoming MGE such as SXT.
Bacterial group II introns are ribozymes that splice autocatalytically from their pre-mRNAs (for a review see Toro et al., 2007). Most of these elements are also mobile retroelements that can insert either specifically in the same site of cognate intronless alleles (retrohoming) or aspecifically in other sites (retrotransposition). Mobile group II introns encode a multifunctional protein that is directly involved in their mobility. The group II intron Ll.LtrB, which is integrated in the relaxase gene from the ICE pRS01 of L. lactis, was the first bacterial group II intron shown to splice and move in vivo (Mills et al., 1997 and references therein). Conjugative transfer of pRS01 and/or engineered plasmids carrying the intron induces invasion of the recipient by Ll.LtrB by retrohoming and/or retrotransposition (Belhocine et al., 2004, 2005). This intron can also integrate in cognate position of distantly related relaxase gene of the ICE Tn5252 and of the conjugative plasmid pCF10 (Staddon et al., 2004). Furthermore, the introns inserted in the relaxase gene of the mobilizable plasmid pAH82 from L. lactis and of pRS01 share 99.8% nucleic identity, despite the low similarity between the encoded relaxases (29.0% identity; O'Sullivan et al., 2001). Taken together, these data indicate that exchanges of introns between distant conjugation modules of ICEs and plasmids occur. Type II introns have also been found in various positions of other ICEs and of an IME, generally in the conjugation modules, such as genes encoding MOB, CP, or MPF proteins (virB4 ATPase, cell wall hydrolase) (Mullany et al., 1996; Paulsen et al., 2003; Pavlovic et al., 2004; Bacic et al., 2005; Beres & Musser, 2007; Brenciani et al., 2011).
Prophages integrated within ICEs
Some ICEs or related GEIs carry putative prophages. The comparison of GEIs closely related to ICESb1 revealed a prophage site-specifically integrated in the samA gene of the GEI SPI-7CT18 from S. enterica (Pickard et al., 2003). In the same way, the comparison of the putative ICE Tn6164 from C. difficile with the closely related element Tn1806 from S. pneumoniae showed that Tn6164 carries a complete prophage (Corver et al., 2012). However, the interactions between these prophages and ICEs have not been studied.
ICEs or IMEs within ‘nonmobilizable’ plasmids
ICEs or IMEs showing low specificity of integration can insert in plasmids. The first identified and best known ICE, Tn916, has a low specificity of integration and can transpose in ‘nonmobilizable’ plasmids. It has never been found to mobilize in cis these plasmids (Clewell et al., 1995a), likely because its conjugation module is only expressed when it is under circular form (Celli & Trieu-Cuot, 1998). Several ICEs from Bacteroides, such as CTnXBU4422, can also transpose in ‘nonmobilizable’ plasmids. However, unlike Tn916, the plasmids carrying this ICE can transfer by conjugation, showing that the ICE mobilizes in cis the plasmid (Shoemaker & Salyers, 1990; Salyers et al., 1995). Furthermore, the cointegrates resulting from the insertion of some IMEs of Bacteroides (Tn4399, cLV25 and Tn5520) in ‘nonmobilizable’ plasmids are mobilized in trans by ICEs from Bacteroides and/or IncP plasmids (Murphy & Malamy, 1993; Vedantam et al., 1999; Bass & Hecht, 2002). Therefore, these IMEs mobilize in cis the plasmids. In the same way, the IME Tn4451 from Clostridium can integrate in ‘nonmobilizable’ plasmids and thereafter mobilize them in cis (Crellin & Rood, 1998; Lyras et al., 1998).
ICEs within ICEs or conjugative plasmids
Some ICEs are inserted into conjugative plasmids or into ICEs. ICEs that encode a tyrosine recombinase with a low specificity of integration, such as the ICEs Tn916 and Tn1549, can be found in various conjugative plasmids of firmicutes (Clewell & Gawron-Burke, 1986; Garnier et al., 2000). The conjugative transfer of such a plasmid then leads to the mobilization in cis of the ICE it carries. Furthermore, ICEs belonging to the Tn916 family have been frequently found integrated into streptococcal ICEs having conjugation modules closely related to that of Tn5252 (Ayoubi et al., 1991; Ding et al., 2009; Mingoia et al., 2011). The first reported composite element of this type, Tn5253, harbors Tn5251, an ICE differing from Tn916 by only 79 pb (Ayoubi et al., 1991; Santoro et al., 2010). Tn5253 was found to transfer, leading to the mobilization in cis of Tn5251 (Ayoubi et al., 1991), but Tn5251 can also transfer alone from a strain harboring Tn5253 (Santoro et al., 2010). Whereas almost all known composite ICEs have been found in firmicutes and involved a Tn916-related ICE, an ICE from Bacteroides closely related to CTn3Bf, CTn12256, carries another unrelated ICE (Wang et al., 2011). This passenger ICE is closely related to CTnDOT, an ICE that encodes a tyrosine recombinase catalyzing a low-specific but nonrandom integration. From a donor strain carrying this structure, the CTnDOT-related element can transfer either alone or as a passenger of CTn12256. Such structure would extend the host range of CTnDOT as the free-standing form of CTnDOT could not transfer itself to Prevotella ruminicola B14 or mobilize plasmids to that species, whereas CTn12256 could do both (Wang et al., 2011). A composite structure was also found in ICE6013 from S. aureus, an ICE that encodes a DDE transposase conferring a low specificity of integration. The composite element corresponds to a copy of ICE6013 inserted in another copy of the same element (Smyth & Robinson, 2009).
IMEs within ICEs or conjugative plasmids
Some IMEs are also integrated into conjugative plasmids or ICEs. The IME Tn4451 from C. perfringens that encodes a low-specific serine integrase was initially identified as a transposon carried by a conjugative plasmid (Abraham & Rood, 1987; Crellin & Rood, 1998). This IME can be transferred as a passenger of the conjugative plasmid (mobilization in cis) or alone (mobilization in trans). The ermF region, inserted into the ICE CTnDOT from B. thetaiotaomicron, corresponds probably to a degenerated IME (Ghosh et al., 2009). Furthermore, sequence re-analysis of four ICEs from firmicutes revealed that they carry typical IMEs encoding one or two serine recombinases and a MOB but neither CP nor MPF proteins. ICE2096-RD.2 from Streptococcus pyogenes (Beres & Musser, 2007), ICESp2905 from S. pyogenes (Brenciani et al., 2011), and Tn6103 from C. difficile (Brouwer et al., 2011) carry 1, 2, and 3 IMEs, respectively, while ICESluvan, a Tn5252-related element from Streptococcus lutetiensis, carries an ICE closely related to ICE1549 that itself carries a putative IME (the 9-kb element; Bjorkeng et al., 2013). Furthermore, three types of transconjugants can be recovered from ICESp2905 transfer assays (Giovanetti et al., 2012). The first one corresponds to the transfer of the whole composite ICE, that is, the mobilization in cis of the IMEs described as Erm(TR) element and Tet(O) element. The second one corresponds to the transfer of a composite element devoid of Erm(TR) (ICESp2906), suggesting that Erm(TR) element has excised from ICESp2905 and was subsequently lost. The third one corresponds to the transfer of the Erm(TR) element alone, that is a putative mobilization in trans. These last transconjugants have been observed only when a strain harboring ICESp1108, an ICE closely related to ICESp2905, was used as recipient strain. In donor and transconjugants, the Erm(TR) element is integrated in the same positions of the closely related ORFs orf8 from ICESp2905 and orf10 from ICESp1108, both belonging to the conjugation modules of the ICEs. Therefore, this IME integrates in a specific site of a conserved ORF of these ICEs. Although this integration disrupts this ORF, the transfer of the whole ICESp2905 and not only of ICESp2906 devoid of the IME has been observed. Therefore, the Erm(TR) element encodes two alternative mechanisms of transfer, one hijacking the complete recombination and conjugation machineries of ICESp2905 (cis mobilization after site-specific integration), probably without any damage to the helper element, and the other hijacking only the CP and MPF of unknown conjugative plasmids and ICEs (trans mobilization), perhaps including those of ICESp2905 itself.
The analysis of ICEStI and related GEIs from S. thermophilus revealed that their structure is composite, suggesting that they evolved by site-specific accretion and subsequent deletions (Pavlovic et al., 2004). In brief, an ICE or an IME (GEI1) encoding a site-specific integrase is transferred by conjugation in a recipient cell already carrying a related GEI integrated in the attB site (GEI2) targeted by these two GEIs. The incoming element can integrate into one of the two att sites flanking GEI2 (Fig. 6), leading to the tandem accretion of the two elements. The two possible resulting structures (GEI3 or GEI4) would be composite GEIs, GEI1 and GEI2 being separated by a chimerical recombination site. However, if GEI1 and GEI2 carry identical or almost identical sequences, the integration can also result from homologous recombination between the resident and incoming elements. If the two GEIs are identical, the tandems resulting from these two mechanisms will be indistinguishable. Although experimental transfer leading to accretion has been tested and obtained for very few elements (Hochhut et al., 2001a; Possoz et al., 2001; Bellanger et al., 2011; Puymège et al., 2013), more and more genome analyses report existing GEI tandems. Tandems are observed for the three recombinase types. Although prophages can be considered as GEIs and the tandem formation is not rare for these elements, we do not address in this review the formation of polylysogen cells, that is, cells carrying a tandem of prophages (Mandal et al., 1974; Kholodii & Mindlin, 1985; Hayashi et al., 2001; Eppinger et al., 2007; Asadulghani et al., 2009).
Accretion catalyzed by a tyrosine recombinase: experimentally created GEI tandems
Tandem integrations of two identical or almost identical GEIs resulting from conjugative transfer were described for different families of elements. They probably result from the integration of a second copy of an element in the attL or attR sequence flanking a previously integrated copy, but may also result from the integration of this second copy by homologous recombination. For instance, the transfer of ICESt3 from S. thermophilus to a recipient cell already bearing a resident copy of this ICE leads to the creation of a GEI tandem in all the transconjugants (Bellanger et al., 2011). Furthermore, 20% of the transconjugants recovered after the transfer of pSAM2 to a recipient cell carrying an immunity defective copy of this ICE have a pSAM2 tandem (Possoz et al., 2001). Similarly, a tandem amplification of ICEclc, an ICE encoding chlorobenzene degradation, was obtained in a transconjugant of Pseudomonas putida when the strain was cultivated in the presence of this aromatic organic compound as the sole carbon source (Ravatn et al., 1998). The transfer of the IME SGI-1 from S. enterica to a recipient cell devoid of this element also leads to the recovery of tandems in over 40% of cases (Doublet et al., 2008). Such structures could result from successive acquisitions of SGI-1 monomers followed by site-specific accretion. However, they could also be obtained if the relaxase covalently bound to the transferring DNA in the recipient does not catalyze the re-circularization of the element at the first encountered oriT, allowing the transfer of a SGI-1 concatemer (Doublet et al., 2008).
Site-specific accretions of an incoming ICE with a related or unrelated resident GEI integrated in attB were also obtained. ICE_515_tRNALys, an element from S. agalactiae, possesses a conjugation module related to that of ICESt3 but has a different recombination module (Brochet et al., 2008). Streptococcus agalactiae strains already harboring in attB an IME or a CIME unrelated to ICE_515_tRNALys or another ICE related to the incoming ICE have been used as recipients. The transfer of ICE_515_tRNALys to these strains always led to its preferential integration in the attR site of the resident GEI that includes the 3′ end of tRNALys gene, regardless of the relationship of the two GEIs (Puymège et al., 2013). In the same way, the transfer of ICESt3 to recipients carrying the related GEIs ICESt1 or CIMEL3catR3 leads to its preferential integration in the attR site of the resident GEI, which includes the 3′ end of the fda gene (Bellanger et al., 2011). SXT and R391 are two related ICEs integrated in the 5′ end of the prfC gene (Waldor et al., 1996; Murphy & Pembroke, 1999). The transfer of SXT to a cell bearing R391 generates SXT/R391/prfC and R391/SXT/prfC tandems in a similar proportion (Hochhut et al., 2001a). By contrast, 90% of the tandems recovered after transfer of R391 to a cell carrying SXT display a R391/SXT/prfC structure (Hochhut et al., 2001b). Finally, a transfer mimicking accretion between an ICE and a remnant GEI was also obtained by the integration of an incoming ICEclc copy in a DNA fragment containing the attR site of this ICE (Sentchilo et al., 2009).
Accretion catalyzed by a tyrosine recombinase: reports of existing GEI tandems
More and more tandems that could result from ancient accretions are reported in literature. Prospecting for existing tandems can be done by looking for a complete or truncated internal attachment site (attI) inside a GEI (Fig. 7). This search is much easier when internal sequences highly similar to direct repeats flanking the GEI are present. For example, a sequence inside ICESt1 from S. thermophilus that is almost identical to its flanking direct repeats was found to be part of an internal functional attL site (Pavlovic et al., 2004). Attachment sites are not restricted to the direct repeats but also contain a mandatory adjacent sequence on which recombinase and/or recombinase cofactors bind, that is, the arms of the att sites (Grindley et al., 2006). Hence, a sequence homologous to the arm of an att site appearing inside a GEI is also an evidence of an ancient tandem accretion (Fig. 7b). Sequence analysis of the GEIs of the ICESt1 family also revealed several truncated internal att sites corresponding to remnant arms of att sites (Pavlovic et al., 2004). Some other accretions can be deduced from the presence of internal complete or truncated integrase gene(s). For instance, SPI-7CT18 from S. enterica is a 134-kb degenerated GEI closely related to ICESb1 and is integrated into the 3′ end of pheU, a tRNAphe encoding gene (Pickard et al., 2003; Seth-Smith et al., 2012). An integrase gene (sty4680) is located at its right end, near pheU. A truncated ORF closely related to sty4680 (93% identity), sty4678, is located 1.6 kb on the left of sty4680 (GenBank accession number NC_003198). Moreover, a 611-kb ICE, the symbiosis island, is integrated in a tRNAPhe gene of M. loti MAFF303099 (Sullivan et al., 2002). The mll6432 gene, which is located at the left end of the element near the tRNA gene, encodes the integrase of the symbiosis island. A truncated ORF closely related to mll6432 (81% identity), msl6419, is located 12.1 kb on the right of mll6432 (GenBank accession number NC_002678). Such structures may have arisen by tandem accretion of GEIs followed by subsequent deletion of the att site separating them (Fig. 7c).
A large panel of composite GEIs harboring evidences of ancient accretions is shown in Table 3. As truncated internal att sites are very rarely sought and therefore found, all the composite GEIs show in Table 3 carry, at least, one internal direct repeat. Some of them belong to well-studied families of pathogenicity islands. For instance, a tandem integrated in the 3′ end of a gene encoding a tRNAAsn in Enterobacter hormaechei 05–545 includes an ICE and a high pathogenicity island (HPI), a GEI which encodes virulence genes and can be transferred between different enterobacteria (Paauw et al., 2010). Another example of composite GEI is the locus of enterocyte effacement (LEE) carried by the enterohemorrhagic E. coli strains 12009 and E24377A (Table 3; Ogura et al., 2009). The nature and the number of the GEIs in these tyrosine recombinase-mediated tandems are highly variable (Table 3). In some cases, the tandemly integrated GEIs are unrelated, encoding very phylogenetically distant integrases and/or carrying unrelated att sites. For instance, the GEI ICE_2603_tRNALys is composed of a degenerated ICE and of an unrelated IME, both encoding distantly related integrases (< 30% identity; Brochet et al., 2008). Although most of the composite GEIs only contain one or two internal attachment sites(s), an impressive number of direct repeats can be found inside some composite GEIs. For example, the GEI MIT9312-ISL3 from the cyanobacterium Prochlorococcus sp. contains 13 internal direct repeats (Table 3, Coleman et al., 2006). Moreover, some of the elements included in composite GEIs can be very large. For instance, PAISt, a 674-kb ICE from S. turgidiscabies, results from the accretion of a 569-kb element with another 105-kb element (Huguet-Tapia et al., 2011). On the opposite, other elements found in composite GEIs can be tiny such as one of the three elements constituting CIME_515_rpsI which is 247 bp long (Brochet et al., 2008).
Table 3. GEIs presenting evidences of tandem accretions
The nature of the GEIs is indicated if known. Some newly described GEIs were identified during succinct analyses performed in order to illustrate seeking of GEIs carrying internal direct repeats. Homology searches were performed by comparing the sequences with the public DNA databases using the program blastn (Altschul et al., 1997) (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The new GEIs were named according to Brochet et al (2008). The use of the direct repeats flanking ICEBs1 (Burrus et al, 2002b) as query leads to the pinpointing of GEI_BamFZB42_tRNALeu, GEI_BamTA208_tRNALeu and, GEI_BsuTU-B-10_tRNALeu. The use of the direct repeats flanking ICEEc1/ICEEh1 (Schubert et al, 2004; Paauw et al., 2010) as query leads to the pinpointing of GEI_Kpn342_tRNAAsn and GEI_SenCFSAN001992_tRNAAsn.
The elements are sorted according their origin (bacterial division, genus, species). β, betaproteobacteria; γ, gammaproteobacteria; act., actinobacteria; chla., chlamydiae; chl., chloroflexi; cya., cyanobacteria; fir., firmicutes.
Various ICEs and IMEs encode one to three serine recombinase(s) that catalyze their integration and excision (Ito et al., 2003; Groth & Calos, 2004; Camilli et al., 2011). Site-specific accretions of an incoming ICE or IME encoding a serine recombinase with a resident GEI have never been observed. Nevertheless, after conjugation, tandems of engineered plasmids carrying the serine recombinase-based recombination system of the Streptomyces phage φC31 are frequently observed. They probably result from successive and independent site-specific recombinations, that is, site-specific accretion, but may also result from a site-specific integration and a subsequent homologous recombination of the integrated copy with a free copy of the plasmid (Combes et al., 2002; Eustaquio et al., 2005; Sioud et al., 2009).
Only one composite GEI tandem including an ICE or an IME encoding a serine recombinase was reported in the genome of C. difficile QCD-63Q42 (Table 3, Brouwer et al., 2011). This GEI includes two ICEs, the first one related to CTn5 and the other to CTn7, and is integrated in a site corresponding to the CTn7 target site in the genome of C. difficile 630 (Brouwer et al., 2011). However, the analysis of other GEIs encoding a serine recombinase, such as the GEIs integrated in the 3′ end of the ssrA gene in Dehalococcoides (McMurdie et al., 2011) or the staphylococcal chromosome cassettes (SCC), revealed accretions (Table 3). The SSCs, ranging from 0.1 to 34 kb, are integrated in the 3′ end of the orfX gene which encodes a rRNA methyltransferase in various Staphylococcus strains (IWG-SCC, 2009; Boundy et al., 2013). Most of these elements carry two serine recombinase genes (the ccr genes) that are both required for SCC excision and integration (Wang & Archer, 2010; Misiura et al., 2013). Like CIMEs, numerous SCC cassettes are devoid of recombinase genes and are described as a pseudo-SCC element (ΨSCC; IWG-SCC, 2009). Numerous composite GEIs corresponding to elements integrated in tandem have been described (Mongkolrattanothai et al., 2004; Chongtrakool et al., 2006; Zhang et al., 2009; Chen et al., 2010; Schijffelen et al., 2010; Zong & Lu, 2010; Shore et al., 2011; Urushibara et al., 2012). For instance, the S. aureus USA300 harbors an ACME-SCCmec tandem (Diep et al., 2008), whereas Staphylococcus epidermidis ATCC122228 contains a 102-kb tandem composed of six ΨSCC and two SCC locus (Table 3, Takeuchi et al., 2005).
Accretion catalyzed by a DDE transposase
The unique composite GEI that includes an ICE or an IME and encodes a DDE transposase concerns ICE6013 from S. aureus (Smyth & Robinson, 2009). Although the integration specificity of this ICE is very low, two copies were found to be integrated in tandem. The elements are separated by a 3-bp sequence, that is, the size of the target duplication caused by the insertion of this ICE. Furthermore, we have previously mentioned that the Tn7 transposon, which encodes a DDE transposase, inserts into a specific site of the chromosome (attTn7; see section 'Matryoshkas: MGEs within MGEs'). Many bacteria harbor Tn7-related elements integrated in tandem in this locus (Table 3, Parks & Peters, 2007, 2009). For instance, Clostridium butyricum 5521 carries a tandem composed of three complete or almost complete Tn7-like elements and a fourth truncated one inserted in attTn7 (Table 3, DeBoy & Craig, 1996). The Tn7-like elements are separated by an identical or very similar 5-bp sequence duplicated upon insertion (Parks & Peters, 2007).
Target immunity and difference in integration efficiencies
The Tn7 transposon encodes a mechanism preventing the insertion of a second copy of this MGE in the vicinity of its own insertion site (Arciszewska et al., 1989). In fact, the binding of the Tn7 transposase subunit TnsB on the ends of the transposon prevents the binding of TnsC, a protein essential for transposition (Stellwagen & Craig, 1997; Skelding et al., 2003). However, this phenomenon, called target immunity, has a limited efficiency and prevents only 90% of these tandem insertions (DeBoy & Craig, 1996). In addition, various works showed that the high integration specificity of a GEI can impair its integration frequency when it transfers to a recipient already carrying a related GEI in attB. This mechanism is different from exclusion entry. For example, the transfer frequency of ICESt3 is reduced up to eightfold when the attB site of the recipient is already occupied by a CIME (Bellanger et al., 2011). Similarly, the integration efficiency of ICEclc is weaker in its attR site than in a truncated attR site, deleted of its arm, and thus similar to an attB site (Sentchilo et al., 2009). These phenomena are independent of any GEI-encoded function and probably reflect differences between ICE integration efficiencies in an empty attB and in att sites of a resident GEI. Indeed, the binding of integrase, excisionase, and/or host cofactor(s) on the arms of the attL or attR sites of a resident GEI might interfere with the integration efficiency of an incoming GEI.
Intracellular dynamics of tandem: birth of hybrid GEIs
When GEIs of a tandem are identical or very similar, they constitute a large direct repeat. Therefore, a RecA-dependent homologous recombination could occur and lead to the creation of a circular DNA molecule containing one or several GEIs, whereas a hybrid GEI would remain integrated within the chromosome (Garriss et al., 2009; Bellanger et al., 2011; Fig. 8).
Moreover, hybrid elements can also be produced by GEI-encoded mechanisms. Indeed, homologues of the bacteriophage λ Red recombination genes bet and exo are carried by ICEs belonging to the SXT/R391 family (Garriss et al., 2009; Wozniak et al., 2009; Bellanger et al., 2011; Chen et al., 2011; Garriss et al., 2013). This RecA-independent recombination system involves (1) an exonuclease (Exo) that degrades linear double-strand DNA fragments to produce single-stranded DNA overhangs and (2) a pairing protein (Beta) that binds to these single-stranded DNA overhangs and promotes annealing to complementary DNA strands (Stahl, 1998; Poteete, 2001). Neither ICE excision, nor conjugative transfer is necessary to create hybrid ICEs by the SXT/R391 recombination system (Garriss et al., 2013). Nevertheless, conjugation appears to facilitate the segregation of hybrids and may provide a means to select for functional hybrid ICEs. DNA-damaging agents induce the recombination system, the excision, and the transfer of ICEs belonging to the SXT/R391 family (Beaber et al., 2004). Therefore, their mobility and plasticity are both induced by DNA damages.
Another GEI-encoded mechanism that would be able to create hybrid GEIs is the site-specific recombination catalyzed by some relaxases (MOBs) between two oriT copies (Burrus & Waldor, 2004; Ceccarelli et al., 2008). The first site-specific recombination mediated by a relaxase that has been described was concomitant with conjugation of a plasmid carrying two cloned copies of oriT (Gao et al., 1994). During this conjugation, transfer was initiated by nicking at one oriT and was stopped at the second oriT, thus without transferring the whole plasmid. More recently, relaxase-mediated site-specific recombinations were described in the absence of conjugation. In this case, the relaxases not only initiate bacterial conjugation but can sometimes also catalyze site-specific integrations and/or excisions. Thus, the relaxases TrwC of the plasmid R388 from E. coli (Llosa et al., 1994; Draper et al., 2005; César et al., 2006), NikB of the plasmid R64 from Salmonella (Furuya & Komano, 2003), and TraX of the plasmid pAD1 from E. faecalis (Francia & Clewell, 2002) were reported to catalyze co-integrations and/or resolutions of DNA molecules carrying a complete or a truncated copy of their cognate oriT. Numerous GEIs encode a relaxase belonging to the same family as TrwC, NikB, and TraX (MOBF, MOBP, and MOBC, respectively; Tables 1 and 2). For example, the TrwC-related relaxase Zoua, encoded by a GEI from Streptomyces kanamyceticus, can catalyze under selective pressure the amplification of a 145-kb region of this GEI that includes the entire kanamycin biosynthetic gene cluster (Murakami et al., 2011). This up to 36-fold amplification occurs by relaxase-mediated recombination between inverted repeat structures similar to those found in oriT sequences. However, all the relaxases belonging to the same protein families as TrwC, NikB, and TraX are not able to catalyze site-specific recombinations. Indeed, the relaxases of the F and pKM101 plasmids are not capable of promoting recombination between two oriT copies even if they belong to the MOBF family (Gao et al., 1994; César et al., 2006). Site-specific recombination between different GEIs integrated in tandem was also proposed to explain the formation of hybrid ICEs of the SXT/R391 family, which encode relaxases belonging to the MOBH family, but this hypothesis has never been demonstrated (Burrus & Waldor, 2004; Ceccarelli et al., 2008). A tandem amplification of ICEclc in a transconjugant of P. putida was obtained when the strain was cultivated on chlorobenzene as sole carbon source (Ravatn et al., 1998). Nevertheless, the role of the ICEclc relaxase belonging to the MOBH family in this amplification was not investigated. Altogether, these data suggested that site-specific recombinations mediated by a relaxase might be a common phenomenon between oriTs of GEI.
On the whole, the RecA-dependent and the Bet/Exo-like recombination mechanisms as well as site-specific recombinations mediated by a relaxase can lead to the creation of hybrid GEIs and therefore are involved in module rearrangements and exchanges between GEIs.
Intracellular dynamics of tandems: shuffling and loss of GEIs integrated in tandem
The tandem integration of MGEs can lead to their instability. First, each element composing the composite GEI may excise and re-integrate as long as it is flanked by functional recombination sites and that one of the elements of the tandem encodes a functional recombinase (Fig. 9). Excision of the whole composite GEI may also occur. The elements of a tandem that are flanked by att sites, which excise but do not encode the recombinase, should be considered as nonautonomous MGEs. More and more studies revealed such tyrosine recombinase-mediated excisions of individual and composite GEIs. For instance, such multiple excisions were found for the LpcGI-1 and LpcGI-2 tandems from Legionella pneumophila (Lautner et al., 2013), the ICE_515_tRNALys-CIME_Nem_tRNALys tandem from S. agalactiae (Puymège et al., 2013), the ICESt3-CIMEL3catR3 tandem from S. thermophilus (Bellanger et al., 2011), or the GI2–GI3 tandem from Bordetella petrii (Lechner et al., 2009). In the same way, the excision of the two GEIs composing the tandem SSCmec-ACME tandem from S. aureus USA300 and SSCmec-ΨSCC from S. aureus wkz-2 is mediated by the serine recombinases encoded by the SSCmec elements (Jansen et al., 2006; Diep et al., 2008).
Once a GEI has excised, it can integrate back into an attachment site. If it integrates in the site created by its own excision, the integration restores the native tandem structure, whereas if it integrates in a different att site, the respective order of the GEIs in the tandem is modified (Fig. 9). This GEI shuffling was extensively characterized in the progeny of a transconjugant harboring a CIMEL3catR3-ICESt3-fda tandem (the gene in which the GEIs are integrated (fda) is mentioned to indicate their respective order; Bellanger et al., 2011). The independent excision of both GEIs, the excision of the whole tandem, and the formation of an ICESt3-CIMEL3catR3-fda array were demonstrated by PCR amplifications of the different predicted att sites (Bellanger et al., 2011; Fig. 9).
When excised, a GEI can be lost by dilution during cell division. For instance, GI3 of the GI2–GI3 tandem in B. petrii can be lost from the bacterial population after about 100 consecutive generations in the absence of selective pressure (Lechner et al., 2009). In the same way, CIMEL3catR3 can be lost in the progeny of a transconjugant carrying an ICESt3-CIMEL3catR3 tandem (Bellanger et al., 2011). Nevertheless, ICESt3 is never spontaneously cured in this progeny, regardless of the growth conditions. Indeed, the putative RM II system encoded by this ICE could act as a toxin/antitoxin system killing the daughter cells that have lost the ICE (Kobayashi, 2001; Bellanger et al., 2011). GEIs encoding a serine recombinase can also be lost from tandems by dilution. For example, upon in vitro serial passage of S. aureus 85/2082 carrying a SCCmercury-SCCmec tandem, the strain lost β-lactam resistance suggesting the loss of at least the SCCmec element (Ito et al., 2001). In addition, the comparison by pulsed-field gel electrophoresis, multilocus sequence typing, and staphylococcal protein A (spa) typing of methicillin-resistant S. aureus and methicillin-susceptible S. aureus strains strongly suggested ccr-mediated spontaneous in vivo deletions of SSCmec (Donnio et al., 2007; Boundy et al., 2012).
In all the cases described here, the whole tandem can also excise as a unique circular form, leaving an empty attB site in the chromosome (Fig. 9), and even re-integrate. This type of excision results from the site-specific recombination between the attR site of the first GEI of a tandem and the attL site of the second one. It also constitutes the first step of cis-conjugative mobilization (see next paragraph). Such excision of a Tn7 tandem as a single unit was hypothesized but not investigated (Parks & Peters, 2009).
Intercellular dynamics of tandem: cis-conjugative mobilization and retromobilization
A whole composite GEI composed of a mobilizing MGE (ICE or IME) carrying a complete recombination module and an oriT, and any other related GEI flanked by att sites, could excise as a circular form, transfer by conjugation, and then integrate in the attB site of a recipient cell (Fig. 9). The mobilized GEI could be an ICE, an IME, or a CIME, that is, a GEI that does not encode any protein involved in conjugative transfer and in recombination but that is flanked by att sites (Pavlovic et al., 2004). Only three cis-conjugative mobilizations of a CIME by a related ICE were reported up to now, the transfer of CIMEL3catR3-ICESt3 in S. thermophilus (Bellanger et al., 2011), ICE_515_tRNALys-CIME_Nem_tRNALys in S. agalactiae (Puymège et al., 2013), and LpcGI-2 in L. pneumophila (Lautner et al., 2013). It should be stated that some of the transconjugants from a donor harboring CIMEL3catR3-ICESt3 have only acquired the CIME. This results likely from the transfer of the whole tandem, its resolution in the recipient followed by the integration of only the CIME and the loss of the ICE during cell division (Bellanger et al., 2011). Moreover, when using a donor cell carrying a tandem of ICESt3, co-transfer of the two ICESt3 copies tagged by different antibiotic resistance genes was observed at a similar frequency than the independent transfer of these two ICEs (X. Bellanger, G. Guédon, unpublished data). This shows that cis-conjugative mobilization of an oriT-carrying GEI by another oriT-carrying element can also occur.
Conjugative transfer and mobilization were usually considered as a gene flow from a donor to a recipient strain. However, gene flow occurring at a high frequency in the two directions was reported in proteobacteria and firmicutes (Mergeay et al., 1987; Szpirer et al., 1999; Haines et al., 2006; Lotareva & Prosorov, 2006; Timmery et al., 2009). These studies described the mobilization in cis or trans of chromosomal genes, mobilizable or nonmobilizable plasmids by a conjugative plasmid from the recipient back into the donor. The word ‘retrotransfer’ is commonly used to characterize this phenomenon of gene capture. DNA conjugative transfer is unidirectional, and retrotransfer thus involves two successive unidirectional transfers, a first one from the donor to the recipient cell and a second one from the recipient (actually a transconjugant) to the donor cell (Ankenbauer, 1997). Moreover, retrotransfer requires transfer of a plasmid carrying a conjugation module to the recipient, but not its replication, and expression in the recipient of genes of the conjugation module (Heinemann & Ankenbauer, 1993a, b; Sia et al., 1996). Retrotransfer is not rare and its frequency can be very high, being in some cases similar to the frequency observed for the transfer in one direction (Mergeay et al., 1987; Haines et al., 2006). Although several captures of chromosomal markers by a conjugative plasmid were described, all these retrotransfers were mediated by a transposon carried by the plasmid (Szpirer et al., 1999).
Retrotransfer or retromobilization involving ICEs has until now been tested for Tn916, Tn925 (an element almost identical to Tn916), and ICESt3. First investigations failed to show any Tn916- or Tn925-mediated retrotransfer of plasmids from B. subtilis and Bacillus thuringiensis (Showsh & Andrews, 1996). Nevertheless, retrotransfer of CIMEL3catR3 by ICESt3 was recently shown (Bellanger et al., 2011). The retrotransconjugants derive from the donor strain but harbor a GEI tandem composed of two ICESt3 copies and one CIMEL3catR3 instead of ICESt3 alone. This retrotransfer probably relies on successive steps: (1) a first transfer of ICESt3 to a recipient cell harboring the CIME, (2) the site-specific accretion of the incoming ICE with the resident CIME, (3) the excision of the whole composite GEI, (4) its transfer to a donor cell already carrying the ICE, and (5) its integration in one of the att site flanking the resident ICE. It is postulated that any cointegrate made of an ICE and another DNA molecule (GEI, plasmid or chromosome) could be retrotransferred if the conjugation module is expressed. As the Tn916 conjugation module is only expressed when the ICE is under circular form (Celli & Trieu-Cuot, 1998), it is unlikely that a cointegrate carrying this element would be able to retrotransfer, which would explain the results of Showsh and Andrews (1996).
Although several studies characterized the cis mobilization of a GEI by a related ICE (Lechner et al., 2009; Bellanger et al., 2011; Lautner et al., 2013; Puymège et al., 2013), the study of ICESt3 and CIMEL3catR3 is the only demonstration of GEI retromobilization mediated by an ICE or GEI capture by a cell carrying a related ICE. However, taking into account the increasing number of experimental evidences, the widespread property of GEIs to integrate in tandem and of composite GEIs to excise under a unique circular form, cis-conjugative mobilization and retromobilization will probably be recognized as key mechanisms of GEI transfer. Regardless of its gene content, a GEI flanked by att sites could thus be mobile provided that an ICE or an IME is present in the cell to mobilize it.
Despite their intracellular and intercellular mobility, unlike plasmids and transposons, ICEs are difficult to detect as their site-specific integration does not lead to the disruption of the target gene. Therefore, most ICEs identified before 1990, such as Tn916, pRS01 or pSAM2, were atypical. As these elements belong to a previously unknown class of MGEs, many ICEs have been initially called ‘plasmids’ (for a review see Burrus et al., 2002a). Even now, such confusions still occur. For instance, various ICEs were misnamed, such as the ‘phage island’ G9acb from Acinetobacter baumannii (Di Nocera et al., 2011), the ‘integrative plasmid’ from Streptococcus canis (Richards et al., 2012), or the ‘plasmid-like’ element pLP100 from L. pneumophila (Trigui et al., 2013). Moreover, even the detection of ICEs by in silico analysis of genomes is not so easy. Before the mid-2000s, most genes involved in conjugation were annotated as ‘unknown’ in the databanks, and the domains were not indicated. Furthermore, even now the protein or domain annotation can be imprecise or misleading, hindering the detection of ICEs by sequence analysis. For instance, tyrosine integrases of ICEs are frequently annotated as ‘phage integrases’ or ‘site-specific recombinases, phage integrase family’. Some annotations can also indicate a function without any relation to the real or main function of the protein. For instance, the relaxases of most ICEs belonging to the superfamily ICEBs1/Tn916/ICESt3, probably the most common ICE subclass found in firmicutes, carry two domains, a 5′ end domain HTH/XRE DNA binding domain (cd00093) also found in the central repressor from prophages and a large domain Rep_trans (replication initiation factor; pfam02486) typical of a relaxase. As a result, many of these relaxases are annotated as ‘Cro/CI family transcriptional regulators’ or ‘transcriptional regulators’. Furthermore, as a result of the modular structure and exchanges, the mobility and regulation genes generally belong to protein families that also include mobility or regulation proteins of conjugative plasmids, integrated prophages, or transposable elements.
The first searches of ICEs in bacterial genomes have been performed in the early 2000s. For instance, seven AICEs or AICE-related GEIs (i.e. a previously known AICE, SLP1, and six novel GEIs reported as pSAM2-like elements) were detected in the sequenced chromosome of the actinobacterium Streptomyces coelicolor A3(2) (Bentley et al., 2002). To our knowledge, only four comprehensive and reliable searches of all types of ICEs on all the genomes available for a taxon were performed. The first search consists of Blast analyses performed on all the 22 completely or incompletely sequenced genomes of firmicutes that were available in 2001. It uses as queries all the complete (only two and those of ICESt1 and of ICEBs1 which was identified during this study) and incomplete available sequences from conjugative modules from firmicutes (Burrus et al., 2002b). This led to the detection in six of these genomes of 17 complete conjugative modules associated with recombinases, that is, 17 putative ICEs (including seven ICEs in C. difficile 630 and six in E. faecalis V583). The second one, performed on eight complete or incomplete available genomes of the firmicute S. agalactiae, revealed a total of 16 ICEs (Brochet et al., 2008). The third one, performed on 11 strains of C. difficile, revealed 30 ICEs (Brouwer et al., 2011). The last one, performed on 275 available genomes of actinobacteria, revealed 144 AICEs (i.e. ICEs that transfer as double-stranded DNA) and 17 ICEs using MOB, CP, and MPF to transfer as single-stranded DNA (Ghinet et al., 2011). More and more ICEs are detected in sequenced genomes although most genomes have probably not been analyzed in this perspective, at least during the sequencing project. In addition, various comprehensive searches of putative ICEs related to a studied ICE were performed on all the genomes available for the corresponding taxon. Taken together, all these data show that various families are widespread among proteobacteria (Toussaint et al., 2003; van der Meer & Sentchilo, 2003; Gaillard et al., 2006; Juhas et al., 2007; Ryan et al., 2009; Wozniak et al., 2009), firmicutes (Carraro et al., 2011; Mingoia et al., 2011; Guerillot et al., 2013; Puymège et al., 2013), bacteroidetes (Bacic et al., 2005), and actinobacteria (te Poele et al., 2008; Ghinet et al., 2011). Furthermore, an exhaustive search of conjugation modules based on HMM profiles deduced from known MOBs, CPs, and MPFs, that is, essentially conjugation modules of proteobacterial plasmids was recently performed on 1124 archaeal and bacterial genomes (Guglielmini et al., 2011). It identified 335 complete conjugation modules on chromosomes and only 180 on plasmids. Although the integration/excision modules were not searched, this strongly suggests that ICEs transferring as single-stranded DNA are more frequent than conjugative plasmids. This study found not only numerous ICEs in bacterial divisions where some ICEs were already identified but also some ICEs in archaea and other bacterial divisions such as Cyanobacteria, Acidobacteria, or Fusobacteria. Interestingly, none of the ICEs (transferring as single-stranded DNA) identified in actinobacteria in that study was identified by the analyses reported by Ghinet et al. (2011). Conversely, none of the putative actinobacterial ICEs (transferring as single-stranded DNA) identified by Ghinet et al. was found by Guglielmini et al. (2011). This suggests that the strong bias in the initial dataset for proteobacterial plasmids does not allow the detection of very distant or unrelated conjugation modules, precluding the detection of many ICEs in other bacterial divisions and archaea.
IMEs: a world still to be explored
The IMEs are much harder to detect by conjugation than ICEs as most integrate in specific sites avoiding the disruption of the target gene and their transfer requires the presence of a helper element. This explains why so few IMEs differing by their gene content have been shown to be mobilizable in trans by conjugative elements, that is, six in bacteroidetes, four in firmicutes, four in proteobacteria, and only one in actinobacteria (Table 2). In addition, the detection of IMEs by in silico analysis of genomes is also more difficult than the detection of ICEs. Indeed, according to the element, an IME encodes only a few conjugation proteins (proteins belonging to the relaxosome) or only one conjugation protein (a MOB), if any. Furthermore, IMEs found by sequence analysis were frequently confused with other classes of MGEs. For instance, our sequence reanalysis of the ‘ICEs’ GI2 from Burkholderia cenocepacia (Graindorge et al., 2012), ICE6180-RD.1 (Beres & Musser, 2007) and ICESp2907 from S. pyogenes (Giovanetti et al., 2012) revealed that they are IMEs (Table 2). Furthermore, another sequence reanalysis showed that the ‘Stoke Mandeville phage island’ inserted in the ICE CTn027/Tn6105 actually corresponds to three IMEs integrated into Tn6105 and the intervening sequences of this ICE (Brouwer et al., 2011).
The unique comprehensive search of IMEs encoding a MOB, performed on eight complete or incomplete available genomes of the firmicute S. agalactiae, revealed nine IMEs (with one present in three strains; Brochet et al., 2008). Furthermore, the previously discussed exhaustive search performed by Guglielmini et al. (2011) on 1124 archaeal and bacterial genomes identified not only 335 chromosomal conjugation modules but also 402 chromosomal relaxase genes that do not belong to a conjugation module. Although the integration/excision modules were not searched, this suggests that IMEs are even more frequent than ICEs. Furthermore, it is probable that, like for ICEs and for the same reasons, Guglielmini et al. (2011) failed to detect numerous MOB genes that do not belong to a conjugation module and therefore failed to detect numerous IMEs. In addition, numerous GEIs possess a complete integration/excision module (encoding usually a tyrosine recombinase) but do not encode any conjugation protein (see e.g. Brochet et al., 2008; Boyd et al., 2009). These elements could correspond to: (1) IMEs encoding a MOB very distantly related to known MOBs or belonging to a not yet characterized family of MOBs, (2) IMEs that, like MGIs from Vibrio (Daccord et al., 2010, 2013), do not encode any conjugation protein but harbor an oriT related to that of some conjugative elements, (3) a degenerated GEI deriving from an ICE or an IME by deletion of the conjugation or the mobilization module such as the GEI HPI from Y. pestis KIM (Schubert et al., 2004), and (4) GEIs belonging to other poorly known classes of MGEs such as satellite prophages (Christie & Dokland, 2012). Furthermore, at least some tISs encoding a DDE transposase (i.e. insertion sequences with passenger genes) are IMEs, as they are mobilizable in trans by an ICE (Achard & Leclercq, 2007). Altogether, these data suggest that, although very poorly known, IMEs might have a very high prevalence in genomes.
Prevalence of degenerated GEIs and remnants
The slightly degenerated GEIs that derive from ICEs or IMEs by pseudogenization are generally detected along with related functional ICEs or IMEs and/or are frequently confused with them (Brochet et al., 2008; Seth-Smith et al., 2012). The most impressive amount of degenerated ICEs has been reported in the genome of the obligatory intracellular α-proteobacterium Orientia tsutsugamushi strain Ikeda: the 185 copies of highly degenerated and related ICEs have a total length of 694 kb, representing 34.7% of the genome of the strain (Nakayama et al., 2008). Highly degenerated elements such as CIMEs (i.e. elements that derive from ICEs or IMEs by deletion of the conjugation/mobilization and recombination modules but have retained att sites) are generally only detectable by careful comparison between limits of known ICEs or IMEs and limits of other GEIs that are integrated in the same sites. Hence, since a limited amount of ICEs and very few IMEs were identified, only some CIMEs were reported. For instance, the comparison of ICESt1 with related GEIs integrated in the 3′ end of fda revealed two different ICEs, four different CIMEs, and two GEIs deriving probably from CIMEs by deletion of their left end (Burrus & Waldor, 2003; Pavlovic et al., 2004). The unique comprehensive search of CIMEs in genome(s), performed on eight complete or incomplete available genomes of the firmicute S. agalactiae, revealed 25 CIMEs, if each element found in composite GEIs is taken in account individually (Brochet et al., 2008). Furthermore, it should be emphasized that in the absence of known related ICEs/IMEs, these highly degenerated elements will be described as GEIs or escape to detection. In addition, tiny remnants such as the traces found within CIME302 and in ICESt1 (see section 'Degenerated GEIs deriving from ICEs, IMEs, or IMEXs') will be even more difficult to detect. In the same way, a GEI remnant is integrated in accretion with the putative IME MGIAmaMed1 from Alteromonas macleodii (Daccord et al., 2013). Altogether, these data suggest that, although almost unknown, highly degenerated GEIs deriving from ICEs and IMEs, such as CIMEs or remnants, might be highly frequent in genomes.
Site-specific accretion: a key mechanism of ICE and IME evolution?
Tandemly integrated GEIs (and/or GEIs carrying internal recombination sites) are found in numerous bacterial genomes and in various integration sites (Table 3). For instance, the search for internal recombination sites revealed 12 composite GEIs, within the set of 35 ICEs, IMEs, and related GEIs differing by their gene content, that were identified in the genome of eight strains of S. agalactiae (Brochet et al., 2008). This analysis suggests that at least 15 independent accretion events catalyzed by eight different types of tyrosine integrase have occurred. Furthermore, the actual traces may be even more frequent in this dataset as a search of faint GEI remnants such as the truncated att site found in ICESt1 and related elements (Pavlovic et al., 2004) was not performed.
However, only a very small minority of these structures is reported. The most plausible explanation for the failure to identify these structures is that GEI analyses were not carried out to reveal their chimerical nature and evolutionary dynamics. As an illustration that composite GEIs can be found if searched, we performed a succinct analysis consisting in homology searches with the blastN and blastX programs (Altschul et al., 1997) and using the sequence of some well-known ICEs available in public DNA databases for which we had no evidence of accretion. For example, the use of the direct repeats flanking ICEBs1 (Burrus et al., 2002b) as query led to the pinpointing of the composite GEIs GEI_BamFZB42_tRNALeu, GEI_BamTA208_tRNALeu, and GEI_BsuTU-B-10_tRNALeu (Table 3). Those flanking ICEEc1/ICEEh1 (Schubert et al., 2004; Paauw et al., 2010) allowed the identification of GEI_Kpn342_tRNAAsn and GEI_SenCFSAN001992_tRNAAsn. Therefore, it seems that site-specific accretion is a key mechanism of evolution of ICEs, IMEs, and related GEIs, at least of those that integrate site-specifically.
Increasing data show that various GEIs correspond to highly diverse types of noncanonical integrated MGEs. Apart from the best known ICEs which encode all functions necessary for their transfer and maintenance, these GEIs can belong to poorly known and/or recently described classes of nonautonomous MGEs that have highly variable degrees of dependence. The very few systematic investigations of these MGEs in bacterial genomes suggest that they are probably widespread in most or all archaeal and bacterial groups. However, various pitfalls and bias hinder their in silico or experimental identification, and their actual abundance remains largely underestimated. It cannot even be excluded that the most difficult to identify and more rarely described are the most widespread. With the availability of an increasing number of sequenced bacterial genomes, the development of new approaches to identify these MGEs is a key challenge for the future studies of GEIs. An easy detection of these elements is necessary to understand the real importance and impact of conjugative transfer of GEIs on the evolution of their host.
The highly diverse interactions of conjugative elements with their various types of hitchhikers (high or low specificity insertion of the hitchhiker, site-specific accretion, structure instability, mobilization in cis or in trans) lead to a complex evolution of the GEIs. This evolution can lead to the inactivation of the element, the acquisition of adaptative modules, or the change of its intracellular or intercellular mobility. One of the main consequences of this complex evolution is that most, if not all, conjugative or mobilizable GEIs are chimerical MGEs that have acquired novel modules from other mobile elements. Another consequence is the frequent decay of these elements. This leads not only to nonmobile GEIs or remnants but also to elements with a reduced, but significant, mobility such as the CIMEs. Therefore, a key challenge for future research will be to determine the interactions between conjugative elements, their various hitchhikers, and the degenerated elements that have retained some mobility.