Identification of bacterial plasmids based on mobility and plasmid population biology

Authors

  • Maria Pilar Garcillán-Barcia,

    1. Departamento de Biología Molecular e Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), Universidad de Cantabria-CSIC-IDICAN, Santander, Spain
    Search for more papers by this author
  • Andrés Alvarado,

    1. Departamento de Biología Molecular e Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), Universidad de Cantabria-CSIC-IDICAN, Santander, Spain
    Search for more papers by this author
  • Fernando de la Cruz

    1. Departamento de Biología Molecular e Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), Universidad de Cantabria-CSIC-IDICAN, Santander, Spain
    Search for more papers by this author

  • Editor: Teresa Coque

Correspondence: Fernando de la Cruz, Departamento de Biología Molecular e Instituto de Biomedicina y Biotecnología de Cantabria (IBBTEC), Universidad de Cantabria-CSIC-IDICAN, C. Herrera Oria s/n, 39011 Santander, Spain. Tel.: +34 942 201 942; fax: +34 942 201 945; e-mail: delacruz@unican.es

Abstract

Plasmids contain a backbone of core genes that remains relatively stable for long evolutionary periods, making sense to speak about plasmid species. The identification and characterization of the core genes of a plasmid species has a special relevance in the study of its epidemiology and modes of transmission. Besides, this knowledge will help to unveil the main routes that genes, for example antibiotic resistance (AbR) genes, use to travel from environmental reservoirs to human pathogens. Global dissemination of multiple antibiotic resistances and virulence traits by plasmids is an increasing threat for the treatment of many bacterial infectious diseases. To follow the dissemination of virulence and AbR genes, we need to identify the causative plasmids and follow their path from reservoirs to pathogens. In this review, we discuss how the existing diversity in plasmid genetic structures gives rise to a large diversity in propagation strategies. We would like to propose that, using an identification methodology based on plasmid mobility types, we can follow the propagation routes of most plasmids in Gammaproteobacteria, as well as their cargo genes, in complex ecosystems. Once the dissemination routes are known, designing antidissemination drugs and testing their efficacy will become feasible.

Introduction

Plasmids occur pervasively in most bacterial species. They are important agents of gene flux. As a paradigmatic example, they are responsible for the appearance and dissemination of multiple antibiotic resistances (multidrug resistance or MDR plasmids), which is an increasingly recognized threat in human medicine (Smith & Romesberg, 2007; Boucher et al., 2009). Each year, about 25 000 patients die in the EU from an infection caused by MDR bacteria (ECDC/EMEA Joint Technical Report, 2009, http://www.ema.europa.eu/docs/en_GB/document_library/Report/2009/11/WC500008770.pdf). Among a multitude of additional examples of new threats imposed by MDR bacteria, Yersinia pestis was shown to acquire an MDR plasmid for the first time in 1995 (Welch et al., 2007). Mobile antibiotic resistance (AbR) genes are contained in platforms that include plasmids, integrative and conjugative elements (ICEs), integron cassettes and a variety of transposons and related elements. All these are collectively known as mobile genetic elements (MGEs). MGEs can move by a variety of molecular mechanisms, including conjugation, transformation and transduction. Among the almost infinite range of possibilities, specific MGEs will use preferred routes. Little is known about the constraints that limit the mobility of a given MGE (Slater et al., 2008). It is now increasingly appreciated by the clinical and microbiological communities that, if we knew more about the dynamics and preferred routes of MGE propagation, possibilities will exist to control, and therefore impede or limit, the dissemination of mobile AbR genes (Bonten et al., 2001; Williams & Hergenrother, 2008). In any case, plasmids are the preferred route for dissemination of AbR, while bacteriophages play a relatively minor role in the process, at least in Gammaproteobacteria (de la Cruz & Davies, 2000; Barlow, 2009; Skippington & Ragan, 2011). As a first step to ascertain the routes of plasmid propagation, we need a strategy to sort out plasmids and then compare what genes these plasmid groups have in common and how they compare with other sets of plasmids. In other words, we need an informative classification system. As shown in the review by Smillie et al. (2010), we only have a relatively comprehensive picture of plasmid diversity in the phylum Proteobacteria. Other bacterial phyla are considerably unknown by comparison. Thus, this review will emphasize what we have learned from Proteobacteria with just occasional incursions in other bacterial phyla.

In order to control the spread of MDR plasmids, we need to know many more variables that affect their movement (preferred hosts and environmental conditions for propagation and or stable maintenance). To follow their migration routes, from reservoirs to the final human pathogens, we have to be able to track and identify individual plasmids with techniques that should be, ideally, both inexpensive and highly scalable. This means that we need to have in hand a robust plasmid classification method and the corresponding technology for experimental testing. Classical methods of plasmid classification are incompatibility testing (Datta & Hedges, 1971; Taylor et al., 2004), hybridization with replicon probes (Couturier et al., 1988) and PCR-based replicon typing (PBRT) (Gotz et al., 1996; Greated & Thomas, 1999; Carattoli et al., 2005; Garcia-Fernandez et al., 2009; Bertini et al., 2010; Villa et al., 2010). The first is clearly obsolete because of: (1) the need to transfer plasmids to the same given host for analysis, which limits the range of plasmids that can be analyzed, (2) the exponential increase in labor when new Inc groups are discovered and incorporated into the test and (3) single point mutations can change a plasmid Inc group (Lacatena & Cesareni, 1981; Tomizawa & Itoh, 1981). PBRT is widely used and has led to important advances in our knowledge of plasmid diversity and dynamics. It allowed us to know that clinically relevant AbR genes are mainly located on conjugative plasmids belonging to a few widespread replication types. Some of these plasmids were able to transfer to different hosts causing new outbreaks of MDR bacteria (Boyd et al., 2004; Lavollay et al., 2006; Chowdhury et al., 2011). In spite of its success, plasmid classification by PBRT also suffers from several drawbacks: (1) the frequent occurrence of multiple replication regions in a plasmid that results in an impossible univocal classification, (2) the lack of phylogenetic depth due to the diversity and rapid evolution of replicators and (3) the existence of hybrid replication regions that also confuse classification. As a further advance, analysis by plasmid multiple locus sequence type (pMLST) has been used to identify a number of plasmid backbones: IncI1 (Garcia-Fernandez et al., 2008) (http://pubmlst.org/plasmid/), IncN (Garcia-Fernandez et al., 2011; Zong et al., 2011) (http://pubmlst.org/plasmid/), and IncHI1 (Phan et al., 2009) and IncHI2 [by plasmid double locus sequence typing (Garcia-Fernandez & Carattoli, 2010) (http://pubmlst.org/plasmid/)]. Although pMLST quickly detects genes belonging to different plasmid modules, backbone variants often escape detection. Other classification methods are even more robust, but require an analysis of the full sequence of the plasmids (Brilli et al., 2008; Suzuki et al., 2010). Francia et al. (2004) and Garcillán-Barcia et al. (2009) proposed a new classification scheme based on plasmid mobility, the so-called MOB classification system. It was later shown that this typing system could be applied to an in-depth description of all plasmids populating DNA databases (Smillie et al., 2010). This analysis shed light, for the first time, on the more general aspects of plasmid population structure and provided some hints on the likely evolutionary routes that shaped the genetic architecture of present-day plasmids. The MOB classification, together with PBRT, has already been successfully applied in the identification of plasmids from clinical isolates (see, e.g., Valverde et al., 2009; Mata et al., 2010; Curiao et al., 2011).

In the present work, we review some conceptual aspects of the genetic constitution, diversity and dynamics of bacterial plasmids (see Genetic organization of plasmids). This first cartography will inform us about the likely constraints that limit the spread of the AbR elements carried by a given plasmid. We continue by looking at the diversity of plasmids as they appear in DNA sequence databases (see A world of plasmids). We use these data to elaborate a method for plasmid identification and phylogenetic classification (Box 1). Some of the relevant knowledge about plasmid dynamics in populations is described in Establishment module. Once the relevant plasmid groups are identified and classified, the knowledge of each plasmid population properties should assist us in implementing effective antidissemination strategies (see Population genetics of proteobacterial AbR plasmids). More research is needed to discover a range of antidissemination drugs that will help us in this endeavor. Both in vivo (Fernandez-Lopez et al., 2005) and in vitro (Lujan et al., 2007) approaches can provide us with compounds with which to test various dissemination containment strategies and, ultimately, to propose a plan of action to control the spread of MDR plasmids to clinically relevant human pathogens.

Table Box 1..   A PCR-amplification method using degenerate oligonucleotides for the classification of plasmids: the MOB classification system
The Gammaproteobacteria contain many of the most important bacterial human pathogens, which are easily infected by MDR plasmids. Besides, in Gammaproteobacteria, AbR gene mobility is caused primarily by conjugation (Bennett, 2008; Su et al., 2008). Thus, the MOB classification is a pertinent scheme for the classification of plasmids involved in the dissemination of AbR.
Protein families (protein sequences related by homology, that is, common ancestry and catalyzing the same biochemical reaction) conserve a core atomic structure. Within this core there are some invariant amino acids, which usually form part of the catalytic center (Orengo et al., 2003; Lesk, 2005). In relaxase families, three conserved motifs are conspicuous (Francia et al., 2004; Garcillán-Barcia et al., 2009): a first motif contains the catalytic tyrosine that cleaves the oriT in conjugal DNA processing, a second motif contains an acidic residue (glutamic or aspartic acid) that helps in activating the catalytic tyrosine and a third motif contains three histidines that coordinate a divalent metal cofactor necessary for the cleavage reaction (Guasch et al., 2003). Given the high conservation of these amino acid motifs in relaxases of the same family, degenerate oligonucleotides can be designed that are able to amplify all DNA variants coding for these amino acids (Rose et al., 1998). Thus, they are ideal to amplify sequences that conserve the motifs, but vary in DNA sequence due to a random drift of synonymous mutations.
Alvarado et al. (manuscript in preparation) have validated a set of degenerate primer pairs that can amplify >90% of all known transmissible plasmids in Gammaproteobacteria, including clusters of five out of the six MOB relaxase families: MOBF, MOBP, MOBQ, MOBH and MOBC (see Figure, Box 1). They applied this method to the analysis of several plasmid collections, being able to detect and classify many plasmids previously untypable by other methods. As specific examples, several MOBP11 mercury-resistant conjugative plasmids from a collection of isolates from marine environments (Dahlberg et al., 1997), new MOBP3, MOBQu and MOBC plasmids from a collection of Escherichia coli plasmids from urinary tract infections isolated by Ejrnaes et al. (2006) and new MOBF11, MOBP11 and MOBP12 AmpC β-lactamase-encoding plasmids from a collection of enterobacteria (Mata et al., 2010) were typed.
inline image
Schematic representation of the workflow followed in the design of semi-degenerate oligonucleotides under the CODEHOP philosophy (Rose et al., 1998). The left panel represents the MOBF11 motif I amino acid sequence alignment. The middle panel represents the same sequences back-translated to DNA. The right panel depicts a logo constructed from the aligned DNA (http://weblogo.berkeley.edu/logo.cgi). The 5′ section of the oligonucleotide, denominated ‘clamp’, is not degenerate (solid line) and reflects the consensus of the alignment. The 3′-terminal 11–14 nucleotides, a segment denominated ‘core’, are degenerate at the third position of each codon (dashed line) to cover all possible combinations of the given amino acid sequence. As a consequence, in the mixture of degenerate oligonucleotides, there is always a perfect match to a positive target DNA sequence in the core. The clamp provides the extended homology required for efficient exponential amplification of the first PCR products.

Genetic organization of plasmids

According to a classical view, plasmids have a modular structure, meaning that related functions are clustered in specific segments of the DNA molecule (Thomas, 2000; Osborn & Boltner, 2002; Toussaint & Merlin, 2002; Norman et al., 2009). Usually, it is understood that each plasmid module comes from a different phylogenetic origin and that plasmids are built up by the more or less random juxtaposition of different functional modules (Osborn & Boltner, 2002; Toussaint & Merlin, 2002; Norman et al., 2009). In Fig. 1, we show a scheme of the classical plasmid modules as represented in the IncW conjugative plasmid R388 and the IncQ1 mobilizable plasmid RSF1010. The first thing to be appreciated is that a considerable part of a plasmid genome is taken up by functions related to its own survival or propagation. This is called a ‘plasmid backbone’ and has to be compared with the set of genes that confer adaptive functions to the host (the adaptive or cargo module). Conjugative plasmids require a considerable set of backbone genes, which include not only the modules devoted to propagation, but also a module involved in the establishment in new recipient cells. Mobilizable plasmids can spare most of them because they use those of helper plasmids. Backbone synteny is conserved much more than cargo segments, which vary quickly according to the selective pressures to which plasmids respond. See, for example, the typical cases of REPFII, REPN, REPH, REPP, REPW or REPA/C plasmids, which contain highly conserved backbones interspersed by indels carrying various AbR genes (Heuer et al., 2004; Sota et al., 2007; Revilla et al., 2008; Carattoli, 2009; Fricke et al., 2009; Phan et al., 2009).

Figure 1.

 Modular genetic organization of a conjugative plasmid (R388) and a mobilizable plasmid (RSF1010). The figure shows the genetic organization of both plasmids in which genes are depicted in different colors according to the functional module to what they belong. The propagation module (coding for the genes involved in conjugation) is divided into two colors, because it contains a module for conjugative DNA processing (MOB, for plasmid mobilization) and a second one responsible for the synthesis of the type IV secretion system that constitutes the conjugation channel (MPF, for mating pair formation). Further details of the genetic constitution of these plasmids can be found in Fernandez-Lopez et al. (2006) and Revilla et al. (2008) for R388 and Meyer (2009) for RSF1010.

Each variant of a module carried by a plasmid relates to its choice of a given evolutionary strategy. We are still largely ignorant of the ‘specialties’ that correspond to each modular variant, but we have hints about some of them.

Replication module

The replication module of a plasmid basically determines the absolute copy number of the plasmid and its stability in different hosts and growth conditions. Copy number determination is important in plasmid population biology because the higher the copy number, the greater the likelihood of replication. Thus, evolution will tend to increase a plasmid copy number to outcompete other plasmids. This trend is countered by the added burden that the higher copy number produces on host cells, as well as by other (arguable) sociobiological issues relevant to the control of plasmid copy number (Paulsson, 2002; Watve et al., 2010). It should also be noted that, in some hosts, a plasmid can be inherently unstable, but persist because of overreplication due to propagation (De Gelder et al., 2007; Heuer et al., 2007). Besides, replicons ameliorate rapidly to increase their stability when they enter new hosts (Sota et al., 2010). There are three main groups of replicators: θ-replicators, rolling-circle (RC) replicators and strand-displacement replicators (del Solar et al., 1998). Based on pure epidemiological data, plasmids in Proteobacteria are most frequently θ-replicators, while gram-positive bacteria contain a large fraction of RC replicators. The reasons for these preferences are not obvious, because RC-replicating plasmids can be found and stably replicated in the Proteobacteria (del Solar et al., 1993) and θ-replicating plasmids are abundant in Lactobacillus (Benachour et al., 1995; Asteri et al., 2011). In any case, these associations are probably due to different historical-evolutionary trajectories more than to the possibility of a given plasmid to enter one or another type of bacterial cell. Not surprisingly, given the essentiality of the replicators, basic replicons are used widely for plasmid classification by PBRT, as stated in the Introduction.

Stability module

There are three main mechanisms by which plasmids ensure their stability, none of them universal. Thus, they have limited applicability for a general description or identification of plasmid types. They will be perhaps more valuable for niche-specific description, although practically nothing is known about the comparative adaptive value of each given stability system. The simplest stability mechanism is the class of multimer resolution systems, which is included in most θ-replicating plasmids (RC-replicating plasmids do not need multimer resolution systems). Many multicopy plasmids contain just a site, called cer, at which a host-encoded resolvase complex acts, specifically converting multimers into monomers (Summers & Sherratt, 1988; Hodgman et al., 1998). The need for a resolution system is due to the fact that plasmids, because they are represented at several copies per cell, can recombine, forming dimers and higher multimers. However, multimers have a higher chance of being replicated; hence, the population of plasmids will tend to form higher and higher multimers, which are increasingly unstable and are eventually lost. This is known as the dimer-catastrophe hypothesis and is the basis for the requirement of multimer resolution systems (Summers et al., 1993). Small multicopy plasmids endowed with a multimer resolution system are usually stable, so they do not need additional stability systems. However, for large plasmids (larger than 30 kb of DNA sequence), evolution has selected plasmids with a clearly lower copy number (from about 20 copies per cell typical of small multicopy plasmids to four or less copies per cell), most probably to compensate for the additional burden of carrying and expressing a larger DNA sequence. In low-copy-number plasmids, random assortment at cell division will result in a high frequency of plasmid loss. Thus, additional stability systems are required. They are toxin/antitoxin (TA) systems (Gerdes et al., 2005; Diago-Navarro et al., 2010) and partition systems (Gerdes et al., 2000; Velmurugan et al., 2003). TA systems kill cells that have lost the plasmid. This is due to the fact that the toxin gene produces a stable product (usually a protein) while the antitoxin gene produces an unstable product (either a protein or an RNA) required to neutralize the toxin, which disappears quickly when the coding DNA is lost. TA systems occur not only in plasmids, but also in chromosomes, and are considered as genetic elements for DNA stabilization (Szekeres et al., 2007). Finally, partition systems are the most sophisticated stability elements in plasmids. They produce an ordered assortment of the plasmid copies in cell division, in a process analogous to chromosomal distribution in cell mitosis (Gerdes et al., 2000; Velmurugan et al., 2003). Partition systems are sometimes coupled to conjugation systems by a common regulator, needed to balance the physiological requirements of conjugation with those of partition (Guynet et al., 2011). There is much active research on the molecular mechanisms of TA systems and partition systems. The comparative advantages resulting from the carriage of different stability systems in particular plasmids are a subject of interest to plasmid population dynamics.

Conjugation module

There are two classes of plasmids according to their transmissibility by conjugation. Plasmids that contain a full set of conjugation genes are called conjugative. The example is the enterobacterial AbR IncW plasmid R388 (Fig. 1). Other plasmids contain only a minimal set of genes that allow them to be mobilized by conjugation when they coexist in the same donor cell with a conjugative plasmid. They are called mobilizable plasmids and the example is the IncQ1 plasmid RSF1010 (Fig. 1). At the population level, conjugative plasmids are generally low copy number, while mobilizable plasmids will tend to be high copy number (Watve et al., 2010). Contrary to the variety of plasmid replication systems that appear to be phylogenetically unrelated (del Solar et al., 1998), there seems to be a single predominant mechanism for plasmid conjugation in Proteobacteria. It is based on a DNA-processing mechanism that uses relaxases belonging to the 3H protein family (Garcillán-Barcia et al., 2009). Using the relaxase sequences as an assortment criterion, the MOB classification was developed (Francia et al., 2004; Garcillán-Barcia et al., 2009; Smillie et al., 2010) (see Table 1). This comprehensive classification allows an emphasis to be placed on comparative aspects, a concept that is developed in A world of plasmids. In general, determination of the MOB type is an adequate descriptor of the entire transfer system of a plasmid and, in general, of the complete plasmid backbone.

Table 1.   Detection and classification of gammaproteobacterial plasmids
Reference plasmid*Plasmid size (bp)GenBank accession no.Relaxase type (MOB)T4SS type (MPF)MOB primers§Inc groupReplicon type (REP)Cargo genes**Plasmid original host (host range)††
  • *

    Plasmid representatives of each gammaproteobacterial MOB subfamily are listed. More than one reference plasmid per MOB subfamily was included when some members belong to different Inc or REP types or when more than one MOB primer pair was used for subfamily identification. Additional reference plasmids lacking relaxases have also been included when PBRT primers were available for their identification.

  • MOB subfamily to which the plasmid relaxase belongs according to Garcillán-Barcia et al. (2009). A dash indicates the absence of relaxase in the corresponding plasmid.

  • MPF type according to Smillie et al. (2010). Four groups were defined: MPFT, MPFF, MPFI and MPFG. A dash indicates the absence of type IV secretion system (T4SS) in the corresponding plasmid.

  • §

    § MOB primers used to amplify the corresponding relaxase class. A dash indicates that no primers were developed to identify the plasmid, either because of the scarce number of members (e.g. R721, pIGMS31) or the lack of a relaxase gene (e.g. pK245).

  • When incompatibility was experimentally tested, the Inc group is provided.

  • PBRT primers (Gotz et al., 1996; Greated & Thomas, 1999; Carattoli et al., 2005; Garcia-Fernandez et al., 2009; Bertini et al., 2010; Villa et al., 2010). A dash indicates the absence of primers for detecting the corresponding plasmid.

  • **

    For each reference plasmid, genes encoding antibiotic resistance (underlined italics), virulence (bold italics), resistance to heavy metals (normal type) and degradation of xenobiotic compounds (italics) are indicated. Obviously, these ‘cargo’ genes are highly variable and other members of each group carry different determinants, as thoroughly reported, for instance, by Cattoir et al. (2008), Carattoli (2009), Chen et al. (2010), Andrade et al. (2011).

  • ††

    The bacterial species where the plasmid was first isolated is shown. Besides, the taxonomic families of Gammaproteobacteria where either the reference plasmid or other plasmids of the same Inc/REP group have been found (A, Aeromonadaceae; Ac, Acidithiobacillaceae; C, Cardiobacteriaceae; Ca, Caulobacteraceae; E, Enterobacteriaceae; L, Legionellaceae; M, Moraxellaceae; P, Pseudomonadaceae; Pt, Pasteurellaceae; V, Virbrionaceae).

  • ‡‡

    pMLST test available for IncN plasmid screening (Garcia-Fernandez et al., 2011; Zong et al., 2011) (http://pubmlst.org/plasmid/).

  • §§

    Truncated mer operon.

  • ¶¶

    ¶¶ When the complete sequence of the plasmid is not available (pSU316) or it is not annotated yet (pIP71, R387, TP113, R6K), the designation of the cargo genes are tentative, based on the phenotypic traits described in the literature.

  • ∥∥

    pMLST test available for IncI1 plasmid screening (Garcia-Fernandez et al., 2008) (http://pubmlst.org/plasmid/).

  • ***

    B/O primers recognize IncK and IncB/O plasmids (Carattoli et al., 2005).

  • †††

    The IncQ1 plasmid RSF1010 exerted incompatibility against the IncQ2 plasmid pTC-F14 (Gardner & Rawlings, 2004). This effect was not shown against other IncQ2 plasmids, such as pTF-FC2.

  • ‡‡‡

    The IncQ2 plasmids pTC-F14 and pTF-FC2 are compatible (Gardner & Rawlings, 2004).

  • §§§

    A PCR-amplification test based on gene traN of plasmid pHHV216 was used in plasmid identification from agricultural soils (Heuer et al., 2009).

  • ¶¶¶

    pMLST test available for IncHI1 plasmid screening (Phan et al., 2009).

  • ∥∥∥

    Plasmid double locus sequence typing test available for IncHI2 plasmid screening (Garcia-Fernandez & Carattoli, 2010) (http://pubmlst.org/plasmid/).

  • ****

    R391 is an ICE formerly considered an IncJ plasmid (Coetzee et al., 1972; Nugent, 1981). A PCR has been developed to detect the integrase gene of IncJ ICEs (McGrath et al., 2006).

  • ††††

    IncR is not a group properly defined by incompatibility testing, but the PBRT notation of pK245 replicon (Garcia-Fernandez et al., 2009).

  • NA, data not available, due to the lack of complete nucleotide sequence; ND, not determined.

R38833 913BR000038.1F11TF11-fw+F1-rvIncWWoriV (Gotz et al., 1996); W (Carattoli et al., 2005)dfrB2, sul1E. coli (E, P, A, M, L)
R4650 969AY046276.1F11TF11-fw+F1-rvIncN‡‡NoriV (Gotz et al., 1996); N (Carattoli et al., 2005)oxa2, aadA1, sul1, tetAR, asrABCDRS. Typhimurium (E, P, V, A)
pWWO116 580NC_003350.1F11TF11-fw+F1-rvIncP-9P9rep (Greated & Thomas, 1999)xyl, merRBΔ§§Pseudomonas putida (P, E)
F99 159AP001918.1F12FF12-fw+F1-rvIncFIFIA+FIB+FIC+FrepB (Carattoli et al., 2005)Escherichia coli (E)
R10094 281NC_002134.1F12FF12-fw+F1-rvIncFIIFrepB (Carattoli et al., 2005)merRTPCAD, yadA, sul1, aadA1, cat, tetAR,Shigella flexneri (E)
pSU316NDX55894.1+M28097.1+M26937.1 (partial sequence)F12FF12-fw+F1-rvIncFIII/IncFIVFrepB (Carattoli et al., 2005)hly¶¶Escherichia coli (E)
pED208∼90 000AF411480.1 (partial sequence)F12FF12-fw+F1-rvIncFVS. Typhi (E)
pIP7185 825ftp://ftp.sanger.ac.uk/pub/pathogens/Plasmids/pIP71.consF12FF12-fw+F1-rvInc9FrepB (Carattoli et al., 2005)bla, cat, aad, sul, tet¶¶Escherichia coli (E)
pSLT93 939NC_003277.1F12FF12-fw+F1-rv FIIS (Carattoli et al., 2005)srgAB, pefBACDI, spvRABCDS. Typhimurium (E)
pMT99 286CP001610.1F12FF12-fw+F1-rv FIIγ (Villa et al., 2010)caf1RMAYersinia pestis (E)
pKPN4107 576NC_009650.1F12FF12-fw+F1-rv FIIK (Villa et al., 2010)blaSHV-12, aac(6′), aadA, blaTEM, merACDE, vagCKlebsiella pneumoniae (E)
pACICU264 366NC_010606.1F1FF12-fw+F1-rv GR6 (Bertini et al., 2010)Acinetobacter baumannii (M)
RP460 099NC_001621.1P11TP11-fw+P1-rvIncP-1αPtrfA1+PtfrA2 (Gotz et al., 1996); P (Carattoli et al., 2005)tetAR, aphA, blaTEM, klaBPseudomonas aeruginosa (P, E, V)
R75153 423NC_001735.4P11TP11-fw+P1-rvIncP-1βPtfrA2 (Gotz et al., 1996)dhfrIICEnterobacter aerogenes (E, P)
R64120 826NC_005014.1P12IP12-fw+P1-rvIncI-1α∥∥I1-Iγ (Carattoli et al., 2005)tetACDR, strAB, asrA1A2BCDHR2Salmonella enterica (E)
R38787 645ftp://ftp.sanger.ac.uk/pub/pathogens/Plasmids/R387.consP12IP12-fw+P1-rvIncKK+B/O*** (Carattoli et al., 2005)catIII, str¶¶Shigella flexneri (E)
TP11396 471ftp://ftp.sanger.ac.uk/pub/pathogens/Plasmids/TP113.consP12IP12-fw+P1-rvIncB/OB/O (Carattoli et al., 2005)aph, mer¶¶S. Typhimurium (E)
pCTX-M389 468NC_004464.2P13IP131-fw+P1-rvIncL/ML/M (Carattoli et al., 2005)blaCTX-M3, blaTEM1, aacC2, dhfrA12, aadA12, armA, sul1, mel, mph2Citrobacter freundii (E)
Rms14957 121NC_007100.1P14P14-fw+P1-rvIncG/IncP6sul1, aadA5, aac(3)-I, blaAhlyBDHIJPseudomonas aeruginosa (P, E)
pTC-F1414 155NC_004734.1P14P14-fw+P1-rvIncQ2†††,‡‡‡Acidithiobacillus caldus (Ac, E)
R6 K39 872ftp://ftp.sanger.ac.uk/pub/pathogens/Plasmids/R6K.dbsP3TP3-fw+P3-rvIncX2X (Carattoli et al., 2005)bla, str¶¶Escherichia coli (E)
pOLA5251 602NC_010378.1P3TP3-fw+P3-rvIncX1-blaTEM1, blmS, oqxAB, mrkABCDFEscherichia coli (E)
pFBAOT684 749NC_006143.1P41TP41-fw+P41-rvIncUU (Garcia-Fernandez et al., 2009)sul1, aadA2, tetARAeromonas caviae (A, R, P)
pHHV216§§§58 274FJ012880.1P42T -strAB, tetHR, floR, sul2, phcKUncultured bacterium (E, M)
ColE16646J01566.1P5P51-fw+P5-rv ColE (Garcia-Fernandez et al., 2009)Escherichia coli (E)
pTPqnrS-1a10 066AM746977.1P5P51-fw+P5-rv ColE+ColETp (Garcia-Fernandez et al., 2009)qnrS1S. Typhimurium (E)
p95555673AY359464.1P5P52-fw+P5-rv tetLActinobacillus pleuropneumoniae (Pt, E)
pAsal16371NC_004338.1P5P53-fw+P53-rv Aeromonas salmonicida (A, E)
R72175 582NC_002525.1P6TIncI2dhfrI, sat, aadAS. Typhi (E)
pQ79042NC_014356.1PuIncQ3blaGES-1, blaOXA/aac(6′)-Ib,Escherichia coli (E)
R27180 461NC_002305.1H11FH11-fw+H11-rvIncHI1¶¶¶HI1 (Carattoli et al., 2005)tetARCDS. Typhi (E)
R478274 762NC_005211.1H11FH11-fw+H11-rvIncHI2∥∥∥HI2 (Carattoli et al., 2005)tetDCAR, cat, aphA, merRTPCADE, silCSR, copE2ABCDRSE1, terY3Y2XY1WSerratia marcescens (E)
pCAR1199 035NC_004444.1H11FH11-fw+H11-rvIncP-7carAaBaBbcarCAcAdDFE, antRABCPseudomonas resinovorans (P)
R391****88 532AY090559.1H12FH12-fw+H12-rvIncJ****aph, merRTPCAProvidencia rettgeri (E)
pSN254176 473NC_009140.1H12FH12-fw+H12-rvIncA/CA/C (Carattoli et al., 2005)floR, tetAR, strAB, sul1, sul2, aacC, aadA, blaCMY-2, sugE, merSalmonella enterica (E, A)
Rts1217 182NC_003905.1H12FH12-fw+H12-rvIncTT (Carattoli et al., 2005)cmlR, mphB, aph,pheA, arsA, klaABProteus vulgaris (E)
pKLC102103 532AY257538.1H2GH2-fw+H2-rv vagCPseudomonas aeruginosa (P)
RSF10108684NC_001740.1Q11Q11-fw+Q11-rvIncQ1†††QoriV+QrepB (Gotz et al., 1996)strAB, sul1Escherichia coli (E, C, Ca, M, P, Pt, V)
p117455486DQ176855.1Q12Q12-fw+Q12-rv tetBActinobacillus pleuropneumoniae (Pt, E)
pIGWZ124072DQ311641.1QuQu-fw+Qu-rv Escherichia coli (E)
p1ABAYE5644NC_010401QuQu-fw+Qu-rv GR11 (Bertini et al., 2010)Acinetobacter baumannii (M)
p2ABSDF25 014NC_010396Qu GR18 (Bertini et al., 2010)Acinetobacter baumannii (M)
p3ABSDF24 922NC_010398Qu GR7, GR9, GR15 (Bertini et al., 2010)katEAcinetobacter baumannii (M)
CloDF139957X04466.1C11-C11-fw+C11-rv Enterobacter cloacae (E)
pYptb3295327 702NC_006154.1C12TC12-fw+C12-rv Yersinia pseudotuberculosis (E)
pIGMS312520AY543072.1Vu Klebsiella pneumoniae (E)
p135040NAGQ861437 (partial sequence)VuNA GR19 (Bertini et al., 2010)bla-oxa143Acinetobacter baumannii (M)
pK24598 264DQ449578.1IncR††††R (Garcia-Fernandez et al., 2009)tetDR, sul2, strAB, cat2, blaSHV2A, aacA2, blaTEM, qnrSKlebsiella pneumoniae (E)
p1ABSDF6106NC_010395 GR1, GR12 (Bertini et al., 2010)Acinetobacter baumannii (M)
pACICU128 279NC_010605 GR2, GR10 (Bertini et al., 2010)bla-oxa58Acinetobacter baumannii (M)
p3ABAYE94 413NC_010404 GR13 (Bertini et al., 2010)nemAAcinetobacter baumannii (M)
p4ABAYE2726NC_010403 GR14 (Bertini et al., 2010)Acinetobacter baumannii (M)
pAB113 408NC_009083 GR17 (Bertini et al., 2010)Acinetobacter baumannii (M)

Establishment module

As mentioned before, plasmid backbones of conjugative plasmids contain more genes than those required for replication, stability and propagation. In fact, most conjugative plasmids, even those as small as the REPW-MOBF11 plasmids, seem to conserve an additional DNA region of about 10–20 kb (Fernandez-Lopez et al., 2006), blue-colored in Fig. 1, which contains (among others) genes related to DNA transactions in the recipient cell. This region is not essential for maintenance under laboratory conditions, but seems to be essential for survival in nature, because all plasmids contain variants of it. In general, this part of the plasmid is located in the so-called conjugal leading region (the first to enter recipient cells in conjugation) and contains a set of genes frequently shared by many different plasmids. Examples of such genes are those coding for single-stranded binding proteins, antirestriction systems, etc. These genes are supposed to be important when a plasmid enters a new genetic background, and are thus called establishment genes. A classical example is the primase gene sog of REPI1-MOBP12 plasmids, which is only partially required for conjugation between Escherichia coli cells (Chatfield et al., 1982), but is required to expand the recipient host range to Salmonella and other enterobacteria (Lanka & Barth, 1981). Mutations in genes stbABC located in the leading region of REPN-MOBF11 plasmid pKM101 decreased plasmid stability (Paterson et al., 1999). Homologs of gene ardA are present in the leading region of REPN-MOBF11, REPFrep-MOBF12 and REPI1-MOBP12 plasmids (Chilley & Wilkins, 1995). ArdA acts specifically against type I restriction enzymes, protecting the unmodified plasmid DNA once it has entered the recipient cell (Delver et al., 1991; Read et al., 1992). Other functional antirestriction genes, klcA/ardB, are present in REPN-MOBF11 and REPP-MOBP11 plasmids (Serfiotis-Mitsa et al., 2010). The leading region of MOBF12 and MOBP12 plasmids contains gene psiB (named after plasmid SOS inhibition) (Bagdasarian et al., 1980; Golub et al., 1988). It was shown to be transiently expressed in transconjugant cells (Bagdasarian et al., 1992), suppressing the potentially deleterious SOS response produced by the transferred single-stranded DNA (ssDNA) through binding to RecA and the consequent inhibition of all its activities (Bailone et al., 1988; Petrova et al., 2009). Another gene that maps in the leading region of many conjugative plasmids, ssb, encodes a ssDNA-binding protein (Golub & Low, 1985, 1986) that suppressed the UV and temperature sensitivity of chromosomal ssb-1 mutants when tra genes were derepressed (Golub & Low, 1986). Both ssb and psiB are induced in recipient cells following conjugation and therefore help in the installation of the incoming DNA (Jones et al., 1992). ssb mutants exhibited the same conjugative and stability properties as the wild-type strain, but a marked plasmid-mediated SOS inhibition phenotype (Howland et al., 1989). These and other references show that there is patchy information about some of the genes contained in the establishment regions of different plasmids. However, we are far from having a complete picture of the importance of these prevalent genes. This is an area in which research should be conducted to clarify an important issue of plasmid physiology.

Adaptive module

It is the most variable and changes quickly, compared with variations in plasmid backbones. By analyzing the adaptive modules of R plasmids, it became clear that plasmids cluster in groups that contain a conserved backbone in which different platforms containing AbR genes insert (Schluter et al., 2007; Welch et al., 2007; Revilla et al., 2008; Phan et al., 2009; Carattoli et al., 2010). Special importance has been given to integrons, one of the most active AbR gene capture platforms (Mazel, 2006). Interestingly, integron integrases were shown to be upregulated during conjugative transfer, increasing gene cassette rearrangements (Baharoglu et al., 2010). It is important to emphasize that the appearance of a wide variety of plasmids with almost identical backbones, but containing a number of indels in different permissive spots (sometimes even the same site, then called a hot-spot) is frequently observed (Heuer et al., 2004; Sota et al., 2007; Revilla et al., 2008; Fricke et al., 2009; Phan et al., 2009; Carattoli et al., 2010). However, the carriage of cargo genes is not without a cost, because they affect plasmid fitness. For instance, loss of plasmid-borne AbR was observed repeatedly during experimental evolution experiments, resulting in plasmid-containing populations carrying deletions of the AbR genes and increased fitness (Godwin & Slater, 1979; Dahlberg & Chao, 2003). Besides genes selected for obvious selective value (as AbR in the presence of antibiotics), what other adaptive traits are carried by plasmids? Many genes carried by plasmids code for traits involved in bacterial sociality, such as the production of public goods (which benefit a cell's neighbors) or bacteriocins (which harm a cell's neighbors) (Rankin et al., 2010). As could perhaps be expected, little research deals with the causes and consequences of the carriage of this kind of genes in specific plasmid types.

A world of plasmids

Because of bacterial sequencing projects, or specific plasmid sequencing projects, we presently know the complete DNA sequence of more than 2000 plasmids. Many of these sequences have already been subjected to various types of analysis. From them, we can infer some global characteristics of the genetic constitution of plasmids. For instance, Rankin et al. (2010) analyzed what types of gene are most likely to be found on plasmids and why. Because plasmids are autonomous replicons, selection acts on them in directions not necessarily optimal for their hosts. Thus, plasmid genes can be beneficial or harmful to the carrying host. Moreover, they can help or harm other bacteria in the environment of the host. For instance, genes involved in biofilm formation or genes coding for secreted hydrolases help other bacteria. On the other hand, genes coding for bacteriocins harm other bacteria. These genes that affect other bacteria in the population are called ‘public genes’, as opposed to ‘private genes’, which only affect the fitness of the carrying host, but not of other bacteria (for instance AbR genes). The interplay between these types of genes is different if they are located in the chromosome, where they generally cannot move, or in plasmids, where they can overreplicate the host. It has been found that plasmids contain more ‘public genes’ than do chromosomes (Nogueira et al., 2009). This example is mentioned here just to emphasize that we should expect to find specific types of genes in plasmids, sometimes for reasons that are not immediately obvious. Following this line of reasoning, there have been some attempts to characterize sets of genes as ‘typical’ of plasmids. A pure bioinformatic approach used phylogenetic profiling of completely sequenced plasmids and produced good results in the discovery of protein-coding backbone components when considering relatively closely related plasmids (Brilli et al., 2008). Similarly, an analysis of the proteins coded by 503 plasmids contained in the ACLAME database (http://aclame.ulb.ac.be) (Leplae et al., 2006) allowed a network representation of the relationships between plasmids, which is relevant for plasmid classification and phylogenetic analysis. In general, the explicative power of these attempts suffered from the lack of a hierarchy in the genes that form the obtained networks. In other words, it is difficult to assign a backbone of genes in the absence of an obvious core genome. To overcome this difficulty, we proposed to use plasmid relaxases as the core plasmid gene, that is, a sort of ‘16S-RNA clock’ to which the evolution of other plasmid genes could be anchored. By implementing this simple change in the point of view, it was possible to discern the phylogenetic relationships among plasmids far more easily (Garcillán-Barcia et al., 2009). As an example, an analysis of the extended IncW backbone allowed us to perceive some general trends in plasmid evolution (Fernandez-Lopez et al., 2006).

Following this idea, we carried out a bioinformatic analysis of plasmid mobility using the 1730 plasmids available in the GenBank database at the time of writing (Smillie et al., 2010). Basically, we established a computational protocol to identify and classify conjugation and mobilization genetic modules. The results of this analysis showed that plasmid diversity is as large as that of bacterial chromosomes (in the sense of occupation of the sequence space by backbone genes). Furthermore, comparative sequence analysis indicated that plasmids retain their backbone structure much better in evolution than bacteriophages, which show an extreme modular, even combinatorial, structure (Lima-Mendez et al., 2007). An important, and perhaps surprising finding of the analysis of global plasmid size distribution, was its multimodality (Smillie et al., 2010), showing several clear maxima, instead of an expected loosely fitting normal distribution (Fig. 2). The data are best interpreted if we think of plasmids as divided into classes of conjugative, mobilizable and nonconjugative. As expected from the concepts put forward in the previous section, mobilizable plasmids are generally of a small size, showing a median at about 5 kb. This is enough genetic content to code for a basic replication module plus one to three adaptive genes, which we propose therefore as the basic trend of mobilizable plasmids. However, there is a second broad and flat peak that includes mobilizable plasmids from 50 to 300 kb. This peak is difficult to interpret, but suggests that a significant fraction of plasmids are selected by evolution to be dependent on alien MPF systems in order to gain for additional protein-coding sequence space. Two examples of this kind of plasmids, among many others, are the 57 121 bp Pseudomonas aeruginosa R-plasmid Rms149 (GenBank accession no. NC_007100) and the 65 158 bp Acidithiobacillus caldus plasmid pTcM1 (NC_010600), which confers resistance to arsenic. Alternatively, the loss of transfer capacity can be due to the deletion of conjugative genes [as is obvious in the sequence analysis of, for instance, pO157 (NC_007414), pETEC_73 (NC_009788), pSS_046 (NC_007385), pAsa4 (NC_009349), etc.], a situation that can alleviate the burden imposed to the host cell by expression of the conjugative machinery. In conjugative plasmids, which show a mean size of 100 kb, the increased size seems to be a necessity for adjusting to the carriage of MPF and establishment modules (a minimum of roughly 30 kb), as explained in Genetic organization of plasmids. Besides transmissible plasmids, DNA sequence databases contain approximately 50% of proteobacterial plasmids that carry no relaxase gene and, therefore, are assumed to be nontransmissible by conjugation. However, a fraction of these could still be transferred by conduction. Conduction is a mechanism of transfer by which a nonmobilizable plasmid forms a cointegrate with a transmissible plasmid, the cointegrate is transferred to the recipient and the plasmid reforms there by resolution of the cointegrate (Clark & Warren, 1979). The natural significance of this process, which is well known in the laboratory, has not been analyzed. Nontransmissible plasmids also show a multimodal distribution, with maxima at about 4, 35 and 400 kb. We interpret these maxima as the sizes that are optimal for other gene transfer mechanisms. The first maximum, at about 4 kb, could be related to transformation, which shows a clear dependence on size (Lorenz & Wackernagel, 1994), so that the smaller the DNA sequence, the higher the transformation frequency. The second maximum coincides with the size of lambda-like phages, which are very abundant and have an average size of 40–50 kb. Because this size limits the amount of DNA that can be encapsidated in the phages (Fineran et al., 2009), it also places a limit on the size of transducing particles. Finally, very large plasmids (sizes over 300 kb) are probably transfer-deficient remnants of conjugative plasmids that actively accumulated chromosomal genes and are thus in the process of converting to supernumerary chromosomes. Specifically, 90% of plasmids larger than 400 kb contain genes coding for essential proteins and show a higher coding density than smaller ones (Smillie et al., 2010). Besides, very large plasmids are preferentially hosted by prokaryotes with larger chromosomes (Slater et al., 2008) that, in turn, tend to reside in more complex environments (Bentley & Parkhill, 2004; Raes et al., 2007).

Figure 2.

 Mobility of plasmids according to their size. Distribution of conjugative (i.e. self-transmissible by conjugation), mobilizable (i.e. transmissible by conjugation only in the presence of a helper conjugative plasmid) and nontransmissible plasmids, according to their size. Curves were created from a polynomial interpolation of the histograms of each class. The figure is an update from Smillie et al. (2010), using the database as of October 2010.

Another general trend found by bioinformatic analysis was that plasmids show adaptation to a preferred bacterial host, as shown by the amelioration of their frequencies of di- and trinucleotides (Campbell et al., 1999; Suzuki et al., 2010), even when they could potentially transfer to distantly related bacteria, as shown in Koksharova & Wolk (2002). Only some plasmids, like REPW and REPQ1, appear to change host so frequently that they do not show signs of amelioration to any sequenced bacterial genome. Phylogenies of conjugative VirB4-like and T4CP-like proteins also showed that most plasmid classes were circumscribed to relatively narrow bacterial taxonomic clusters (Smillie et al., 2010), suggesting reduced plasmid mobility between phyla. It can be supposed that different plasmid backbones carry different strategies for adaptation. Thus, many evolutionary strategies can exist in plasmids, which are engraved in plasmid sequences by the inheritance of specific sets of genes. We know almost nothing of the relevance of many of the plasmid genes contained in plasmid backbones, as discussed in Genetic organization of plasmids. The existence of a functional specialization is shown, for instance, by the relationship between plasmid size and MOB type, as shown in Fig. 3. The figure shows an analysis of 257 plasmids from Gammaproteobacteria. As can be seen in the figure, in which the size bimodal distribution of plasmid sizes is obvious, certain MOB types include only large plasmids while others are typical of small plasmids. This result has to be interpreted as a specialization of each MOB type for certain genome architectures. Thus, MOBF and MOBH plasmids are usually large, implying a strategy of more extended and perhaps more sophisticated backbones. This can perhaps be related to the fact that those plasmids can conjugate in a liquid medium and this additional complication brings in the appearance of new sets of genes (e.g. those encoding mating-pair stabilization proteins). On the other hand, MOBQ plasmids prefer small sizes with almost no exception. MOBP plasmids distribute across a large range of sizes, suggesting a versatile and successful genetic constitution. Although there are few MOBV plasmids in Proteobacteria, these few follow the small size characteristic of their relatives in Firmicutes. This differential distribution is certainly nonrandom, although we are far from having a mechanistic explanation for it. Clearly, experiments in which different modular organizations are compared will shed some light on these intriguing plasmid properties.

Figure 3.

 Assortment of 261 relaxases placed in 257 gammaproteobacterial plasmids according to plasmid size and MOB type. Each MOB type is denoted by a different color, as shown in the color code at the right. The horizontal axis distributes plasmids according to the size windows shown. The vertical axis denotes the number of relaxase in each size window.

In summary, although the theory that plasmids are formed by the accretion of functional modules is a well-accepted one in plasmid biology, the data presented in this section demonstrate that module shuffling is a slow process, which is ‘filtered’ by selection. By this, we mean that, although there are infinite ways in which plasmids can exchange modules and produce all types of hybrids in the laboratory, these processes seem to occur at a slow pace in nature. Out of the genetic melting pot, specific plasmid backbones emerge that seem to be reasonably stable over time and take over a large proportion of the existing majority of elements that can be extracted from a given ecosystem. As a rough guide, a half of all gammaproteobacterial plasmids are transmissible by conjugation (either conjugative or mobilizable), while the remaining half are not (Smillie et al., 2010). These are supposed to propagate by either transduction, transformation or conduction. The abundance of transposons and insertion sequences in nontransmissible plasmids argues in favor of the importance of this last mechanism. The relative importance of conduction in plasmid transmission should be analyzed in more detail.

Population genetics of proteobacterial AbR plasmids

For many MDR pathogens, resistance is mediated by the acquisition of genes by lateral gene transfer (LGT). In these cases, resistance does not usually appear in the treated human (or animal) host. Rather, the causative microbial agent or genetic platform is acquired from the community (Lipsitch & Samore, 2002). This fact was recently confirmed, for example, by a most revealing work by Sommer et al. (2009), which shows that most AbR genes identified in the human gut by culture-independent methods were clearly different from known AbR genes. By contrast, nearly half of the AbR genes identified in cultured aerobic gut isolates (which represent roughly only 1% of the gut microbiome) were identical to AbR genes harbored by major pathogens. Thus, the indigenous gut microbial communities and the population of hosts for AbR gene platforms are largely separate entities with the corollary that AbR genes in human pathogens come from environmental reservoirs. If this were a general case, treating patients with antibiotics will result in further selection and dissemination of the responsible MDR organism (Lipsitch & Samore, 2002). If AbR genes and their platforms are acquired from community reservoirs, these reservoirs and the routes by which they travel down to the final human pathogen that causes an infection should be found. An in silico analysis (Beiko et al., 2005) was used to identify some highways by which bacteria exchange genetic information, but little is known about the experimental validation of presumed routes. For instance, conjugation in soil is enhanced in the rhizosphere of plants (Smit et al., 1998), while conjugation in liquid media is enhanced by the medium protozoa (McCuddin et al., 2006).

Once an AbR-encoding plasmid has been stabilized in a given host, arresting the use of the antibiotic becomes ineffective as a control strategy of AbR spread, as demonstrated for apramycin- (Yates et al., 2006) and trimethoprim-resistance plasmids (Sundqvist et al., 2009; Brolund et al., 2010). In fact, when the cost of resistance is low, the time required for displacing AbR populations by sensitive ones after ending drug treatment may be long, as shown by mathematical models and experimental evolution experiments carried on plasmid pB10 (De Gelder et al., 2004). Even when a small fraction of the resistant population remains in the environment, reintroduction of the antibiotic could cause the resistant population to quickly revert its previous decline, as predicted both by theoretical and by modeling approaches (Levin et al., 1997; Austin et al., 1999; Heinemann et al., 2000).

In more practical terms, experimental evolution experiments shed light on the mechanisms that explain the persistence of plasmids in bacterial populations (Lenski, 1997). In those experiments, a plasmid-containing host is propagated for several generations without selective pressure (media that do not select for the plasmid-encoded trait). The stability of the plasmid through generations is checked by replica plating in selective and nonselective media. To test for the burden imposed by the plasmid to the host, a competition experiment between the plasmid-free and the plasmid-bearing host is implemented, starting a co-culture under nonselective conditions with the same amount of both subpopulations. The number of cells containing and lacking the plasmid is checked by replica plating at controlled intervals. If there is no difference in fitness between the competing strains, the selection cost or burden due to plasmid carriage is 0. If the plasmid-free subpopulation overgrows, it can be said that the plasmid imposes a cost to the host. Overgrowth of the plasmid-containing subpopulation means an increase in host fitness due to plasmid carriage. Fitness cost experiments that include the original strain carrying the evolved plasmid, or the evolved host containing the ancestral plasmid, allowed researchers to infer whether the genetic changes leading to a burden decrease occurred in the plasmid or the host chromosome.

Plasmids, such as R1 or RP4, were shown to impose an initial burden on ‘naïve’E. coli cells. However, after several hundred generations in batch culture, the plasmids were stable and the cost was reduced through genetic mutation, both in the plasmids and in the bacterial chromosome. In fact, the evolved plasmids no longer imposed a cost on their host when transferred to the ‘naïve’ ancestral E. coli. In parallel, the evolved strain exhibited a lowered cost for carrying the ancestral plasmids (Dahlberg & Chao, 2003; Dionisio et al., 2005). These results suggest that, even in the absence of selection, a conjugative plasmid would remain in the population.

Fitness gains are initially rapid in constant environments, but tend to decline over time (Elena & Lenski, 2003). Sporadic selection for plasmid-encoded genes, typical in heterogeneous environments, seemed to be a determinant factor for plasmid persistence (Eberhard, 1990; Turner et al., 1998). Periods of high plasmid loss alternate with periods in which the relative frequency of segregants remains unchanged, because plasmid cost could be counterbalanced by environmental fluctuations (Ponciano et al., 2007). The initial ratio of plasmid-free and plasmid-carrying cells necessary for plasmid-bearing bacteria to persist depended on the environment. For example, in mixed environments (e.g. liquid serial batch), when selection is present, the coexistence of both populations depended on a high initial cell density, while in spatially structured environments (e.g. soft agar matrix), the initial cell density had no effect (Chao & Levin, 1981; Ellis et al., 2007; Slater et al., 2008).

What does conjugative transfer have to do with plasmid persistence? Plasmid stable maintenance could be guaranteed if rates of plasmid loss due to segregation and fitness costs were compensated either by a fitness increase of the host, as described above, or through plasmid reinfection. Bacterial conjugation is the main route for transmissible plasmids to reach new recipients as complete units, rather than natural transformation (Lorenz & Wackernagel, 1994). Early studies using chemostats found that plasmids could be maintained only when cell density and conjugative transfer rates were large enough for the transmission of the plasmid to compensate for its loss through segregation and selection against plasmid-carrying bacteria (Stewart & Levin, 1977).

The IncP-1 plasmid pB10 was unstable in Pseudomonas putida H2, where the plasmid conferred a high cost. Evolution experiments of pB10-containing H2 populations were carried out, with or without concomitant plasmid transfer, in the presence of an antibiotic selective for the plasmid. The plasmid became stable in strain H2 after 1000 generations. However, its stability, as well as the host fitness, significantly increased when partially evolved plasmids were periodically transferred to naïve plasmid-free H2 hosts (Heuer et al., 2007). Thus, regular horizontal plasmid transfer may positively affect plasmid adaptation to an unfavorable host. In a different experiment, Dahlberg & Chao (2003) showed that evolved RP4 clones exhibiting lower fitness costs also exhibited decreased conjugation frequency (further analysis indicated mutations in genes for pilus production). In parallel experiments, plasmid R1 evolved clones also showed reduced transfer rates, but only in the evolved host, an indication that this phenotype was not plasmid R1 encoded. Turner et al. (1998) also examined how the cost of plasmid carriage depended on plasmid transmissibility. They carried out a 500-generation experiment using a conjugative plasmid isolated from nature and analyzed 10 derived plasmids. Five of them yielded higher rates of conjugative transfer than the ancestral plasmid, while five others yielded lower rates (including two that became unable to conjugate). Similarly, the plasmids that evolved lower conjugation rates were less costly to their host than the ancestral plasmid, whereas those that evolved higher conjugation rates became more costly. This behavior was explained by a mathematical model (Ponciano et al., 2007) predicting that high plasmid loss (due to segregation or high burden) must be balanced by high transfer frequency, while a burden reduction would allow plasmid invasion of the population. The model also predicts that, within a certain range of parameter combination (burden, segregation frequency and conjugation frequency), plasmid-carrying and plasmid-free bacterial populations will coexist indefinitely.

The above experiments were carried out using a small set of model plasmids (R1, RP4, pB10 and a few others). In order to have a true knowledge of the diversity of plasmid evolutive strategies, similar assays will need to be carried out using a variety of plasmid systems (backbones) and their embodied differential properties. Fortunately, existing genomic data allow us to get a general idea of the existing plasmid diversity in Gammaproteobacteria, the most-studied group of bacteria. Based on the phylogeny of their relaxases, we assorted most transmissible plasmids originating from Gammaproteobacteria into subfamilies, as shown in Table 1. Each subfamily could be amplified by a specific set of oligonucleotide pairs (Alvarado et al., manuscript in preparation). Table 1 includes not only plasmids adapted to hospital environments, but also environmental plasmids. Interestingly, some of these are occasionally also found in hospital settings (our unpublished data). They come mainly from the family Enterobacteriaceae, although representatives of other gammaproteobacterial families and even broad-host-range plasmids are also included, as indicated in the table.

The selected subfamilies belong to one or another of the six reported relaxase families (Garcillán-Barcia et al., 2009; Smillie et al., 2010) (Fig. 5) and cover more than 95% of the transmissible gammaproteobacterial plasmids present in GenBank. For instance, the MOBF relaxase family is almost completely represented by two subfamilies, MOBF11 and MOBF12, in the gammaproteobacterial plasmids. MOBF11 includes relaxases of plasmids belonging to Inc groups W, N and P9, while MOBF12 groups relaxases of plasmids of the IncF complex. Similarly, the MOBH1 class includes relaxases encoded by plasmids of several Inc groups (H, T, A/C, P7) as well as ICEs such as R391 and SXT. MOBH2 relaxases are mainly encoded by ICEs (such as PAPI-1 and clc). Several MOBP classes are widely represented in gammaproteobacterial R-plasmids: MOBP11 clusters relaxases of IncP plasmids; MOBP12 corresponds to IncI1, K and B/O; MOBP13 to IncL/M; MOBP14 to relaxases of the mobilizable plasmids of IncQ2/G group; MOBP3, MOBP4, and MOBP6, relaxases of IncX, IncU, and IncI2 plasmids, respectively; and MOBP5, ColE1-like mobilizable plasmids. MOBQ and MOBC families cluster relaxases of gammaproteobacterial plasmids into subfamilies MOBQ1 and MOBC1, which, respectively, include mobilizable plasmids RSF1010 and CloDF13. A more descriptive view of the MOB plasmid classification can be found in Francia et al. (2004), Garcillán-Barcia et al. (2009) and Smillie et al. (2010). As could be expected, analysis of gammaproteobacterial plasmids from genera phylogenetically distant from Enterobacteriaceae can produce a significant proportion of plasmids that could not be adequately classified, as shown by Bertini et al. (2010). Their relaxases fall in as yet badly resolved phylogenetic subfamilies, for example, Qu and Vu (see Table 1). High-throughput plasmid sequencing, which is expected to occur in the next few years, will resolve these uncertainties and result in a more robust and comprehensive plasmid classification.

Figure 5.

 Inc/REP family distribution of gammaproteobacterial plasmids according to relaxase type. Two hundred and sixty-nine relaxases contained in 257 gammaproteobacterial plasmids in the NCBI database (Smillie et al., 2010) were distributed into the six MOB families. The Inc or REP types associated with each MOB family are indicated.

As exemplified in Fig. 4, the REP types described by Gotz et al. (1996), Greated & Thomas (1999), Carattoli et al. (2005), Garcia-Fernandez et al. (2009), Bertini et al. (2010) and Villa et al. (2010) are much more restrictive in the plasmids they can amplify than the MOB types. In spite of this, the REP types include most of the backbone classes that are commonly found in clinical isolates of R-plasmids, for which they were devised. The MOB classification proposed by Garcillán-Barcia et al. (2009) and Smillie et al. (2010) misses only a few REP types (Fig. 5, Table 1), suggesting that most plasmid types that play a significant role in AbR dissemination are transmissible by conjugation. The sole exception within the Enterobacteriaceae is the IncR plasmid pK245 (Chen et al., 2006), which contains no relaxase. Thus, the MOB type can be used as a single token for extensive studies that do not call for a massive sequencing effort. A recent report on plasmids from Acinetobacter baumannii (Bertini et al., 2010) classified them into 19 REP groups, mostly unrelated to the existing REP types and mostly nontransmissible. Some of these groups contained completely sequenced plasmids, and are thus included in Table 1. They should be used as an example that further inspection of the Gammaproteobacteria will still uncover new REP (and MOB) groups.

Figure 4.

 Phylogeny of the MOBF1 family of relaxases. The first 300 amino acid residues of protein TrwC_R388 (black square) were used as query in a psi-blast search (threshold=10e−8; matrix: BLOSUM62), as explained (Garcillán-Barcia et al., 2009). The search was filtered to retrieve only plasmid sequences from Gammaproteobacteria. The search converged at the third iteration and retrieved 102 hits above the threshold. Phylogeny reconstruction was performed using mega 4.0 (Tamura et al., 2007). Nomenclature of the branches refers to groups of plasmids robustly solved during phylogeny reconstruction, the most important branches shown in different colours: F111 (green), F112 (red), F113 (blue) and F121 (brown). The two columns at the right of phylogeny indicate the MOB (Alvarado et al., manuscript in preparation) and the REP (Gotz et al., 1996; Greated & Thomas, 1999; Carattoli et al., 2005; Garcia-Fernandez et al., 2009; Bertini et al., 2010; Villa et al., 2010) types used for plasmid classification. MOB data were obtained by comparing the DNA sequences of relaxase genes with the pair of oligonucleotides designed to amplify them. As explained in Box 1, amplification is obtained only when there is a perfect match with the 3′-terminal 12 nucleotides of both primers. REP data were obtained similarly by searching the DNA sequences for targets of the probes designed by Gotz et al. (1996), Greated & Thomas (1999), Carattoli et al. (2005), Garcia-Fernandez et al. (2009), Bertini et al. (2010) and Villa et al. (2010). Positive identification required a perfect match in the 3′-terminal 12 nucleotides of the two primer oligonucleotides used for amplification. A dash indicates the absence of these sequences. Plasmids underlined are not yet available in databases and were added by us to the psi-blast hit list. Plasmid pAA-SP42 was obtained from hospital Sant Pau i la Santa Creu, Barcelona (accession no. JF421285.1). pMBUI4 is a plasmid isolated from soil and sequenced by E. Top (unpublished data). Xalbi stands for Xanthomonas albilineans.

We would like to illustrate the kind of phylogenetic analysis allowed by the MOB classification by looking at the phylogeny of MOBF1 relaxases, as shown in Fig. 4. It should be remembered at all times that relaxase evolution is the epitome of the evolution of the complete plasmid backbone, as shown in Smillie et al. (2010). Figure 4 shows the MOBF1 relaxase phylogenetic tree and the coverage of REP and MOB typing methods for each branch of the tree. As can be seen, REP typing identifies specific terminal branches within the tree, while MOB typing (Box 1) yields much broader results due to the use of degenerate oligonucleotide primers (in this regard, the REP and MOB strategies are complementary). Specific MOB classes are later identified by sequencing of the resulting MOB amplicons. Used in this way, the MOB method uncovers most of the plasmid diversity found in Gammaproteobacteria (as represented in DNA databases) and provides an example of the utility of this type of analysis to classify the plasmids according to the evolutionary links of their relaxases. In the figure, we have included the main MOBF1 types: F11 and F12 and their subtypes, and we included the REP types corresponding to them. The MOB subtypes were assigned after sequencing the amplicons obtained using the set of oligonucleotides corresponding to the MOB types. For instance, MOBF111 corresponds to REPW, MOBF112 corresponds to REPN, etc. However, REP types are less comprehensive. For instance, REPN leaves out pCT14 (Bramucci et al., 2006), pIasmI (accession no. FP340279) and pAA-SP42 (accession no. JF421285.1); REPW leaves out plasmII (FP340278) and the recently discovered environmental plasmid pMBUI4 (E. Top, pers. commun.). The objective of the comparison shown in Fig. 4 is not to claim that one method is better than the other, because both were planned with different objectives. While REP aims to ascertain what there is in the R-plasmid world in the simplest manner, MOB was developed to uncover new players that populate deeper branches of the known relaxase families (see Adaptive module). As an example, MOBP14 has no REP probes, but we found several hits with these probes in clinical isolates (our unpublished data). They correspond to the prototype plasmid Rms149 (Haines et al., 2005), assigned to the IncG/IncP6 incompatibility group (Haines et al., 2006). These plasmids remained unnoticed up to now in clinical surveys because of the lack of suitable probes.

Figure 4 is also useful when looking at the evolution of MOBF plasmids. As can be seen, MOB type F11 consists of several well-defined subtypes, including REPW and related plasmids (MOBF111), REPN and related plasmids (MOBF112) and a set of plasmids related to the IncP9 group of Pseudomonas plasmids (MOBF113). These three subtypes are clearly defined and represent true phylogenetic groups (coherent with trees constructed from VirB4s of T4CPs; see Smillie et al., 2010). This tree therefore indicates that plasmids belonging to the REPW, REPN and REPP9 groups are more related among them than to those of any other REP type. This relatedness most likely extends to a large fraction of the plasmid backbone and thus represents a series of plasmids that can share similar evolutive strategies, as discussed in Fernandez-Lopez et al. (2006). The next exercise is to compare F11 with F12. F12 contains the well-known members of the REPF plasmid complex, which includes close to 25% of the clinical isolates of R plasmids in E. coli (Carattoli, 2009). Small changes in the incompatibility determinants of REPF plasmids lead to compatibility (Lopez et al., 1989), allowing the coexistence of several REPF plasmids. Coexistence within the same host would facilitate AbR exchange by homologous recombination as well as by cointegrate formation (Hopkins et al., 2006; Chaudhuri et al., 2010; Villa et al., 2010). Although the F121 subtype is a heavily populated branch, there is no more genetic distance between them than there is among members of the F111 or F112 groups. Therefore, real plasmid types cluster in well-resolved monophyletic groups, in a trend confirmed by the inclusion of many new isolates.

The very existence of this kind of tree, which are the rule rather than the exception in the plasmid world (see Garcillán-Barcia et al., 2009) also indicates that plasmids exchange functional modules, but not to the extent of confounding phylogenetic trees. If this were the case, the relaxase trees will not be coherent with the trees obtained with other backbone proteins. Generally speaking, we observed backbone gene exchanges only in deep branches of the trees (although we do find exceptions, we believe many are due to the transient formation of plasmid chimeras as a consequence of strong selective pressures). Thus, plasmid backbones should be considered as stable as those of bacterial chromosomes. This parallelism should be understood just in the sense that we can use the reflexes trained for bacterial nomenclature on plasmid nomenclature; we are seeing very similar trends. For instance, the differences in the genetic structure of REPW plasmids are as great (or as small) as those we find in the genus Escherichia (Fernandez-Lopez et al., 2006; Revilla et al., 2008). Hence, we can speak of the population biology of plasmid backbones or, so to say, ecology of plasmids. Each successful module combination will have its own ecology. However, we know close to nothing about this. Ideally, research should strive to obtain a ‘plasmid specification sheet’ for each relevant plasmid backbone. These specification sheets should contain data on the behavior of the respective REP type (that is, replication, copy number and stability in different hosts), MOB and MPF types (that is, conjugation frequencies to and from different hosts, conjugation kinetics and other physiological details of the conjugation apparatus, as well as other relevant genes contained in the establishment module). These parameters could then be used for first attempts at mathematical modeling of the dynamics of plasmid propagation and persistence (Krone et al., 2007).

Conclusions and further work

Although the relevance of LGT for the shaping of bacterial genomes is without question (de la Cruz & Davies, 2000), it appears that the vertical line of evolution preserves enough phylogenetic idiosyncrasy so that bacterial taxa are still highly informative with respect to the overall genetic constitution and physiology of a given bacterium (Beiko et al., 2005; Valas & Bourne, 2010). A similar situation applies to plasmids, which also share a relatively stable backbone of core genes among related members for long evolutionary periods (Smillie et al., 2010). Thus, it also makes sense to talk about plasmid species. As a consequence, the identification and characterization of plasmid species provides relevant information with respect to their physiology and, of special relevance in this review, their modes of transmission. A central concept of this review is that the identification of the relaxase gene is a good descriptor of the complete plasmid backbone. Therefore, the MOB classification of plasmids has a value comparable to the 16S rRNA gene classification of bacteria.

Once we know the significant plasmid species in an ecosystem, how to identify and follow them, we can discover their dynamics in complex bacterial populations, which are the genetic parameters that define their behavior. However, this review suggests that we know little of the comparative advantages and adaptation cues present in a given plasmid backbone to explain the present ecology of bacterial plasmids. This can change dramatically in the coming years because of the opportunities of recent technological breakthroughs. First is massive DNA sequencing, which will allow us a nonbiased access to plasmid diversity in microbial ecosystems. Second, systems biology approaches will allow us to analyze the multidimensional response of an ecosystem to systematic perturbations by modeling and experimentally proving the hypotheses that form the base of those models.

To advance along these lines, new tools can now be used that provide enough analytical power to start unveiling the main routes that genes (e.g. AbR genes) use to travel from environmental reservoirs to human pathogens. On the one hand, the MOB classification method will help by providing an inexpensive and easily automatable PCR-amplification technique that can cover most of the present-day diversity of transmissible gammaproteobacterial plasmids. More research and identification of plasmids has to be conducted before this approach can be efficiently used for the analysis of other bacterial groups. On the other hand, the characterization of the properties of relevant plasmid species will provide enough starting data to formulate hypotheses that can be modeled and experimentally tested in a systems biology approach.

This knowledge should be applied in the research for agents that can control the propagation of relevant dissemination platforms (plasmids, integrons, bacteriophages, ICEs, etc.) and therefore their cargoes (AbR genes). Potential antidissemination drugs, including compounds used as cotherapies to improve and preserve the efficacy of antibiotics (Smith & Romesberg, 2007; Williams & Hergenrother, 2008), as well as a number of Eco-Evo interventions in particular infection-prone environments (Baquero et al., 2011), will then become more easily testable for their efficacy in real, but simplified ecosystems as proof of principle that the approach can work. It is hoped that these kinds of interventions can ultimately lead to the control of the dissemination of AbR and will thus help to solve one important and increasing threat to human health.

Acknowledgements

This work was supported by grant BFU2008-00995/BMC from Ministerio de Ciencia e Innovación (MCINN, Spain), grant REIPI RD06/0008/1012 from Instituto de Salud Carlos III and grant no. 248919/FP7-ICT-2009-4 from the European VII Framework Program. M.P.G.-B. was the recipient of a JAE-Doc postdoctoral contract from Consejo Superior de Investigaciones Científicas (CSIC). A.A. was partially funded by the Ist Plan Regional de I+D+i de Cantabria.

Ancillary