Multiple plasmid origin-of-transfer substrates enable the spread of natural antimicrobial resistance to human pathogens

Antimicrobial resistance poses a great danger to humanity, in part due to the widespread horizontal transfer of plasmids via conjugation. Modeling of plasmid transfer is essential to uncovering the fundamentals of resistance transfer and for development of predictive measures to limit the spread of resistance. However, a major limitation in the current understanding of plasmids is the inadequate characterization of the DNA transfer mechanisms, which conceals the actual potential for plasmid transfer in nature. Here, we consider that the plasmid-borne origin-of-transfer substrates encode specific DNA structural properties that can facilitate finding these regions in large datasets, and develop a DNA structure-based alignment procedure for typing the transfer substrates that outperforms mere sequence-based approaches. We identify thousands of yet undiscovered DNA transfer substrates, showing that actual plasmid mobility can in fact be 2-fold higher and span almost 2-fold more host species than is currently known. Over half of all mobile plasmids contain the means to transfer between different mobility groups, which links previously confined host ranges across ecological habitats into a robust plasmid transfer network. We show that this network in fact serves to transfer antimicrobial resistance from the environmental genetic reservoirs to human pathogens, which might be an important driver of the observed rapid resistance development in humans and thus an important point of focus for future prevention measures.


Introduction
Horizontal gene transfer of antimicrobial resistance (AMR) genes occurs via the processes of transformation and conjugation. The former mediates especially narrow range, intra-genus transfers 1,2 , whereas the latter is implicated in a wider range of transfer hosts 3,4 and potentially enables AMR to overcome the toughest phylogenetic and ecological transmission barriers [5][6][7][8][9][10] .
Consequently, the interaction between conjugative relaxase enzymes and their DNA origin-of-transfer (oriT) substrates facilitates the majority of all AMR transfers in nature 11,12 and is especially important for ones related to human infection complications 13 . However, the current knowledge on conjugative transfer mechanisms and systems 4,14-16 is unable to describe the unprecedented amount of observed horizontal transfer 7,8,12,17 that seems to transcend all transfer barriers between resistance reservoirs and human hosts [5][6][7]9,10,12,13,18 .
The standard approach for characterization of plasmid mobility involves classification of conjugation and mobilization genes 14 , especially typing of relaxase enzymes into the respective mobility (Mob) groups 15,19 . However, besides the possibility of yet unidentified enzymes and mobility groups 15,[20][21][22][23][24] , multiple new processes have recently been uncovered that might confer additional mobility to plasmids and involve the origin-of-transfer (oriT) DNA substrate. These include (i) broadened relaxase binding specificities to multiple different oriT sequence variants [25][26][27][28][29] , which, according to the evolutionary theory of such DNA regions 4,30,31 , indicates the possibility of plasmids carrying multiple functional secondary oriTs, and (ii) trans-mobilization of plasmids carrying oriTs triggered by relaxases from co-resident plasmids acting in trans on the non-cognate oriTs [32][33][34][35] . The latter mechanism demonstrates that oriT regions are the only elements of the conjugation machinery required in cis 21 and suggests that many plasmids classified as non-mobile due to the absence of putative relaxases may in fact be mobilizable 33 .
However, although typing the oriT enzymatic substrates instead of the genetic scaffolds might present improvements to the current understanding of plasmid mobility, no systematic studies of oriTs across sequenced plasmids have yet been performed, likely due to the lack of available data and tools that would enable such oriT typing.
A major problem with uncovering oriT regions is that, apart from being experimentally laborious, it is computationally challenging due to multiple molecular mechanisms and a variety of DNA Zrimec 2019 sequence elements present and coevolving in the DNA substrate 4 , even among plasmids belonging to a single species such as Staphylococcus aureus 32 . OriT typing requires algorithms beyond simple sequence-based alignment 36,37 , which can recognise and process more complex molecular motifs such as inverted repeats and hairpins [38][39][40] as well as underlying DNA physicochemical and conformational features. These underpin key protein-DNA readout and activity mechanisms 4,41-43 as well as define conserved niches of structural variants that enable good resolution between Mob groups and subgroups 4 . The use of DNA structural representations has indeed led to improvements in algorithms for identification of other regulatory regions, such as promoters and replication origins [44][45][46][47][48] . Despite this, instead of using tools that probe the actual relaxase-oriT interaction potential by identifying molecular properties that are the basis of such interactions, conventional approaches for oriT analysis still rely on mere nucleotide sequence-based methods 32,36,37 .
Hence, here we prototype a DNA structure-based alignment algorithm for finding oriT variants, enabling us to locate and Mob-type oriT regions across thousands of sequenced plasmids.
Based on the newly uncovered oriT variants, since they can facilitate both in cis and in trans plasmid transfer, we re-analyse the amount of potential mobile plasmids and mobile plasmid-carrying host species. We then evaluate if and how the uncovered fraction of oriTs helps to overcome the known barriers to horizontal gene transfer, by reconstructing and analysing the network of potential AMR transfers between different species and habitats, especially those from the environmental reservoir to the human flora.

Structural alignment algorithm improves oriT-typing performance
Since a DNA alignment algorithm performs multiple comparisons between a query and a target sequence by evaluating a distance function, we first developed a structural distance function, termed s-distance (Fig S1-1 where region sizes were varied stepwise to cover a single relaxase enzymatic site of 40 bp up to the whole oriT region of 220 bp containing multiple binding sites ( Fig 1A). Furthermore, the comparison included discrimination of both (i) positive oriTs (aligned to nic site) from negative non-oriT sequences and (ii) Mob groups. With structurally encoded DNA, significant (Anova p -value < 1e-4) discrimination of both positive/negative and Mob groups was achieved in the whole oriT size range (Fig 1B,C), compared to the bare nucleotide sequences, where results were significant (Anova p -value < 0.05) only with oriT regions equal to or shorter than 120 bp.
This was corroborated with the Jaccard distance, which significantly (Ranksum p -value < 1e-16) decreased by 40% with increasing oriT region size when using structurally encoded k-mers (Methods M2, Fig S1-2), whereas it increased with nucleotide k-mers (Ranksum p -value < 1e-9). The results suggested that our structural encoding approach leveraged the chemical information in longer query regions and could thus improve multiple sequence comparisons with alignments by increasing the statistical depth ( Fig S1-3).
We next prototyped an alignment framework ( Fig 1D) that employed the s-distance measure to find target hits to query oriTs, where p -values were obtained via permutation tests (Fig S1-1, Methods M2). A 4 Mob group dataset of 106 query oriT regions and 170 testing plasmids 15,36,49 was compiled (Fig S1-4, Table S1-1, Methods M1) to test the algorithm's performance by assessing the oriT/nic location and Mob type of the highest scoring alignment hits. By using full-length 220 bp query regions, on average, 25% more significant (Permutation test p -value < 1e-14) oriT hits were recovered and Mob group predictions increased 28% compared to using 40 bp query size (Fig 1E, Fig S1-5). This corroborated that the use of longer queries indeed led to improved algorithm performance (Fig 1B,C). Furthermore, compared to Blast 37 and to Zrimec 2019 non-encoded DNA using the p-distance (Methods M2), our approach uncovered, on average, 27% more significant (Permutation test p -value < 1e-10) oriT hits and correctly predicted 22% more Mob groups (Fig 1E, Fig S1-5). By analysing machine learning metrics to better understand the algorithm's performance (Methods M3), we observed that a marked~40% increase was achieved with Recall ( Fig 1F) at a relatively constant Precision and Specificity , which corresponded to recovering a larger amount of the correct oriTs (Fig S1-6). Our method thus correctly located and Mob typed, on average, 55% of oriTs in the cross-validation and testing datasets (Fig S1-7, S1-8). We further validated the capability of the algorithm to identify specifically nic sites using a plasmid dataset with experimentally determined oriT regions that was not Mob typed 36 (Methods M1). Out of 13 such plasmids with 14 oriT sites, it correctly identified (Permutational test p -value < 1e-12) 6 oriT regions in 5 plasmids with 100% sequence identity and aligned to within ±1 bp of the nic sites 49 (Fig S1-7, Table S1-2). This indicated that the lack of diversity in the query dataset altogether missed certain oriTs in the testing datasets, which was also confirmed by using smaller query datasets that lowered the algorithm's performance especially for locating oriT regions (Fig S1-9). Nevertheless, despite the limited oriT data availability, the results experimentally verified the algorithm's capacity for oriT typing. subsets, for oriT typing as well as discrimination of MOB groups and subgroups. Error bars denote 95% confidence intervals. (F) Comparison of machine learning performance metrics between our algorithm (Strast) and Blast, and by using 220 bp or 40 bp oriT subsets. Error bars denote 95% confidence intervals.

OriT typing reveals a two-fold increase in the number of known mobile plasmids
We used the structural alignment algorithm to explore the diversity of oriT regions in natural plasmids. To cover all available oriT regions, we expanded the query dataset to 112 unique oriTs from 6 Mob groups that included, besides oriTs from the major Mob groups F, P, Q and V, also 3.6% and 0.9% of elements from groups C and T, respectively ( with Mob groups determined by relaxase amino acid homology analysis 50 ( Fig 2B). Here, 28.7% of plasmids were mobile (1307 plasmids that contained 1377 distinct relaxases) with the highest represented Mob groups F, P, Q and V (Fig 2A, Fig S2-2). The target dataset contained 893 distinct host species from 22 distinct phyla, with the mobile plasmids harboured by 40% of the distinct species with 88.1% from the phyla Proteobacteria and Firmicutes (Fig S2-1).
Contrary to the expectations based on the 1377 distinct relaxases found in the target dataset to merely 14 with relaxase typing (Fig S2-1). Furthermore, out of the 907 plasmids where both oriTs and relaxases were identified, the same Mob group, indicating that the oriT was cognate to the relaxase, was identified in 75% of cases ( Fig 2E). In the remaining 25% of these plasmids, the uncovered orits could have been secondary oriTs 30,31 or corresponded to either unknown or in trans acting 32 relaxases. The distribution of the oriT-identified Mob groups was found to be comparable to the one expected according to relaxase typing (Fig 2E), whereas a larger diversity of oriTs may exist within certain Mob subgroups than others (Table S2-3).
Multiple oriT regions were identified in over 63% of both the relaxase-typed as well as untyped mobile plasmids, where on average, 2 orits from 2 different Mob groups were identified per plasmid ( Fig 2F). This supported the notion that, besides secondary and in trans oriTs, the untyped plasmids likely carried un-identified relaxases 15,20-24 .

Zrimec 2019
Since the amount of query oriTs was the limiting factor in our analysis of plasmid mobility Additionally, we observed that in order to achieve a full overlap with the relaxase-typed plasmids, a considerably larger query dataset than is currently available is required, comprising

Over one half of mobile plasmids can transfer between different Mob groups
A large part of the newly uncovered oriTs were additional regions to the primary ones that corresponded to the plasmid cognate relaxases (Fig 2F, Fig 3A inset). This resulted in 1331 multi-oriT plasmids (64% of all oriT-typed mobile plasmids, of which 43% were relaxase-typed) that carried on average 5 oriTs (Fig S3-1). First, we analysed the connectivity (co-occurrence network) between the different oriT regions (nodes), when they were shared across the multi-oriT plasmids (edges), which represented an undirected multi-edged graph (Methods M6).
Since each oriT hit was characterised only by its closest-associated query oriT, the actual oriT node diversity was limited to the 102 query oriTs that returned hits (Fig 2E). The graph Zrimec 2019 contained a total of 79,004 connections between 552 interacting oriT node pairs (Fig 3B), with an average of 6 connections and up to 3528 connections per oriT node pair (ie. oriTs were shared across this many plasmids, Fig 3A). Among the multiple small connected components in this graph, one was larger than 2 nodes and comprised 76 oriT nodes (75% of all oriTs, Fig 3B) that proportionally represented all 6 Mob groups except Mob V, of which only 38% was in the fully connected subgraph (Fig 2E). The observations of a power law of the degree distribution ( Fig S3-2: degree exponent was 0.97) indicated that this oriT network obeyed the laws of natural biological scale-free networks, while the decrease of the clustering coefficient with increasing degree hinted at the possibility of a hierarchical topology (Fig S3-2: global clustering coefficient was 0.47) 52 . In fact, we observed that specific oriT regions acted as hubs and connected multiple other regions across the Mob groups ( Fig 3B). The most highly connected hubs were for instance pNL1-, BNC1 Plasmid 1-and pBBR1-like oriTs from Mob F, Q and V, respectively, that connected over 50 unique oriTs from all 6 Mob groups via~1500 connections each (see details in Table S3-1).
We next focused on the connectivity of Mob groups, since over 90% of the multi-oriT plasmids contained on average 2 unique Mob groups and 3 unique Mob subgroups (Fig S3-1). Indeed,  Table S3-2). Therefore, by switching the nodes of the above oriT graph with the corresponding Mob groups, we were able to analyse the high connectivity between Mob groups within the plasmids (Fig 3C, Table S3-3, Methods M6). In correspondence with the most frequently co-occurring Mob groups F, Q and V (Fig S3-2), the highest flux was also observed between these groups. The most strongly connected and highly promiscuous was Mob Q, as

High connectivity across habitats facilitates transfer of resistance genes to humans
We explored how the increased potential for plasmid transfer between different host species (Fig 3) contributes to the connectivity across different environmental habitats and whether it facilitates the spread of AMR genes to humans. For this task, we expanded the plasmid and oriT data by mapping the host species across 9 habitat supertypes (Table S4- between the habitats (Fig 4A). On average, each plasmid was present in 6 habitats and had access to 5 unique host species (Fig 4B) as well as 6 habitats (Fig S4-3). The number of mobile plasmids and oriTs across the habitats were both strongly correlated to the number of unique species (Pearson's r was 0.961 and 0.720, p -value < 0.03, respectively), with an average of 3 oriTs per plasmid (Fig S4-4 and S4-5). Interestingly, among the highest amount of oriTs per plasmid was found in industrial (includes food production and water treatment facilities) and animal habitats, known to harbor resistance 61 (Fig S4-4).
The number of outbound and inbound connections differed across the habitat types (Fig S4-6 respectively). Conversely, the Thermal habitat displayed 43% less inbound than outbound connections, which might also have been due to the lack of representation of plasmids from that habitat 55 or the lowest amount of species diversity (Fig S4-1). In order to analyse the transfers to and within the Human habitat more comprehensively, we expanded the habitat taxonomy to include commensal and pathogen types as well as tissue subtypes (Table S4-1, Methods M6).
Strikingly, a 2-fold higher amount of inbound vs. outbound connections was observed with commensals ( Fig 4C), whereas pathogens displayed a mere 11% increase. The highest amounts of connections to humans were from the Freshwater and Soil habitats (Fig S4-7A), especially via the gut and oral microbiomes (Fig S4-8A). Notably, all tissues except respiratory ones displayed an increase of inbound connections compared to outbound ones, with unassigned bacteria harboring some of the most highly receptive species (Fig S4-8B). Further analysis of the transfer within the human system itself showed an approximately equal rate of outbound and inbound connections (Fig S4-7B), suggesting that commensal bacteria act as the main interface for horizontal uptake of genes from the environment 62 , which they then further disseminate to the pathogens within the human body (Fig 4C inset).
By predicting the AMR genes in the plasmids (Methods M6), we found a moderate positive correlation (Pearson's r = 0.462, p -value < 1e-16), between the number of oriTs and the number of resistance genes in a plasmid (Fig 4D). Thus, 33% of the multi-oriT plasmids carried on Zrimec 2019 average 4 genes from 3 different AMR classes (Fig 4D). When viewed from the perspective of the above cross-habitat transfer network (Fig 4A), the amount of inbound connections of resistance genes from the environment to the human flora indicated that, as expected, the most abundant flow of genes corresponded to the oldest and most widely used classes of antibiotics, for which also resistance is most developed and widespread 63 (Fig 4A, Fig S4-8). Moreover, in this case, the amount of inbound connections to pathogens surpassed that of commensals by almost 20% (Fig 4E), supporting that AMR transfer routes generally serve a different portion of microbes compared to plasmid transfer in general (Fig 4C). Zrimec 2019

Discussion
Here, we explore the potential for horizontal transfer of natural plasmids by attempting to identify all conjugative DNA origin-of-transfer substrates coded within them. By prototyping a structural alignment approach to find and characterize oriT regions across plasmids (Fig 1D), we discover an almost 8-fold larger amount of oriTs than expected according to relaxase typing (Fig 2B).
Analysis of these regions suggests that the actual number of transferable plasmids can be as much as 2-fold higher and span almost 2-fold more host species than is currently known ( Fig   2B), with the amount of plasmid-borne AMR genes in fact proportional to this assessment of plasmid mobility (Fig 4D). We show that over half of all mobile plasmids contain the means to transfer between different mobility groups (Fig 3C), potentially linking multiple host ranges that were previously thought to be confined 3,4 . By analysing this network of potential plasmid transfers across ecological habitats (Fig 4A), we find that human and animal microbiomes display a large amount of plasmid connections with the environment. The considerably larger influx of plasmids to humans and animals compared to other environmental habitats (Fig 4C) might be a consequence of the increased amount of AMR transfers to these organisms 63 . In accordance with previous findings 62 , the network shows that human commensals might act as the main interface for horizontal uptake of genes from the environment in general (Fig 4C), whereas, unsurprisingly, the transfer of the specific widespread AMR genes is generally more highly targeted at pathogens (Fig 4E). Nevertheless, the potential increase in plasmid mobility uncovered here might be an important driver of the observed rapid resistance development in humans 63,64 and thus an important point of focus for future prevention measures.
Our oriT-typing procedure is a result of rationally expanding DNA alignment algorithms to incorporate enzymatically-relevant properties of the oriT substrates (Fig 1A), where the conservation of structural properties is detected across the whole 220 bp region compared to mere~40 bp of nucleotide sequence in the core relaxase-binding site 4 (Fig 1C,D). By allowing the use of at least 2-fold longer query sequences, structural alignment achieves a much larger statistical depth than sequence alignment (Fig S1-3), which means that oriTs can be efficiently sought across whole plasmids instead of just the vicinity of relaxases, as is the constraint with current methods 36 (Fig 1E). Since, due to the nature of the conserved structural properties, each enzymatic substrate corresponds to multiple possible sequence variants, the benefit of the DNA structural encoding is that it exposes these sequence variants by accessing the search space of the enzymatic co-evolutionary constraints (i.e. DNA structural background) 4 (Fig 1A   and 2D). The identified candidate oriT regions serve as starting points that can be further verified by typing other known molecular features 32 , such as inverted repeats 65,66 (Fig S2-4) and nucleotide sequence properties of the core enzymatic binding 38,67 and nicking sites 49,68 ( Fig   2D).
Besides finding the majority of expected oriT regions of known cognate relaxases (Figs 2E), almost ⅔ of the oriT-bearing plasmids in fact carry multiple oriTs (Fig 2F). The newly discovered oriTs are frequently located where they are expected, in non-coding regions and within genes related to horizontal mobility (conjugation, transposition and integration) 69 . However, the amount of Mob groups and depth of enzymatic substrate diversity that could be analysed within each group was constrained by the size of the query set of available oriT regions and nic sites 19 ( Fig S1-9 and S2-5). By simulating the availability of a larger set of query sequences, a linear relationship between the amount of uncovered and query oriTs indicates that our current sampling is likely still an underestimation of the actual plasmid transfer potential that could possibly span all plasmids 51,70 (Fig 2I). Our results of a 2-fold higher plasmid mobility compared to relaxase typing, with an almost similar increase in the amount of mobile plasmid-bearing host species (Fig 2B), point to multiple possibilities that further undermine the paradigm of a one relaxase-one oriT conjugative plasmid system spanning less than ⅓ of plasmids 50 : (i) a massive underidentification of relaxase enzymes 51,71 , (ii) relaxase promiscuity 21,31 and oriT evolutionary mechanisms 30 leading to many functional secondary oriTs, and (iii) a system-wide adoption of relaxase in trans mechanisms 21,72 .
Since plasmids are vehicles for transfer and long-term storage of 'common goods' that include, besides AMR, also virulence, heavy metal resistance and other genes 73 70,76 . Firstly, we found that the plasmids bearing multiple oriTs connect different Mob groups and subgroups (Fig 3C), thus transcending phylogenetic and host range barriers as well as potentially creating a global cross-habitat network that is reminiscent of a dense information Zrimec 2019 network 70,77 (Fig 4A). This network interconnects environmental genetic reservoirs and carriers as well as commensal vectors 62 , which likely act as long-term and shorter-term storage units 75 , respectively, with known human pathogens 64 (Fig 4C). Secondly, with a glimpse of the oriT network topology via the closest-associated query oriTs reminiscent of scale free and even hierarchical networks (Fig 3B and S3-2), the transfer networks display robust fault tolerant properties. As sparsely connected nodes without many direct neighbors are linked to highly connected hubs, even in case of absence of a large number of nodes, the remaining ones are likely still well connected 52,78 . The finding that specific types of oriTs serve as hubs linking to most of the other oriT types (Fig 3B,C)

M2. Development and testing of alignment algorithms.
We developed and tested a DNA structure based alignment algorithm, termed Strast, based on (i) encoding DNA into a structural representation 4 and (ii) a procedure that finds the most similar segments of target sequences to query sequences using a structural distance measure. In the first stage of Strast, for each DNA sequence, its set of kmers is transformed into a precalculated structural encoding, termed s-mers 4 (Fig S1-1B). The s-mers are based on clustered principal components of values obtained from 64 models of physicochemical and conformational DNA Zrimec 2019 properties (Table S1-3). Here, the s-mer size was set to 7 bp, the number of principal components was 18 (out of 64) to capture over 0.99 of data variance, and the k-means clustering algorithm with k = 128 clusters was used. The s-distance between two DNA sequences is defined as the sum of Euclidean distances between each consecutive pair of s-mers in the DNA sequence encodings (Fig S1-1B). The Jaccard distance between structurally encoded DNA sequences is obtained by using s-mers instead of k-mers. The second stage of Strast queries a target sequence for regions with the most similar and statistically significant s-distance (Fig S1-1C: see algorithm pseudocode, Methods M3).
For comparison of Strast to traditional non-encoded sequence based alignments two algorithms were used: (i) Blast v2.24 37 (www.ncbi.com) and (ii) the Strast procedure using the p-distance measure instead of s-distance. To assess the performance of the alignment algorithms for oriT-typing in target sequences, we evaluated the correctness of both (i) oriT and nic location finding to within ± 1 bp 49 and (ii) typing of Mob groups and subgroups. Additionally, the effect of the length of query regions in both types of tests was assessed (Fig 1B).

M3. Statistical analysis and machine learning metrics.
The F-test was performed using Permanova 80 with sequence bootstraps. The statistical significance of s-distance scores was evaluated using permutational tests, where bootstrap resampling (n_bootstraps = 1e6 per sequence) of randomly selected query oriT sequences (n_seq = 10) was used to estimate the s-distance scores at different p-value cutoffs (from 1e-6 to 1e-1). Next, to obtain a mapping function of s-distance to permutational p-values in the whole range of 1e-132 to 1e-1 (Fig S1-1D), least squares curve fitting to a second order polynomial function was performed, where the theoretical limit of~1e-132 was set to correspond to an s-distance of 0. For additional statistical hypothesis testing, the Python package Scipy v1.1.0 was used with default settings.
The following machine learning performance metrics were used to assess alignment algorithm performance: Precision, Recall/Sensitivity, Specificity, Accuracy, F1-score and Matthews correlation coefficient (Table S1- negative value was assigned to results below the significance cutoff that corresponded or did not correspond, respectively, to the known value.

M4. Analysis of alignment hits.
The newly uncovered regions were analysed by comparing the features of the oriT alignment hits with those of the query dataset, which included sequence properties and inverted repeats.

M5. Simulations of plasmid mobility.
To estimate the results that would be obtained with a larger oriT query dataset, the following procedure was applied. The oriT alignment results with the 4602 target plasmid dataset were diluted according to 10-fold dilutions of the 102 query regions (Fig 2C) used to identify the hits (10 repetitions were used). Least-squares curve fitting was performed (Python package Scipy v1.1.0) using a linear function and the dataset dilutions -specifically between the size of the query oriT dataset and the variables corresponding to the numbers of oriT hits, mobile plasmids, mobile plasmid carrying species and overlap with relaxase-typed plasmids.

M6. Analysis of transfer between environmental microbiomes.
Host species of the oriT alignment results with the subset of multi-oriT plasmids were mapped across 9 habitat supertypes (see habitat groupings in Tables S4-1, S4-2) according to published data on environmental 55 and human microbiomes [57][58][59][60] . The habitat taxonomy was further expanded to include commensal and pathogen types 57 as well as human tissue subtypes 55 . For typing antimicrobial resistance genes in the plasmids, the webserver version of ResFinder v3.2 82 was used with default settings.
To study the connectivity between the different oriT regions or Mob groups as nodes, shared across the multi-oriT plasmids as edges, an undirected multi-edged graph was constructed. To study the connectivity and AMR transfer potential between different habitats, a directed graph was constructed, where habitat nodes were connected by multi-oriT plasmids as edges.
Habitats of donor hosts carrying the mobile plasmids (outbound connections) were connected to habitats of potential acceptor hosts, deduced from the query oriTs (inbound connections). The Python package NetworkX v2.2 was used.