Recombination between RNA viruses and plasmids might have played a central role in the origin and evolution of small DNA viruses


  • Mart Krupovic

    Corresponding author
    1. Department of Microbiology, Institut Pasteur, Molecular Biology of the Gene in Extremophiles Unit, Paris, France
    • Department of Microbiology, Institut Pasteur, Molecular Biology of the Gene in Extremophiles Unit, Paris, France
    Search for more papers by this author


original image

The finding that viruses with RNA and DNA genomes can recombine to produce chimeric entities provides valuable insights into the origin and evolution of viruses. It also substantiates the hypothesis that certain groups of DNA viruses could have emerged from plasmids via acquisition of capsid protein-coding genes from RNA viruses.

The origin of viruses is shrouded in a veil of mystery. Although it is generally accepted that their origin is polyphyletic and very ancient 1–5, it remains largely unclear by what means and on how many independent occasions viruses have emerged during evolution. Especially obscure is our understanding on the evolutionary relationships between viruses with RNA and DNA genomes. Typically, viruses with different nucleic acid (NA) types are perceived as vestiges of different epochs of cellular evolution. According to this view, RNA and DNA viruses, respectively, represent the ancient RNA and modern DNA worlds, and reverse-transcribing viruses provide a missing link between these two worlds. Notably, comparative genomic and structural analyses have revealed that many small RNA and DNA viruses build their capsids using related proteins, pointing towards the link between these two groups of viruses. To explain this relationship, several alternative scenarios have been proposed. For example, it has been suggested that DNA viruses could have gradually evolved from more ancient RNA viruses 6 or that RNA and DNA viruses have emerged independently, directly from the primordial genetic pool 4. In addition, it has been proposed that certain groups of DNA viruses (e.g. geminiviruses and circoviruses) could have emerged in the course of recombination between contemporary RNA viruses and DNA plasmids 7, 8. Recently, Diemer and Stedman 9 have reported in Biology Direct on the discovery of a novel virus-like genome, which represents a chimera between an RNA and a DNA virus. This finding provides important insights into the origin(s) and evolution of viruses, which I discuss below.

Discovery of the chimeric viral genome

The genome of the RNA-DNA hybrid virus (RDHV) was sequenced while investigating the diversity of viruses in the acidic, high-temperature Boiling Springs Lake (Lassen Volcanic National Park, USA) using a viral metagenomics approach. The circular DNA genome of RDHV (∼4.1 kb) contains four open reading frames (ORFs 1–4), two of which share significant similarity to sequences in the public databases 9. ORF1 encodes a rolling-circle replication initiation protein (RCR Rep), most similar to those of circoviruses [small eukaryotic viruses with circular single-stranded (ss) DNA genomes]. The product of ORF2 is related to the capsid proteins previously found only in RNA viruses, including members of the family Tombusviridae and two unclassified oomycete-infecting viruses, SmV-A and PhV-A. Thus, RDHV appears to be an offspring of two viruses with different NA types (Fig. 1).

Figure 1.

Emergence of the chimeric RNA-DNA hybrid virus (RDHV) genome. Recombination between a circovirus-like DNA genome and an RNA tombusvirus-like genome, which, respectively, donate genes for the RCR Rep and the capsid protein, leads to the emergence of the RDHV genome. The genome maps are drawn approximately to scale and the colour key for the open reading frames is indicated. The three-dimensional structures of the Tomato bushy stunt virus (TBSV, Tombusviridae; PDBID: 2TBV) and Porcine circovirus 2 (Circoviridae; PDBID: 3R0R) are indicated next to their genomes and are shown in scale. The P (projection) domain of the TBSV capsid protein, which forms the characteristic protrusions on the capsid surface, is highlighted in blue, while the S (shell) domain responsible for capsid shell formation is shown in cyan. Since the capsid protein of RDHV is derived from a tombusvirus and has the same domain organization, the RDHV virion is likely to resemble that of tombusviruses. The virion maps were downloaded from the VIPER database (

RNA-DNA recombination might be more frequent than currently recognized

It is not clear how the recombination that gave rise to RDHV could have occurred. Although it is beyond the scope of this article to discuss in detail the possible mechanisms at play during the RNA-DNA recombination, a growing body of evidence suggests that this process might be more frequent in nature than currently recognized. The prevalence of different non-retroviral RNA virus genomes endogenized (chromosomally integrated in the host germ cells) in diverse eukaryotic lineages 10 supports this possibility. In fact, RDHV and other chimeric viruses could have emerged in the context of endogenous viruses, i.e. following tandem integration of an RNA virus genome (or its DNA copy) and a DNA replicon (virus or plasmid) into the cellular chromosome and subsequent exogenization of a novel chimeric virus with a DNA genome. Alternatively, the initial recombination could have occurred at the RNA level, between the viral RNA genome and the RCR Rep transcript of a DNA virus, followed by reverse transcription. Indeed, copy-choice recombination, when viral RNA-dependent RNA polymerase (RdRp) switches from one RNA template to another, is a well-known mechanism mediating recombination not only between related viruses but also among distantly related viruses, and even with non-viral RNAs 11. It is now important to elucidate the molecular mechanisms of RNA-DNA recombination, which leads to the emergence of chimeric RDHV-like viruses and endogenization of RNA viruses.

RNA viruses and DNA plasmids might be at the origin of small DNA viruses

The discovery of RDHV might also help us understand the enigmatic relationship between small icosahedral viruses with RNA and DNA genomes that build their virions using structurally related jelly-roll capsid (JRC) proteins. For the following discussion, it is important to mention that structural similarity between viral capsid proteins most likely represents homology, rather than convergence (for discussions see 3, 5). The jelly-roll fold is one of the most prevalent structural folds found in the capsid proteins of small icosahedral RNA viruses belonging to a number of different families (Fig. 2). In contrast, the diversity of viruses with DNA genomes utilizing the same type of capsid proteins is much more limited (Fig. 2). All of these viruses are among the simplest known DNA viruses (e.g. circoviruses, Fig. 1). Interestingly, all DNA viruses with JRC proteins utilize RCR Rep-like proteins for genome replication 12. In some (e.g. geminiviruses, microviruses), these proteins function as genuine RCR initiators, while in others (polyomaviruses and papillomaviruses) they have evolved to perform other genome replication-associated functions, albeit preserving the ancestral structural fold 12. RCR Reps are also typical for bacterial and archaeal plasmids 13, pointing to an evolutionary link between small DNA viruses and plasmids. Indeed, in the light of the RDHV discovery, the hypothesis that certain groups of DNA viruses have emerged from plasmids via acquisition by the latter of the virion formation modules from RNA viruses 8 becomes more credible.

Figure 2.

Three possible evolutionary scenarios for the origin of DNA viruses with jelly-roll capsid proteins. The name of taxonomic groups of RNA (green) and DNA (cyan) viruses with jelly-roll capsid proteins are listed on the right-side of the figure. See the main text for details.

There are several possible scenarios that could account for the observed evolutionary relationship between JRC DNA viruses, small icosahedral RNA viruses, and DNA plasmids. Here I focus on three of them (Fig. 2): (i) according to the “independent emergence” scenario, RNA and DNA viruses could have originated from non-viral RNA and DNA replicons by acquiring the pre-capsid gene (i.e. cellular gene for a protein capable of forming icosahedral containers dedicated to genuine cellular functions) independently from each other. However, the fact that the JRC RNA viruses are much more diverse than the corresponding DNA viruses (as reflected by the numbers of corresponding different virus families; Fig. 2) suggests that the former were the first to emerge. This conclusion is also consistent with the general view that RNA viruses are relics of the ancient RNA world and thus predate DNA viruses 4, 6. Besides, repeated independent capture of cellular genes and adaptation of proteins tailored to perform functions that have nothing to do with the viral life-cycle by different replicons appear unlikely. (ii) The second, “gradual transition,” scenario suggests that JRC DNA viruses could have gradually emerged directly from JRC RNA viruses, possibly through an intermediate reverse-transcription stage (Fig. 2). Although such a sequence of events is appealing because of its conceptual linearity, it lacks support from available genomic and structural data. Firstly, the genome replication proteins utilized by JRC RNA and DNA viruses – RdRps and RCR Rep-like proteins, respectively – are evolutionarily unrelated, i.e. RCR Rep could not have evolved from RdRp, and thus must have come from a distinct source. Secondly, none of the known reverse-transcribing viruses builds its virion using JRC proteins. (iii) The discovery of the RDHV genome provides support for the third, “RNA-to-DNA jump,” scenario (Fig. 2), whereby JRC DNA viruses emerge from the JRC RNA viruses in an evolutionary leap involving replacement of the RdRp gene with a plasmid-borne gene for an RCR Rep. The packing of NAs by JRC RNA viruses is not always strictly specific, as has been recently demonstrated in the case of Flock house virus (Nodaviridae) 14. Furthermore, since the persistence lengths of ssRNA and ssDNA molecules are similar 15, the encapsidation of the newly formed viral DNA genome by a JRC protein of an RNA virus appears to be feasible. Such “jumps” from an RNA virus to a DNA virus could have occurred on multiple independent occasions, during different evolutionary periods – some possibly in pre-LUCA (last universal common ancestor) environment, others in the context of modern cells – producing several ancestral JRC DNA viruses. The latter could then diversify to produce the extant diversity of DNA viruses with JRC proteins. Once in existence, such viruses could engage in a secondary gene exchange with RNA viruses, an event that supposedly led to the emergence of RDHV (Fig. 1; ref. 9). Consequently, small DNA viruses with the JRC capsids do not necessarily share a common viral ancestor, and caution should be used when interpreting evolutionary relationships among them. Furthermore, although plasmids have certainly played an important role in the evolution of diverse DNA viruses 16, the “RNA-to-DNA jump” scenario might not necessarily extend to other (non-JRC) groups of DNA viruses, some of which could have emerged by totally different mechanisms (e.g. by “gradual transition” 6).

Chimeric viruses illuminate the pitfalls of the genome-based virus classification scheme

Horizontal gene transfer (HGT) plays a profound role in the evolution of viruses 16. This mechanism of evolution generates the constantly increasing complexity in the virosphere, and, as biologists, we marvel at it. However, at the same time, it brings us to a conceptual conundrum – how to classify viruses that are mosaics of genes with different evolutionary histories? The smaller the viral genome, the more tangible is the effect of HGT on its identity. The RDHV genome presents an excellent illustration of this point. The two RDHV genes (occupying 80% of its genome) with homologues in the databases have been acquired from different viruses (Fig. 1); the gene for the capsid protein shares evolutionary history with tombusviruses, while the one for RCR Rep apparently originates from a virus related to circoviruses. So, to which of the two viral groups should RDHV be affiliated? Is it a “circovirus-like” or a “tombusvirus-like” virus? Indeed, both options are legitimate. Diemer and Stedman 9 denote RDHV as a circovirus-like entity. Similarly, previous viral metagenomics studies have revealed a number of “circovirus-like” genomes, which, in most cases, can be linked to circoviruses only via their RCR Reps 17. However, considering the similarity between the capsid proteins and genome sizes of RDHV and tombusviruses, the RDHV virion is likely to be very similar to those of tombusviruses (both in size and appearance), but radically different from virions of circoviruses (Fig. 1). From this perspective, RDHV no longer resembles a circovirus. Consequently, the way we classify such chimeric viruses will largely depend on our professional interests and preferences. Scientists focusing on viral (meta) genomics will have no doubt that RDHV is a circovirus-like entity, while for structural virologists it will be obvious that RDHV is a tombusvirus-like virus. I, personally, lean towards the latter position. Firstly, it is the virion (defined by the type of the capsid protein encoded by a virus) that distinguishes viruses from other replicons, such as plasmids (for the discussion on “what makes a virus a virus” see ref. 18). Secondly, genome replication modules are known to be swapped between evolutionarily unrelated viruses as well as between viruses and plasmids, and even between viruses and their hosts 16, 19. This is especially evident in the case of prokaryotic viruses with larger genomes, where the directionality of HGT is easier to trace.

Obviously, virus classification is a human invention, which is not always faithful to viral biology. Thus, sceptics might question whether it is at all worthwhile struggling with virus classification instead of being content with the realization that what actually happens in the virosphere is a continuous shuffling contest, where different functional modules coming from diverse replicons are being mixed and matched to produce the increasing global complexity. This said, virus classification has always provided a framework for understanding the evolution of viruses; in order to continue advancing in this direction, and not to get lost in the maze of horizontal interactions, we need to devise a scheme for classification of chimeric viruses, such as RDHV. Along these lines, together with Dennis Bamford 18, we have previously suggested that virion structure may be considered as a vertical element of any virus. In this context it may be used as a guideline for the high-level virus taxonomy, which would help in revealing deep evolutionary connections between distantly related viruses, and eventually in bringing the much-desired order to the virosphere.

Concluding remarks

The origin of viruses is one of the most exciting questions in biology, but it is also among the ones most difficult to tackle. The finding of Diemer and Stedman that RNA and DNA viruses recombine to produce new chimeric entities might explain the presence of homologous genes in RNA and DNA viruses (e.g. related fusion proteins in herpesviruses and rhabdoviruses 20 or receptor-binding proteins in reoviruses and adenoviruses 21). It therefore appears that the sequence space that viruses can (and do) explore is practically limitless, spanning both RNA and DNA virospheres. The latter realization allows us to consider new scenarios for the origin and evolution of different viral groups. Finally, metagenomics provides a glimpse into future discoveries in virus research. Thus, we should be on the lookout for such chimeric viruses, and be prepared to classify them in a meaningful way.


I would like to thank Drs. Purificacion Lopez-Garcia and Andrew Moore for inviting me to write this commentary.