Cross‐species transmission of deltacoronavirus and the origin of porcine deltacoronavirus

Abstract Deltacoronavirus is the last identified Coronaviridae subfamily genus. Differing from other coronavirus (CoV) genera, which mainly infect birds or mammals, deltacoronaviruses (δ‐CoVs) reportedly infect both animal types. Recent studies show that a novel δ‐CoV, porcine deltacoronavirus (PDCoV), can also infect calves and chickens with the potential to infect humans, raising the possibility of cross‐species transmission of δ‐CoVs. Here, we explored the deep phylogenetic history and cross‐species transmission of δ‐CoVs. Virus–host cophylogenetic analyses showed that δ‐CoVs have undergone frequent host switches in birds, and sparrows may serve as the unappreciated hubs for avian to mammal transmission. Our molecular clock analyses show that PDCoV possibly originated in Southeast Asia in the 1990s and that the PDCoV cluster shares a common ancestor with Sparrow‐CoV of around 1,810. Our findings contribute valuable insights into the diversification, evolution, and interspecies transmission of δ‐CoVs and the origin of PDCoV, providing a model for exploring the relationships of δ‐CoVs in birds and mammals.

Host range is defined as the number of host species infected by a virus, which is a virus trait to understand the epidemiology and pathogenicity of pathogens. As one of the major factors in viral evolution, viruses rarely spread effectively in new hosts that were previously unexposed or uninfected (Lau & Chan, 2015). The cross-species transmission of the virus poses a continuing threat to public health. Due to the increased contact between humans and other animal species, there is the possibility of cross-species transmission and subsequent disease outbreaks (Parrish et al., 2008). Among RNA viruses, CoVs contain the large RNA genomes, ranging from 26 to 32 kilobases in length, which make them genetically more permissive to genome modification. About two-thirds of each CoV genome comprises two overlapping ORFs (Orf1a and Orf1ab). The remaining genome includes the structural proteins' ORFs, namely Spike (S), envelope (E), membrane (M), and nuclear protein (N). Differences also exist in the accessory proteins encoded by different CoVs (Gorbalenya, Luis, John, & Snijder, 2006). Previous studies have reported that cross-species transmission of CoVs is closely associated with the S protein, the nonstructural protein 3 (nsp3) (papain-like protease, PLpro), and the accessory protein(s) (Cui, Li, & Shi, 2019;Forni, Cagliani, Clerici, & Sironi, 2017;Forni et al., 2016).
Coronaviruses have a large host spectrum, through mutation and recombination; CoVs have an increased probability of interspecies host jumping and of novel CoVs emerging under specific conditions (Su et al., 2016). SARS-CoV and recently emerged MERS-CoV epidemics have proven the ability of coronaviruses to cross-species barriers and emerge rapidly in humans. Presently, the greatest genetic diversity in α-and β-CoVs is documented in bats, and previous studies have confirmed SARS-CoV and MERS-CoV are originated from bats (Hu et al., 2017;Memish et al., 2013); however, the evolutionary history of δ-CoVs is poorly understood.
Genetic diversity levels in δ-CoVs suggest that they are likely to be widely distributed in avian species around the world. Even if no δ-CoV has been found in human so far, we could not exclude that the host range of δ-CoV (avian, pigs, and leopard cats) might preclude of cross-species potential to other hosts, including humans. Therefore, the possible zoonotic spread of new viruses to humans would pose a significant threat to global public health.
Here, we investigated the origin and evolution of δ-CoVs. The data we have obtained will help with the preparation of countermeasures against the possible future risk of zoonotic transmission of δ-CoV to humans.

| Phylogenetic and sequence distance analyses
Briefly, the complete genomes and selected primary genes (Orf1ab, S, E-M-N) from δ-CoVs were aligned using MAFFT (Katoh & Standley, 2013). Maximum-likelihood (ML) phylogenetic trees were constructed using IQ-TREE (Nguyen, Schmidt, von Haeseler, & Minh, 2015), and the best-fitting nucleotide substitution model was determined automatically by the program following 1,000 bootstrap replicates. The results were visualized using iTOL v.4 (http:// itol.embl.de/). The sequence distance analyses were assessed by the SSE v1.3 package (Simmonds, 2012), and sequence divergence scans were performed for the different viral hosts and generated by the inbuilt Sequence Distance program.

| Codon usage analyses
Cross-species transmission in viruses can involve a change in codon preference, which allows the virus to adapt to the new host and support self-proliferation (Bahir, Fromer, Prat, & Linial, 2009).
Genomic GC content is one of the most reliable signals relating to cross-species codon usage variation (Chen, William, Hottes, Lucy, & Mcadams, 2004), and wobble in the third position (the GC3s) of the codons may indicate viral evolution (Bera et al., 2017). The nucleotide contents (GC, GC12s, and GC3s) of each δ-CoVs coding sequence were calculated using the Galaxy website (https://galaxy.paste ur.fr) and CondonW software. GC contents were plotted against GC12s and GC3s using Graphpad Prism v6.01. The codon usage heatmap was drawn by TBtools (Chen, Xia, Chen, & He, 2020), and as there was no significant difference in the codon usage among the PDCoVs, we listed all the PDCoVs as a group. The frequently used codons with higher relative synonymous codon usage values were represented by the largest red circle, the medium frequently used codons by a smaller circle and the lower frequency codon usage by a large green circle.

| Bayesian evolutionary analysis of PDCoVs
To better understand the relationship between the Sp-CoVs and PDCoVs, the time for viral origin was explored through the timescaled phylogenetic tree constructed in BEAST 2 using the standard Yule Model (Bouckaert et al., 2014). To further compare the time of origin for the different genotypes of the PDCoV isolates, a Bayesian time-scaled tree and phylogeographic tree for the PDCoVs were prepared in BEAST 2 using the Mascot package (Muller, Rasmussen, & Stadler, 2018). This algorithm entirely avoids migration history sampling. The genomic dataset was analyzed with a strict clock under a single GTR + gamma substitution model. The states were sampled every 10,000 steps, and 10% of samples were discarded during burn-in with an MCMC chain length of 100 million. The parameters in the result were checked by estimating the effective sample sizes with Tracer 1.7 (Rambaut, Drummond, Xie, Baele, & Suchard, 2018) and visualized using Figtree v1.4.3 (tree.bio.ed.ac.uk/softw are/figtr ee/).

| Accessory protein analyses
VGAS (http://cefg.uestc.cn/vgas/) was used to search for potential protein-encoding segments in deltacoronaviruses and visualized them using IBS v1.0 (Liu et al, 2015). We arranged the viruses according to the ML phylogenetic tree, and the similarity of PLpro, RNA-dependent RNA polymerase (RDRP), S and some accessory proteins was compared with Dabbling Duck-CoV and PDCoV.

| Phylogenetic analyses and genetic divergence of δ-CoVs
To characterize the genetic diversity of δ-CoVs among different hosts, we constructed a phylogenetic tree based on the 118 complete δ-CoVs genomes. The sequences of 18 avian deltacoronaviruses (ADCoVs) and 100 PDCoVs were obtained from the National Center for Biotechnology Information (NCBI, https://www.ncbi.nlm.nih. gov) for analyzing phylogenetic and host-virus evolutionary relationships (Table S1). Using the genome from infectious bronchitis virus (IBV, a γ-CoV genus member) as the outgroup, maximum-likelihood (ML) phylogenetic trees were constructed based on the complete genome ( Figure 1a) from δ-CoVs using IQ-TREE. Based on the phylogenetic tree topologies and viral hosts, the δ-CoVs separated into three groups: PDCoVs, Sp-CoVs, and other bird-CoVs. Moreover, we constructed three phylogenetic trees based on the ORF1ab, S, and E-M-N genes sequences of δ-CoVs. The topological structure is basically consistent between genome and ORF1ab; however, Sp-CoVs strains showed inconsistent topology in the phylogenetic trees of S and E-M-N gene, differing from the ORF1ab gene phylogenetic tree F I G U R E 1 Phylogenetic and host-virus evolutionary analyses for δ-CoVs. (a) ML phylogenetic tree for the δ-CoVs genome. The best-fitting nucleotide substitution model was determined automatically by the program following 1,000 bootstrap replicates, and the phylogenetic trees were visualized using iTOL v.4 (Interactive Tree of Life, http://itol.embl.de/). The PDCoVs collapsed into one node are shown in a blue triangle, the Sparrow-CoV is shown in red font, and the other ADCoVs are shown in green font; bootstrap support values higher than 95 are shown with orange dots. The sequence names of δ-CoVs are shown in a uniform format (NCBI accession number-Strain name-Target host-Isolated country-Isolated year). (b) Genome-based recombination analysis using SimPlot v3.5. The settings were as follows: window size, 500; step size 20; gap stripping, on; Kimura distance model. (c) The different event costs used for the host-virus phylogeny congruence test. All possible cospeciation, duplication, and host switch events are shown. Sp-CoV is marked in red font. PDCoV is marked in blue font ( Figure S1a). This inconsistent topologies of S and E-M-N genes of Sp-CoVs, in which outlier sequences were found in other bird-CoVs subgroups in a phylogenetic tree ( Figure S1a), might be attributed to cross-species transmission or/and genomic recombination.
To further explore the evolution of δ-CoVs, phylogenetic analysis of the complete genome and S gene among the Coronaviridae subfamily shows that the complete genomes of δ-CoVs share a close kinship with those of other γ-CoVs, while the S gene from δ-CoVs shares a closed kinship with those from other α-CoVs ( Figure S2, Table S2). Moreover, the intra-and inter-group genetic distance (p-distance) analyses were used to further quantify genetic divergence of δ-CoVs. As shown in Figure 1b,c the δ-CoVs shared a higher sequence similarity in ORF1ab and E-M-N genes. In contrast, a considerable genetic diversity is shown in the S gene. Moreover, the genetic diversity in ADCoV is generally higher than that in PDCoV ( Figure S1b). Interestingly, ML-trees and p-distance analyses showed that for all genes Sp-CoV is closest to PDCoV, suggesting a strong correlation between these viruses (Figure 1a and Figure S1).

| Host-virus evolutionary relationships for δ-CoVs
Sequence divergence for primary genes and the p-distances of the intra-and inter-groups and the synonymous and nonsynonymous p-distances were calculated. The results showed that PLpro, the S gene, and the accessory proteins from the δ-CoVs displayed high levels of molecular variability ( Figure S1b and Figure S3). Interestingly, the fluctuation range for the synonymous and nonsynonymous mutation rates showed that the S gene from Sp-CoVs had a large span through all δ-CoVs. Moreover, our sequence similarity and recombination analyses showed that the newly discovered Sp-CoV (GenBank accession number MG812375) is more similar to PDCoV across the whole genome (Figure 1b), suggesting that the origin of PDCoV may have occurred via recombination between different Sp-CoVs. A statistically significant signal for phylogenetic incongruence in δ-CoVs showed that PDCoV might evolved from a recombination event, with the 5′ part of the S gene acquired from one Sp-CoV (ISU690 isolate) and the remaining genomic regions acquired from other Sp-CoV (HKU17 isolate) (Figure 1b).
To further discern the spread of δ-CoVs among different hosts, we performed event-based cophylogenetic reconstructions using the Jane program (version 4), because it can analyze five types of events in a host-virus phylogeny (cospeciation, duplication, duplication and host switching, loss and failure to diverge) with each event having a related cost. The different event costs produced the same results in that there are large amounts of host switching in δ-CoVs  Figure S4). We also have listed all the potential accessory proteins from δ-CoVs according to the ML phylogenetic tree, and our results reveal that the 3′-tail sequence of PDCoV is shorter than in other δ-CoVs ( Figure S3).

| Origin and evolution of PDCoVs
To better understand the potential evolutionary origin of PDCoVs, the time-scaled phylogenetic tree of PDCoVs was constructed in BEAST 2. As shown in Figure S5a (Park, Park, Song, How, & Jung, 2019). Current research shows that α-CoVs and β-CoV only infect mammals, whereas γ-CoVs mainly infect birds. For δ-CoVs, most of them infected with avian and some other members could infect mammals, indicating that they have undergone hostswitching events between these animals. Here, our results reveal that the S protein from δ-CoVs showed a close relationship with those from other α-CoVs. Because the S protein plays an essential role in CoV entry, this protein in δ-CoVs may determine whether viral infection in birds or mammals is successful, making successful viruses a potential threat if their mammalian host range expands.
Currently, δ-CoVs have been detected in many birds, and substantial genetic differences occur among different viral species.
Moreover, genetic diversity in ADCoV exceeds that in PDCoV, suggesting a long evolution of δ-CoVs in birds. Thus, a possible risk is that δ-CoVs will spread more widely in birds. As we know, Southeast Asia contains one of the most globally abundant and diverse bird populations, and previous research indicates that avian influenza always originates in Asia, especially Southeast Asia (Martin et al., 2006). Whether a high risk for new ADCoV infections in mammals exists in Southeast Asia deserves future attention. Certainly, the close genetic similarity, codon usage, and GC F I G U R E 3 Maximum-clade-credibility tree showing ancestral time and locations for the PDCoVs inferred from the structured coalescent. The branch colors indicate the location states, and the tip shapes of the tree represent the isolated countries of the PDCoVs, as shown in the regional legend. The node at the root of the tree represents the root state posteriori probability of the SEA bias in δ-CoVs isolated from sparrows and pigs make the Sp-CoV/ PDCoV lineage particularly attractive for researchers in the context of cross-species transmission. S gene from Sp-CoVs had a large span through all δ-CoVs, which may allow the virus to spill over into new hosts. The Munia-CoV and PDCoV S genes are highly similar, prompting speculation that PDCoV may have arisen from a recombination event between Sp-CoV and Munia-CoV (Lau et al., 2018). However, our results suggest that the origin of PDCoV may be the result of a recombination between different Sp-CoVs.
Interestingly, the GC content of PDCoVs is slightly below that of Sp-CoVs, possibly promoting adaptation in the avian-derived virus to replicate in mammals. Wong, et al. reported  Generally, cross-species transmission of CoV is closely associated with the viral S protein, PLpro, and the accessory proteins. As one of the main structural proteins, their S protein participates in receptor-binding and host adaptability (Cui et al., 2019). PLpro is involved in processing the viral polyproteins and regulates the innate immune response (Forni et al., 2016). Our results showed that more sites underwent positive selection in the PLpro gene than in the S gene (2 sites versus 12 sites), which implies that the S gene has played a more important role during PDCoV evolution. For the accessory proteins, δ-CoVs own open reading frames (ORFs) encode a wide variety of accessory proteins, some of which are host derived, while some have been lost during viral evolution. Studies have shown that the accessory proteins in PDCoV target the host's antiviral innate immune responses, which are also thought to promote viral adaptation to the host (Fang et al., 2018). Our results reveal that the number of accessory proteins in PDCoV is fewer than in the other δ-CoVs. This suggests that some of the accessory genes are not essential for PDCoV replication in pigs and are also not the most important factor for the host switch from birds to mammals.
In summary, our analyses provide in-depth insights into the diversification, evolution, and interspecies transmission of δ-CoVs and the origin of PDCoV. Increasing evidence strongly implicates wild birds as the reservoir hosts for δ-CoVs, though transmission of the virus within bird populations remains unknown. Given that the birds like sparrows share the ecological niche with domestic mammals, sparrows might act as a potential intermediate host, which play a role in transmission of δ-CoVs to pigs. Following initial pig infection, the pig-to-pig transmission is a predominant feature of PDCoV outbreaks. Given that pigs are in frequent contact with human and wild animals, the lower interspecies hurdles in pigs would make them a potential mixing vessel for δ-CoVs. Thus, there is still a risk that δ-CoVs may spread to more mammals, including human. Although sparrows are suspected to be the primary source of infection in pigs and the δ-CoV genomes from pigs and sparrows are highly similar, the routes of direct or indirect interspecies transmission are yet unknown. Therefore, detailed case-control studies are needed to unravel the exact transmission routes.

CO N FLI C T O F I NTE R E S T
None declared.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available in the Appendix S1 of this article.