Genomic analysis of porcine circovirus type 2 from southern China

Abstract Background Porcine circovirus type 2 (PCV2) is recognized as virulent porcine pathogen and has been linked to porcine circovirus diseases (PCVD). However, there remain many unknowns regarding the spread and epidemic growth of PCV2. Methods To assess the genetic diversity of PCV2 in the southern China, a total of 92 sequences of PCV2 strains from this region were retrieved from GenBank and were subjected to amino acid variation and phylogenetic analyses together with 28 representative sequences, based on the sequence of the ORF2 gene, from different swine‐producing countries. Results All 92 PCV2 strains shared between 93.7% and 100% sequence similarity and could be divided into four genotypes (PCV2a, PCV2b, PCV2d and PCV2h), of which PCV2d had surpassed PCV2b and became the most prevalent PCV2 genotype in this region. Alignment of the deduced amino acid sequences of the capsid protein revealed that the obtained PCV2 strains possess two major heterogenic regions/hypervariable regions (positions 52–68 and 185–191), which were within or close to the epitopic regions in the capsid (Cap) protein. Meanwhile, the 92 PCV2 sequences also show evidence of at least five unique recombination events. Conclusion The data in this study indicate that the PCV2 strains in the southern China are undergoing constant genetic variation and that the predominant strain and its antigenic epitopes in this area have been gradually changing in recent years.

. As a non-enveloped, single-stranded DNA virus, the PCV2 virion is icosahedral and 17 nm in diameter (Lv, Guo, & Zhang, 2014). The PCV2 has an ambisense, closed, circular genome with a size of 1,766-1,768 and 1,777 nucleotides that is computationally predicted to possess 11 overlapping open reading frames (ORFs) (Nguyen et al., 2012). To date, six ORFs have been characterized in detail: ORF1 codes for two replication-associated proteins (Rep and Rep'), ORF2 codes for the capsid protein (Cap)involved in the host immune response, ORF3 codes for the apoptotic protein, ORF4 codes for the anti-apoptotic protein (Lv, Guo, Zhang, & Zhang, 2016), ORF5 codes for a novel potentially endoplasmic reticulum stress-inductive protein (Lv, Guo, Xu, Wang, & Zhang, 2015) and ORF6 codes for a newly discovered protein that may be involved in caspases regulation and the expression of multiple cytokines in PCV2-infected cells Li, He, et al., 2018).
Particularly, ORF2 is a common target gene used for epidemiological and phylogenetic analyses on PCV2 strains as the analysis has been shown to be representative of full genome analysis. Previous studies have substantiated that the Cap protein encoded by ORF2 possesses three specific antigenic sites (aa 69-83, aa 117-131 and aa 169-183) and three spatial overlapping antigenic epitopes (aa 47-63, aa 165-200 and aa 230-233) (Lv et al., 2014).
PCV2 can be divided into eight genotypes (PCV2a to PCV2h), which were compiled based on the updated phylogeny-grounded genotype definition for PCV2 strains that have been described by Franzo and Segalés (2018). PCV2a and PCV2b are the most common strains. In approximately 2003, there was a global-scale shift in PCV2 from genotype PCV2a to PCV2b, which is highly prevalent in many countries. PCV2c is a genotype that was first identified in Denmark (Dupont, Nielsen, Baekbo, & Larsen, 2008), while PCV2d and PCV2e are two later discovered genotypes in China and other countries (Guo, Lu, Wei, Huang, & Liu, 2010;Wang et al., 2009). PCV2f is a novel genotype that was first identified by Bao et al. (2018), while PCV2g and PCV2h are two more genotypes that are recently proposed by Franzo and Segalés (2018). Additionally, PCV2 might be divided into group 1 (PCV2b) and group 2 (PCV2a) with eight clusters (1A-1C and 2A-2E) in another PCV2 genotype definition (Olvera, Cortey, & Segalés, 2007).
The link between PCV2 group and disease status has been investigated many times but the results are not clear-cut. Of these results, it is generally accepted that PCV2a, PCV2b and PCV2d are able to experimentally reproduce PCV2-SD under appropriate circumstances, such as co-infection with other swine pathogens or immunostimulation by vaccines or adjuvants (Gillespie, Opriessnig, Meng, Pelzer, & Buechner-Maxwell, 2009;Opriessnig et al., 2014;Segalés, 2015).
In China, PCV2 infection was first recognized in 1996, and was then subsequently identified in most pig farms in different regions (Bao et al., 2018). Currently, several studies has revealed that Chinese strains share a high-nucleotide sequence identity and mainly belong to genotypes PCV2b andPCV2d, with PCV2d becoming more dominant . There are known recombinant events of PCV2 within the ORF2 gene that are considered to be type-specific and closely related to the pathogenesis (Cai et al., 2012;Wang et al., 2009). However, there remain many unknowns regarding the spread and epidemic growth of PCV2. In the present study, we obtained the complete genomes of 92 PCV2 strains from the Guangxi Beibu Gulf economic zone in China from GenBank and subjected them to amino acid variation and phylogenetic analyses, along with 28 PCV2 strains from different geographic regions throughout the world, based on the sequence of the ORF2. This information may provide valuable insights in to the genetic variation and phylogenetic characteristics of PCV2 populations circulating in this area and shed new light on the choice of vaccines that are ultimately used.

| Phylogenetic analyses of PCV2sequences
To provide better insight into the extent of genetic heterogeneity among PCV2 strains in southern China, a phylogenetic tree of the ORF2 gene was constructed with MEGA v.10.0.5 software using the neighbour-joining (NJ) method (Kumar, Stecher, & Tamura, 2016). A bootstrap value was calculated using 1,000 replicates. The newly proposed method for genotyping by Franzo and Segalés (2018) was applied.

| Analysis of antigenic structure of PCV2 ORF2 gene
It is widely accepted that the variation in the antigenic structure of the deduced capsid protein encoded by the ORF2 gene is important evidence for viral adaptability. Therefore, we performed an epitope cluster analysis via the newly developed cluster-breaking algorithm with an identity threshold of 80% on ORF2 genes used in this study (http://tools.iedb.org/clust er2/). In addition, predictions of B-cell epitopes, secondary structures and surface locations were also performed on this target gene by using the BepiPred method  (Minin, Bloomquist, & Suchard, 2008). These synthetic datasets displayed the variation in this gene sequence in different PCV2 strains.

| Nucleotide and amino acid substitutions analysis of PCV2 ORF2 sequences
Substitution rates of the nucleotide sequences and the deduced amino acid sequences of ORF2 gene products were analysed by the Tamura-Nei method using MEGA v.10.0.5. Tajima's neutrality test was performed to calculate the Θ, π and D values, which are the common indicators within the neutrality test that are used for evaluating the selection pressure on the group being tested, following the procedures described elsewhere (Nielsen, 2001

| Recombination analysis of PCV2 sequences
To investigate the recombination rates, putative breakpoints and potential parental sequences of the PCV2 genomes, Recombination Detection Program (RDP v.4.97) was utilized according to the recommendations of previous studies (Mu et al., 2012). To further comprehensively confirm the identified recombinant events, the seven recombination detection methods, namely RDP, MaxChi, GeneConv, BootScan, SiScan, 3Seq and Chimaera, which are abbreviated to R, M, G, B, S, T and C, respectively, that had been implemented in the RDP4 software were employed again to ensure an acceptably low rate of false positives. The correlation parameters were identical to those reported by Li, He, et al. (2018)

| Nucleotide sequences analysis
The sequence analysis showed that the complete genomes of these

| Phylogenetic relationships of PCV2 isolated in southern China
Based on the subgroup terminology in a previous report [13], the phylogenetic analysis indicated that 92 strains could be divided into four genotypes: PCV2a (7/92), PCV2b (37/92), PCV2d (47/92) and PCV2h (1/92), with PCV2d being the currently prevalent genotype ( Figure 2; Table 1) [18]. Further research on the genotyping of these PCV2 strains found that PCV2 strains coming from the same or different regions may fall into the different or the same subgroups (e.g. MH465458 and MH465420; EF675240 and MH465419) (Table 1; Figure 1) (Wang et al., 2009;Zhang et al., 2014), and this discovery is similar to findings of a previous study in which the hypothesis was put forward that the prevalence of different PCV2 subgroups does not have a significant geographical characteristic (Li et al., 2010).
Consistently, the genetic distances between the Chinese PCV2 strains were relatively far away from one another (data not shown) and the strains of each genotype were interspersed with the same genotypic PCV2 strains worldwide.

| Amino acid analysis
To investigate variation in the amino acid sequences of the putative capsid protein, the deduced amino acid sequences of the ORF2 gene of 100 PCV2 strains, including the 92 Chinese strains and 8 selected representative strains, were subjected to pairwise alignment.
As shown in Figure 3, the divergence at the amino acid level was greater than that of the nucleotide sequence (Table 3), displaying lower similarity that ranged from 78.6% to 100% (data not shown). This information may provide new evidence for PCV2 phenotyping.

| Recombination analysis
Based on the analysis performed using the RDP v.4.97 software, a total of 18 potential recombination events were detected within all   (Table 4). The only recombination event that was confirmed separately using the seven methods implemented in RDP4 was shown in Figure 5.

| D ISCUSS I ON
Homogeneous analysis demonstrated that the overall diversity of PCV2 strains in southern China is still low, as the lowest sequence similarity observed between any two Chinese strains was 93.7%, which is similar to the sequence similarity of 94.6% found in a previous study in 2009 (Harmon et al., 2015) and the sequence similarity of 92.7% described in another report in 2012 (Table 3) (Mu et al., 2012). However, phylogenetic and other genetic variation analyses that have aimed to elucidate the evolution and spread of PCV2 are still compelling, because the ORF2 gene of PCV2 is relatively free of recombination, which is a prominent feature of PCV2 evolution (Olvera et al., 2007).
Based on the ORF2 gene, a phylogenetic tree was generated using the neighbour-joining method. The results indicated that the 92 PCV2 isolates from the southern China could be grouped based on two major genotypes (PCV2b and PCV2d), with PCV2d being the more predominant PCV2 genotype in this region (Figure 2 and Table 1). Meanwhile, we found that the ORF1 region in three Chinese strains was sharing high homology with that of a strain of PCV2c that was first recognized in Denmark and reported in a recent study, although no evidence of the presence of PCV2c was found in this region (Liu, Wang, Zhu, Sun, & Wu, 2016). Currently, however, whether the three Chinese strains derived from Denmark by international trade transportation or genovariation are still unclear. Notably, the size of the PCV2d group is significantly larger than those reported by Wang et al. (2009) and Mu et al. (2012), and is significantly larger than the PCV2b group, which indicates that PCV2d is undergoing an increase in population size due to some directional selection and is becoming the predominant genotype in China. This conclusion was further confirmed by a nucleotide replacement rate analysis that indicated that the D value derived from Tajima's neutrality test was significantly less than zero. Nevertheless, all of the 92 PCV2 strains analysed in this study were collected after 2006 (Table 1) (Wang et al., 2009;Zhang et al., 2014), which coincided with an increase in the severity of PCVD cases in China (Li et al., 2010). Moreover the findings are consistent with a previous report that indicated that the genotypic shift from PCV2b to PCV2d likely occurred in approximately 2010 (Franzo & Segalés, 2018;Yang et al., 2018). Coincidentally, global genetic analysis indicated that the PCV2 evolution trace was PCV2a to PCV2b to PCV2d and has occurred in many countries (Franzo & Segalés, 2018). In the USA, a variant PCV2 mutant strain designated as mPCV2b, and now grouped in PCV2d, was detected in several PCVD cases in 2012 and the prevalence rate of it appears to have increased in recent years, indicating that there is an ongoing genotype shift occurring from PCV2b to PCV2d on a global scale in these subsequent years (Franzo & Segalés, 2018;Jiang et al., 2017;Xiao, Halbur, & Opriessnig, 2012). Currently, there are series commercial vaccines that are PCV-2b or PCV-2a based are extensively used in this region, which phenomenon may also contribute to the evolution of PCV2 from PCV2b to PCV2d genetic subtype, as PCV2 virus is able to evolve by way of mutation and recombination in response to the wide-spread application of these vaccines. Further phylogenetic analysis indicated that the PCV2b strains obtained from different parts of the region were closely related to each other but were more and another two different clusters that are marked by shadows with grey and pink separately were also discovered within PCV2a genotype ( Figure 2).
As reported previously, the Cap protein is considered the most variable structural protein of PCV2, and the amino acid variation in this region might be associated with pathogenicity and/or immunogenicity (Fenaux, Opriessnig, Halbur, Elvinger, & Meng, 2004;Mu et al., 2012). Therefore, an amino acid alignment of the Cap protein, which is encoded by the ORF2 gene, was also conducted. Our results show that there are two major regions of variation, residues 52-68 and 185-191, that correspond to two of the three dominant immunoreactive areas identified by Lekcharoensuk et al. (2004). In addition, these data show that patterns specific to each group exist, such as Chinese PCV2 strains clustered within PCV2h had one amino acid marker region located at positions 57-63 and two specific amino acid variations found at positions 124-125. However, it is strange that less variation or no significant differences were observed within the newly reported antigenic recognition regions (residues 117-131, 132-146, 156-162, 195-202 and 230-233) (Li et al., 2010;Shang et al., 2009). All of the ORF2 genes of the 47 PCV2d strains from the southern China encoded 234 aa, while the vast majority of the remaining strains encoded 233 aa, which is the same number that was encoded by ORF2 in the PCV2 strains isolated from other parts of the world (Figure 2 (Gu et al., 2012). A similar phenomenon occurred in the antigenic epitope 26-RPWLVHPRHRY-36 in the nuclear localization signal region of the PCV2 Cap protein (Guo, Lu, Huang, Wei, & Liu, 2011).
Interestingly, the N-terminus of the Cap protein encompassing the nuclear localization signals in all the 92 PCV2 sequences was also found to be fairly well conserved (Figure 3), further confirming the inferred importance of this site. Epitope cluster analysis revealed that there were eight potential epitopes within the Cap protein in the 92 PCV2 strains (Figure 4), of which two epitopes (residues 5-40 and 130-190) were mainly related to antigen recognition and were highly conserved in all of the 92 analysed PCV2 sequences. We noted, however, that the remaining short epitopes do not correspond to the six dominant immunoreactive areas identified in other studies, and this finding may therefore present new evidence for PCV2 adaptability.
F I G U R E 4 Epitopes and predictions in genes of part of the species. Epitopes: Positions above epitope threshold. Predictions: The protein sequence is displayed with an orange gradient, illustrating BepiPred-2.0 predictions. Structural: Helix (H -pink probability gradient), Sheet (E -blue probability gradient) and Coil (C-Orange probability gradient) predicted. Surface: Buried(B)/Exposed(E) and orange gradient illustrating predicted relative surface accessibility.

E TH I C S S TATEM ENT
All the 120 PCV2 full-genome sequences were retrieved from GenBank and thereby no ethical approval was required. GK-AA17202037), and we would like to thank PhD Xiao-Hu Hu for technical assistance.

CO N FLI C T O F I NTE R E S T
The authors declare that they have no conflict of interest.