Evolutionary study of COVID‐19, severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) as an emerging coronavirus: Phylogenetic analysis and literature review

Abstract Since emerging coronaviruses have always become a human health concern globally especially severe acute respiratory syndrome coronavirus 2 (SARS‐CoV) and Middle East respiratory syndrome coronavirus and a novel coronavirus was introduced in Wuhan, China, in December 2019 (called SARS‐CoV‐2), many researchers focused on its epidemics, virological and clinical features. SARS‐CoV‐2 is classified as Betacoronaviruses genus and Sarbecovirus subgenus (lineage B). The virus shows a great similarity with SARS‐CoV and bat SARS‐like coronaviruses. In this study, we evaluate SARS‐CoV‐2 virus phylogeny and evolution by using current virus and related sequences.

. Based on this classification, SARS-CoV and MERS-CoV are classified in Sarbecovirus and Merbecovirus, respectively . Based on phylogenetic study it could also be concluded that the Wuhan Seafood Market Pneumonia Virus, SARS-CoV-2 (2019-Novel Coronavirus) will be classified as Sarbecovirus .

Epidemics of SARS-CoV infection as an emerging coronavirus
initiated from Guangzhou, China in 2002 by 2003, had spread to 29 countries and lead to 8,089 cases and 9.6% mortality rate. The SARS-CoV was a recombinant virus from horseshoe bats which was transmitted to the human population through civet cats and was not reported after 2004 Hui et al., 2020). The virus phylogeny was comprehensively studied by Luk et al. (2019).
Another emerging coronavirus was the MERS-CoV which was first reported in the Kingdom of Saudi Arabia in 2012 and is still circulating; transmitted to the human population through camels (Azhar et al., 2019). The MERS-CoV involved 2,465 confirmed cases with an approximate 35% mortality rate (Hui et al., 2020). The recent emerging coronavirus was SARS- CoV-2 (2019-Novel Coronavirus) or Wuhan seafood market pneumonia virus. It was initially reported in some cases presenting unknown pneumonia and associated with the sea food market in Wuhan, China in December 2019 (Bogoch et al., 2020). Based on World Health Organization statistics and situation reports (August 30, 2020), the virus circulated all around the world and led to nearly 25 million confirmed cases and 800,000 deaths worldwide (WHO, 2020). Further investigations established its diagnosis guidelines, virus features, related morbidity and mortality rate (Hui et al., 2020; https://www.who.int/emerg encie s/disea ses/novel -coron aviru s-2019/techn ical-guidance; https://www.who. int/docs/defau lt-sourc e/coron aviru se/situa tion-repor ts/20200 128-sitre p-8-ncov-clear ed.pdf?sfvrs n=8b671 ce5_2). In this study, we tried to assess the virus phylogeny and evolution by using current virus related sequences.

| SAR S -COV-CL A SS IFI C ATI ON AND G ENOME ORG ANIZ ATION OF SARB ECOVIRUS
Most of the virological features and clinical data regarding the SARS-CoV-2 have been established by the valuable work of . They conducted a study published in January 2020 in which they reported the SARS-CoV-2 classification and genome organization courtesy of Oxford Nanopore Technology sequencing. Their results showed that the virus is related to the bat SARS-like coronavirus. The virus was also classified as lineage B of and six accessory ORFs (includes 3a, 6, 7a, 7b, 8 and 10; NCBI reference sequence: NC_045512.2). Since the time the complete genome of SARS-CoV-2 was published in January 2020   bat-SL-CoVZXC21) showed 88%-99% identity for different proteins, and between SARS-CoV-2 and SARS it was 68%-100% . Furthermore, a conducted study by Zhou et al. (2020) indicates that the most similar strain with SARS-CoV-2 is bat SARS-related coronavirus-RaTG13 (GISAID accession number: EPI_ISL_402131) within 96.2% similarity by the full-length genome, RNA-dependent RNA polymerase (RdRp) and spike (S) genes assessment. The most identical protein between both SARS-CoV-2 and SARS-CoV viruses was helicase (nsp13). SARS-CoV-2 protease enzyme also seems to be more similar with the bat SARS-like coronavirus. The most different amino acid sequence between SARS-CoV-2 and SARS-CoV was ORF 3b. The ORF 3b protein in SARS-CoV is responsible for the interferon Type I inhibition and induces apoptosis (Liu et al., 2014;Luk et al., 2019). A novel protein in ORF 3b of SARS-CoV-2 was also identified which needs further investigation . Recent investigations F I G U R E 2 The phylogenetic analysis of 300 bases region of theRNA-dependent RNA polymerasegene of coronaviruses by Neighbour Joining method using 1,000 bootstrap, red triangle showed SARS-CoV-2 reference sequence, The scale represents Coronaviruses genus alpha (blue font), gamma (green font) and beta (red font) genus. Beta genus divided into four subgenus (highlighted) included Sarbecovirus, Nobecovirus, Merbecovirus and Embecovirus. SARS-CoV-2, severe acute respiratory syndrome coronavirus 2 introduced novel variants of the RdRp gene of SARS-CoV-2 in patients from Nevada (Hartley et al., 2020). Intrahost virus evolution in cancer patients by Siqueira and colleagues highlights the importance of continued monitoring of the SARS-CoV-2 mutations and evolutional patterns (Siqueira et al., 2020). Meanwhile, in a study conducted by Bai et al. (2020) a comprehensive profile from evolution and mutations in SARS-CoV-2 was released regarding the assessment of 16,373 genome sequences. also prompted further study (Lim et al., 2019;Xu et al., 2019).

| SAR S -COV-2 PHYLOG ENE TI C ANALYS IS AND PROBAB LE E VOLUTION
The majority of conducted phylogenetic analysis studied was performed using MEGA X software and the sequences obtained from the NCBI GeneBank (https://www.ncbi.nlm.nih.gov/). Phylogenetic trees were designed by the neighbor-joining method except for the evolutionary The statistical assessment of the trees routinely used the 1,000 replicate bootstrap values (<70 replicates were excluded from the tree branches; Saitou & Nei, 1987).

| Envelope
Conducted researches on the envelope protein of coronaviruses revealed that the protein has short membrane protein and contains three main domains which include N-terminus, large hydrophobic transmembrane domain (TMD) and C-terminus (Schoeman & Fielding, 2019). Its domains are illustrated in Figure 3. The E protein is important in virus assembly and is released from the host cell by using the TMD domain and pathogenesis in SARS-CoV (Hogue & Machamer, 2008;Jimenez-Guardeno et al., 2014;Schoeman & Fielding, 2019;Ye & Hogue, 2007).  showed that the SARS-CoV-2 E protein has a 100% similarity to bat-SL-CoVZXC21 and 95% similarity to SARS-CoV. The phylogenetic study for the E gene sequence in our study illustrated a great similarity with sarbecoviruses which noted the probable importance of the E protein in pathogenesis of SARS-CoV-2 (Figure 4).

| Spike, Membrane, and Nucleocapsid
The M gene phylogenetic study shows the similarity between the SARS-

| ORF3
The neighbour-joining phylogenetic analysis of full length ORF3 of SARS-COV-2 shows the most similarity with bat SARS-CoV, bat SARS-CoV HKU-3 and MERS-CoV ( Figure 10). Conducted study on the MERS-CoV with deletion on ORF 3, 4 and 5, shows that this mutant virus replication and pathogenesis includes anti-inflammatory responses and are less than the wild-type virus in human respiratory cells' culture and animal models (Menachery et al., 2017). ORF3b in SARS-COV-2 seems to be unique in comparison with SARS-CoV or SARS-related-CoVs. It also revealed that this ORF is important in pathogenesis due to its anti-IFN activity and despite the differences, suggested that this ORF still could be functional in anti-IFN activity .

F I G U R E 7
The phylogenetic analysis of 1,259 bases region of the N gene of corona viruses by Neighbor Joining method using 1,000 bootstrap. Red triangle showed SARS-CoV-2 reference sequence. The scale represents 0.1 substitutions per nucleotide position. All of the accession numbers and full name of the strains were listed. SARS-CoV-2, severe acute respiratory syndrome coronavirus 2

| ORF8
Multiple sequence alignment (MSA) of the complete genome from SARS-CoV-2 revealed that ORF8 has unique sequences (Ceraolo & Giorgi, 2020). The phylogenetic study of ORF8 of the virus shows high similarity with bat SARS-like coronaviruses (Figures 8 and 9).
The importance of ORF8's similarity to SARS-like coronaviruses and SARS-CoV is due to the history of the SARS-CoV epidemic condition. During the start of the SARS-CoV epidemic, ORF8 encoded a single protein, while in late periods of the epidemic this ORF encoded two proteins. There is a hypothesis suggested that this change in ORF8 could be a possible explanation for the SARS-CoV attenuation and reduction in replication of the virus (Muth et al., 2018;Oostra et al., 2007;Zhang & Liu, 2020).

| RdRp (nsp12) and helicase (nsp13)
The RdRp sequence of the viruses is used for classification of the viruses into different lineages  as it discussed in Sections 2 and 5. Furthermore, nsp13 of the coronaviruses was encoded in the virus helicase . Nsp12 or RdRp F I G U R E 8 The phylogenetic analysis of 365 bases region of the ORF8 gene of corona viruses by Neighbour Joining method using 1,000 bootstrap. Red triangle showed SARS-CoV-2 reference sequence. The scale represents 0.05 substitutions per nucleotide position. All of the accession numbers and full name of the strains were listed. SARS-CoV-2, severe acute respiratory syndrome coronavirus 2 of the virus seems to be important in unwinding the process of the helicase (Jia et al., 2019). Neighbour-joining analysis shows the helicase of the SARS-CoV-2 sequence is similar to the MERS-CoV, Human Coronavirus OC43, HKU-1 and some bat coronaviruses as illustrated in Figure 10. The neighbour-joining of SARS-CoV-2 nsp13 seems to be gene located with sequence heterogeneity between different isolates of SARS-CoV-2 (Figure 11), which it highlights the importance of further investigation. The interacting domain of the helicase seems to reside in all coronaviruses and might be a good target for potential therapeutic agents (Jia et al., 2019).

| G EOG R APHI C AL D IS TRIBUTI ON OF SAR S -COV-2 AND COMPLE TE G ENOME ALI G NMENT
The first case of SARS-CoV-2 in other regions outside China was reported in Thailand in January 2020 (Cheng et al., 2020). In total, 25 million confirmed cases and more than 800,000 deaths worldwide have been recorded due to COVID-19 as of the 30th of August 2020 (WHO, 2020). The virus evolution studies established by using the

| CURRENT THER APEUTI C OP TI ON S FOR SAR S -COV-2
The history of anti-protease drugs for the treatment in coronaviruses returns to the SARS-CoV epidemic (Chu et al., 2004).
The use of lopinavir/ritonavir seemed to be potentially beneficial in treating SARS-CoV patients . Regardless of being an anti-HIV drug, Cinanserin (an old 3 chymotrypsinlike (3C-like) protease) could be useful in treatment . Another drug which also seems to be useful in the treatment of SARS-CoV-2 is chloroquine phosphate due to its anti-inflammatory functions. Furthermore, the chloroquine phosphate function could be effective in increasing endosome pH and preventing virus fusion in the host cell (Gao et al., 2020).
In the study conducted by Zhang et al. it was suggested that all of the supportive treatments, coronavirus-specific treatments and antiviral treatments in infected patients, and using the influenza vaccine in non-infected people, could be helpful but one of the important matters is the patients' nutritional status . Regarding the nutritional status the role of the Vitamin D still remains controversial (Borst et al., 2011;. Using the Bacillus Calmette-Guérin vaccine is recommended (Dara et al., 2020). Recently, the blocking of the virus attachment to ACE-2 receptors and the employment of Convalescent Plasma Therapy seems to be useful in treatment .

| E THI C S
Ethics was approved by Ethical Committee of Iran University of Medical Sciences, Tehran, Iran by the code: IR.IUMS.REC.1398.1327.

CO N FLI C T O F I NTE R E S T
None declared.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1002/vms3.394.