Genomic surveillance of SARS‐CoV‐2 strains circulating in Iran during six waves of the pandemic

Abstract Background SARS‐CoV‐2 genomic surveillance is necessary for the detection, monitoring, and evaluation of virus variants, which can have increased transmissibility, disease severity, or other adverse effects. We sequenced 330 SARS‐CoV‐2 genomes during the sixth wave of the COVID pandemic in Iran and compared them with five previous waves, for identifying SARS‐CoV‐2 variants, the genomic behavior of the virus, and understanding its characteristics. Methods After viral RNA extraction from clinical samples collected during the COVID‐19 pandemic, next generation sequencing was performed using the Nextseq and Nanopore platforms. The sequencing data were analyzed and compared with reference sequences. Results In Iran during the first wave, V and L clades were detected. The second wave was recognized by G, GH, and GR clades. Circulating clades during the third wave were GH and GR. In the fourth wave, GRY (alpha variant), GK (delta variant), and one GH clade (beta variant) were detected. All viruses in the fifth wave were in GK clade (delta variant). In the sixth wave, Omicron variant (GRA clade) was circulating. Conclusions Genome sequencing, a key strategy in genomic surveillance systems, helps to detect and monitor the prevalence of SARS‐CoV‐2 variants, monitor the viral evolution of SARS‐CoV‐2, identify new variants for disease prevention, control, and treatment, and also provide information for and conduct public health measures in this area. With this system, Iran could be ready for surveillance of other respiratory virus diseases besides influenza and SARS‐CoV‐2.

They are not vital for viral replication; however, they are involved in pathogenesis. 6,7 Coronaviruses are biologically diverse and mutate rapidly. 8 The virus's properties are largely unaffected by most of changes. However, some changes may have an impact on the properties of the virus, such as how easily it spreads, the severity of the disease it causes, or how well vaccines, therapeutic medicines, diagnostic tools, or other social and public health measures work. 9 Since the beginning of the COVID-19 pandemic, different genetic lineages of SARS-CoV-2 have emerged and spread worldwide. 9 The SARS-CoV-2 variants that may pose an increased risk to public health have been divided into the following three groups by WHO: Variants under monitoring (VUM), variants of interest (VOIs), and variants of concern (VOCs). A variant with genetic changes that are thought to affect the characteristics of the virus is called a VUM, and there are some indications that it may pose a threat to public health and safety in the future. Variants that have been found to cause community spread in multiple cases, clusters, or countries are defined as VOIs. The definition of a VOC is an increase in transmissibility and virulence or a decrease in the efficacy of current public health, social, and therapeutic measures. 10 There are currently 11 clades in the GISAID nomenclature system, which is based on shared marker mutations.
The L and S clades formed early in the pandemic before the L split into V and G. The GR, GH, GV, and GK clades split from base clade G. GR evolved into GRY, which later developed into GRA, the current dominant clade. The O clade contains all sequences that have not been classified. 11 The gold standard for monitoring and identifying new variants in SARS-CoV-2 is whole genome sequencing (WGS) using next generation sequencing (NGS). All SARS-CoV-2 genes can be sequenced using this method, including those encoding non-structural proteins and other regions. 12 It is essential to maintain constant monitoring of the genetic diversity of SARS-CoV-2 in order to (a) ensure that vaccines and immune-based diagnostic or therapeutic interventions are effective, (b) offer a treatment that is much more stable, and (c) observe the pattern of the virus's geographic spread during the ongoing pandemic. 13,14 2 | METHODS

| Data analysis
All the reads were mapped to the SARS-CoV2 reference genome assembly for data investigation. The assembled viral genome was of high quality and contained no unknown nucleotides. The gathered genomes were studied by CoVsurver mutations Application in GISAID and aligned using the sequence alignment program BioEdit. Finally, all sequences were submitted in GISAID.

| RESULTS
In this study, 330 COVID-19 confirmed cases from the sixth wave of COVID-19 in Iran were subjected to NGS. These specimens were oropharyngeal swabs collected from all over the country. The variants and amino acids changes in structural, non-structural and accessory proteins were compared with SARS-CoV-2 strains circulated in Iran during the first five waves which were evaluated in our previous study. 3,15 Amino acids changes in structural proteins were listed in Table 1 and those related to nonstructural proteins were mentioned in Table 2. It should be noted that amino acid substitutions in accessory proteins were detected in a limited number of strains in the sixth wave. The highest rate of substitution in these proteins was 1.3% among BA.1 and BA.2 variants as follows: 1.3% of BA.1 strains had NS7a-P99S substitution. Besides, 1.3% of BA.2 strains had NS3-H78Y substitution.
T A B L E 1 Amino acid changes detected in structural proteins of SARS-CoV2 strains circulated in the sixth wave of the pandemic which were detected in more than 70% of each variant and compared to shared aa changes during the first five waves in Iran.

Genes 6th wave
Shared changes with previous waves As an RNA virus, SARS-CoV-2 has a high rate of mutations, resulting in ongoing evolution over time that could affect replication, infectivity, transmissibility, virulence, and immunogenicity. 16 Increasing transmissibility, pathogenicity, and the capacity to evade natural or vaccine-induced immunity are all potential outcomes of emerging variants. 17 Analysis of whole-genome sequences is essential for monitoring its increased transmissibility and virulence-altering potential. In this study, we reported the circulation of distinct lineages of  Nonstructural NSP3-A1892T NSP3-K38R NSP3-L1266I NSP3-S1265del SARS-CoV-2 variant that emerged in late 2019. BA.1 quickly became the most common variant worldwide after being discovered, and it has since developed into several other lineages. 23 The spike protein of BA.1 has more than 30 mutations that make it less sensitive to vaccine-induced antibody neutralization 25  severe. 34,37,38 Late in January 2020, the spike protein's D614G mutation was occasionally observed in both Europe and China. This mutation first spread to Europe and then gradually spread worldwide. It is still the predominant spike substitution, globally. 37,39 In our study, D614G mutation was continuously identified from the second wave to the sixth (with more than 70% frequency), and it was the major substitution. In the last wave, BA.1, BA.2, and mixed lineages (ML) showed this mutation at high percentage.
The N501Y mutation is localized to the RBD and helps in achieving higher binding affinity to host cells, potentially leading to P681H at the S1/S2 spike cleavage site is thought to increase furin cleavage, potentially affecting viral cell entry. 42 It is believed that the mutations in Omicron's spike protein, P681H, increase spike protein cleavage and contribute to Omicron's high-speed transmission. 43 The presence of this substitution in Omicron raised concern as it may be associated with higher virulence and infectivity. 44 In Iran, P681H was detected in the sixth and fourth waves. As in the last wave, this mutation was found in BA.1, BA.2, and mixed lineages with high frequencies.
Omicron has been described as a highly mutated variant with an "unusual constellation of mutations." 45 Free energy perturbation and computational mutagenesis could confirm that Omicron RBD binds ACE2, 2.5 times stronger than the prototype SARS-CoV-2. Notably, three substitutions, T478K, Q493K, and Q498R, nearly doubled the electrostatic potential (ELE) of the RBD Omic -ACE2 complex and made a significant contribution to the binding energies. 46 Moreover, the Omicron variant and other VOCs share T478K and E484A mutations, which have been found to increase neutralizing antibody resistance and associate with immune escapes. 47 In this study, T478K substitution was identified in the last two waves (fifth and sixth). In the sixth wave, this mutation was present in all BA.1, BA2, and mixed lineages samples.
The Omicron VOC is also described by the four-point mutation in It is important to note that nonstructural proteins of SARS-CoV-2 (NSPs) primarily affect the innate immune responses of humans, facilitating immune escape.
NSP3 has cleavage operations on nsp, via the pLpro domain, including self-cleavage of NSP3. 49 This nonstructural protein has several immune escape mechanisms that make it easier for viruses to reproduce, such as hindering ISG15 modification and inhibiting IFN production. [50][51][52] In our study, NSP3 mutation was detected in BA.1, BA.2, and the mixed groups in the sixth wave, but not presented in previous waves.
The transmembrane proteins nsp3, nsp4, and nsp6 hijack and rearrange the membranes of the host endoplasmic reticulum, subsequently inducing the formation of double membrane vesicles (DMVs). 53 We observed nsp4 substitution in the fourth, fifth, and sixth waves. But nsp6 mutations were just identified in the sixth wave in BA.1, BA.2, and mixed groups.
NSP5 is the major protease (Mpro) of the SARS-CoV-2. NSP5 likewise separates NLRP12 and TAB1 as well as handling long popular polypeptides. This protein is essential for viral infection. 54

ACKNOWLEDGMENTS
We would like to thank all the patients who kindly participated in our study. We should say many thanks to the staff of the NIC located at

CONFLICT OF INTEREST STATEMENT
The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT
Data are openly available in a public repository that issues datasets with DOIs.