Resistance‐associated mutations to the anti‐SARS‐CoV‐2 agent nirmatrelvir: Selection not induction

Mutations associated with severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) resistance to antiprotease nirmatrelvir were reported. We aimed to detect them in SARS‐CoV‐2 genomes and quasispecies retrieved in our institute before drug availability in January 2022 and to analyze the impact of mutations on protease (3CLpro) structure. We sought for 38 3CLpro nirmatrelvir resistance mutations in a set of 62 673 SARS‐CoV‐2 genomes obtained in our institute from respiratory samples collected between 2020 and 2023 and for these mutations in SARS‐CoV‐2 quasispecies for 90 samples collected in 2020, using Python. SARS‐CoV‐2 protease with major mutation E166V was generated with Swiss Pdb Viewer and Molegro Molecular Viewer. We detected 22 (58%) of the resistance‐associated mutations in 417 (0.67%) of the genomes analyzed; 325 (78%) of these genomes had been obtained from samples collected in 2020−2021. APOBEC signatures were found for 12/22 mutations. We also detected among viral quasispecies from 90 samples some minority reads harboring any of 15 nirmatrelvir resistance mutations, including E166V. Also, we predicted that E166V has a very limited effect on 3CLpro structure but may prevent drug attachment. Thus, we evidenced that mutations associated with nirmatrelvir resistance pre‐existed in SARS‐CoV‐2 before drug availability. These findings further warrant SARS‐CoV‐2 genomic surveillance and SARS‐CoV‐2 quasispecies characterization.

these mutations in SARS-CoV-2 quasispecies for 90 samples collected in 2020, using Python.SARS-CoV-2 protease with major mutation E166V was generated with Swiss Pdb Viewer and Molegro Molecular Viewer.We detected 22 (58%) of the resistance-associated mutations in 417 (0.67%) of the genomes analyzed; 325 (78%) of these genomes had been obtained from samples collected in 2020−2021.APOBEC signatures were found for 12/22 mutations.We also detected among viral quasispecies from 90 samples some minority reads harboring any of 15 nirmatrelvir resistance mutations, including E166V.Also, we predicted that E166V has a very limited effect on 3CLpro structure but may prevent drug attachment.Thus, we evidenced that mutations associated with nirmatrelvir resistance pre-existed in SARS-CoV-2 before drug availability.These findings further warrant SARS-CoV-2 genomic surveillance and SARS-CoV-2 quasispecies characterization.

| INTRODUCTION
2][3][4][5][6][7][8] This prodrug targets the main protease of this virus (3CLpro), a 3chymotrypsin-like protease that cleaves the viral polyprotein to generate active proteins 9 (https://pubchem.ncbi.nlm.nih.gov/compound/54752934).The question of its role in the appearance and prevalence of resistance-associated mutations needs to be analyzed in light of earlier studies, conducted on the resistance of bacteria to antibiotics by Joshua and Esther Lederberg, 10 and on the resistance of HIV-1 to antivirals. 11,12In these works, the existence of resistant mutants was detected in the absence of anti-infective drug (the antibacterial agent in the case of Escherichia coli and the antiviral agent in the case of HIV-1).As a matter of fact, mutants appeared spontaneously, and the role of the antibiotic or antiviral was in selecting, not inducing the resistance-associated mutations.Since some nirmaltrelvir resistance-associated mutations are known, we thought it would be interesting to identify, in SARS-CoV-2 genomes and quasispecies whose sequences were obtained in our institute, the pre-existence of these mutations before the use and even the marketing of this antiviral, in December 2021 in the United States and in Europe and in January 2022 in France.In addition, we analyzed the impact on the SARS-CoV-2 3CLpro structure of mutation E166V previously identified as the main one associated with nirmatrelvir resistance in vitro. 8

| Detection of nirmaltrelvir resistanceassociated mutations in SARS-CoV-2 genomes
We first screened a set of 62 673 genomes obtained by nextgeneration sequencing (NGS) in our institute from nasopharyngeal swab samples analyzed in the diagnosis setting and collected between late February 2020 and mid-November 2023, including 61 397 previously described 13 and 1276 most recently obtained since then.
NGS had used Illumina technology, the Nextera XT paired-end strategy, and the MiSeq instrument (Illumina Inc.); the Illumina COVID-seq protocol and the NovaSeq 6000 instrument (Illumina Inc.); or the Oxford Nanopore technology and the GridION instrument (Oxford Nanopore Technologies Ltd.), as previously described. 13,14The frequencies of amino acid mutations in consensus genomes was determined using an in-house Python script (https://www.python.org/).This Python script uses as input the output.csvfile of the Nextclade tool (https://clades.nextstrain.org/), 15and allows counting the number of each different nucleotide or amino acid mutation.In addition, it creates a tabulated.tsvfile with a raw for each different mutation while the numbers of mutations for the whole genome set, for the genomes of a same lineage, and for each individual genome are put into columns.This script is freely available at https://github.com/sanschiffre/ihuscripts/tree/main.7][8] At least five of them (E166V, P352L, T304I, S144A/E, and A173V) were associated with high to minimal resistance in mutant virus experiments. 8 addition, we sought, using the pileup function of the Pysam python module (https://www.python.org/)with default parameters (including a quality score > 13 to analyze NGS reads) for these mutations in SARS-CoV-2 quasispecies previously described for 90 nasopharyngeal samples collected between March and September 2020 that were part of a larger previously described set. 16Viral quasispecies had been determined from the mapping of NGS reads on the SARS-CoV-2 Wuhan-Hu-1 isolate complete genome (Gen-Bank Accession no.NC_045512.2) performed with the CLC genomics workbench software v7 (https://digitalinsights.qiagen.com/)and 0.8 as threshold for sequence coverage and 0.9 as threshold for nucleotide similarity.These NGS reads, analyzed here, had been obtained from nasopharyngeal samples directly sequenced, without prior PCR amplification that can affect sequencing accuracy, using the Illumina technology by the Nextera XT paired-end strategy on a MiSeq instrument (Illumina Inc.), as previous described. 16All analyzed mapping files were for assembled genomes with a coverage of ≥99% of genome no.NC_045512.2and a mean number of reads per position > 50.A threshold of 4% had been chosen for a significant intra-sample diversity at a given nucleotide position, as this corresponded to ≥2 reads for a minimum of 50 reads per position.

| Structural analysis of the SARS-CoV-2 protease
The structural models described in the present study are derived from the structure of the SARS-CoV-2 3CLpro accessed through the PDB file 7VH8 (https://www.rcsb.org/). 17The wild-type enzyme was submitted to energy minimization with the Polak-Ribière algorithm 18 of the Hyperchem eight program 19 as previously described. 20Mutation E166V was generated with the Swiss Pdb Viewer (https://spdbv.unil.ch/) 21followed by energy minimization.The electrostatic surface potential of the proteins was calculated with the Molegro Molecular Viewer (http://molexus.io/molegro-molecular-viewer/) as described previously. 20

| Detection of nirmaltrelvir resistanceassociated mutations in SARS-CoV-2 genomes
We detected the presence of 22 (58%) of the resistanceassociated mutations in 417 (0.67%) of the 62 673 consensus genomes analyzed (Table 1).A total of 325 (0.84%) of these genomes had been obtained from the 38 656 samples collected in 2020-2021, before nirmaltrelvir availability in France, while 92 (0.38%) had been obtained from the 24 017 samples collected in 2022-2023, among which all but one belong to an Omicron (BA.1, BA.2, or BA.5) variant.The most frequent mutations were P108S (in 81 genomes), A129V (69), G15S (62), T21I (57), and A191V (34).In addition, T21I, P252L and T304I, the three most frequent mutations reported to be selected in in vitro culture experiments with nirmatrelvir, 8 were detected in 57, 5, and 4 genomes, respectively.Mutation E166V reported to confer the greatest resistance level in in vitro assays 8,22 was not detected but T21I and L50F that compensate the replicative fitness loss due to this mutation 8  Beyond, we searched for the 38 amino acid mutations previously reported to confer nirmatrelvir resistance, [6][7][8] among NGS reads generated from 90 respiratory samples for which viral quasispecies had been determined 16 and genomes were assembled by mapping with a mean NGS depth per position ranging between X50 and X1471.We were able to detect among NGS reads generated some minoritary ones that harbored any of 15 codon changes associated with the 38 amino acid mutations, including the main mutation E166V (Table 2).This indicates that these mutations were present as part of viral quasispecies, although they were highly minority.These 15 mutations were harbored by reads generated from between 1 and 7 samples, and the mean prevalence per sample of these reads ranged between 0.50 and 5.56%.Major mutation E166V was detected from two samples.[25]

| Structural analysis of the SARS-CoV-2 main protease
We then studied the impact on the structure of SARS-CoV-2 3CLpro of mutation E166V as it was the one reported to induce the highest level of in vitro resistance to nirmatrelvir. 8,22As shown in Figure 1 with the superimposition of the secondary structures of the wild-type (gray ribbons) and mutant (blue ribbons) enzymes, we predicted that E166V has a very limited effect on the enzyme structure.Yet, the replacement of a glutamic acid by a valine in the E166V mutant may prevent the drug from attaching to the enzyme.Three molecular mechanisms may explain the resistance conferred by mutation E166V: (i) A kinetic effect due to the increase of the electropositive surface potential in a zone that was previously electronegative (yellow asterisk), which implies that the initial attraction of the drug by the enzyme is slowed down in the mutant; (ii) In contrast with glutamic acid (E166), the side chain of valine in the E166V mutant cannot interact with nirmatrelvir via an hydrogen bond; Finally; (iii) the reorientation of the side chain of valine in the E166V mutant induces a steric clash that pushes the drug out of its binding site.
T A B L E 2 Frequency of nirmatrelvir resistance-associated mutations in SARS-CoV-2 quasispecies from 90 respiratory samples collected in 2020.

Number of samples harboring the amino acid mutation
Frequency of the amino acid mutation in the sample(s)

Mean
Min Max Proportions and numbers of reads Note: The list of SARS-CoV-2 genomes for which quasispecies were analyzed can be found in Supplementary Table S1.
Here we were able to detect, although at low prevalence, mutations associated with nirmatrelvir resistance in SARS-CoV-2 genomes obtained in our center before the approval and use of this drug in France.This is congruent with a previous study that reported the natural presence of these mutations in 0.5% of 13 446 588 SARS-CoV-2 genomes obtained worldwide and deposited as of December 2022 in GISAID. 6In this latter work, two mutations reported to decrease by ≥2 log the inhibition by nirmatrelvir of virus replication or of 3CLpro activity were detected, albeit in <1 per million genomes. 6 a matter of fact, antiviral use can select the genomes that carry resistance-associated mutations among the viral quasispecies, 26,27 which does not occur for genomes replicated elsewhere during periods or in countries where the antiviral is not administered.In the case of HIV-1, considering its rapid replication, its mutation rate, and its genome size, it has been estimated that, on average, mutations will occur in every position in the genome multiple times each day. 28In this context, the probability that mutations that confer resistance to antivirals could spontaneously occur in absence of the drugs is high, providing that they are not deleterious for the structure of the corresponding mutated proteins.Similarly, for SARS-CoV-2, it was estimated that 10 5 −10 8 mutations occur per infected person based on an estimated 10 9 −10 11 viruses at the peak of infection, a mutation rate of 3 × 10 -6 mutations per nucleotide per replicative cycle, and a number of 3−7 replicative cycles per infection. 29This tremendous amount of mutations occurring during a single infection course likely encompasses every possible single nucleotide substitution over the almost 30 000 nucleotide long genome.[32][33][34] The case of the E166V mutation in SARS-CoV-2 3CLpro is particularly interesting.The fact that this mutation is not deleterious for the tridimensional structure of the SARS-CoV-2 main protease according to our present analysis is consistent with the spontaneous occurrence of this mutation among viral quasispecies in absence of nirmatelvrir, according to our sequence data, albeit at a very low prevalence.However, this does not go without a cost.Indeed, this mutation modifies the electrostatic surface potential of the enzyme (Figure 1), which may explain the fitness loss observed in the E166V mutant. 5,8Secondary mutations occurring at distant sites from E166V 5,8 may then compensate this loss of fitness by restoring a surface potential similar to that of the wild type enzyme.Similar mechanisms have been proposed to drive virus evolution through host membrane interactions. 35terestingly, we observed in the present study that nirmatrelvir resistance was frequently associated with base deaminations.These could be signatures of the well-known APOBEC3 cellular enzymes. 36 a matter of fact, these enzymes have been reported to edit the SARS-CoV-2 RNA, [37][38][39] and we have previously reported that APOBEC3 was associated with approximately one quarter of all the mutations harbored by 61 397 SARS-CoV-2 consensus genomes obtained in our laboratory. 13Beyond, although described as involved in a host defense mechanism for the case of their targeting of HIV F I G U R E 1 Structural analysis of the impact of the E166V mutation on protease structure and nirmatrelvir binding.The electrostatic surface potential of the wild-type (upper left) and E166V mutant (upper right) protease is illustrated with the following color code: blue, positive; red, negative; white/gray, neutral.The structural superimposition of the wild-type (gray ribbons) and E166V mutant (blue ribbons) is shown in the central panel.The red frame indicates the binding site of nirmatrelvir, which is described at the atomic scale for the wild type enzyme (lower left).In the case of the E166V mutant, a steric clash due to the side chain of valine may prevent nirmatrelvir binding to the protease (lower right).
genomes, 40 it appears that APOBEC enzymes edit the genomes of a broad range of viruses and organisms. 36Present data suggest that SARS-CoV-2 resistance to antivirals could be linked at least in part to the activity of these cellular enzymes.
Overall, these previous findings reinforce the justification of a SARS-CoV-2 genomic surveillance as well as of SARS-CoV-2 quasispecies characterization to enable the monitoring first, of the prevalence of majority and minority mutations within genes encoding antiviral targets, before and after the use of these drugs in patients, and second, of the transmission and of the virological and clinical impacts of drug resistance mutations.
Frequency of nirmatrelvir resistance-associated mutations in SARS-CoV-2 consensus genomes from 62 673 respiratory samples collected in 2020−2023.COLSON ET AL. | 3 of 7 were found, in 57 and 15 genomes, respectively.Moreover, signatures of the activity of cellular enzymes named apolipoprotein B mRNA editing enzyme, catalytic polypeptide 3 (APOBEC3) were found for 12 of the 22 mutations detected in our data set, which involved C > U (n = 7) or G > A (n = 5) changes.