MALDI‐TOF MS analysis of nasal swabs for the characterization of patients infected with SARS‐CoV‐2 Omicron

With the ongoing mutation of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) leading to various variants, there is an urgent need for new diagnostic methods for SARS‐CoV‐2 infection. The existing nucleic acid test and antigen test suffer from long assay time and low sensitivity, respectively. Matrix‐assisted laser desorption/ionization time‐of‐flight mass spectrometry (MALDI‐TOF MS)‐based nasal swabs analysis have been demonstrated as a promising technique in SARS‐CoV‐2 infection screening. However, the applicability of the technique in the different variants of SARS‐CoV‐2 is uncertain. Given the prevalence of the Omicron variant since 2022, we developed a MALDI‐TOF‐based diagnosis method with nasal swab samples to detect the infection by this variant. We collected 325 SARS‐CoV‐2‐positive and 221 SARS‐CoV‐2‐negative nasal swab samples, and the molecular mass fingerprints were acquired from the samples by MALDI‐TOF MS. Using a random forest machine learning classification model to analyze the molecular mass fingerprints MALDI‐TOF mass spectra, the accuracy of 97%, false negative rate of 0%, and false positive rate of 7.6% were achieved for the diagnosis of SARS‐CoV‐2 infection. Combining the MALDI‐TOF analysis with top‐down proteomics, we identified four potential protein biomarkers, that is, humanin‐like 4, thymosin beta‐10, thymosin beta‐4 and statherin, in the nasal swab for the diagnosis of coronavirus disease 2019. It was further found that the four protein biomarkers can also differentiate the SARS‐CoV‐2 original strains infection and Omicron strains infection. These results suggest that the MALDI‐TOF MS‐based nasal swab analysis holds effective diagnostic capabilities of SARS‐CoV‐2 infection, and shows promising potential for global application and extension to other infectious diseases.


INTRODUCTION
The diagnosis of coronavirus disease 2019 (COVID- 19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is mostly achieved through nucleic acid testing based on reverse transcriptase quantitative polymerase chain reaction (RT-qPCR) technology.While nucleic acid testing has high sensitivity and specificity, it is limited by the need for central laboratory, resulting in a delay in detection. 1The failure to obtain a timely COVID-19 diagnosis result can cause significant delays in treating emergent patients.Antigen testing has been widely promoted as a fast detection method for COVID-19 using upper respiratory specimens, 1 which however suffers from false negative results. 1,2To date, further development is still needed to improve the diagnosis efficiency of COVID-19.
It has been proved that immune response monitoring can provide sensitive detection of SARS-CoV-2 infection 3 and reveal information about disease severity and host characteristics. 4Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has been widely used in the analysis of biomolecules such as peptides in various types of clinical specimens. 5The technique holds the advantage of rapid analysis, high sensitivity and robustness in usage.At the onset of the COVID-19 outbreak, Nachtigall et al. proposed a MALDI-TOF-based method for COVID-19 diagnosis by detecting the immune response against the virus in nasal cavity. 6Since the late 2020, the virus has undergone several mutations, resulting in the emergence of different SARS-CoV-2 variants such as Alpha, 7 Beta, 8 Gamma, 9 Delta 10 and Omicron 11 that circulate strongly in different regions and countries.The suitability of previous MALDI-TOF-based methods for the current epidemic context is uncertain.
Centers for disease control and prevention (CDC) reported that the prevalent SARS-CoV-2 strains in China since 2022 primarily involved the Omicron variant with various evolutionary branches.Currently, the analysis of SARS-CoV-2 mutation is mainly based on the nextgeneration sequencing. 12However, the method is at high cost and is time consuming.New methods have been developed for the detection and identification of SARS-CoV-2 variants, including PCR-based methods. 13These methods share the same limitations as the RT-qPCR in the detection of original strains.Under this circumstance, in this study we evaluated the MALDI-TOF-based analysis of nasal swabs for the diagnosis of patients infected with SARS-CoV-2 Omicron.A total of 135 Chinese participants were involved in the study, resulting in 325 SARS-CoV-2 positive nasal swab samples and 221 SARS-CoV-2 negative samples.Based on unsupervised and supervised statistical analysis, 25 MALDI-TOF feature peaks were identified from a training dataset and then validated on an independent test dataset.Using a random forest (RF) classification model, the accuracy of 97%, false negative rate of 0%, and false positive rate of 7.6% were achieved for the diagnosis of SARS-CoV-2 Omicron infection.Combining the MALDI-TOF analysis with top-down proteomics, 14 we identified four potential protein biomarkers, that is, humanin-like 4 (HN4), thymosin beta-10 (Tβ10), thymosin beta-4 (Tβ4), and statherin, in the nasal swab for the diagnosis of COVID-19.It was further demonstrated that the four potential protein biomarkers can be used to differentiate patients infected by original strains of SARS-CoV-2 and patients infected by SARS-CoV-2 Omicron strains.

Sample collection and storage
Positive and negative SARS-CoV-2 nasal swab samples from mildly or asymptomatically infected individuals as well as volunteers without SARS-CoV-2 infection were collected from Beijing Ditan Hospital during March-April of 2022.The study involved 135 volunteers, each of whom underwent three to six rounds of sampling.Nucleic acid testing was applied on the samples, and the Ct value of 40 was considered as the criteria of SARS-CoV-2 positive.After collection, samples were kept in nasal swab preservation solution, shaken for 5 s, left to rest for 10 s, and then stored at room temperature.MALDI-TOF MS analysis was performed for all samples within 72 h of sampling.Participants were excluded or withdrawn from the study if any of the following conditions occurred: less than 6 years old, serious comorbidities, and discontinuation in sampling.Samples were excluded if there were no valid nucleic acid results, or if the sample collection or preservation did not meet the requirements aforementioned.

MALDI-TOF MS analysis
The nasal swab was placed in nasal swab preservation solution and mixed thoroughly.The swab was removed and 1 µL of the solution was deposited on a stainless-steel MALDI target plate and dried at room temperature.One µL of matrix solution was pipetted to cover the sample spot and dried at room temperature for MALDI-TOF MS analysis using a Clin-TOF-II instrument (Bioyong Technologies Inc.).Mass calibration was performed using a standard calibration mixture of peptides and proteins to reach a mass tolerance of 500 ppm.Mass spectra were acquired in the mass range of m/z 2000-20,000.Fifty laser bombardment positions were selected for each sample spot and each position was bombarded 10 times.The final superimposed spectrum was saved.

MALDI-TOF MS data processing and analysis
Raw data of MALDI-TOF MS were processed by an R package, MALDIquant, 15 with operations including sqrt transformation, savitzkyGolay smoothing with a half window size of 50, and sensitive non-linear iterative peakclipping baseline correction.Then, peak detection was performed using the median absolute deviation (MAD) method with a half window size of 70 and a signal-to-noise threshold of 4. Minifrequency (the minimal frequency of features detected across all the samples) was set as 0.25.Peaks were then binned by binPeaks with a tolerance of 0.005 for all samples to get the features.The features obtained as a matrix table was then processed by log 2 transformation and quantile normalization using Metaboanalyst 16 (McGill University, Montreal, Canada, https://www.metaboanalyst.ca/).The processed MALDI-MS feature matrix is shown in Supporting Information S1.
Four machine learning algorithms, that is, partial least squares discriminant analysis (PLS-DA), logistic regression (LR), support vector machine (SVM) and RF, were used to generate the classification models.p-Value was calculated by the Mann-Whitney test between the COVID-19 cohort and the non-COVID-19 cohort.The diagnostic efficiency of the models was evaluated by the receiver operating characteristic (ROC) curve, accuracy, sensitivity, and specificity.Accuracy was defined as a ratio between the correctly classified samples to the total number of samples (TP + TN)/(P + N).Sensitivity represents the correctly classified positive samples to the total number of positive samples (TP/P).Specificity represents the correctly classified negative samples to the total number of negative samples (TN/N).TP stands for true positive, TN for true negative, P for positive, and N for negative.All the statistical analysis and machine learning were performed using Metaboanalyst 16 (McGill University; https://www.metaboanalyst.ca/).
MS analysis was performed on Orbitrap Exploris 480 (Thermo Fisher Scientific) with nanospray flex (Thermo Fisher Scientific).The ion transfer capillary temperature was set to 300 • C. The spray voltage was set to 1.9 kV.The primary MS acquisition range was from m/z 600 to 3000 with a mass resolution of 240k @ m/z = 200.The automatic gain control target (AGC target) was set to 3 × 10 6 and the maximum injection time was set to 250 ms.The mass resolution of the tandem MS/MS was 120k @ m/z = 200.In data-dependent acquisition (DDA), top 20 precursors were selected for the MS/MS analysis.The AGC target value was set to 5 × 10 5 .The maximum ion injection time was set to 250 ms.The precursor ion isolation window was set to m/z 1.4.The dynamic exclusion was set to 30.0 s, and the higher-energy C-trap dissociation (HCD) stepped collision normalized energy was set to 30%.

Top-down proteomics data analysis
ProteinGoggle 2.0 17 was used to process the top-down proteomic data.A database of human protein sequences was selected, with fixed modifications being alkylation on cysteine, variable modifications being methylation, dimethylation, trimethylation on lysine, acetylation on lysine, methylation, dimethylation on arginine, phosphorylation on serine, threonine, and tyrosine, and oxidation on methionine.ProteinGoggle was used to build the protein analysis database and to analyze the MS data.Based on the library search results, data with post-translational modification (PTM) score ≥ 1, proteoform score ≥ 1, and matched fragment ions of every proteoform ≥ 6 were filtered in the identification list.False discovery rate (FDR) threshold was 1%.The data were de-weighted according to the protein sequence number (accession), protein sequence (sequence), and modification (modifications) in the total protein identification list to obtain the identification list of different modified proteins (Sheet 1 of Supporting Information S2).The data were de-weighted according to the protein sequence number (accession) in the total protein identification list to obtain the identification list of de-duplicated proteins (Sheet 2 of Supporting Information S2).

Annotation of MALDI-TOF MS peaks
Proteins identified by the top-down proteomics were considered to annotate the MALDI-TOF MS peaks.Only single charge was considered for MALDI-TOF MS peaks, and the mass tolerance of 1000 ppm was considered.

Ethical statement
The

RESULTS
The study aims to validate the effectiveness of MALDI-TOF analysis of nasal swab samples for the diagnosis of SARS-CoV-2 Omicron infection.Nasal swab samples were analyzed by MALDI-TOF MS to collect molecular mass fingerprints.Machine learning was then applied to extract feature peaks and to build a classifier, 5 which can diagnose SARS-CoV-2 Omicron infection (Figure 1).In practice, after analyzing the nasal swab samples using MALDI-TOF, the obtained MS data can be inputted into the pre-established classifier to obtain the diagnostic results.We collected 325 SARS-CoV-2 positive nasal swab samples (P) and 221 SARS-CoV-2 negative nasal swab samples (N) in the Beijing Ditan Hospital during March-April of 2022.Nucleic acid assay was applied to the nasal swab samples, and the Ct value of 40 was considered as the criteria of SARS-CoV-2 positive.All the patients were infected by the SARS-CoV-2 Omicron strains.A random allocation process was employed to divide the collected nasal swab samples into two sets, namely, a training set and a test set.The training set consisted of 227 positive and 155 negative samples, while the test set included 98 positive and 66 negative samples (Figure 2A).The nasal swab samples were subjected to MALDI-TOF MS analysis.The resulting mass spectra were processed by MALDIquant for feature extraction and analyzed by Metaboanalyst for multi-variant statistical analysis and machine learning.Totally, 236 peaks were obtained after data processing of all the MS raw files (Supporting Information S1).
In the mass range of m/z 2000-20,000, we observed a similar number of features between group P and group N (Figure S1).The spectral processing parameters settings for both groups were consistent, using the MAD method with a half window size of 70 and a signal-to-noise ratio of 4, as detailed in Section 2. We focus on the mass range between m/z 2000 and 20,000 to detect changes in protein expression due to the human immune response to SARS-CoV-2 infection.This m/z range can capture a wide array of biomolecular changes, including those in proteins and polypeptides.For proteins with m/z larger than 20,000, the detection sensitivity by MALDI-TOF MS is low, and we could hardly observe any efficient signal.For the m/z smaller than 2000, the signal is mainly from matrices, metabolites, lipids, and small peptides.Combining the information of metabolites, lipids, and small peptides may offer a more comprehensive analysis of the nasal swab samples.However, different methods in MALDI-TOF MS analysis should be employed, such as the usage of nanomaterials to assist the ionization of small molecules, 19 which can bring additional problems such as the consistency between experiments.
After data collection, machine learning was used to identify distinctive features between SARS-CoV-2 positive and negative samples.The unsupervised principal component analysis analysis did not reveal significant difference between the two groups in the training set, indicating that there were only subtle differences between the two groups (Figure 2B).Then, distinctive features between the groups were extracted using PLS-DA (variable importance in projection [VIP] > 1.5) and volcano plot analysis (Fold change (FC) > 1.5, p-value < .05)(Figure 2C,D The Mann-Whitney U test, a non-parametric statistical method, was rigorously applied to evaluate the selected feature peaks identified within the training dataset.This test, known for its effectiveness in comparing differences between two independent samples, was specifically utilized to assess the distinctiveness of the features between the positive group and the negative group.As meticulously illustrated in Figure 3, the analysis yielded statistically significant results for all the features, demonstrating a clear and consistent differentiation between the two groups.This outcome reinforces the validity of these features in distinguishing between the positive and negative samples, underpinning their potential utility in further analytical applications. We further evaluated the potential of the 25 feature peaks to distinguish SARS-CoV-2 positive and negative samples on the test dataset.Leveraging unsupervised cluster analysis using the 25 feature peaks, 89% of the test samples could be accurately classified, demonstrating a potential of the features in the diagnosis of SARS-CoV-2 infection (Figure S3a).This was further confirmed by the heatmap analysis of the 25 feature peaks using the test dataset (Figure S3b).Then, four different machine learning methods, that is, PLS-DA, LR, SVM, and RF, were used to generate the classification models using the training data with the 25 feature peaks, and tested on the test data set.PLS-DA plays a crucial role in chemometrics and bioinformatics, excelling in the analysis of high-dimensional data through dimension reduction and maximizing differences between classes.In contrast, LR is preferred for binary classification tasks, such as medical diagnostics, by predicting the probabilities of categories and focusing on the likelihood of outcomes rather than reducing data dimensions.SVM excels in high-dimensional spaces, including text and image classification, by effectively distinguishing between categories with an optimally determined boundary, outperforming LR and PLS-DA in managing complex data boundaries.RF, by integrating multiple decision trees, enhances model stability and accuracy, especially for large datasets with high-dimensional features, and is more adept at reducing the risk of overfitting compared to LR or SVM. 20Together, these methods showcase distinct advantages in data processing and problem solving, and hence the different machine learning methods were selected to build classification models for the diagnosis of SARS-CoV-2 infection.The area under the curve (AUC) values for all the four machine learning methods were around 0.9, reflecting a high degree of authenticity in diagnosis (Figure 4A-D).The confusion matrix showed a low number of false positive and false negative samples (Figure 4E-H).The AUC, accuracy, specificity, and sensitivity for the four machine learning methods can be found in Table S1.RF exhibited the highest accuracy of 97%, the false negative rate of 0%, and the false positive rate of 7.6%.Above all, RF was determined as the optimal method for SARS-CoV-2 infection detection using nasal swab samples and MALDI-TOF analysis.
To identify the biomarkers correlated with the 25 MALDI-TOF feature peaks, top-down proteomics was employed to annotate the MALDI-TOF peaks (Supporting Information S2).Four proteins, that is, humanin-like 4 (HN4), thymosin beta-10 (Tβ10), thymosin beta-4 (Tβ4), and statherin, were identified from four of the 25 features, as shown in Table S2.The rest features cannot be matched to any of the proteins or protein fragments identified by the top-down proteomics.
We further compared our data to the ones reported by Nachtigall et al., 6 which were from the patients infected by the original strains of SARS-CoV-2 in 2020.A total of 211 positive samples were included in the study by Nachtigall et al. from three labs, and we chose the 44 positive samples from lab 2 and lab 3 to compare with the 325 samples of the SARS-CoV-2 Omicron strainsinfected samples in our study.The spectral processing parameters for the original strains were consistent to the aforementioned ones for the Omicron strains.A Monte Carlo cross-validation analysis was conducted to differentiate between samples infected with the original strains and those infected with the Omicron strains using the four identified protein biomarkers through RF.Twothirds of the samples were used to evaluate the feature importance, and the remaining one-third samples were used for validation.The procedure was repeated multiple times to calculate the performance and confidence intervals of each model.The efficacy of these protein biomarkers in distinguishing between the SARS-CoV-2 original strains and the SARS-CoV-2 Omicron strains was evident from the ROC curve (AUC = 0.979) and confusion matrix analysis results (Figure S4).Our findings demonstrate the potential of the four protein biomarkers in differentiating between the original COVID-19 strains from 2020 and the variant strains from 2022, highlighting their significant role in strain-specific identification and also the difference in human immune response against the different strains.
F I G U R E 4 Receiver operating characteristic (ROC) curve and confusion matrix of the prediction results on the test samples for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Omicron infection diagnosis.Four machining learning were used consisting of random forest (RF), partial least squares discriminant analysis (PLS-DA), support vector machine (SVM), and logistic regression (LR).The classifier was built on the training dataset with the 25 distinctive features.

DISCUSSION
Epidemic infectious diseases, such as the 2002 outbreak of severe acute respiratory syndrome (SARS) and the 2019 emergence of SARS-CoV-2, have caused significant damage to both human life and property.These diseases are characterized by rapid mutations in virus, making it challenging to sustain investment in the development of prevention and treatment methods.To date, there have been a number of methods developed for the diagnosis of SARS-CoV-2 infection.RT-PCR is considered as the gold standard for SARS-CoV-2 infection detection, with the advantages of highly sensitive and specific detection, but is time consuming and requires specialized equipment.Serological tests, such as ELISA, are effective in detecting past infections, but their utility is limited in identifying current infections. 21Antigen testing offers a quick and cost-effective screening method, but with lower sensitivity compared to RT-PCR. 22MALDI-TOF MS is less common but promising in SARS-CoV-2 infection detection, provides a rapid and broad-spectrum analysis, aiding in understanding the immune response against the virus infection.
In 2020, Nachtigall et al. introduced an innovative and pioneering approach to diagnose COVID-19 by MALDI-TOF. 6A total of 362 nasal mucus samples collected from the South American region were analyzed, with the focus on immune changes in the nasal cavity following COVID-19 infection. 6Given the persistent and dynamic mutations of SARS-CoV-2, 23 disease severity and immune responses have correspondingly changed.As a result, the MALDI-TOF-based COVID-19 detection method may no longer be universally applicable.To assess the suitability of MALDI-TOF technology in response to changes in the epidemiological context, we analyzed nasal swab samples collected from China in 2022 and compared our findings to those by Nachtigall et al. 6 Based on the variance with the introduction of virus mutations, the capability of MALDI-TOF technology in diagnosing COVID-19 caused by various mutants of SARS-CoV-2 is demonstrated.With the easing of COVID-19 policies, the gathering of large-scale COVID-19 samples has gained increased significance.Comparing MALDI-TOF-based diagnostic strategies developed for different COVID-19 strains and various regional populations offer an unprecedented and fascinating perspective.
Drawing from this concept, our study mainly focuses on the immune response caused by the SARS-CoV-2 Omicron strains in the nasal cavity.We compared the 25 features identified in our study with those by Nachtigall et al. 6 Five common features (m/z 3334, 3354, 4167, 4963, and 5219) were discovered, suggesting that different SARS-CoV-2 strains may induce similar immune changes in humans.While in contrast, unique features were observed in each study, suggesting potential alterations linked to COVID-19 strains. 23Among the 25 features identified in our study, four were further annotated by top-down proteomic analysis, including HN4 (m/z 3334), Tβ10 (m/z 4936), Tβ4 (m/z 4963), and statherin (m/z 5299).It is noted that the features of m/z 3334 and m/z 4963 were also identified by Nachtigall et al. 6 We further employed the four features to differentiate the patients infected by the original strains and the ones infected by the Omicron strains, demonstrating the potential of the four protein biomarkers in strain-level resolution of SARS-CoV-2 infection, which also highlights the difference in immune response against different SARS-CoV-2 strains.It should be noted that the main aim of the study is to demonstrate the utility of the MALDI-TOF-based nasal swabs analysis in the detection of SARS-CoV-2 Omicron infection instead of identifying biomarkers specific to the Omicron strains because of the difficulties in collecting nasal swabs samples of patients infected by different strains of SARS-CoV-2.
Top-down proteomics, combined with MALDI-TOF MS, identified four protein biomarkers specific to infections caused by the SARS-CoV-2 Omicron strain.Top-down proteomics, which involves the analysis of intact proteins, enhances accuracy in proteoforms analysis by preserving post-translational modifications and identifying protein fragments that can be missed by the typical bottom-up methods relying on peptide-based analysis. 24The topdown proteomics are more suitable than the bottom-up proteomics to annotate the MALDI-TOF MS features, since both techniques observe directly the molecule weights of proteins or protein fragments existing in samples, while from bottom-up proteomics it is difficult to predict the exact proteoforms.However, due to the technique difficulties, such as the separation of intact proteins, the ionization of large proteins, the fragmentation of proteins during tandem MS analysis, and the detection of ions with large m/z, the top-down proteomics identifies limited number of proteins.Therefore, only four features were annotated as protein biomarkers.
Tβ4, a small protein consisting of 43 amino acid residues with an acetylated N-terminus, is abundantly present across human tissues.Tβ4 plays a vital role in controlling actin polymerization and acts as an anti-inflammatory and antioxidant agent, further promoting wound recovery and angiogenesis.Yu et al. have demonstrated the protective effect of recombinant human thymosin beta-4 (rhTβ4) against coronavirus infections in mice. 25Tβ10 is part of the thymosin family proteins, which plays crucial roles in cell morphology, motility, and cytoskeletal organization.Tβ10 has been proven to be involved in the malignant processes of many cancers. 26Statherin, a molecule initially discovered in parotid and submandibular-sublingual secretions and expressed in submucosal glands, plays a crucial role in oral health.It contributes to enamel protection, maintains mineral homeostasis, provides lubrication, and aids in early microbial colonization. 27In addition to oral health, statherin is present in nasal secretions, where it may enhance the overall antimicrobial properties in conjunction with proteins like secretory phospholipase A2 and defensins. 28Despite its significance, the existing literature offers limited insight into the variations of statherin in respiratory tract pathologies. 29Humanin (HN) and its derivatives, including HN4, belong to a family of proteins known for their anti-apoptotic properties and regulatory role in cellular metabolism.HN, a small protein consisting of 24 amino acid residues, is recognized for its protective capacity against neuronal damage associated with Alzheimer's disease.While previous studies primarily focus on Alzheimer's disease, recent findings suggest that HN and its derivatives can suppress cell apoptosis across a variety of diseases and organ systems, implying a more extensive protective effect of the proteins. 30The relationship between HN's anti-apoptotic capabilities and viral infections, as well as the underlying mechanisms, warrants additional investigation.
The limitation of the current study is that influenza infection or COVID-19 vaccination may also induce similar immune responses. 31Therefore, there is a need to incorporate a more diverse range of samples in the control group.Additionally, the rapid evolution of SARS-CoV-2 necessitates continuous updating of markers to maintain diagnostic accuracy.Other limitations of the MALDI-TOFbased COVID-19 diagnostic model include the need for extensive validation of the identified markers in larger cohorts to ensure reliability across diverse clinical settings.Furthermore, assessing the practicality and cost of integrating this model into existing diagnostic workflows is crucial for routine clinical usage.These factors underscore the importance of ongoing research and adaptation in response to the dynamic nature of viral pandemics.
In summary, we demonstrate that the MALDI-TOF technology in diagnosing SARS-CoV-2 infection using nasal swab samples displays remarkable performance with SARS-CoV-2 Omicron strains, just as the previous study with the original strains of SARS-CoV-2.The relatively stable diagnostic ability is a promising feature of the technique for its potential use in global diagnostics.The discovery of four potential biomarkers, that is, HN4, Tβ10, Tβ4, and statherin, can provide valuable information to understand the human immune response against the virus, and to develop novel diagnosis methods based on the biomarkers.Under an extended circumstance, MALDI-TOF-based molecular mass fingerprinting can be further applied to the diagnosis of various infectious diseases.

A U T H O R C O N T R I B U T I O N S
Sample collection and mass spectrometry analysis were performed by Rui Song and Xiaohua Hao under the supervision of Xiaoyou Chen.Dandan Li analyzed the data and wrote the draft of the manuscript.Qian Lyu assisted in the mass spectrometry analysis.Qingwei Ma was involved in the design of the study and supervised the mass spectrometry analysis.Liang Qiao designed the study, supervised the data analysis, and finalized the manuscript.All authors contributed to the article and approved the submitted version.

A C K N O W L E D G M E N T S
This work was supported by the National Natural Science Foundation of China (NSFC, 22074022, 22374031).

C O N F L I C T O F I N T E R E S T S TAT E M E N T
The authors declare no conflict of interest.

F I G U R E 1
Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF)-based diagnosis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Omicron infection.F I G U R E 2 (A) Classification of samples.(B) Score plot of principal component analysis (PCA) for the matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) features of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) positive and negative samples in the training set.(C) Score plot of partial least squares discriminant analysis (PLS-DA) for the MALDI-TOF features of the SARS-CoV-2 positive and negative samples in the training set.(D) Volcano plot of the MALDI-TOF features of the SARS-CoV-2 positive and negative samples in the training set.Fold change: N/P.(E) Top 15 differential features between the SARS-CoV-2 positive and negative samples by variable importance in projection (VIP) score of PLS-DA.(F) Heat map of the 25 distinctive MALDI-TOF features in the training samples.
; Figure S2), wherein 25 distinctive features were obtained from the training set under the combined criteria of PLS-DA and volcano plot.The 15 peaks with the largest VIP are shown in Figure 2E. Figure 2F shows the heatmap of the 25 features between the SARS-CoV-2-positive and -negative samples in the training set.It was clearly observed that features like m/z 3334, 2419, 3219, and 4936 were downregulated in the SARS-CoV-2-positive samples, while features like m/z 4189, 4149, and 4167 were upregulated in the SARS-CoV-2-positive samples.

F
I G U R E 3 (A-Y) The relative intensity of the 25 distinctive features between the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-positive and SARS-CoV-2-negative samples in the training set.Error bars show standard deviation among samples.p-Value was calculated by the Mann-Whitney test.Asterisks represent statistically significant differences between groups (**p-value < .01,****p-value < .0001).N: SARS-CoV-2-negative sample; P: SARS-CoV-2-positive sample.