Advancing Glycan Analysis: A New Platform Integrating SERS, Boronic Acids, and Machine Learning Algorithms

Glycans are the most abundant fundamental biomolecules, but profiling glycans is challenging due to their structural complexity. To address this, a novel glycan detection platform is developed by integrating surface‐enhanced Raman spectroscopy (SERS), boronic acid receptors, and machine learning tools. Boronic acid receptors bind with glycans, and the reaction influences molecular vibrations, leading to unique Raman spectral patterns. Unlike prior studies that focus on designing a boronic acid with high binding selectivity toward a target glycan, this sensor is designed to analyze overall changes in spectral patterns using machine learning algorithms. For proof‐of‐concept, 4‐mercaptophenylboronic acid (4MBA) and 1‐thianthrenylboronic acid (1TBA) are used for glycan detection. The sensing platform successfully recognizes the stereoisomers and the structural isomers with different glycosidic linkages. The collective spectra that combine the spectra from both boronic acid receptors improve the performance of the support vector machine model due to the enrichment of the structural information of glycans. In addition, this new sensor could quantify the mole fraction of sialic acid in lactose background using the machine learning regression technique. This low‐cost, rapid, and highly accessible sensor will provide the scientific community with another option for frequent comparative glycan screening in standard biological laboratories.


Introduction
Glycans are the most abundant and diverse fundamental biomolecules in living organisms.Almost all cells in nature DOI: 10.1002/adsr.202300052 are covered with dense layers of glycans that mediate a wide range of biochemical reactions. [1]Because glycosylation is a post-translational modification (PTM), glycan profiles can be influenced by various factors, such as environmental conditions, temperature, cell growth cycles, and cell health.Therefore, frequent monitoring of the glycosylation changes is necessary to explore cell activities.However, glycan analysis remains challenging due to their complex isomeric forms, glycosidic linkages, and branched structures.[4] Due to these challenges, an accessible glycan analysis tool that allows users to frequently monitor the changes of the glycosylation is highly desirable. [2,4]erein, we developed a new glycan sensing platform that integrates surfaceenhanced Raman spectroscopy (SERS), boronic acid receptors, and machine learning techniques.Raman spectroscopy is a potential tool for glycan analysis because it offers unique fingerprint spectra of molecules.However, the inherent weak signals limit its applications.To overcome this barrier, a Raman enhancement technique, called SERS, was used to increase the Raman signals of molecules that are adsorbed on metallic nanoparticles. [5]8][9] To further improve the accuracy of the glycan detection, the SERS substrate was functionalized with boronic acids.[12][13] By designing intermolecular distances between boronic acid moieties, specially designed boronic acids could selectively bind to different types of glycans. [14]Researchers have used different types of boronic acids to detect sialic acids in cancer cells, [11,15] glycoproteins in serum samples, [16,17] and glycans on cell surfaces. [18,19][22] However, previous studies have mainly focused on boronic acids with high binding selectivity toward a specific targeted sugar.For instance, Kong et al. used two different boronic acids to simultaneously reach two pairs of hydroxyl groups on a single glucose molecule. [21]One of the boronic acids contains a metal carbonyl reporter that produced a distinct Raman peak in a SERS silent region for biological samples.By monitoring this unique Raman peak, Kong et al. were able to distinguish glucose from galactose and fructose.In another study, Sharma et al. designed a special bisboronic acid with two boronic acid moieties in a single molecule for glucose binding. [22][22] Given the complexity of the glycan, designing numerous receptors with desired properties can be unrealistic.To detect more complex glycans, a different approach is needed.
A new detection principle was adapted to distinguish different glycan samples without requiring highly selective boronic acid receptors.Commercially available boronic acids were used in this study, allowing for greater convenience and cost-effectiveness.Boronic acid analogs are typically attached to aryl groups that offer distinct Raman vibrations. [23]The reaction between glycans and boronic acids causes changes in molecular vibrations.We hypothesized that various glycosidic linkages and isomeric structures of glycans could influence vibrations of aryl groups differently, leading to observable spectral shifts.Moreover, in a complex glycan sample, different glycan molecules would competitively bind to the same boronic acid receptor.The contributions of Raman signals from different glycans would vary based on their concentrations, eventually leading to unique Raman spectra.We hypothesize that observing the Raman spectral changes could lead to a new path for the comparative glycan analysis.
Because glycan-boronic acid reactions simultaneously influence multiple vibration modes, the spectral alterations appear in a broad frequency range.We could not rely on specific peaks for spectral analysis; therefore, the machine learning algorithm was used to track overall changes in Raman spectra.Machine learning-based methods have been used to solve complicated qualitative and quantitative questions. [24,25]The same methods can be applied to Raman spectra since it could differentiate the spectra based on subtle changes and reveal the underlying data patterns from the whole spectra instead of individual peaks. [26]e integrated the discriminant analysis of principal components (DAPC) with various machine learning classifiers, including support vector machine (SVM), to classify SERS spectra collected from beverages. [9]The same machine learning strategy was applied to analyze the complex SERS spectra from various glycan samples.
For proof-of-concept, we studied the interactions of two boronic acids: 4-mercaptophenylboronic acid (4MBA) and 1thianthrenylboronic acid (1TBA) with a group of monosaccharides and disaccharides (The molecular structures are shown in Figure S1, Supporting Information).Both boronic acids have similar selectivity trending toward monosaccharides, but 1TBA has one more aromatic ring structure, which may lead to additional spectral changes. [27,28]The reaction between boronic acids and glycans could trigger distinctive spectral changes, and the experimental spectra agreed with the simulation data calculated by the density functional theory (DFT).The glycan-boronic acid reactions influenced the charge distributions of the aromatic structures on the boronic acids; thus, 1TBA and 4MBA receptors offered different types of glycan binding spectra.By combining the SERS spectra from 1TBA and 4MBA receptors, the "collective spectra" could enrich the available structural information.The results show that such collective spectra could improve classification accuracy.To evaluate the limitation, the sensor was used to differentiate stereoisomers (glucose, mannose, and galactose), structure isomers with different types of glycosidic linkages (maltose vs. isomaltose), and the presence of glycosidic linkages (lactose vs. glucose and galactose mixture).The classification accuracy of the selected glycans could reach 99.6%.To demonstrate the application, a classification of extracted oligosaccharides from different milk products was performed.In addition to qualitative analysis, the quantification of the mole fraction of sialic acids in lactose background was conducted using the machine learning regression technique.The quantification accuracy reached a coefficient of determination (R 2 ) value of 0.998 and a normalized mean square error (NMSE) of 0.00195.

Boronic Acid Coated SERS Substrate
Glass fiber filter papers coated with a dense layer of silver nanoparticles were used as SERS substrates.This nanoplasmonic substrate, called nanopaper, was prepared by the silver mirror reaction, offering a significant Raman signal enhancement. [6]The SERS spectra of glucose, galactose, mannose, and sialic acid on the nanopaper without boronic acid surface modification are shown in Figure S3 (Supporting Information).Glucose, galactose, and mannose are stereoisomers; the spectral difference among these isomers could be clearly observed.However, the intensities of Raman spectra are still relatively low on the nanopaper.SERS is a near-field phenomenon that enhances Raman signals near metal nanoparticle surfaces. [29]Lower signals may have been caused by suboptimal contact between sugars and silver particles.Decorating additional receptor molecules on nanopapers could bring glycans near silver particle surfaces, eventually improving SERS signals. [30]The Figure 1.A schematic of the sensor design.SERS was conducted on nanopaper substrates (i.e., glass fiber papers decorated by silver nanoparticles).The surfaces of silver particles were modified by selected boronic acid receptors that could bind to hydroxyl groups on glycan molecules.The interactions between boronic acids and glycans influence molecular vibrations, leading to unique Raman spectral patterns.The complex Raman data were further analyzed by machine learning algorithms for glycan classification and quantification.The example spectra shown here are A) 4MBA, B) 1TBA, C) 4MBA with glucose, D) 1TBA with glucose.
reporter molecule chosen here is boronic acids due to their covalent reactions toward cis-diols on glycans molecules.Boronic acids could selectively bind to various pairs of hydroxyl groups on glycans, based on intermolecular distances of boronic acid moieties.Thus, the nanopaper surfaces were modified with boronic acid molecules to capture glycan molecules.
The sensor design is shown in Figure 1.We modified nanopaper surface with two types of boronic acids, 4MBA and 1TBA, and evaluated their performance.4MBA has been extensively used for glycan sensor developments, [11][12][13]16,31] but 1TBA has not been thoroughly investigated. Compard to 4MBA, 1TBA has two aromatic rings in its structure.The spectral shifts caused by 1TBAglycan binding would differ from those caused by 4MBA-glycan.We hypothesized that the collection of spectral information from different boronic acids could improve the accuracy of glycan identification.

Comparison of Experimental and Theoretical Spectra
Raman scattering of 4MBA interacting with glycans has been studied extensively, [12,13,31,32] and the vibrational assignments from the literature are listed in Table S1 (Supporting Information).1TBA has not been fully investigated; thus, we measured and calculated the Raman spectra of 1TBA-glycan complex.Figure 2 shows the experimental SERS spectra of glucose and mannose on 1TBA functionalized nanopaper.The SERS signals are over 20 times stronger than the signals on the nanopaper without boronic acid functionalization.The spectral difference among no sugar (1TBA only), 1TBA-glucose, 1TBA-galactose, and 1TBA-mannose could be observed in Figure 2.
Instead of using specific peaks for analysis, the overall changes in the spectra were observed.Density functional theory (DFT) simulations were performed to understand the influence of molecular vibrations on Raman spectra.We conducted the simulations for 1TBA alone and 1TBA binding to two stereoisomers, glucose and mannose.The prior study has shown that boronic acid derivatives could bind to various combinations of hydroxyl groups on monosaccharides. [33]The flexible hydroxyls on positions 4 and 6 of glucose are most likely to interact with boronic acids; [34,35] thus, we conducted the DFT calculations of 1TBA binding to 4, 6 diols on glucose and mannose.The ex- perimental and theoretical vibrational frequencies, along with their corresponding vibrational assignments and intensities, are shown in Table S2 (Supporting Information).The DFT simulation suggests that the interactions between 1TBA and monosaccharides influence the charge distributions in 1TBA aromatic ring structure (Figure S4, Supporting Information).In addition, the stereoisomer, glucose and mannose, could cause different charge distributions, leading to changes in Raman spectra.These findings correspond to the frontier molecular orbital (FMO) analysis done by Revanna et al. [27] For instance, DFT calculations show that the bindings of glucose and mannose with 1TBA could intensify the peak at 428 cm −1 (CCCC torsion and SCCC out of plane bending); moreover, 1TBA-glucose has a higher intensity at 421 cm −1 compared to 1TBA-mannose at 423 cm −1 (Movie S1, Supporting Information).The changes of simulated molecular vibrations can be observed in other Raman peaks as well, such as 1560 and 1568 cm −1 (Movies S2 and S3, Supporting Information).The changes in these theoretical values correspond to what we observed in the experimental results.The theoretical calculations support our hypothesis that the binding of glycans to boronic acid could shift the vibration modes, leading to unique changes in Raman spectra.

Explore the Spectral Variations among Seven Common Monosaccharides
We first evaluated the capability of monosaccharide detection.The common monosaccharides in mammalian cells, glucose, mannose, galactose, fucose, GlcNAc, GalNAc, and sialic acid, were investigated. [36]Figure 3 shows the average normalized SERS spectra of the selected monosaccharides using 4MBA or 1TBA functionalized nanopapers.SERS intensities from 4MBA functionalized nanopapers were than the signals from 1TBA functionalized nanopapers.The thiol-silver bonding enhances the adsorption of 4MBA on silver nanoparticles, resulting in higher signal intensities. [37]Higher SERS signals of 4MBA-glucose nanopapers lead to a higher signal-to-noise ratio (S/N ≈ 417), compared to the spectra on 1TBA-glucose nanopapers (S/N ≈ 148).
The principal component analysis (PCA) was used to explore the major spectral variations among the selected monosaccharides.The use of multivariate analysis enables us to analyze the spectra changes over the full range instead of relying on specific peaks.The first PC explained 90% of the variabilities.The PCA contribution plots of PC1 for 4MBA and 1TBA nanopapers are shown in Figure S5 (Supporting Information).For 4MBA nanopaper, one of the major changes in the vibrational spectra could be observed at 1008 cm −1 , contributed by CC and OH stretching.The other major changes were located at 1570 and 1589 cm −1 , which are associated with CC stretching and CH bending on the aromatic ring of 4MBA.Compared with 4MBA spectra, more changes could be observed in 1TBA spectra, which likely means the 1TBA-glycan binding process influences more vibration modes.For 1TBA, there is a shift from a 1095 cm −1 peak in 1TBA to an 1104 cm −1 peak in 1TBA-glucose and 1TBAmannose SERS spectra (Figure 3), which is CC stretching.The 1128 cm −1 peak in 1TBA has another shift to an 1139 cm −1 peak in 1TBA-glucose and 1TBA-mannose spectra, associated with CC stretching and HCC in-plane bending.These 4MBA and 1TBA spectra observations show that the monosaccharide binding causes detectable changes in Raman spectra.

Classification of Monosaccharides via Machine Learning Algorithm
PCA analysis indicated that the bindings of different monosaccharides to boronic acid functionalized nanopapers could cause spectral changes at various wavenumbers.The complex spectral variations could not be easily distinguished by visual inspection; thus, the multivariate analysis and machine learning algorithm were utilized to analyze the SERS spectra.SERS spectra were first processed by the multivariate analysis, DAPC, and then the classic machine learning classifiers, SVM, were used to classify the spectra.DAPC could extract spectral features related to the key difference among the sample groups and minimize the influence of the variations within the same sample groups. [9,38]The combination of DAPC with SVM has shown superior performance for SERS spectra analysis. [9]The performance of the machine learning classification was expressed using the confusion matrix (Figure S6, Supporting Information).
The nanopapers functionalized with 4MBA show a better classification performance, compared to the nanopaper coated with 1TBA.For 4MBA nanopaper, the average classification accuracy could reach 99.7%, and the average sensitivities and average specificities of the selected monosaccharide exceed 99.5%.For 1TBA nanopaper, the average classification accuracy is 97.4%, and the range of the sensitivities and specificities for selected monosaccharides is around 93-99%.1TBA nanopaper particularly misclassified GlcNAc.The relatively poor performance of the 1TBA nanopaper may be explained by a lower S/N.For the GlcNAc, the S/N on 4MBA and 1TBA nanopapers were 499 and 83.7, respectively.The acetyl group on the position C2 of GlcNAc may influence the interaction of GlcNAc with the borate groups, leading to a weaker binding affinity toward 1TBA.Nevertheless, both the boronic acids have the capability of distinguishing the selected monosaccharides, including the stereoisomers.

Distinguish 𝜶-1,4 and 𝜶-1,6 Glycosidic Linkage in Maltose and Isomaltose
The types of glycosidic linkages between two saccharide units could influence biological function. [39]Identification of glycosidic linkages is essential in glycan detection.To evaluate the capability of the sensor, we first explore two disaccharides, maltose and isomaltose.Maltose and isomaltose are structural isomers composed of two glucose units.The glucose of maltose is connected via positions 1 and 4, while isomaltose has an -1,6 glycosidic linkage.Figure 4 shows the SERS spectra of maltose, isomaltose, and glucose using 4MBA or 1TBA functionalized nanopaper.
PCA analysis was performed again to observe the major variations among those three samples (Figure S7, Supporting Information).The relative peak contributions from 4MBA are similar to the ones for monosaccharide cases.The 1570 and 1589 cm −1 peaks in PC1 of 4MBA (Figure S7A, Supporting Information) contribute the most compared with other peaks, representing the CC stretching and CH bending on the aromatic ring structure of 4MBA.The formation of glycan and borate esterification can explain the changes in those two peaks. [13]In PC2 (Figure S7B, Supporting Information), we could observe the changes of ring vibrations in 600-800 cm −1 region and 1074 cm −1 . [40,41]These peaks are associated with the charge transfer in thiol-conjugate benzene ring.In the 1TBA spectra dataset (Figure S7C, Supporting Information), the main contributions to distinguish the samples are from the 1078, 1134, 1547, and 1564 cm −1 peaks, originating from the CC stretching and HCC in-plane bending.The 418 cm −1 peak for the SC stretching and SCC in-plane bending does not help as much in the disaccharide case compared to the monosaccharide case.The machine learning classifier was used again to distinguish these three samples (Figure S8, Supporting Information).The 4MBA and 1TBA functionalized nanopapers performed greatly in differentiating maltose, isomaltose, and glucose spectra with 99.8% and 100% accuracy, respectively.

Distinguish Glycosidic Linkage between Two Stereoisomers, Glucose, and Galactose
The capability of detecting another common disaccharide, lactose, was also evaluated in this study.Lactose consists of one glucose and one galactose with a -1,4 linkage.Prior studies have suggested that the flexible hydroxyls on positions 4 and 6 of glucose are most likely to interact with boronic acids. [34,35]This hydroxyl group of glucose at position 4 is blocked by the glycosidic linkage in lactose molecules.To observe the influence of the glycosidic bonds, we measured the SERS spectra of the lactose and an equal volume mixture of glucose and galactose (Figure 5).Upon visual inspection, the spectra of lactose and the mixture of glucose and galactose are very similar; thus, PCA was performed to observe the tiny changes in SERS spectra (Figure S9A, Supporting Information).For 4MBA, the major spectral variations are similar to the changes in the prior monosaccharide, maltose, and isomaltose.The shifts of SERS peaks were observed at the 472, 609, and 1073 cm −1 regions.For 1TBA (Figure S9B, Supporting Information), PC1 shows a different result compared with the previous 1TBA PC1 plots for monosaccharides and dis-accharides.We found that PC1 has increased contributions from the 579, 892, 1341, 1366, 1417, 1440, 1452, 1457, 1468, 1474, 1487, and 1493 cm −1 regions; however, these peaks exhibit lower signals.The low S/N of the 1TBA receptor leads to poor classification performance (Figure S10, Supporting Information).

Enhancing Classification Accuracy Using the Collective SERS Spectra
4MBA functionalized nanopapers show a good performance in distinguishing the seven selected monosaccharides, maltose/isomaltose, and lactose.1TBA functionalized nanopapers failed to distinguish lactose but showed acceptable accuracy for classifying the other glycans.When the number of classification groups increases, the computation complexity of machine learning classifiers increases dramatically. [42]To evaluate the limitation of the machine learning classifier, we analyzed all of the spectra previously collected from 11 sample groups, including 7 monosaccharides, maltose, isomaltose, lactose, and the mixture of galactose and glucose (Figure S11, Supporting Information).Although the classification performance is still acceptable after increasing the number of classification groups, the decreases in the average classification accuracies for both 4MBA and 1TBA were 0.8% and 8.8%, respectively.
As discussed above, glycan binding to different boronic acids could induce changes in their vibration modes, offering different structural information.We hypothesized that the collective SERS spectra from different boronic acid receptors could enrich the structural information, eventually improving the classification accuracy.This multi-receptor concept for SERS detection has been implemented to improve the accuracy of beverage classification. [43,44]To evaluate this strategy, we integrated the SERS spectra collected from 4MBA and 1TBA functionalized nanopapers and analyzed the collective spectra using the same machine learning algorithm.The confusion matrix of the collective spectra classification is shown in Figure S12 (Supporting Information).The major variations of the collective spectra could be observed by PCA plot (Figures S13-S15, Supporting Information).In PC1 (explained 38.88% variation), the major spectral variations were contributed by 4MBA.The information from the 1TBA nanopaper was dominated in PC2 (explained by 31.29%variation).Compared with the average classification accuracies obtained using the spectra from an individual boronic acid receptor, 99.2% for 4MBA and 91.2% for 1TBA, we observed an improvement in average classification accuracy, 99.6%, from the collective spectra.

Milk Sample Classification
To demonstrate the capability of differentiating complex glycan samples, we performed a comparative analysis on glycan samples extracted from cow milk, goat milk, and soy milk.Milk oligosaccharides are essential sources of nutrition, [45] and comparative analysis of milk oligosaccharides can be used for monitoring milk adulteration. [46]xtracted milk oligosaccharides contain various types of glycans that may undergo competitive reactions with the boronic acid receptors.The Raman signal contributed by each glycan would vary with the glycan concentration, resulting in a unique spectral pattern for each glycan mixture.We measured three batches of glycans extracted from different milk samples and classified the Raman spectra (Figure S16, Supporting Information) using the collective spectra method, which achieved 99.8% accuracy (Figure S17, Supporting Information).The result demonstrates the potential of this sensing platform for comparative analysis of complex glycan samples.

Quantification of Sialic Acids
In addition to qualitative analysis, we inquired whether the sensing platform could offer quantitative analysis.To evaluate the potential of glycan quantification, we measured the mole fractions of sialic acid in lactose.Similar to the milk oligosaccharide cases, sialic acid and lactose could competitively bind with boronic acids and result in different spectral changes based on the mole ratios of sialic acids.Such spectral information could assist in quantitative analysis.
Sialic acids play essential roles in cellular communication and the immune system.The amount of sialic acids in milk is critical for infant health. [47]For example, goat milk has a relatively higher sialic acid content than bovine milk products; [47,48] as such, goat milk is considered a better alternative to human breast milk.A quick and easy way to monitor the sialic acid concentration could assist in the quality control of dairy products.
Since lactose is highly abundant in milk, we mixed sialic acids with lactose to mimic the milk glycosylation environment.The mixtures of sialic acids and lactose were spotted on 4MBA and 1TBA functionalized nanopapers, and the SERS spectra were collected as previously.The spectra were processed by DAPC, and the canonical variables were used to build a machine learning regression model.[51][52][53] We established the GPR models using the SERS spectra from a single boronic acid (4MBA or 1TBA) and the collective spectra.The mole fractions of sialic acids spiked in the background lactose range from 0 to 2 mol%.This concentration range was chosen based on typical concentrations of sialic acids in goat milk in different lactation periods. [48]The quantification model performance was evaluated by 5-fold cross-validation.
4MBA shows a better quantification performance than 1TBA (Figures S18 and S19, Supporting Information).The 4MBA model showed good prediction (R 2 of 0.9978) and a low NMSE (0.00214).1TBA model still exhibited an acceptable quantification performance (R 2 of 0.9937 and NMSE of 0.00629).The performance of quantification could be further improved using the collective spectra (Figure S20, Supporting Information).The model based on the collective spectra could reach the best R 2 of 0.9981 and the lowest NMSE of 0.00195.

Conclusion
We have designed a machine learning-driven SERS glycan sensor that enables the classification of the selected glycan with over 99% accuracy.Two commercially available boronic acid receptors (4MBA and 1TBA) were tested, and they could effectively capture glycan molecules through the selective cis-diol chemical interactions.With the help of multivariate analysis, we could detect the overall spectral variations in a broad frequency range.The DFT simulations show that glycan-boronic acid reactions alter the molecular vibrations and charge distributions, inducing distinct spectral variations.Even stereoisomers (glucose, mannose, and galactose) could trigger sufficient spectral changes that can be detected by the machine learning algorithm.Furthermore, the glycan bindings can influence the vibrations of the aromatic ring structures on the boronic acid receptors; therefore, each boronic acid receptor could offer a unique glycan binding spectrum.The unique spectra can be used for comparative studies of different glycans.By integrating the spectra obtained from 4MBA and 1TBA receptors, the structural information of the glycan was enriched, leading to improved classification accuracy.
1TBA exhibited relatively poorer performance than 4MBA in both classification and quantification due to its lower S/N.The strong thiol-silver bond improves the contact between 4MBA and silver nanoparticles, leading to optimal signal enhancement.The relatively weaker interaction between thianthrene and silver might result in the suboptimal contact of 1TBA with silver nanoparticles.Boronic acids that can strongly bind to SERS substrate (e.g., boronic acids with thiol or amine groups) would be better choices for glycan analysis.
Recognition of isomeric structures is one of the major challenges in glycan analysis. [2]This SERS glycan sensor distinguishes not only the stereoisomers (glucose, mannose, and galactose) but also the structural isomers with different glycosidic linkages.The sensor could recognize -1,4 and -1,6 glycosidic linkages in maltose and isomaltose, as well as the -1,4 glycosidic linkage between glucose and galactose in lactose molecules.The competitive binding of different glycans in a complex sample with boronic acids enables the differentiation of extracted milk oligosaccharides.In addition to qualitative classification, this sensor can quantify sialic acids within the lactose background.The sensor can detect the mole fractions of sialic acids in lactose using the machine learning regression model.The quantification accuracy was further improved by using the collective spectra obtained from 4MBA and 1TBA receptors.
In summary, combining SERS, boronic receptors, and machine learning-driven chemometrics offers a rapid analytical approach for comparative glycan detection.The collective spectra from multiple types of boronic acid receptors can enrich the structural information of glycans, leading to improved classification and quantification accuracies.This study focused on the analysis of small oligosaccharides for proof-of-concept.For more complex glycan structures, the detection accuracy may be improved by using an array of boronic acid receptors.
Nanopaper Fabrication: Nanopapers were fabricated, as previously reported. [6]In brief, Tollens' reagent containing 300 mm ammonia and 50 mm silver nitrate was prepared in a 2 L glass beaker in a 55 °C water bath.Glass microfiber papers were immersed in the solution, and 500 mm glucose solution was added to initiate the silver mirror reaction.After the reaction was complete, the filter papers were rinsed thoroughly with deionized water and 2-propanol.The resulting products, i.e., the nanopapers, were stored in 2-propanol.The storage container was covered with aluminum foil and placed in drawers to prevent light exposure.
Substrate Surface Modification and Glycan Detection: Before surface modification, nanopapers were cut into a 1 cm × 0.5 cm rectangular shape.Nanopapers were coated with boronic acids by immersing substrates in 50 mm 1TBA or 0.1 mm 4MBA in methanol for 1 h.For glycan detection, the boronic acid coated nanopaper was immersed in the aqueous solution containing 1 mm of the desired glycans for 1 h.Before Raman measurement, the paper was dried in an oven at 75 °C for 5 min.
Raman Measurements: Raman spectra were collected with a Ther-moFisher Scientific DXR3 Raman microscope using laser excitation with a wavelength of 785 nm and an output power of 1 mW.This instrument was equipped with an Olympus BX41 optical microscope and a thermoelectrically cooled charge-coupled detector (ThermoFisher front-illuminated CCD system) with 1024 × 256 pixel format, operating at −70 °C.The signal was calibrated by an internal polystyrene standard and a 10× objective.The spot size was about 3.8 μm.200 SERS spectra were collected with an exposure time of 1 s for 5 accumulations at different spots for each sample.
Data Processing of Raman Spectra: The data analysis was performed using the same methodology reported in the previous study. [9]The data analysis scheme is shown in Figure S2 (Supporting Information).Briefly, the spectra were first processed using asymmetric least square (ALS) baseline correction.Then, baselined spectra were vector normalized and Savitzky-Golay smoothed (4th order polynomial, with a frame size of 37).Finally, multivariate analysis techniques and classification algorithms were performed in the spectral range 400-1650 cm −1 .Data processing was conducted with Matlab 2021b.
Multivariate Analysis and Machine Learning Classification: Before applying classifiers, the smoothed spectra were processed using multivariate statistical analysis to reduce the complexity and extract the significant spectral features that explain the most variance.Discriminant analysis of principal components (DAPC) was used in this study. [38]Principal component analysis (PCA) was first applied to reduce the data complexity, and then, a supervised multivariate analysis, discriminant analysis, was used to further discriminate the dataset by correlating the variation in the data with the sample information.
After feature extraction, common machine learning classifiers were used to classify SERS spectra.Support vector machine (SVM) was selected because of its superior performance in Raman spectral analysis. [9] fivefold cross-validation was performed to assess the suitability of the classification algorithm. [54]In brief, the training and the validation sets were established by randomly selecting from the Raman spectra data.
The training dataset was used to generate a classification model, and the model predicted the validation dataset to evaluate the performance.The cross-validation approach was repeated five times, wherein the validation set consist of 280 randomly selected SERS spectra in repetition for each monosaccharide case, 120 for each disaccharide structural isomers case, 80 for each different glycosidic linkage case, and 440 for all sample and collective spectra cases.The model's performance was evaluated by using classification accuracies, sensitivity, and selectivity.
The collective spectra were constructed by combining the truncated 1TBA spectra (400-1650 cm −1 ) and the 4MBA (400-1650 cm −1 ) spectra.Then, the collective spectra went through the same multivariate analysis and classification algorithm as the individual spectra.