Random forest and live single‐cell metabolomics reveal metabolic profiles of human macrophages upon polarization

Human macrophages are innate immune cells with diverse, functionally distinct phenotypes, namely, pro‐inflammatory M1 and anti‐inflammatory M2 macrophages. Both are involved in multiple physiological and pathological processes, including would healing, infection, and cancer. However, the metabolic differences between these phenotypes are largely unexplored at single‐cell resolution. To address this knowledge gap, an untargeted live single‐cell mass spectrometry‐based metabolomic profiling coupled with a machine‐learning data analysis approach was developed to investigate the metabolic profile of each phenotype at the single‐cell level. Results show that M1 and M2 macrophages have distinct metabolic profiles, with differential levels of fatty acyls, glycerophospholipids, and sterol lipids, which are important components of plasma membrane and involved in multiple biological processes. Furthermore, we could discern several putatively annotated molecules that contribute to inflammatory response of macrophages. The combination of random forest and live single‐cell metabolomics provided an in‐depth profile of the metabolome of primary human M1 and M2 macrophages at the single‐cell level for the first time, which will pave the way for future studies targeting the differentiation of other immune cells.


| INTRODUCTION
Macrophages are critical immune cells known to be highly heterogenous with two major functionally distinct subtypes (Franken et al., 2016;Verreck et al., 2006;Verreck Frank et al., 2004). Namely, classically activated M1 (pro-inflammatory phenotype) and alternatively activated M2 macrophages (antiinflammatory phenotype), which represent the extreme polarization states of macrophages in response to different environmental changes and stimuli (Yao et al., 2019). M1 macrophages have an enhanced bactericidal and tumoricidal capacity fueled by the production of high levels of pro-inflammatory cytokines. On the other hand, M2 macrophages are commonly associated with the secretion of anti-inflammatory cytokines to promote the resolution of inflammation and facilitate tissue repair and remodeling (Cammarota et al., 2020;Shapouri-Moghaddam et al., 2018;Verreck et al., 2004Verreck et al., , 2006Zhang et al., 2017). The balance between these two macrophage phenotypes plays a central role in maintaining tissue homeostasis under steady-state conditions and after exposure to pathogens and tissue damage (Abuawad et al., 2020). It is well known that the progression of various inflammatory diseases is also determined by the balance of proper polarization and functioning of the two phenotypes (McWhorter et al., 2013;Raggi et al., 2017). However, the molecular events that control M1 and M2 polarization and the metabolic differences between human M1 and M2 macrophages that determine their distinct functional properties are not fully understood, especially at the single-cell level. There are several limiting factors, among them are the technical limitations associated with analyzing the small volumes of single cells, and the difficulty in extracting meaningful insights from the complex, multidimensional data sets generated from untargeted metabolomics experiments (Evers et al., 2019).
The usage of primary human M1 and M2 macrophages in metabolome comparison studies is currently limited. Previous studies were conducted on murine bone marrow-derived macrophages or immortalized macrophage cell lines (Diskin & Pålsson-McDermott, 2018), all of which have various extents of differences with human primary macrophages. These differences include but are not limited to cell surface markers expression, compound production (e.g., nitric oxide), arginine metabolism, transcriptomic and proteomic profiles, and so on (Andreu et al., 2017;Fuchs et al., 2019;Martinez et al., 2013;Murray & Wynn, 2011;Tedesco et al., 2018;Thomas & Mattila, 2014;Vrieling et al., 2020). Therefore, additional research is needed for better understanding of the correlation between the metabolic state and the phenotypic heterogeneity of macrophages, with particular emphasis on human primary macrophages. Metabolomics is a powerful tool that can achieve this objective by identifying changes in small molecule metabolite profiles and metabolic pathways. However, traditional population-level metabolomic approaches do not consider the inherent cellular heterogeneity and instead average out the metabolic profile across a large number of cells (Evers et al., 2021). This is especially relevant to macrophages, which are known to be heterogenous. Therefore, ignoring cellular heterogeneity may adversely affect biological insights gained from averaged population metabolic studies.
Single-cell metabolomics provide the capability to reveal the cellular metabolic heterogeneity of individual cells that is often masked in pooled analyses (DeVilbiss et al., 2021;Duncan et al., 2019;Kumar et al., 2020). Cellular heterogeneity is ubiquitous but still poorly understood. This is mainly due to the insufficient sensitivity and inapplicability of current technologies to low analyte abundances and limited sample volumes associated with single cells (Ali, Abouleila, Shimizu, Hiyama, Emara, et al., 2019;Yin et al., 2019). Live single-cell mass spectrometry (LSC-MS) is a promising technique that has the required sensitivity and selectivity for single-cell metabolic profiling (Ali, Abouleila, Shimizu, Hiyama, Watanabe, et al., 2019;Duncan et al., 2019;Evers et al., 2021;Okubo-Kurihara et al., 2022).
In LSC-MS, a live single cell or single organelle is sampled into a tapered glass microcapillary under microscopic observation with minimal disruption to the cellular microenvironment ( Figure 1a). The sampling capillary is then attached to a modified mass spectrometry source where the cellular contents are sprayed, and subsequently measured by mass spectrometry with minimal dilution . This technique has been successfully employed to discern metabolic differences and targeted metabolites analysis for single plant cells, and mammalian cells Fujii et al., 2015), but never been performed on primary human cells and immune cells. Another reason that hinders the adoption of single-cell metabolic studies is the difficulty of analyzing the complex spectral data sets generated from such experiments. This complexity is largely due to the lack of proper sampling processing and separation, two essential steps in traditional metabolomics experiments that are extremely difficult to implement in single-cell studies due to the low volumes of single cells.
Random forest (RF) is a machine-learning (ML) approach that uses an ensemble classification trees for classification and feature F I G U R E 1 Schematic of live single-cell mass spectrometry combined with random forest. (a) Single M1 or M2 macrophage samples were picked up using a tapered glass capillary attached to a micromanipulator, followed by mass spectrometry measurement after adding ionization solvent from the opposite end of the capillary. (b) Random forest model was constructed on the obtained mass spectrometry data sets after data preprocessing to characterize the potential distinct metabolic signature between human M1 and M2 macrophages. (Zhao et al., 2019). RF has been applied in the analysis of population-level metabolome data obtained from mass spectrometry measurements (Liebal et al., 2020), with its unique advantages, namely, high-classification performance, no need of kernel and complex parametrization adjustments, fast at prediction time, and can provide feature importance assessment for the results (Melo et al., 2018). Therefore, it represents a promising candidate as a highperformance classifier based on the mass spectral input data (massto-charge ratio [m/z value] × intensity) generated from the analysis of single-cell samples ( Figure 1b). Furthermore, features used in the classification can be extracted from the RF model, and then used to identify the peaks unique to each phenotype.
Here, we demonstrate for the first time the applicability of LSC-MS to metabolic analysis of primary human cells, specifically immune cells. To achieve this, an innovative single-cell methodology based on untargeted LSC-MS metabolomic profiling combined with RF was developed and applied to provide an accurate prediction model for metabolomics, and enable single-cell-based discrimination of primary human M1 and M2 macrophages. This platform succeeded in capturing the metabolic signature unique to M1 and M2 macrophages, and then leveraging it to classify each phenotype with a high degree of selectivity and sensitivity, all at the single-cell level. Furthermore, putatively annotated metabolites that are differentially present in each phenotype were extracted from the RF model, such as fatty acyls, glycerophospholipids, and sterol lipids, which are important components of plasma membrane and involved in multiple biological processes such as inflammation and cell differentiation. These findings showcase the potential of using single-cell metabolomics, coupled with RF models to do phenotypical classification based on single-cell metabolomics data, and subsequently, gain a deeper understanding of the metabolome of heterogenous cell populations. This will help fuel the future research that aims to explore modulatory mechanisms of immune cell differentiation at the single-cell level.
As quality control, macrophages were stained for surface expression of CD14, CD163, and CD11b acquired on a flow cytometry (BD LSRFortessa, BD Biosciences). Macrophage differentiation and activation status was determined by quantifying IL-12 and IL-10 secretion by ELISA following stimulation of cells in the presence of 100 ng/mL lipopolysaccharide for 24 h.

| Single-cell sample preparation
Single-cell sampling was achieved using a sampling setup which includes a 3D micromanipulator (Narishige) connected to a platinumcoated glass capillary (Humanix) and fixed onto a video microscope (Olympus). Macrophages cultured in a 35 mm petri dish were observed under microscopy, a single M1/M2 macrophage of interest was sucked into a glass capillary (CT-2; Humanix) by applying negative pressure on the sampling capillary with constant visual feedback from the microscope (n = 23 for single M1 and n = 33 for single M2 samples). Similarly, the culture medium was also sampled as control (n = 7). To each capillary, 2 μL of the ionization solvent (a mixture of 80% methanol, 10% dimethyl sulfoxide, and 0.1% formic acid) was introduced from the rear end, followed by the introduction of the capillary's content into the mass spectrometer by applying high voltage to the capillary, where the voltage differential between the capillary and the mass spectrometer's inlet causes the capillary contents to be sprayed into the mass spectrometer for measurement.

| Mass spectrometry measurement
Mass spectrometry measurements were performed on a Q-Exactive orbitrap mass spectrometer (Thermo Fisher Scientific) equipped with an offline nanoelectrospray source (Nanospray Flex; Thermo Fisher Scientific) to introduce the cellular contents into the instrument.
Since the sampling capillaries are coated with platinum, they also act as a nanospray emitter once a voltage differential is established between the capillary and the mass spectrometer inlet. To initiate measurements, the glass capillary was fixed onto the offline nanospray source at 2 mm away from the inlet. Spectra acquisition was done in positive mode, with spray voltage set between 1 and 1.5 kV, to maintain a spray current between 0.1 and 0.12 microampere. The inlet capillary temperature was set to 250°C, instrument resolution to 100,000 full-width half maxima, normalized automatic gain control target to 1e6, and maximum injection time to 500 ms. The mass spectra were obtained in successive SIM regions covering the m/z ranges from 100 to 880 m/z. Each SIM window width was set to 50 m/z. The instrument was calibrated according to the manufacturer's standard operating procedures before the start of measurements every day to make sure that the m/z error does not exceed the accepted values (5 ppm).

| Data processing
Measured data were converted from the vendor's raw proprietary format to text files by an in-house script after centroiding.
The resulting peak lists were aligned using MarkerView (Sciex) with an m/z threshold of 5 ppm. The aligned peaks were then imported in R statistical software, where the rest of data processing and analysis was done. To check the quality of the MS spectra obtained, the total number of peaks, median measured m/z, and the peak intensities were plotted. Based on the aforementioned data, a total of six samples were eliminated based on their extremely low number of peaks of intensities above 1000 absolute intensity value, indicating an unsuccessful MS run.
Of the removed samples, two were M1 cells (n = 21 left), and four were M2 cells (n = 29 left). Background subtraction was achieved by removing peaks in the samples appearing in 75% of the measured blanks. Furthermore, contaminants, salt clusters, and exogenous molecules were removed from the data set by calculating the mass defect of all measured peaks and eliminating those with a characteristically high mass defect (McMillan et al., 2016). Peak intensity values were then log2 transformed, and pareto scaled in preparation for the statistical analysis.

| Statistical analysis
RF was selected as the ML algorithm of choice due to its good performance in phenotypic discrimination and biomarker isolation (Chen et al., 2013), especially in high-dimensional data sets as is the case in this study.
Highly correlated peak pairs with a Pearson correlation coefficient of more than 0.75 were identified, and one of the peaks in each correlating set was removed to prevent deterioration of the statistical mode performance. Feature selection was then done using recursive feature elimination (RFE) with preset subsets. RFE is an efficient approach for feature selection by eliminating specified proportion of variables with the smallest importance upon each iteration, so that determines a minimal subset of variables needed for an effective model with good prediction accuracy (Acharjee et al., 2020;Darst et al., 2018;Degenhardt et al., 2019). The resultant features were then used in the first RF model. M1 and M2 samples were split randomly into a training set (70% of the samples) and a test set (30% of the samples). The training set was then used for cross-validation of discriminative features identification as shown below, while the test set was set aside for assessing the performance of the RF model later.
Cross-validation was done using successive splits of the training data set to test the performance of the generated model. This was done by manual cross-validation, where the data were randomly split into five subsets. In each subset, an RF model was built to classify both phenotypes. The features that were commonly used by all five cross-validation models were selected and designated as the most common discriminative features. The receiver operating characteristic (ROC) curves of previous models were then plotted.
The identified common discriminative features from the previous step were in the first RF model. The model was tuned by selecting the optimum mtry value that corresponds to the lowest out-of-bag (OOB) error rate. The mtry represents the number of variables randomly sampled as candidates at each split, which is important parameter in RF: individual trees would have poor predictions with a low mtry value, and collection of trees would not diverse enough with high value (Capitaine et al., 2020;Couronné et al., 2018;Wenck et al., 2022). Typically, around 2/3 of samples are used as training set while the remaining is used as test set, termed as the OOB samples (Acharjee et al., 2020;Zhao et al., 2019). OOB is used to access the prediction performance of the model, the lower the OOB error rate, the better the classification (Oza et al., 2019). For the number of trees selection, number of trees versus the OOB were plotted (Figures 2a and 3a) and 500 trees were chosen.
The first model was tested against the test set and by using confusion matrix (Figure 2b) Table S1).

| RF model and classification performance
The first RF model was established based on the most common 151 peaks. In total, 500 trees were grown and during the growing, proximities were computed for the cases. Similar cases may fall into the same terminal node or derive from the same parent. After tuning our RF model using 500 trees and mtry = 23 resulted a model had OOB estimate of error rate = 11.43%. The OOB error rate did not decrease with the number of trees constructed, and the RF algorithm could avoid overfitting to a certain extent as shown in Figure 2a. Its classification accuracy was tested using a confusion matrix against the main test data set that was set aside from the beginning and resulted in an accuracy of 0.93  (Figure 3d-f). T-test was implemented to test the significance of these metabolites by their peak intensity as summarized in Table 1. The classes of metabolites that could be putatively annotated were fatty acyls, sterol lipids, glycerophospholipids, amino acids, and others.

| DISCUSSION
In this study, we present a powerful connection of LSC-MS and RF prediction model for metabolomics data analysis. The model was built by combining RF algorithm with RFE and cross-validation to find the features necessary and sufficient to classify single-cell MS data by ranking feature importance. It was then applied to incorporate spectral data from all single-cell samples and isolate the most discriminant feature m/z peaks between polarized macrophages. Furthermore, using this model, we could isolate several putatively annotated molecules that contribute to discriminating macrophages upon different polarization. Although both LSC-MS and RF are wellestablished and have been largely utilized within their respective fields, the connection of these two provides an alternative to existing approaches given its effective performance on sparse, highdimensional data with collinear features and straightforward understandability for single-cell data (Park et al., 2020). Despite the F I G U R E 2 Classification performance of the first random forest (RF) model for discriminating M1 and M2 macrophages. (a) Plot of overall out-of-bag (OOB)/overall error rate for RF classification of M1 and M2 versus number of trees. The black curve is the OOB of the model, the green curve is the OOB error rate when classifying M1 (reference), and the red curve is the OOB error rate when classifying M2. (b) Confusion matrix generated after training the first model with the selected peaks. (c) Assessing the performance of RF using cross-validation. A 5-fold cross-validation of the RF model was performed to select the common important peaks by each cross-validated model. The final performance with area under the curve value of 0.917 demonstrates the overall stability of our RF model. (d) The contribution of individual peaks in the RF model. The mean decrease in accuracy and alternative approach using mean decrease in Gini were used to rank the relative importance of individual peaks for metabolic pattern recognition in the RF model. Only the 30 top-ranked peaks and variables for each cell type are shown. (e) Multidimensional scaling plot for mass spectrometry data of M1 and M2. (f) Heatmap of important peaks found in all collected M1 and M2. Heatmap visualizes the intensity of each significant peak measured in all M1 and M2 single-cell samples. The color scales between black and bright green represent lower intensity to higher intensity, respectively. Most of the putatively annotated metabolites belong to lipids. It has been reported previously that lipids, such as fatty acyls, glycerophospholipids, sterol lipids, and so on, are components of the cells' plasma membrane and are involved in multiple biological T A B L E 1 Summaries of the discriminatory metabolites between M1 and M2. | 2321 processes such as inflammation and cell differentiation (Glass & Olefsky, 2012;Masoodi et al., 2015;Zhang et al., 2017). Their metabolism has a role in the pro-or anti-inflammatory functions of macrophages by meeting energetic requirements and modulating membrane fluidity (Mukundan et al., 2009;Zhang et al., 2017 (Lee et al., 2016), which has been theorized to be a potential mechanism for inflammation (Na et al., 2016). Additionally, we observed M1 has increased level of TG 60:14. This is consistent with the results of previous work on bulk level that M1 has substantially higher levels of TG (Morgan et al., 2021). In particular, TG was recently demonstrated to play important antibacterial roles and its synthesis was demonstrated to be required for macrophage inflammatory functions especially the production of PGE2 (Bosch et al., 2020;Castoldi et al., 2020). Furthermore, cortisol (ST 21:3;O5) was observed, which is a potent mediator of the activity of macrophages and for the regulation of macrophage polarization, via intracrine interaction with glucocorticoid receptors, to finally determine the outcome of infection (Maciuszek et al., 2019(Maciuszek et al., , 2020. The only metabolites higher in M2 was FA 20:4;O (20-HETE) which is involved in arachidonic acid metabolism. This is consistent with previous findings which stated that this pathway is the most remarkable lipid metabolism disparity between M1 and M2. This finding reaffirms its pivotal role in determining the phenotype of M2 macrophages and highlights the potential of using it as a biomarker for M2 differentiation (Xu et al., 2021).
Despite its promise, it's worth noting that this method is not without its set of limitations. Single-cell sampling process suffers from low throughput and requires highly skilled operators, applying automated methods or microfluidic techniques may overcome these challenges for large-scale single-cell studies.
Although many metabolites were putatively annotated, the limited volume of a single cell makes it difficult to perform exhaustive MS/MS identification. It is important to note that the data obtained from single-cell measurements in the current study is semi-quantitative at best. This is due to the difficulties in incorporating internal standards and homogenizing them with single cell without significant dilution and consequently, loss of signal. Progress in enrichment and separation techniques such as capillary electrophoresis, or miniaturized sample preparation via microfluidics can be utilized to overcome this limitation in a future study. Furthermore, the lack of a separation technique adversely affects the selectivity of the method and makes it difficult to perform structure elucidation, especially in the case of lipids.
Thus, further studies involving targeted and quantitative analytical measurements at single-cell level are required to confirm the metabolomic alterations detected.

| CONCLUSIONS
This proof-of-concept study presents a powerful combination of high-resolution LSC-MS, an RF prediction model for data analysis and rapid database match to distinguish human M1 and M2 macrophages. This platform is sensitive enough to detect subtle changes in a wide range of metabolites from single-cell macro-

CONFLICT OF INTEREST STATEMENT
The authors declare no conflict of interest.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.