ANPELA: Significantly Enhanced Quantification Tool for Cytometry‐Based Single‐Cell Proteomics

Abstract ANPELA is widely used for quantifying traditional bulk proteomic data. Recently, there is a clear shift from bulk proteomics to the single‐cell ones (SCP), for which powerful cytometry techniques demonstrate the fantastic capacity of capturing cellular heterogeneity that is completely overlooked by traditional bulk profiling. However, the in‐depth and high‐quality quantification of SCP data is still challenging and severely affected by the large numbers of quantification workflows and extreme performance dependence on the studied datasets. In other words, the proper selection of well‐performing workflow(s) for any studied dataset is elusory, and it is urgently needed to have a significantly enhanced and accelerated tool to address this issue. However, no such tool is developed yet. Herein, ANPELA is therefore updated to its 2.0 version (https://idrblab.org/anpela/), which is unique in providing the most comprehensive set of quantification alternatives (>1000 workflows) among all existing tools, enabling systematic performance evaluation from multiple perspectives based on machine learning, and identifying the optimal workflow(s) using overall performance ranking together with the parallel computation. Extensive validation on different benchmark datasets and representative application scenarios suggest the great application potential of ANPELA in current SCP research for gaining more accurate and reliable biological insights.

. The comparisons of six representative preprocessing workflows and their quantification performances for PTI dataset PMID22902532 [1] (Table 2). Similar to Figure 5, the correspondence to/violation of known pathways were showed based on the resulting protein activation sequence. (a) and (b) separately represented workflows which performed consistently superior/good and poor (as shown in Figure 2), including the first and last ranked workflows demonstrated in Figure 5. The proteins marked in red font were the earliest activated in each pathway, and those filled in red squares were considered to violate the prior knowledge. Figure S2. Circular stacked bar plots of multi-criteria based comprehensive performance evaluation of all benchmark datasets except those shown in Figure 2. For a certain workflow, the stacked bar from the inner to the outer represented Criteria Ca-Cc of CSI or PTI. Figure S3. Analysis of robustly well-performing workflows across multiple CSI-FC datasets ( Table 2) to identify common trends between them. (a-c) Circular heatmaps of workflows across datasets under each criterion (the outermost layer), their corresponding performances (the middle layer; the brighter color of the cell, the better the performance) and their mean assessing values (the inner layer; dots colored in light green/red represented robustly well/poor-performing workflows which were distinguished by the cutoff for distinguishing good and poor workflows under each criterion). Since few robustly well-performing workflows were commonly identified under all criteria (only three workflows, i.e., "NON+BOX+NON+FCL", " NON+BOX+NON+NON" and "NON+TRU+NON+FAI"), their distributions were not analyzed. Figure S4. Analysis of robustly well-performing workflows across multiple PTI-MC datasets ( Table 2) to identify common trends between them. (a-c) Circular heatmaps of workflows across datasets under each criterion (the outermost layer), their corresponding performances (the middle layer; the brighter color of the cell, the better the performance) and their mean assessing values (the inner layer; dots colored in light green/red represented robustly well/poor-performing workflows which were distinguished by the cutoff for distinguishing good and poor workflows under each criterion). (d) The distribution of the shared robustly well-performing workflows under all criteria (the intersection of green dots in a-c). Figure S5. Analysis of robustly well-performing workflows across multiple PTI-FC datasets ( Table 2) to identify common trends between them. (a-c) Circular heatmaps of workflows across datasets under each criterion (the outermost layer), their corresponding performances (the middle layer; the brighter color of the cell, the better the performance) and their mean assessing values (the inner layer; dots colored in light green/red represented robustly well/poor-performing workflows which were distinguished by the cutoff for distinguishing good and poor workflows under each criterion). (d) The distribution of the shared robustly well-performing workflows under all criteria (the intersection of green dots in a-c). 9 at higher (positive and negative) values. [14] QUA QuadraticTransform FC & MC This method uses a quadratic function to transform the raw data. [5] SCA

ScaleTransform FC & MC
This method subtracts the corresponding minimum value from the specified protein expression value, and then divides data by the range, which scales the specified markers to [0, 1]. [7] TRU TruncateTransform FC & MC This method truncates the lower limit of the dynamic range to the pre-determined cutoff value and then replaces the lower value by the cutoff value. [5] Normalization

Bead-based Normalization MC
This method firstly identifies the isotope-containing bead events, then converts the raw data to local medians, and finally computes the average across all files which is utilized to calculate the slopes for each time point and multiplied by all data acquired from corresponding time. [3,15] GSN

GaussNorm FC & MC
This method identifies landmarks based on the pre-determined maximum number M and kernel density estimation, then classifies the detected landmarks into M classes and transforms the landmarks of each sample independently to align the landmarks. [16] WPS

WarpSet FC & MC
This method identifies landmarks based on the kernel density estimation without pre-determined maximum number, then estimates the most likely total number K of landmarks and performs K-means clustering to classify landmarks, finally transforms the data using the warping function estimated for each sample. [16] Signal Clean

FlowAI FC & MC
This method removes surges and large deviations of the trend from the median of the flow rate, then eliminates the acquisition regions whose statistics are shifted from the most stable acquisition region, and deletes outliers in the lower limit and margin Table S2. The CPU utilized for and time spent on running the stand-alone program of ANPELA 2.0 in two tested datasets (PMID25253674 and PMID33774709, as described in Table 2). For each dataset, three subsets with varying sizes (200, 500 and 1000 cells extracted from each FCS file) were separately tested using a personal computer (Windows 10 with 2.9 GHz, 16 cores Intel Core i7-10700F CPU and 32 GB RAM, actually 8 cores were simultaneously utilized in the program). Since major improvements of the parallel computing together with memory management realized in ANPELA, a comparison of time spent between the application and non-application of parallel computing & memory management was provided.  Method S1. Eight mutually independent assessing criteria integrated in ANPELA 2.0 for systematically evaluating the quantification performance of each preprocessing workflow. Both for CSI and PTI studies, three criteria were provided in the absence of gold standards and an additional criterion "Correspondence" was given when gold standards were available for the studied dataset. The detailed information on each criterion with its diverse assessing metrics was described below, including the quantification property it examined, the definition of its assessing metric(s), the implementation steps of the algorithm, the way to interpret the resulting values, and the representative metric with corresponding well-defined cutoff values. Particularly, all resulting values of each assessing metric were rounded to three decimal places.

(1) Assessing criteria for CSI studies:
Criterion Ca: The Classification Accuracy of Distinct Phenotypes based on Cell Subpopulations One of the significant applications of single-cell proteomics is the biomarker discovery based on differential expression analysis, which compares different experimental groups and identifies significantly altered proteins in the studied cell subsets. [20] This criterion measured the accuracy of the classifier in distinguishing data between different biological conditions of each cell subpopulation. Two well-established measuring indexes were provided under this criterion, including area under the curve (AUC) and F1 score. [21][22][23] AUC is a frequently-used metric to map out the overall performance of a classifier by using the receiver operating characteristic curve, which represents a good summary of false positive rate and true positive rate. Its value ranges between 0 and 1, where 1 denotes a perfect classifier. F1 score is the harmonic mean of the precision and recall, where the precision is the number of true positives divided by the number of all positives, and the recall is the number of true positives divided by the number of all samples that should have been identified as positive. Its value ranges between 0 and 1, and a higher value indicates a better classifier with preferable performance.
The preprocessed data was first clustered by FlowSOM (the function "SOM" in R package "FlowSOM") [24] or PhenoGraph (the function "Rphenograph" in R package "cytofkit"). [25] The reason for choosing them as clustering method was that they were found to perform better than many other unsupervised tools in precision, coherence, and stability. [7] FlowSOM was preset as default as it outperformed than all the other clustering methods in an up-to-date and extensible performance comparison for CySCP data. It was identified as a good trade-off between quality and time, and was recommended as a first choice for analyzing CySCP data. It gave best or near-best performance across all datasets, together with extremely fast runtimes. [22,26] After clustering, feature selection (t-test by running the function "t.test" in R package "stats") and k-nearest neighbor classification (KNN) were applied to each subset to classify the data with distinct biological phenotypes. [27,28] Finally, the mean values of AUC and F1 score of each cell subset data were calculated by running the function "ConfusionDF" in R package "MLmetrics" and the function "roc" in R package "pROC", respectively. [22] Particularly, AUC was selected as the representative metric and its value within [0.9, 1.0], (0.7, 0.9) and [0, 0.7] indicated Superior, Good, and Poor performances, respectively. [21,[29][30][31][32] Criterion Cb: The Tightness of Clusters with or without Predetermined Manual Labels This criterion measured the tightness of given clusters with or without predetermined manual labels, [33] including "coherence" as internal evaluation and "precision" as external evaluation. [7] Internal assessment criterion "coherence" was based on the hypothesis that an ideal clustering result should have high similarity within each cluster and high heterogeneity among clusters, and the external ones measured the similarity between the clustering result and the true labels. Therefore, for internal evaluations, predetermined manual labels of given clusters were not taken into account, and the capacity of discovering the inner structure of the data was directly examined by four indicators including Xie-Beni index (XB), Calinski-Harabasz index (CH), Davies-Bouldin index (DB) and Silhouette coefficient (SC). [7,[33][34][35] Conversely, for the "precision" as external criterion with two metrics including Rand index (RI) and purity, [7,23,36] metadata with predetermined labels was utilized as ground truth. In current single-cell studies, although both internal and external evaluations are frequently utilized, internal evaluations are more popular than the external ones owing to their independence from additional true labels and could thus give fair results regardless of with semi-supervised or with unsupervised methods. [7] Therefore, internal evaluations were used as the representative metrics during subsequent performance evaluation and ranking. Despite the fact that external evaluations tend to favor semi-supervised methods over unsupervised methods, they were still retained in the online service of ANPELA to provide comprehensive reference for researchers in the field of single-cell studies, especially for users who are familiar with these metrics.
XB is calculated using intra and inter-cluster distance, which is formulated in terms of cluster compactness and the separation between the clusters. It specifies the inter-cluster separation as the minimum square distance between cluster centers, and the intra-cluster compactness as the mean square distance between each data object and its cluster center. A lower XB value represents better clustering performance. CH is also known as the variance ratio criterion, which is the ratio of the sum of between-clusters dispersion and of inter-cluster dispersion for all clusters. Hence, the higher the index value, the better the clustering performance. The DB index describes the average "similarity" among clusters, which compares the distance between clusters with the size of the clusters themselves. Similar to the CH index, a lower DB value denotes better clustering. SC is calculated using the (a) mean intra-cluster distance and the (b) mean nearest-cluster distance for each sample. The SC for a sample is (b-a)/max(a, b). Therefore, its value range is [-1, 1] and a value closer to 1 relates to a workflow with better defined clusters. RI specifies the probability of agreement between two partitions and calculates the percentage of pairwise assignments that are true positive or negative. Another criterion of "precision" is purity, which is a measure of the extent to which clusters contain a single class. For each cluster, the number of cells from the most common class in the cluster is counted, and then the sum over all clusters is divided by the total number of cells. Both the value of RI and purity is normally between [0,1], and it closes to 1 when the clustering results perfectly match predetermined manual labels.
The assessing measures of internal criterion "coherence" including XB, CH, DB and SC could be obtained from the preprocessed data directly by running the function "intCriteria" in R package "clusterCrit". For external criterion "precision", the preprocessed data was first clustered by FlowSOM or PhenoGraph as described in Criterion Ca. Data was secondarily clustered for each cluster by FlowSOM for comparing the sub-cluster labels to the metadata with predetermined labels. The reason for choosing FlowSOM rather than PhenoGraph as secondary clustering method was that FlowSOM is an unsupervised method with manual selection of the number of sub-clusters, which could be adjusted according to the number of labels in the metadata. Then the measurement of purity and RI were obtained by calculating directly according to the definition and running the function "rand.index" in the R package "fossil", respectively. The ultimate assessing value of "precision" was determined as the mean value of each cluster.  [37][38][39][40] Criterion Cc: The Robustness of Identified Biomarkers among Randomly Sampled Subsets As mentioned above, single-cell proteomics possesses powerful potential to enhance the discovery of cellular biomarkers under different conditions. [41][42][43] This criterion quantified the level of consistency among biomarkers identified from various randomly sampled portions of given clusters. Herein, two popular measures were adopted for this criterion, including consistency score (CS) [44] and relative weighted consistency (CWrel). [45] CS is a well-defined stability measure which is the weighted sum of product of the number of overlapped biomarkers and the number of shared biomarker shortlists in each subset. [46] Intuitively, a workflow is more robust if it results in more proteins shared by more biomarker shortlists. CWrel is another stability measure based on the frequency of the features after feature selection, which gives advantages over CS in suppressing the influence of the sizes of subsets in system and allowing more reliable comparisons of the robustness. [47] Overall, the higher the CS or CWrel, the more differentially expressed features are detected in common among different data subsets.
To be specific, after preprocessing and clustering by the strategy described in Criterion Ca, data subsets were obtained by randomly stratified sampling without peplacement for three times from the whole dataset of each cluster via the function "strata" in the R package "sampling". It achieved a trisection structure of each cluster and kept the ratio of cells in different classes constant. [48] The t-test was then applied to identify differentially expressed proteins. These three shortlists were used to calculate the corresponding CS and CWrel values of each cluster. Finally, the mean values of them in all clusters were calculated. In particular, the CWrel was regarded as the typical assessing metric under this criterion, and its value within [0.3, 1.0], (0.15, 0.3) and [0, 0.15] indicated Superior, Good, and Poor performances, respectively. [49,50] Criterion Cd: The Correspondence between Identified Biomarkers and Reliable Reference Performance evaluation based on prior biological knowledge is a widely used and reasonable assessment strategy. [45,51,52] Unlike the three general-purpose criteria discussed above, this criterion was context-specific which required datasets from well-characterized biological systems with available well-established biomarkers as reliable reference or "ground truth". [34] In some cases, well-established differential proteins between specific different conditions (e.g., disease & health, liver & kidney, T cell & B cell) have been reported. Herein, those known biomarkers could be recognized as the prior knowledge for assessing the reliability of the quantification results.
The frequently-used measure recall was utilized for figuring out the correspondence, which was defined as the fraction of relevant biomarkers retrieved. Its value ranged between 0 and 1, where 1 indicates the ideal retrieval. T-test was applied to the preprocessed data for feature selection and the recall was thus obtained. The value within [0.8, 1.0], (0.5, 0.8) and [0, 0.5] indicated Superior, Good, and Poor performances, respectively. [53][54][55][56] (2) Assessing criteria for PTI studies: Criterion Ca: The Conformance of the Inferred Pseudo-time with Physical Collection Time Besides being a biomarker discovery platform, another major push in the field of single-cell proteomics was to enable data-driven arrangement of cell states into pseudo-progression trajectories by TI methods for inferring cellular transitions. [26,57] For PTI studies, it was reasonable to assume that the physical collection time of samples could roughly reflect the true cellular temporal hierarchy, [58,59] which also explained why real sample collection time has been frequently used for pseudo-time reconstruction performance evaluation. [59,60] This criterion assessed the quality of the inferred ordering by measuring the consistency of the inferred trajectory with respect to the collection time. The assessing metric under this criterion was the time conformance score, which was defined as the probability that two cells were arranged in the same order of the inferred dynamics relative to the collection time. Ten thousand pairs of cells of different experimental time points were randomly sampled to sum up the number of orderings that were in agreement. Due to the dependence of the resulting inferred dynamics on the selection of the initial point, more than 100 randomly selected starting cells were used and then the maximum probability was assigned for the workflow being evaluated. This score fell in the range of [0,1], and a higher score indicated the better conformance. As a reference, the value within [0.8, 1.0], (0.6, 0.8) and [0, 0.6] indicated Superior, Good, and Poor performances, respectively. [59,60] In default, the preprocessed data was first dimensionally reduced via the function "reduce_dimensionality" in the R package "SCORPIUS" followed by trajectory reconstruction via the function "infer_trajectory" in "SCORPIUS". In particular, the parameter "dist" in the function "reduce_dimensionality" was set to "spearman" since it was found typically more robust than other correlation distances when high levels of noise were contained within the dataset. Other optional trajectory inference methods included "scorpius_distPear", "scorpius_distManh", "slingshot_tSNE", "prinCurves_tSNE", "slingshot_PCA", "slingshot_diffMaps" and "prinCurves_diffMaps".

Criterion Cb: The Smoothness of Protein Levels through the Inferred Trajectory
Pseudo-time trajectory inference utilizes a large number of cross-sectional datasets from different single cells to investigate cellular changes between different states, and the cell state progression with pseudo-temporal order is expected to be smooth. [58] This criterion evaluated the smoothness of the inferred trajectory by quantifying the variability in successive protein levels, which has been widely used for evaluating the performance of pseudo-time analysis in single-cell studies. [58,60,61] First, trajectory inference was performed on the preprocessed data as described in Criterion Ca. Second, cells were ordered according to pseudo-times followed by the calculation of root mean square (RMS) of the differences between adjacent cells for each protein, which indicated smoothness of the inferred trajectory (the lower the value, the smoother the trajectory). The RMS was obtained by running the function "calc.roughness" in R package "DeLorean". [58] Similarly, the RMS value of the naive approach (randomly arranging cells over 100 iterations) for each protein was calculated. Finally, based on the hypothesis that the cell state progression with pseudo-temporal order is expected to be smooth, the one-sided t-test was performed to determine whether the RMS value of the analyzed workflow was statistically significantly smaller than that of the naive approach. The lower the p-value, the smoother the inferred trajectory of the analyzed workflow, that was, the better the performance. In order to coordinate the generated p-value with other assessing metrics (within [0, 1], and the higher the value, the better the performance), the reported Smoothness Score was actually calculated by the following equation: The traditional significance level α = 0.05 was used for Smoothness Score interpretations. [62] Particularly, α = 0.01 and 0.001 specified thresholds to partition performance into Superior, Good  As reported, computational tools that could infer cell state progression reproducibly from single-cell data were lacking, which showed significant importance for evaluating the robustness of the data in reproducing trajectory inference results. [57] This criterion estimated the robustness and sensitivity of the inferred trajectories with respect to perturbation to the specific input data by virtue of data subsampling. [59,63] First, trajectory inference was performed on the preprocessed data as described in Criterion Ca and the inferred ordering was recorded. Then, a uniform random sample of 20% single cells was drawn from the whole dataset for repetition. Finally, Spearman and Kendall rank correlation coefficient were employed for measuring the robustness between these two runs via the function "cor" in R package "stats". Spearman rank correlation coefficient quantified the monotonicity of inferred orderings between common cells of the initial dataset and subsampled cells of the subset, while the Kendall rank correlation coefficient evaluated how identical the relative orderings of the cells were. Since the inherent stochasticity effect of the initial point on the resulting orderings, the maximum possible correlation between the whole dataset and subsets was found by cyclically shifting the elements of the subsampled ordering cells. The subsampling and calculation process were repeated four times. Both the absolute values of the rank correlations would approximate 1 with high robustness (+1 or -1 depending on the choice of trajectory starting position). Herein, the average of the absolute values obtained from the four comparisons was used as the final metric for assessment. Particularly, the Spearman rank correlation coefficient was selected the representative measure, and its value within [0.85, 1.0], (0.5, 0.85) and [0, 0.5] indicated Superior, Good, and Poor performances, respectively. [59,60,63] Criterion Cd: The Correspondence between Inferred Dynamics and Prior Biological Knowledge This criterion was designed to serve as biological validation of the consistency between the inferred dynamics and prior biological knowledge. [60] It was based on the hypothesis that signaling dynamics approximated a transient profile where the activation of one protein was supposed to reach its maximum level prior to the activation of latter proteins in the sequence of the signaling cascade. In other words, the sequence of maximum pseudo-times should correspond to the sequence of respective proteins in known signaling cascades (for which well-established signaling pathway hierarchies were required, the detailed information of this file was described in "Sample Input Files Following the Standard Form" in "Materials and Methods" section). First, trajectory inference was performed on the preprocessed data as described in Criterion Ca. Second, a smoothing spline model of the preprocessed expression values of each protein was fit via the function "ns" in R package "splines" and the corresponding peak time at which the expression reached the maximum level was located. Then, the peak time of all proteins were sorted and all consecutive protein pairs were recorded. Finally, the extent to which each consecutive protein pair correspond to the prior biological knowledge was examined by calculating the recall (the fraction of correct pairwise sequence retrieved). To avoid the inherent stochasticity effect of the initial point on the resulting orderings, the recall value over 100 randomly selected starting points were calculated and the maximum value was then determined. For reference, the recall within [0.8, 1.0], (0.6, 0.8) and [0, 0.6] indicated Superior, Good, and Poor performances, respectively. [60] Method S2. Default parameter settings of data preprocessing methods in ANPELA.
In this study, some parameters were pre-determined based on the value widely-accepted in research community, some other parameters were optimized by ANPELA's built-in statistical test and estimation algorithms. Moreover, the remaining parameters depending on user's data were thus hard to be pre-determined, which were allowed to be modified by users.

a) Values widely-accepted in research community
Clearly, the ultimate goal of ANPELA is to provide researchers with performance evaluations of various combinations of preprocessing methods that they frequently use (including parameters that they frequently set). Therefore, a value would be preferred when it conformed to user habits or was considered optimal. The default value of parameter "logbase" for method LOG, which corresponded to the base of the logarithm, was set to 10. [34,64] The default value of parameter "threshold" for method ANN and ARN, which corresponded to the noise that would be subtracted from each value, was set to 1 as recommended by Xshift and Phenograph. [7,25,65] The default value of parameter "method" for method CAT, which corresponded to the strategy for solving linear system, was set to "nnls" because it outperformed others. [3] Another parameter set to its optimal value as recommended was the parameter "b" for method ACS, ANN and ARN, which was set to "1/150" or "1/5" for FC and MC, respectively. [26,66,67]

b) Values optimized by ANPELA's built-in algorithms
Unlike the above strategy for pre-determining parameters which were considered optimal, ANPELA automatically optimized parameters by its built-in statistical test and estimation, which has been widely used in CySCP quantification. [8,68,69] The representative method FVS used in ANPELA, removed the mean-variance correlations from cell populations identified in each fluorescence channel by the inverse hyperbolic sine (asinh) transformation, whose parameters were optimally selected by Bartlett"s likelihood-ratio test. [10] c) Values allowed to be modified according to the studied data Some parameters were hard to be pre-determined because of their dependence on the studied data, which were allowed for users to modify. For example, the parameter "FSC" for method FLC, which corresponded to the forward scatter parameter of the analyzed FSC files, would vary between "FS" and "FSC" and a suffix "-A", "-H" or "-W" was commonly be added to indicate whether the area, height, or width of the signal pulse was being captured. Therefore, this parameter should be modified according to user's data before using ANPELA.
As we all know, parameters could have an impact on the ultimate performance, which is dataset-dependent. [69,70] Strictly speaking, the discussions on data preprocessing method/workflow performance should be based on specific parameter settings and the applied dataset. Therefore, we further provided users with flexible parameter adjustability. On the one hand, in the "one-click" online service of ANPELA, we allowed users to adjust key parameters in order to make it faster and more user-friendly, which would significantly help experimenters who had little or no background in bioinformatics to explore their experimental data on the web side. On the other hand, in the "out-of-the-box" local program of ANPELA, we provided users with more freedom of parameter adjustability, in which bioinformaticians could freely adjust all relevant parameters to obtain workflow performance with parameters they are used to setting or are considered optimal. In general, ANPELA provided diverse and flexible parameter choices for general readers, especially non-bioinformaticians. And a detailed description of key parameters (including their corresponding default values) was given in Table S4.

Download and Install the Latest Release of R and RStudio
The latest version of ANPELA 2.0 was developed and tested on "3. The whole installation should be completed by two sequential steps. First, please install the "3.6.1" version of R by double clicking the executable file (1-R-3.6.1-win.exe) and following step-by-step instructions during the whole setup process (find more information on https://www.r-project.org/about.html). Second, install the RStudio by double clicking the executable file (2-RStudio-2022.02.1-461.exe) and following the step-by-step instructions during the whole setup process.

Download the Source Code of the Standalone Version of ANPELA 2.0
The source code of ANPELA 2.0 together with the supporting R packages can be downloaded from: http://idrblab.cn/anpela2023/ANPELA2.0Sourcecode.zip Please decompress it by right clicking and selecting the "Extract to anpela\" in Your Preferred Directory.

Set Working Directory
After running RStudio, please change your working directory (in RStudio environment) to "Your Preferred Directory\anpela\ANPELA 2.0_Sourcecode\" by typing and then running the following R command: setwd("Your Preferred Directory/anpela/ANPELA 2.0_Sourcecode") NOTE: (1) the R environment uses forward slash (/) to indicate the filepath, which is different from the Windows CMD commands (backslash); (2) user can confirm your current working directory again by typing and then running the following R command: getwd()

Set Library Paths
All of the R packages that ANPELA 2.0 depends on are provided in the folder of "Your Preferred Directory/anpela/ANPELA 2.0_Sourcecode/library". Please set the library trees within which packages are looked for by typing and then running the following R command: .libPaths(c("./library", .libPaths()))

Load Required Scripts
All of the R scripts which load required packages and define a number of functions are provided in the folder of "Your Preferred Directory/anpela/ANPELA 2.0_Sourcecode/src". Please load these scripts by typing and then running the following R command: sapply(paste0("./src/", list.files(path = "./src", recursive = T, pattern = ".R$")), source)

Conduct Systematic Workflows for Quantifying Raw SCP Data Generated by FC/MC
Please quantify your raw SCP data acquired from FC and MC by running the function of "FCquan" and "MCquan" respectively. Particularly, as the output of these two functions is exactly the input of other functions in ANPELA 2.0 except "ranking", it is recommended to save the output of these two functions as .RData file, restart the R session and then load it when needed in order to avoid memory explosion. The detailed information about the argument of these two functions is provided in Table S4. Moreover, as the compensation method of "AutoSpill", "CATALYST" and "FlowCore" required additional files to assist quantification, the sample files are given in the folder of "Your Preferred Directory/anpela/ANPELA 2.0_Sourcecode/exampler".

Analysis and Performance Assessment of the Workflow(s)
Please assess all workflows which are used while running the function of "FCquan" or "MCquan" based on their performances in CSI and PTI studies by running the function of "Classassess" and "TIassess" respectively. Considering that the subsequent overall ranking is based on collective considerations of representative assessing metric values and their corresponding performance grades distinguished by well-defined cutoffs under all criteria, the representative assessing metric under each criterion is recommended in this step. Particularly, as the output of these two functions is exactly the input of the function of "ranking", it is recommended to save the output of these two functions as .RData file, restart the R session and then load it when needed in order to avoid memory explosion. The detailed information about the argument of these two functions is provided in Table S4.

Identify the Proper Workflow(s) for the Studied Dataset by Comprehensive Ranking
Please rank all workflows assessed by the function of "Classassess" or "TIassess" by running the function of "ranking". The detailed information about the argument and output of this function is provided in Table S4 and Table S5 respectively.

Visualize the Resulting Quantification Performance of Each Workflow
Please visualize the resulting quantification performance of each workflow for CSI and PTI studies by running the function of "Classplot" and "TIplot" respectively. The detailed information about the argument and output of these two functions is provided in Table S4 and Table S5 respectively.

Export the Quantified CySCP Data Generated by the Specified Workflow(s)
Please export the quantified CySCP data in the format of FCS and/or expression count matrix (markers in column and events/cells in row) by running the function of "exportFCS". The detailed information about the argument and output of this function is provided in Table S4 and Table S5 respectively.

(func1). FCquan()
Description: this function enables the quantification of raw SCP data acquired from FC by at most ~720 available workflows (each workflow is distinct by randomly combining methods of compensation, transformation, normalization and signal clean), which facilitates the subsequent application of performance assessment, ranking and plotting.

Argument Type Description of the Argument and the Allowable Values Default
The absolute path of the folder storing the FCS raw data files. / metadata vector (character) The absolute filepath of the metadata file.
The exampler for the study type of "CSI" and "PTI" are provided with the name of "metadata(CSI).csv" and "metadata(PTI).csv" in the folder of "exampler" respectively.
a vector of all methods normalizationM vector (character) The method(s) of normalization for flow cytometry data, including "GaussNorm", "WarpSet" and "None". a vector of all methods signalcleanM vector (character) The method(s) of signal clean for flow cytometry data, including "FlowAI", "FlowClean", "FlowCut" and "None". a vector of all methods spillpath vector (character) The filepath(s) to the .fcs file(s) of compensation beads or cells. The spillover information for a particular experiment is often obtained by running several tubes of beads or cells stained with a single color that can then be used to determine a spillover matrix for use.

NOTE: Only needed when "FlowCore" is included in the argument of "compensationM".
The filenames of these .fcs files must correspond to the names of stain channels. If the original .fcs file contains a pre-calculated spillover matrix as the value of the $SPILLOVER, $spillover or $SPILL keywords, this can be set as NULL.

FSC vector (character)
The name of the forward scatter parameter. The absolute filepath of your .csv file defining the filenames and corresponding channels of the single-color controls.

NOTE: Only needed when "AutoSpill" is included in the argument of "compensationM".
NULL logbase vector (numeric) The base of the Log Transformation.

NOTE: Only needed when "Log Transformation" is included in the argument of
The cofactor of Arcsinh Transformation.
NOTE: Only needed when "Arcsinh Transformation" is included in the argument of "transformationM".
The cofactor of Asinh with Non-negative Value.
NOTE: Only needed when "Asinh with Non-negative Value" is included in the argument of "transformationM".
The cofactor of Asinh with Randomized Negative Value.

NOTE: Only needed when "Asinh with Randomized Negative
Value" is included in the argument of "transformationM".
NOTE: Only needed when "QuadraticTransform" is included in the argument of "transformationM". The number of CPU cores to be employed for performing parallel computing.
To avoid memory explosion due to parallel computing, the default is the largest integers not greater than half of the number of CPU cores on the current host. floor(detectCores()/2)

(func2). MCquan()
Description: this function enables the quantification of raw SCP data acquired from MC by at most ~540 available workflows (each workflow is distinct by randomly combining methods of compensation, transformation, normalization and signal clean), which facilitates the subsequent application of performance assessment, ranking and plotting.

Argument Type Description of the Argument and the Allowable Values Default
The absolute path of the folder storing the FCS raw data files. / metadata vector (character) The absolute filepath of the metadata file.
The exampler for the study type of "CSI" and "PTI" are provided with the name of "metadata(CSI).csv" and "metadata(PTI).csv" in the folder of "exampler" respectively.
/ studytype vector (character) The type of your study, including "CSI" (Cell Subpopulation Identification) and "PTI" (Pseudo-time Trajectory Inference). / mergeM vector (character) The method of merging multiple FCS files. When multiple FCS files are selected, cells can be combined using one of the four different methods including "Ceil", "All", "Min" and "Fixed".
Ceil: up to a fixed number (specified by fixedNum) of cells are sampled without replacement from each FCS file and combined for analysis.
a vector of all methods normalizationM vector (character) The method(s) of normalization for mass cytometry data, including "Bead-based Normalization", "GaussNorm", "WarpSet" and "None". a vector of all methods signalcleanM vector (character) The method(s) of signal clean for mass cytometry data, including "FlowAI", "FlowCut" and "None". a vector of all methods single_pos_fcs vector (character) The absolute filepath of the .fcs file containing stained samples and control antibody-capture beads/pooled single-stained beads.
NOTE: Only needed when "CATALYST" is included in the argument of "compensationM".

NULL single_pos_mass vector (numeric)
The numeric masses corresponding to barcode channels.

NOTE: Only needed when "CATALYST" is included in the argument of "compensationM".
NULL CATALYSTM vector (character) The method for solving linear system, "flow" and "nnls".
NOTE: Only needed when "CATALYST" is included in the argument of "compensationM".
"nnls" logbase vector (numeric) The base of the Log Transformation.

NOTE: Only needed when "Log Transformation" is included in the argument of "transformationM".
10 b1 vector (numeric) The cofactor of Arcsinh Transformation.
NOTE: Only needed when "Arcsinh Transformation" is included in the argument of "transformationM".
The cofactor of Asinh with Non-negative Value.
NOTE: Only needed when "Asinh with Non-negative Value" is included in the argument of "transformationM".
The cofactor of Asinh with Randomized Negative Value.

NOTE: Only needed when "Asinh with Randomized Negative
Value" is included in the argument of "transformationM".

1/5
Quadratica vector (numeric) The quadratic coefficient "a" in equation y = a*x^2 + b*x + c.  The marker indexes for data processing and performance evaluation. The number of CPU cores to be employed for performing parallel computing.

NOTE: Only needed when "QuadraticTransform" is included in
To avoid memory explosion due to parallel computing, the default is the largest integers not greater than half of the number of CPU cores on the current host. floor(detectCores()/2)

(func3). Classassess()
Description: this function assesses quantification performance of all workflows which are used while running the function of "FCquan" or "MCquan" based on comprehensive criteria (each with distinct underlying theories) from the perspective of CSI studies.

Argument Type Description of the Argument and the Allowable Values Default data list
The resulting R object of "FCquan" or "MCquan" function for the "CSI" study type.
You can directly use the corresponding object stored in R environment, or save it after running the "FCquan" or "MCquan" function and load it when needed.
The method of clustering the processed data prior to performance evaluation, including "FlowSOM" and "PhenoGraph". "FlowSOM" ncluster vector (numeric) The number of clusters for meta clustering in FlowSOM.
NOTE: Only needed when the argument of "clusteringM" is selected as "FlowSOM". 8
"relative weighted consistency (CWrel)" The number of the most differentially expressed markers that are truncated for calculating the CWrel value.
NOTE: Only needed when the argument of "Cc_metric" is selected as "relative weighted consistency (CWrel)". This value must be less than the number of your selected markers.
the largest integers not greater than the number of your selected markers cores vector (numeric) The number of CPU cores to be employed for performing parallel computing.
To avoid memory explosion due to parallel computing, the default is the largest integers not greater than half of the number of CPU cores on the current host. floor(detectCores()/2)

(func4). TIassess()
Description: this function assesses quantification performance of all workflows which are used while running the function of "FCquan" or "MCquan" based on comprehensive criteria (each with distinct underlying theories) from the perspective of PTI studies.

Argument Type Description of the Argument and the Allowable Values Default data list
The resulting R object of "FCquan" or "MCquan" function for the "PTI" study type.
You can directly use the corresponding object stored in R environment, or save it after running the "FCquan" or "MCquan" function and load it when needed.
"Spearman correlation" pathwayhierarchy vector (character) The absolute filepath of the pathway hierarchy file.
The exampler is provided with the name of "Pathway_Hierarchy.csv" in the folder of "exampler".

NULL cores vector (numeric)
The number of CPU cores to be employed for performing parallel computing.
To avoid memory explosion due to parallel computing, the default is the largest integers not greater than half of the number of CPU cores on the current host. floor(detectCores()/2)

(func5). ranking()
Description: this function ranks all workflows assessed by the function of "Classassess" and "TIassess" based on collective consideration of values and grades (classified by well-defined cutoffs) under each criterion.

Argument Type Description of the Argument and the Allowable Values Default data list
The resulting R object of "Classassess" or "TIassess" function.
You can directly use the corresponding object stored in R environment, or save it after running the "Classassess" or "TIassess" function and load it when needed.

(func6). Classplot()
Description: this function realizes the visualization of the resulting quantification performance of each workflow for CSI studies.

Argument Type Description of the Argument and the Allowable Values Default data list
The resulting R object of "FCquan" or "MCquan" function for the "CSI" study type.
You can directly use the corresponding object stored in R environment, or save it after running the "FCquan" or "MCquan" function and load it when needed.
/ clusteringM vector The method of clustering the processed data prior to plotting, including "FlowSOM" and "FlowSOM" The copy of the resulting ranking file of "Classassess" function, where a number of workflows that do not require plotting have been deleted. NULL ncluster vector (numeric) The number of clusters for meta clustering in FlowSOM.
NOTE: Only needed when the argument of "clusteringM" is selected as "FlowSOM" while calling the "FCquan" or "MCquan" function for obtaing "data". The assessing metric under Criterion Cc for the "CSI" study type, including "relative weighted consistency (CWrel)" and "consistency score (CS)".
"relative weighted consistency (CWrel)" ntop vector (numeric) The number of the most differentially expressed markers that are truncated for calculating the CWrel value.
NOTE: Only needed when the argument of "Cc_metric" is selected as "relative weighted consistency (CWrel)" while calling the "FCquan" or "MCquan" function for obtaing "data". This value must be less than the number of your selected markers.
the largest integers not greater than the number of your selected markers cores vector (numeric) The number of CPU cores to be employed for performing parallel computing.
To avoid memory explosion due to parallel computing, the default is the largest integers not greater than half of the number of CPU cores on the current host. floor(detectCores()/2)

(func7). TIplot()
Description: this function realizes the visualization of the resulting quantification performance of each workflow for PTI studies.

Argument Type Description of the Argument and the Allowable Values Default data list
The resulting R object of "FCquan" or "MCquan" function for the "PTI" study type.
You can directly use the corresponding object stored in R environment, or save it after running the "FCquan" or "MCquan" function and load it when needed.
"scorpius_distSpear" rankingC vector (character) The copy of the resulting ranking file of "TIassess" function, where a number of workflows that do not require plotting have been deleted. NULL

Cc_metric vector (character)
The assessing metric under Criterion Cc for the "PTI" study type, including "Spearman correlation" and "Kendall Rank Correlation".
"Spearman correlation" pathwayhierarchy vector (character) The absolute filepath of the pathway hierarchy file.
The exampler is provided with the name of "Pathway_Hierarchy.csv" in the folder of "exampler".

NULL cores vector (numeric)
The number of CPU cores to be employed for performing parallel computing.
To avoid memory explosion due to parallel computing, the default is the largest integers not greater than half of the number of CPU cores on the current host. floor(detectCores()/2)

(func8). exportFCS()
Description: this function exports the quantified CySCP data in the format of FCS and/or expression count matrix (markers in column and events/cells in row). It supports the export of quantification results generated by workflows of top-performance or user interest.

Argument Type Description of the Argument and the Allowable Values Default data list
The resulting R object of "FCquan" or "MCquan" function.
You can directly use the corresponding object stored in R environment, or save it after running the "FCquan" or "MCquan" function and load it when needed.
/ workflow vector (character) The names of workflows of top-performance or user interest (in the format of "compensationM_transformationM_normalizationM_signalcleanM").
If NULL, the parameters of "ntop" and "ranking_data" would be used to specify the workflows of top-performance.

NULL ntop vector (numeric)
The number of top-performing workflows.
If NULL, the parameter of " workflow" would be used to specify the workflows of top-performance or user interest.

NULL ranking_data vector (character)
The absolute filepath of the resulting file (.csv) of the "ranking" function.
This parameter should be used in combination with the parameter of "ntop" to specify the workflows of top-performance.

FCS vector (logical)
The logical value specifying whether to export in the format of FCS (TRUE) or not (FALSE).

The parameters of 'FCS' and 'matrix' cannot be set as 'FALSE' at the same time.
TRUE matrix vector (logical) The logical value specifying whether to export in the format of expression count matrix (TRUE) or not (FALSE).
The parameters of 'FCS' and 'matrix' cannot be set as 'FALSE' at the same time. Table S5. A variety of output files generated by ANPELA 2.0 together with their descriptions.

Name Brief Description
OUTPUT-ANPELA2023-Overall.Ranking.Data.csv A csv file containing all information of ranking and the metric value under each criterion.
OUTPUT-ANPELA2023-Overall.Ranking. Figure.pdf A heatmap illustrating the performance ranking of all workflows based on the metric under each criterion selected by user.

Name Brief Description
.