Comprehensive multivariate evaluation of the effects on cell phenotypes in multicolor flow cytometry data using ANOVA simultaneous component analysis

This work proposes an approach to assess the effects observed in multicolor flow cytometry (MFC) experiments, for all markers and experimental factors simultaneously. It achieves this end by extending ANOVA simultaneous component analysis (ASCA), a multivariate version of ANOVA, to flow cytometry data. It is based on an initial multiset PCA model to describe the main variation patterns of cell marker expression, followed by an ASCA model on the histograms built from these PCA scores. This approach allows for determining the variations in cell phenotype distribution that are related to the experimental design. On a data set from a study of the immune response to prolonged physical exercise, the proposed method computed the effect size and statistical significance of all the experimental factors and their interactions. Most notably, it provided easily interpretable submodels for the overall effect of the walking exercise and for the interaction between exercise and the responsiveness to a bacterial stimulus. The application of a time‐guided sequential clustering algorithm to the ASCA scores revealed a stratification of the studied individuals based on their neutrophil activation dynamics. These effects were not clearly detectable using PCA alone. In comparison with pairwise classification models by DAMACY (a discriminant analysis method for MFC data), ASCA results were less detailed in describing differences between specific samples, but had the advantage of modeling several factors and levels simultaneously. Such characteristics make the proposed implementation of ASCA an effective and complementary addition to the chemometric methodologies for the analysis of MFC data.


| INTRODUCTION
Multicolor Flow Cytometry (MFC) is one of the most established and insightful methods for single cell analysis, 1,2 with many applications in medical and environmental sciences. It is based on aligning the cells of a given sample, usually after appropriate staining, along a narrow funnel and analyzing them one by one, typically by recording their fluorescence from a laser beam. In such way, all measured cells are characterized by the expression of a certain number of markers, normally 5-20 depending on the instrument. This high-throughput technique enables a very detailed description of the cell variability, allowing detection of pathologies and support identification of mechanisms underlying biological processes. 3 Recently, its clinical versatility has been further enhanced by the development of compact, (semi) automated MFC systems 4 that can analyze samples in situ. This greatly extends the employability of MFC from the laboratory to point-of-care diagnostics.
The large volume of data generated by MFC, from thousands to millions of cells per sample, is highly informationrich on the immune system. The measurements on many different cells within the same sample may identify cells with differential expressions of immune markers, either associated with emergence of cell types with complementary function in the immune system or with activation of the functionality in cells that were already present. This in turn represents changes in the entire sample under scrutiny that may indicate system-wide immunological changes of potential diagnostic value.
Nevertheless, this wealth of data poses considerable challenges to the analysis task. Conventionally, such data is analyzed with a multi-gating approach, restricting the analysis to two variables at a time. This is still widely used in clinical settings but becomes resource-intensive and time-consuming even with a low number of markers, compared to the emerging potential of automated, digital approaches. Manual gating strategies are furthermore strongly biased, as they rely strongly on previous interpretations and thus easily miss emerging, unknown patterns that do not align with the chosen strategy.
Various computational methods have been developed to perform a more efficient and less biased analysis. 5 Some of the most widely used are based on non-linear methods, such as stochastic neighbor embedding, 6,7 self-organizing maps, 8 density-normalized clustering, 9 or deep convolutional neural networks. 10 These methods are often accurate and efficient for tasks of cell classification and in some cases even cell sorting, but they do not allow for unambiguous interpretation of the biological differences among the resulting different types of cells, as the results of these methods are non-deterministic. Therefore, additional specific analysis for each subgroup is indispensable.
Deterministic linear models, such as Principal Component Analysis (PCA) or Partial Least Squares (PLS), provide this kind of information. Although they have been criticized for failing to grasp the complex non-linearities associated to changes in the immune system, the transparency of their results in terms of the immune markers measured on every cell is indisputably superior, because their loadings point explicitly to the direction of biological variability. 11 Following this approach, several methods dedicated to MFC data have been developed, such as DAMACY, 12 FLOOD 13 and ECLIPSE. 14 They all are based on a combination of models applied sequentially: an initial multiset PCA 15 to provide a compact description of the distributions of marker values throughout the analyzed samples, followed by other models to relate these distributions to the problem of interest, usually a discrimination between control and response cases. They have been applied successfully to data from clinical studies analyzing the immune responses to, for example, experimental endotoxemia, [12][13][14]16 hematological cancers 12 and asthma. 12,14 These methods have also been applied to non-dichotomous cases such as time series, notably in an investigation using FLOOD of the effect of several consecutive days of endurance exercise. 17 This was done by building a PCA model with the samples from the first time point, thus treated as a control case, and projecting all the remaining samples on that (control) PCA space. The results revealed the mobilization of certain cell phenotypes associated with systemic inflammation and immunosuppression. However, this mainly unsupervised method often struggles to detect small effects that are hidden by larger sources of variation such as differences among individuals. Furthermore, it does not provide a direct and quantitative estimation of several effects simultaneously. Both these goals are attainable if the information on the experimental design, for example, time points, case groups and any other factor, is incorporated into the data analysis model. ANOVA Simultaneous Component Analysis (ASCA) 18,19 is one of the leading methods to perform this task. It consists in a multivariate version of ANOVA that allows for identifying the multivariate responses associated to specific factors and interactions, as well as determining their statistical significance. It has been employed for a wide range of data from spectroscopy [20][21][22] to -omics, [23][24][25][26] but not in flow cytometry, except for a study in which it was used only on the main cell counts. 27 Therefore, the aim of the current work is to propose an implementation of ASCA for MFC that takes into account the complexity of this type of data. The advantage of such data analysis tool in medical and biological research would be to enable a comprehensive and simultaneous evaluation of all the effects of interest on all markers, without needing the laborious task of examining specific factors or subsets of markers at a time.
The proposed implementation is devised by extending DAMACY, 12 which is a method to perform discriminant analysis on MFC data. Moreover, we will demonstrate through a case study how the ASCA results can be integrated with those of complementary chemometric methods to obtain a more comprehensive analysis approach. These methods also include an adapted clustering approach that operates following the chronological order of the data and is thus particularly useful to distinguish the main time patterns.
The data under consideration comes from a study on the response of the innate immune system to prolonged physical exercise. 28 This is a topic of great interest 29,30 because such response is not fully understood and can sometimes be harmful, particularly on elderly subjects that are more prone to sub-optimal immune responses resulting in inflammation and/or insufficient protection. 31 Moreover, because several clinical conditions affect this physiological outcome, diagnostic tools that rely on the immune response at rest may be improved by observing the response in a challenged state. 32 A more informative chemometric analysis may, thus, help find the exercise load that optimizes the greatest health benefits, as well as biomarkers that enable targeted monitoring of the relevant health conditions. [33][34][35] 2 | MATERIALS AND METHODS

| Study design and measurements
The study cohort consisted of 45 subjects (aged 64 ± 6.8 years, 35% female) who participated to the 2018 edition of the 4-Day Marches (http://www.4daagse.nl/en/), a large walking event that takes place every July in Nijmegen, the Netherlands. Every participant walked an assigned distance (30, 40, or 50 km, depending on the age and physical conditions) at a self-selected pace, every day for four consecutive days. The cohort was also subdivided into three groups: healthy controls, users of statin (a commonly prescribed medication to lower blood cholesterol levels 36 ) without side effects of muscle complaints (here referred to as Statin 1) and statin users with muscle complaints (Statin 2), containing 15, 14 and 16 individuals, respectively. A more detailed description of this cohort is given elsewhere. 37 All subjects were measured on-site at baseline (1 or 2 days before the event) and immediately after every day of walking for the first 3 days, by taking a blood sample and loading it into an automated AQUIOS CL ® "Load & Go" flow cytometer (Beckman Coulter, Miami, FL, USA). 4 Each sample was split into two aliquots, of which one was analyzed directly, whereas the other was treated with N-formyl-methionyl-leucyl-phenylalanine (fMLF), a chemotactic factor commonly used to stimulate the innate immune system by mimicking the oligopeptides released by bacteria. 38 The two types of aliquots are here referred to as fMLFÀ and fMLF+, respectively. Subsequently, all samples were stained with an antibody mix containing CD16, CD62L, CD35, CD10 and CD11b, as described in detail in a previous work. 37  which has a 488 nm diode laser, two light scatter channels (forward scatter and side scatter), five fluorescence channels and an electronic volume measure. A scheme of the experimental design is shown in Table 1. This study was approved by the Medical Ethical Committee of the Radboud University Medical Center under protocol number CMO-nr 2007-148 and all participants gave written informed consent before participation. All procedures performed in this study were in accordance with the 1964 Helsinki declaration and its later amendments.

| Preprocessing and basePCA
The flow cytometry measurements were acquired by the AQUIOS CL ® instrument as .lmd files and exported into FlowJo ® analysis software (Tree Star Inc., Ashland, OG, USA). The data were gated for neutrophils, the most numerous cell type of the innate immune system, based on the expression of CD16 and CD62L 39 ; this gating was checked by two independent researchers. The final data structure consisted of 342 samples, containing 10,000-40,000 cells each, and measured over six variables, that is, the five above-mentioned antibody markers and Forward Scatter (FS). These data were analyzed according to the scheme depicted in Figure 1. The preprocessing consisted of several operations, beginning with compensating for the spillover among different fluorescence channels, and transforming using arcsinh. To avoid artifacts caused by values much lower than À1, prior to arcsinh transformation each column of the whole data was divided by a cofactor, which was initially chosen as the absolute value of the 99th percentile of all the negative values for that column. This cofactor was then checked by visual inspection of the resulting histogram, and modified if the latter deviated considerably from a Gaussian distribution or if spurious negative peaks were observed. 40 The arcsinh-transformed data were divided by the number of cells in each sample, then median-centered and scaled as done in a previous paper. 15 Depending on which method was used afterwards, the centering and scaling was executed either over the whole data set or over the sample that was identified as reference/control (see below).
The preprocessed data of all the concatenated samples were first modeled using simultaneous component analysis (SCA), 41 which is a family of methods for the analysis of data arrays with a shared mode. Its simplest form, also known as SCA-P 42 and employed here, consists in a multiset expansion of PCA to populations measured with a common set of variables; hence, the SCA model built on the preprocessed data is here referred to as basePCA. After projecting the samples onto the first two (base)PCs, a 2D-histogram (100 Â 100 bins) of each one was calculated on the PCA scores as described before. 12 These histograms, a few examples of which are shown in Figure 2, provide an overview of the multivariate information of all cells measured in each sample. The number of bins should be chosen so that the distribution is sufficiently representative but also described with sufficient detail. As a rule of thumb, this number should be equal or lower than the number of cells. All basePCA histograms were smoothed and normalized to the unit sum of all bins, then unfolded and concatenated into a single matrix of dimension N Â 10,000, with N the number of samples considered in each case: some models were built on the whole data set, others on specific subsets only, see Section 3. The F I G U R E 1 Scheme of the data analysis performed in this study. For the meaning of the acronyms (SCA, basePCA, topPCA, ASCA and DAMACY) see main text. The 2D histograms can also be generalized to histograms of several dimensions (see Section 3.1) F I G U R E 2 2D-histograms of basePCA scores for subject 14, fMLFÀ, on each measured day. The red spots depict the cell density with respect to the markers indicated by the loadings (black arrows); the yellow arrows indicate the shift of the cell density peak as compared to the previous time-point resulting matrix was used as input for three further chemometric methods. The first was another PCA model (here denoted as topPCA), which was calculated in the standard way after mean-centering the matrix of unfolded 2D-histograms. The other methods were ANOVA Simultaneous Component Analysis (ASCA) and Discriminant Analysis of Multi-Aspect CYtometry (DAMACY).

| ANOVA simultaneous component analysis (ASCA)
ASCA is a multivariate approach to ANOVA, able to highlight the patterns that correlate with different factors or interactions of an experimental design. 18,24 The method consists of two main steps: (i) decomposition of the data matrix into a sum of matrices that explain each effect, and (ii) dimensional reduction of each effect matrix by means of PCA (or, more specifically, simultaneous component analysis 43 ). The data decomposition used here was: X is the matrix of vectorized 2D-histograms of the basePCA model, built after median-centering and scaling over the data set that is being modeled. X m is a matrix with N identical rows containing the column means of X; X d , X s and X f are the effect matrices of the factors (walking) day, statin group and fMLF, respectively; X ds , X df and X sf are the effect matrices of their binary interactions, X 0 e is the residual of this first decomposition, which is in turn decomposed into X i (i.e., the effect matrix for the factor individual, nested in the statin factor) and the final residuals X 00 e . The final residuals also include the interactions of individual with day and with fMLF, as normally done for random factors. 44,45 Because ASCA does not perform an explicit assessment of random factors, the latter are accounted for only by how they affect the spread and statistical significance of the fixed factors. For suggestions on how to include the explicit treatment of random factors, see discussion in Section 3.2.3. This decomposition does not include quadratic terms, as commonly done in ASCA and other multivariate ANOVA methods. The unbalancedness of the design was handled by applying type III sum-of-squares corrections, 46 which calculate each effect after removing the portion that is explained by the other factors and interactions. As demonstrated by Thiel et al., 46 the consequence of not accounting for design unbalancedness may be an overestimation of some of the effects, especially interactions. The PCA models calculated on each effect matrix are referred to as ASCA submodels. The significance of each factor and interaction was tested using the sum-of-squares of the relevant effect matrix and comparing it with the null distribution generated by 500 permutations. 47 Additional ASCA models were built on fMLFÀ or fMLF+ samples only, after removing the fMLF factor and its relevant interactions from (1). Certain submodels were also calculated on effect matrices combining a factor with one of its interactions (e.g., X f þdf ¼ X f þ X df ), applying type III sum-of-squares corrections in this case, as well.

| Other employed methods: DAMACY and sequential clustering
DAMACY is an implementation of Orthogonal Partial Least Squares-Discriminant Analysis (OPLS-DA) 48 for flow cytometry data 12 and it was employed here to describe the local contrasts between specific days in more detail than ASCA. In particular, the constructed models discriminated between each walking day and the next, as well as between baseline and any other day. The median-centering and scaling was performed in a paired manner, using the median and median absolute deviation of the sample from the earlier of the two considered days. The prediction scores, obtained from a double cross-validation taking the pairs into account, indicates how well each sample can be distinguished between the days, and the weight vector, refolded into a 100 Â 100 matrix, points to which cells in the 2D histograms are more represented on which day. A p-value for the predicted classification was calculated by means of a permutation test (100 permutations).
A cluster analysis was performed on the matrix T of individual time-trajectories, built by collecting the first two scores of the ASCA submodel that combines the factor individual with the day-individual interaction. The scores of each subject were arranged on the same row in chronological order (PC1 baseline, PC2 baseline, PC1 day 1, PC2 day 1, …), for a final dimension of n Â 8, with n the number of individuals. Missing values were imputed with the Missing Data Imputation Toolbox 49 using the default options, that is, Trimmed Scores Regression, 5000 iterations, a tolerance of 10 À10 and 3 PCs as determined from cross validation. Subjects were clustered by applying agglomerative hierarchical clustering 50 on T, with Euclidean distance and Average linkage, in a sequential way. In particular, initial clusters were obtained from the distances calculated on the baseline scores (first two columns of T); each of these clusters were then subdivided according to the scores of day 1 (third and fourth column of T), and the same operation was repeated in nested fashion for days 2 and 3. This sequential procedure allows for incorporating information on the chronological order of the measurements. The cutoff distance used to determine the number of clusters, decided upon visual inspection of the dendrograms, was set as 0.02 (larger than all the 95% confidence intervals obtained by bootstrapping). Four isolated data points were merged with their nearest cluster.
All calculations were performed in Matlab R2018b (The Mathworks, USA).

| RESULTS AND DISCUSSION
Of the 45 measured subjects, three dropped out after 1 day of walking, two dropped out after 2 days and one, despite completing the event, did not show up for analysis at day 2. One subject (of those who dropped out after 2 days) turned out to be CD16-receptor deficient, which made him unsuitable to the analysis. After excluding this subject, the remaining data from 44 participants set comprised 336 samples, of which baseline and day 1 had both 88, whereas days 2 and 3 had both 80, always split exactly in half between fMLFÀ and fMLF+.

| PCA reveals only the largest patterns
The basePCA model built on the whole data set explained 58% and 19% of the total variance for the first and second PC, respectively. These two PCs also correspond to a minimum of the Predicted Residuals Sum of Squares (PRESS), as determined by a 10-fold cross validation. The basePCA loadings, depicted as the arrows in several plots, including Figure 2, indicate that CD10, CD11b and CD35 are strongly correlated, which is expected because they are all markers of neutrophil activation. 4 CD16 and CD62L, which are markers related to respectively cell maturity and adhesion, are instead more correlated to FS, which is a measure of the cell size and appears orthogonal to the activation markers. The basePCA 2D-histograms are an efficient way to represent all the major multivariate information of flow cytometry data in a single plot. 13 Here, they allow for an easy visualization of the changes occurring between different samples. For instance, the histograms in Figure 2 are a representation of the neutrophil phenotype of subject no. 14 during the measured days. The most apparent changes are shifts in the position of the distribution peak, indicating a different average phenotype of the cell population; a slight elongation is also observed on day 3. Even more detailed data representations would be obtained by using more than two basePCs, and the method described in this paper would remain essentially the same, apart from the higher memory requirements to compute a high-dimensional histogram (unless the bin resolution is reduced). However, such histogram would bring visualization challenges that are not in the scope of this paper.
On the matrix of the (unfolded) 2D-histograms of the basePCA, a second PCA model was constructed to highlight the main variations among the cell distributions throughout the data set. The loadings and scores from PCs 1-4 of this model, referred to as topPCA, are shown in Figure 3A. The loadings of the unfolded histograms are refolded into 2Dhistograms, with red and blue areas indicating positive and negative density with respect to the mean (as the histogram data are first mean-centered). The loadings of the basePCA are projected on the same plot as arrows. This superposition is valid because the variables of the topPCA loadings are the same as in the basePCA 2D-histograms, that is, the histogram bins; the same consideration is true for the loadings or regression coefficients of other models presented below (ASCA and DAMACY, see Section 3.2.1). As can be seen from the loading histograms, each PC describes a specific effect: PC1 corresponds to a shift of the cell density towards higher values of CD10, CD11b and CD35, that is, an increase in neutrophil activation; PC2 corresponds to a spreading of the cell density towards CD62L as well as the activation markers; PC3 and following indicate more complex deformations of the cell density.
The first two topPCA scores, shown in Figure 3B, are plotted separately for each day, using dots for fMLFÀ and crosses for fMLF+. Each point is also characterized with a color and a number indicating the statin group and the subject ID, respectively. Because the fMLF stimulation causes a strong neutrophil activation, all the fMLF+ samples appear on the right side of the plot. However, a few fMLFÀ samples show a similar level of activation, especially on days 1 and 2. From these plots we cannot detect any clear pattern related to the time sequence, nor to the statin groups. Moreover, the horseshoe distributions observable in every plot raise the suspicion of representation artifacts. 51 Subsequently, we built a base-and topPCA model on the fMLFÀ data only, whose loading and score plots are shown in Figure 4. This partial model removes the largest effect observed in the data, that is, the fMLF stimulation, and thus may facilitate the detection of patterns caused by smaller effects. The first two PCs of this basePCA explain 48% and 21% of the fMLFÀ data variance, respectively. Just like the full model, the loading of PC1 ( Figure 4A) describes neutrophil activation, but PC2 instead describes a shift in cell density aligned towards higher CD16, corresponding to an increase in cell age. PC3 describes a spreading of the cell density (similar to PC2 of the full model), whereas PC4 and following are related to more complex effects. The score plots ( Figure 4B) do not show any evident horseshoe distribution anymore, and we can also find some notable patterns and groupings. At baseline, the subjects are spread rather evenly along PC1, spanning different degrees of neutrophil activation. On days 1 and 2, the individuals cluster into two main groups of "activated" and "nonactivated" subjects. Interestingly, the individuals in each cluster are not the same between the 2 days, an observation that points to different patterns of activation-deactivation throughout the data set. The average PC2 values are lower on days 1 and 2 than baseline, especially for the non-activated groups. On day 3, the values of both PC1 and PC2 tend to shift towards the middle of the range, and the two clusters approximately merge back into a single one. Nevertheless, it is still not clear what patterns, if any, are related to the days of walking or the statin grouping.

| ASCA reveals all patterns associated to the experimental design
The explained variances and statistical significance of the effects for all ASCA models calculated in this paper are presented in Table 2. The model built on the full data set is dominated by the fMLF effect, with more than 47% of the explained variance. As expected, the corresponding submodel has a loading (not shown) very similar to the topPC1 in Figure 3, and simply describes the neutrophil activation occurring in the fMLF+ samples. However, ASCA also finds other effects that are statistically significant (i.e., at least one group/level is significantly different from the others) from the day and individual factors, as well as the fMLF-day and fMLF-statin interactions. A rather large portion of the variance is left in the residuals, which contain the individual deviations to the average fMLF and day effects. Figure 5A shows the scores (with the projected residuals) and loadings of the day submodel, explaining 5% of the total variance. The black arrows highlight the average trajectory of the time points: from baseline to day 2 the trend is towards the left, where from day 2 to 3 it points approximately in the opposite direction. However, the loadings are not easy to interpret, because at least two shifts in cellular density are present, most likely originating from the fMLFÀ and fMLF+ samples, respectively. These density regions are tentatively encircled with green continuous and dashed lines, but because of a considerable overlap (especially in the blue areas) we cannot determine with certainty which density shift is related to which type of samples.

| Effect of day factor
Therefore, it is convenient to build also ASCA models on half of the data (either fMLFÀ or fMLF+), analogously to what we did for the topPCA. For the model on fMLFÀ samples, the day submodel explains only 12% of the variance (of fMLFÀ data), of which PC1 explains about 80%. Its average score increases from baseline to day 1 (horizontal black arrow in the scores plot of Figure 5B), is approximately stationary at day 2 and decreases, coupled with an increase in PC2, on day 3 (diagonal arrow). The PC1 loading describes a decrease in FS, CD62L and CD16, together with a spreading along the direction of the activation markers (i.e., the horizontally elongated shape of the red density); PC2 is also associated with the spreading of cellular density along several markers. This result is in agreement with the corresponding topPCA model examined above: the large patterns of neutrophil activation (PC1 in Figure 4) do not have a correlation with the day-sequence common to the whole studied cohort (though they are likely related to the spreading found in the ASCA loadings), and the topPC2 scores also indicated an overall decrease in CD62L and CD16 on days 1-2, followed by a slight increase on day 3. A previous univariate analysis of the same data 37 reached the same conclusions as well. However, the ASCA model delineates the common response to the walking days much more directly and clearly, while avoiding a tedious search throughout the (unsupervised) PCs or the individual markers.
The day submodel built on the fMLF+ data explains a larger portion of the data variance (22%, as compared to 12% in the fMLFÀ model) and is more similar to the full model, both in average score trajectory and basePCA loadings, see Figure 5C. Its (ASCA) PC1 loading describes an increase in activation markers (and partially CD16), whereas PC2 is T A B L E 2 Explained variances (%) of the effects analyzed by the ASCA models used in this work aligned towards an increase in CD62L and CD16. This results in a decrease in CD62L and CD16 from baseline to day 1, followed by a decrease in activation markers on day 2 and a subsequent large increase on day 3. Because ASCA models several groups/levels and factors at once, for a given pair of data subsets it may not be as accurate as methods that model that instance specifically. To get an assessment of how the two outcomes may differ, we compared certain ASCA results with the corresponding ones from DAMACY, see Figure 6. In particular, the plots depict the average shift in cell density between baseline (blue) and day 3 (red), for both fMLFÀ (top row) and fMLF+ (bottom row). The ASCA plots (lefthand column) are obtained by multiplying the difference in scores (day 3 -baseline) F I G U R E 5 ASCA model for factor walking day, built on (A) full data set, (B) fMLFÀ and (C) fMLF+ samples. In the score plots (left side), the larger filled circles correspond to the day averages, whereas the smaller dots are the projected residuals of the model, which are also wrapped by the colored polygons. The black arrows highlight the chronological order of the time points: In all cases the direction towards day 3 is somewhat opposite to the trend from baseline to day 2. In the loadings plot of the full model (A), two shifts in cellular density can be recognized, originating from the fMLFÀ and fMLF+ samples and encircled with a green continuous and dashed line, respectively. All percentages indicate the proportion of explained variance with respect to the total variance of the data used for the relevant model by the loadings of the corresponding partial model, using three PCs (which is the theoretical maximum). The DAMACY plots (righthand column) are weight maps (i.e., refolded weight vector) from the models that discriminate baseline from day 3 for each fMLF subset; they all had an accuracy >90% and p < 0.01. All the loading arrows and the reported percentages of explained variance refer to the basePCA models built on the relevant subset.
F I G U R E 6 Comparison between ASCA (left) and DAMACY (right) results. All plots depict the average shift in cellular density between baseline (blue) and day 3 (red), as calculated for fMLFÀ (top row) and fMLF+ (bottom row) data. The ASCA plots are obtained by multiplying the difference in scores (day 3 -baseline) by the loadings of the corresponding partial model, using 3 PCs. The DAMACY plots are weight maps. The loading arrows and the percentages of explained variance (in parentheses along the axes) refer to the basePCA built on that specific subset. The color scales are analogous to those of the previous figures F I G U R E 7 ASCA model combining the fMLF factor with the day-fMLF interaction. Score plots (left-hand side): The large filled circles correspond to the day-fMLF averages, whereas the projected residuals are indicated with small dots (fMLFÀ) and crosses (fMLF+) and wrapped by the colored polygons. The colored straight arrows represent the average effect of response to the fMLF stimulation for each day. Loading plots (right-hand side): PC1 describes the average fMLF-induced neutrophil activation, whereas PC2 describes a rotation of the direction of cell density shift (green curved arrows). All percentages of explained variance refer to the total data set. The color scales are analogous to those of the previous figures The plots obtained by ASCA are similar to the ones from DAMACY and the main cell shifts are oriented in the same direction. However, there are some differences in the shape of the cell densities, especially in the peripheral regions originating from samples or cells that behave differently than the majority of the data set. The ASCA model is also affected by the rigidity imposed by a basePCA common to all factor levels. This fact is apparent by looking at the DAMACY model for fMLFÀ (top right), which reveals that in the baselineday 3 shift the activation markers are not so tightly correlated anymore, and that CD11b presents a higher overall increase than CD10 and CD35 (CD11b points more directly at the red cloud).

| Interaction of fMLF with the walking day
According to Table 2, the full ASCA model describes two significant interactions with fMLF, the largest of which (4.4% of explained variance) is the one with the factor day. Because the plots of pure interaction effects are often difficult to interpret, we calculated a submodel combining the fMLF factor with the fMLF-day interaction as described in Section 2.2.2, see Figure 7. This submodel facilitates the visualization of how the fMLF response changes during the different days. The scores plot shows that on all days most of the fMLF response (colored straight arrows) is explained by PC1, whose loading is associated to a neutrophil activation analogous to the ones seen in Figures 3 and 4, as was expected. On the other hand, the difference in such response among the different days is mainly explained by PC2, which expresses a change in the direction of cell density shift. In particular, an increase in PC2 corresponds to a counter-clockwise rotation of such shift, as pointed out by the green curved arrows. This rotation translates to an activation that is slightly more intense on CD10 and less on CD11b, a condition that occurs most strongly at baseline, decreases on days 1 and 2 and reverts on day 3.
Analogous plots for the interaction between fMLF and statin (shown in Figure S1) indicate that the fMLF response is slightly less intense for the Statin 2 group. Although the tiny effect size of the fMLF-statin interaction (0.4%) warrants caution towards this finding, this result could be the basis for further confirmatory studies, perhaps with a larger cohort.

| Individual stratification of the main walking effect
A drawback of the models presented above is that the ASCA loadings, for example, the ones in Figure 5B, do not indicate whether the plotted average response originates from a single response common to all samples (e.g., all individuals produce both activated and non-activated neutrophils) or instead from a sum of different individual responses (e.g., certain individuals respond with activation, others with non-activation of neutrophils). To shed light on this aspect, it is convenient to calculate the interactions of the individual factor with other effects, which were previously left in the residuals of (1). For example, the submodel combining the factor individual with the day-individual interaction expresses the deviation of each subject from the average walking-day effect (the effect shown in Figure 5). For the fMLFÀ subset, the first two loadings of this ASCA submodel describe neutrophil activation (PC1) and spreading of cell density along the direction of the activation markers and of CD16 (PC2), see bottom plots of Figure 8. The high variance (88%) explained by this submodel indicates that the individual deviations are quite large as compared to the average response.
To identify the major types of dynamics present in this subset, we performed a cluster analysis of the T matrix in a sequential manner as described in Section 2.2.3; the derived dendrograms are plotted in Figure 8. With this sequential approach, the study group is progressively split at each time point, so that, for example, two individuals differing strongly on the first day will fall into two separate clusters even if their responses on subsequent days are very similar. We believe that this clustering criterion is more suited to sort chronological sequences, as opposed to standard clustering algorithms that are not affected by the variable order; other clustering methods may be applied in other situations.
The clustering revealed a stratification of the subjects into seven clusters with distinct response patterns. Their perday averages are plotted in Figure 8 below the dendrograms, with staggered lines indicating the trajectory from baseline to day 3 according to the direction of the arrow. The color of these trajectories matches that of the rectangle surrounding the corresponding cluster in the last layer of dendrograms (day 3), whereas their thickness is proportional to the number of subjects in the cluster. The trajectories of most individuals, corresponding to the largest clusters plotted on the left, all start and end near the center of the axes, indicating a response that eventually returns to the average state.
What changes between these clusters is the direction of the response, for example, between baseline and day 1 the yellow and orange clusters proceed towards the right, corresponding to a higher neutrophil activation, whereas the blue and purple clusters go the opposite way. Clusters 5-7, plotted separately on the right, are instead characterized by a high neutrophil activation already at baseline, followed by an evolution to a state considerably different than the initial one.
Despite the trajectories' overlap, it is possible to appreciate the distinct time-patterns, especially with regards to the activation direction (PC1). For example, the yellow and the purple arrows follow opposite paths from baseline to day 1 (segment highlighted by the numbers "0" and "1"), corresponding to higher and lower neutrophil activation, respectively, before returning to a near-initial state on day 3. F I G U R E 8 Sequential clustering of individuals based on the ASCA scores of the submodel [individual + day-individual], calculated on the fMLFÀ data. For each day, new dendrograms are constructed using the scores of PC1 and PC2 relative to that day. The dashed red line indicates the threshold for cluster separation (a few isolated samples are merged with the nearest cluster). Middle plots: Average trajectories of ASCA scores (from baseline to day 3 according to the direction of the arrows) for each final cluster. The color of the lines match those of the rectangles in the bottom layer of dendrograms, and the line thickness is proportional to the cluster size. The clusters 5-7 are shown on a different plot for clarity. Bottom plots: PC1 and PC2 loadings of the considered ASCA submodel, describing neutrophil activation and spreading of cell density, respectively. All percentages of explained variance refer to the fMLFÀ subset. The color scales are analogous to those of the previous figures This mathematical tool enables a quick recognition of the main patterns of response to walking, highlighting which ones are more common and which are outlying. Such result may facilitate the understanding of which biological processes are prevalent in different subjects, as well as suggest what amount of exercise is most appropriate for each population group. Moreover, this approach, which was here limited to two PCs for ease of visualization, may be extended to further PCs to obtain a more detailed representation of the individual responses. For instance, PC3 (see Figure S2) describes a deformation of the cellular density along a bean-like shape, which is a well-known type of distribution already observed in neutrophils. 17 Finally, a further improvement on this methodology, particularly in assessing differences among individuals, could be to employ multivariate mixed modeling techniques that are able to determine random effects. Two recently published methods, Linear Mixed Model-PCA (LiMM-PCA) 52 and Repeated Measures-ASCA+ (RM-ASCA+), 53 appear very promising candidates for development in this direction.

| CONCLUSION
Multicolor Flow Cytometry is nowadays a mature and extremely valuable technique, increasingly widespread in the medical and biological field. Nevertheless, the size and complexity of the data that it produces, even with a low number of markers, still poses challenges in the detection, visualization and interpretation of the relevant distributions patterns of cell phenotypes. Ultimately, the full exploitation of the wealth of information that can be obtained from MFC depends on the availability of appropriate mathematical tools.
In this work, we proposed a further option for the analysis of MFC data, consisting in an implementation of ASCA based on previous linear multivariate MFC-dedicated methods, which have the advantage of showing explicitly the direction of biological variability of the highlighted cells. Compared to these methods, ASCA provides a much more comprehensive insight in the immunological variation associated with different factors and interactions in the experimental design, enabling transparency from changes on individual cells to systematic response-associated immunological changes on given time points. In our case study on the innate immune response to prolonged physical exercise, ASCA was able to reveal the effects on the cell populations caused by the known experimental factors and their interactions, directly quantifying their size and statistical significance. Most relevant were the average effect of exercise (an initial neutrophil mobilization on the first 2 days of walking, followed by a partial reversal on the third day) and the relationship between exercise and response to bacterial stimulation (fMLF). Using an appropriate time-guided clustering method, we also found a stratification of individual dynamics, for example, some individuals had a high neutrophil activation on day 1 and 2 before returning to the initial state on day 3, whereas another group of individuals had the opposite dynamic. Some of these findings were related to small effects that are normally hidden by larger sources of variation and are thus difficult to detect using only PCA, which does not take the information on the experimental design into account. This work also compared ASCA results with those from a discriminant analysis method (DAMACY), showing that the former may be less accurate at describing differences between specific groups of data, but it is more efficient at providing a general and simultaneous description of all the groups involved in the experimental design. This characteristic is especially useful to analyze sequences (such as a time series in this case), as opposed to pairwise classifications.
The results of this work show that the proposed implementation of ASCA is an effective way to disentangle variation in MFC measurements, particularly changes in relative expression of several cell markers simultaneously, caused by multifactorial experimental designs. It therefore constitutes a valuable addition to the mix of chemometric techniques to be employed for a comprehensive multivariate analysis of this type of data.