The proliferation of polychromatic flow cytometry, in terms of instrumentation (1, 2), reagents (3, 4), data analysis techniques (5, 6), and applications (1, 7), has led to the generation of highly complex datasets on a routine basis. The dimensionality of these datasets is high, providing enormous challenges for analysis and data reduction to interpret results. Flow cytometry data analysis software has been designed to help with this, and large-scale efforts toward automation are underway. However, these efforts have been primarily directed at the single-sample analysis arena; the post-processing of complex datasets remains an area requiring innovation.

When analyzing T cell responses, many laboratories routinely measure multiple different functional components on a cell-by-cell basis, e.g., production of IFNγ, IL2, and/or TNF. There is evidence that the pattern of production of these cytokines, termed “quality” (8), rather than the magnitude of the response, may be an important correlate of protection against pathogens (9–13). A single sample measurement may be thought of as a vector of responses; in this example, it would be a seven-element vector that comprises the percentage of T cells that made each unique combination of the three cytokines. The complexity of this analysis (and size of the measurement vector) grows geometrically with each additional measurement, such as CD4 vs. CD8, restriction to particular differentiation stages (14), or inclusion of additional functional outcomes. The goal of the analyses is often to define the element (or combination of elements) within this vector, for which magnitude correlates with a given biological result. As an example, protection afforded by a vaccine against Leishmania major was correlated with the magnitude of only those CD4 cells that simultaneously produced three cytokines—a fraction of the total CD4 response (11).

To do this comparison, it becomes necessary to simultaneously analyze many measurement vectors, grouped by various categorical variables that describe each sample: e.g., treatment, gender, age group, or other experimental conditions. Researchers require graphical interfaces to easily display measurement vectors in forms like bar charts or pie charts, where different subsets of individuals can be grouped on the basis of any (combination) of categories. In some cases, averaging (or other mathematical operations) across subsets of individuals is also desired.

To support this mode of data exploration and statistical analysis, we developed a set of algorithms implemented in an Apple Mac™-based software application named SPICE (“Simplified Presentation of Incredibly Complex Evaluations”). SPICE is supported and distributed by the National Institute of Allergy and Infectious Diseases, NIH, and is freely available (http://exon.niaid.nih.gov/spice). Currently, SPICE supports the analysis and display of a single measurement type (e.g., frequency or MFI); development to support multivariate analysis is underway.

Here we report on the algorithms and techniques we used in developing this application and analysis platform, including a unique implementation of a statistical test to compare measurement vectors (distributions) between two groups of samples, so that developers can implement similar tests and displays in other software applications. In addition, we highlight important features of the analysis and presentation of this type of data. While original implementation and examples shown here are based on the analysis of antigen-specific T cells, none of the algorithms are specific to that domain; we routinely use SPICE to analyze and present any complex datasets that are described by multiple categorical variables, including demographic data.