How to cite this article: Finn WG, Carter KM, Raich R, Stoolman LM, Hero AO. Analysis of clinical flow cytometric immunophenotyping data by clustering on statistical manifolds: Treating flow cytometry data as high-dimensional objects. Cytometry Part B 2009; 76B: 1–7.
Original Article
Analysis of clinical flow cytometric immunophenotyping data by clustering on statistical manifolds: Treating flow cytometry data as high-dimensional objects†
Article first published online: 18 JUL 2008
DOI: 10.1002/cyto.b.20435
Copyright © 2008 Clinical Cytometry Society
Additional Information
How to Cite
Finn, W. G., Carter, K. M., Raich, R., Stoolman, L. M. and Hero, A. O. (2009), Analysis of clinical flow cytometric immunophenotyping data by clustering on statistical manifolds: Treating flow cytometry data as high-dimensional objects. Cytometry Part B: Clinical Cytometry, 76B: 1–7. doi: 10.1002/cyto.b.20435
- †
Publication History
- Issue published online: 9 DEC 2008
- Article first published online: 18 JUL 2008
- Manuscript Accepted: 27 MAY 2008
- Manuscript Received: 13 FEB 2008
Funded by
- National Science Foundation. Grant Number: CCR-0325571
- Abstract
- Article
- References
- Cited By
Keywords:
- flow cytometry;
- statistical manifold;
- information geometry;
- immunophenotyping;
- immunophenotype clustering
Abstract
Background
Clinical flow cytometry typically involves the sequential interpretation of two-dimensional histograms, usually culled from six or more cellular characteristics, following initial selection (gating) of cell populations based on a different subset of these characteristics. We examined the feasibility of instead treating gated n-parameter clinical flow cytometry data as objects embedded in n-dimensional space using principles of information geometry via a recently described method known as Fisher Information Non-parametric Embedding (FINE).
Methods
After initial selection of relevant cell populations through an iterative gating strategy, we converted four color (six-parameter) clinical flow cytometry datasets into six-dimensional probability density functions, and calculated differences among these distributions using the Kullback-Leibler divergence (a measurement of relative distributional entropy shown to be an appropriate approximation of Fisher information distance in certain types of statistical manifolds). Neighborhood maps based on Kullback-Leibler divergences were projected onto two dimensional displays for comparison.
Results
These methods resulted in the effective unsupervised clustering of cases of acute lymphoblastic leukemia from cases of expansion of physiologic B-cell precursors (hematogones) within a set of 54 patient samples.
Conclusions
The treatment of flow cytometry datasets as objects embedded in high-dimensional space (as opposed to sequential two-dimensional analyses) harbors the potential for use as a decision-support tool in clinical practice or as a means for context-based archiving and searching of clinical flow cytometry data based on high-dimensional distribution patterns contained within stored list mode data. Additional studies will be needed to further test the effectiveness of this approach in clinical practice. © 2008 Clinical Cytometry Society

