The distinction between germinal center B-cell lymphoma (GC-L), such as follicular lymphoma, and germinal center lymphoid hyperplasia (GC-H) can be challenging. Although for some specimens a diagnosis can be made readily using conventional histologic evaluation, for others additional phenotypic or genotypic evaluation is required. Flow cytometric immunophenotyping is a rapid technique that can assist with this distinction, but has some limitations (). Identification of a population of cells with immunoglobulin light chain restriction by flow cytometry provides strong support for a diagnosis of lymphoma, but can occasionally be present in florid GC-H (). In addition, light chain restriction can easily be overlooked if the abnormal population is small, the lymphoma cells lack surface immunoglobulin expression, or the assay is limited by non-specific staining. Although bcl-2 protein overexpression can be identified by immunohistochemistry or flow cytometry in many cases of GC-L, some cases are bcl-2 negative ([3, 4]). In addition, bcl-2 staining of GC B-cells can be difficult to distinguish from that of T-cells and plasma cells by single color immunohistochemistry and although flow cytometric analysis can isolate the B-cells of interest, it requires cell permeabilization, is prone to non-specific staining, and can be difficult to interpret. Abnormal increased or decreased expression of the surface antigens CD19, CD20, and CD10 can be found in many cases of follicular lymphoma, but evaluation for each antigen in isolation lacks sensitivity for the diagnosis of lymphoma ([5, 6]). The availability of higher color flow cytometric assays facilitates the analysis of multiple antigens simultaneously, but raises questions about which antigens to evaluate and how best to analyze the data. In order to address the question “What is the best flow cytometric analysis strategy to distinguish GC-L from reactive lymphoid tissue using the 8-color antibody combination: anti-kappa, anti-lambda, CD19, CD20, CD10, CD5, CD38, and CD45 antibodies?” we applied the computational tools flowType and RchyOptimyx (cellular hieraRCHY OPTIMization), which will be briefly summarized in the discussion section of the manuscript ([7, 8]). This computational analysis highlighted the diagnostic utility of identifying CD10 positive, CD38 negative B-cells in the distinction between GC-L and GC-H. In addition, this study revealed some of the limiting factors that must be considered when applying computational analysis to clinical data sets.
MATERIALS AND METHODS
- Top of page
- MATERIALS AND METHODS
- LITERATURE CITED
Lymphoid tissue biopsy specimens with the following features were identified from the pathology reports at the University of Pittsburgh Medical Center (UPMC; University of Pittsburgh Institutional Review Board IRB proposal PRO11060224):
- Flow cytometric immunophenotyping using an 8-color B-cell tube containing CD45 V500/CD20 V450/kappa fluorescein isothiocyanate (FITC)/lambda phycoerythrin (PE)/CD19 PE-Cy7/CD5 PerCP-Cy5.5/CD10 allophycocyanin (APC)/CD38 APC-H7 (BD Bioscience, San Jose, CA).
- Flow cytometric immunophenotyping using a tube containing Bcl-2 FITC/CD10 PE/CD20 PerCP-Cy5.5 (BD Bioscience, San Jose, CA).
- Presence of a population of CD10 positive germinal center B-cells.
- Diagnosis of reactive changes, including hyperplastic follicles (GC-H), or GC-L, confirmed by review of all diagnostic materials and using the criteria outlined in the 2008 WHO classification ().
Using these criteria, the following specimens were identified: GC-H (n = 48, 25 females, 23 males, median age 40 years) and GC-L (n = 52, 29 females, 23 males, median age 70 years). Cases of GC-L included follicular lymphoma, Grade 1–2 (n = 34) and follicular lymphoma, Grade 3A and/or diffuse large B-cell lymphoma (n = 18).
Flow cytometric immunophenotyping was performed on cells extracted by manual disaggregation. Viability was determined by Trypan Blue exclusion and ranged from 66 to 99% viable (median 84%). A suspension of 5 × l05 cells/tube in phosphate buffer saline (PBS) containing 0.1% sodium azide and 2% fetal bovine serum was incubated with the 8-color surface antibody combination for 15–30 min at 4°C. Lysis was performed using ammonium chloride, and followed by washing with PBS. Stained cells were fixed with 2% formaldehyde. For the bcl-2 tube cells were fixed and permeabilized (Fix and Perm Kit, Life Technologies, NY) as previously described (). Acquisition for both tubes was performed on the same day as staining using a BD FACS Canto II flow cytometer (BD Bioscience, San Jose, CA), and collection of 30,000 events. To ensure consistency of results instrument setup was standardized using target CST beads (BD Bioscience, San Jose, CA) and voltages were monitored with Levey–Jennings plots, settings were cloned between instruments, instrument spectral compensation was set up using Compbeads (BD Biosciences) and lot-to-lot reagent checks were performed.
The original manual analysis was performed in the UPMC clinical flow cytometry laboratory using FACS DIVA software (BD Bioscience, San Jose, CA), with a template that includes the following steps: exclusion of doublets using a plot of forward light scatter (FSC)-area versus FSC-height, exclusion of debris using a plot of CD45 versus side light scatter (SSC) by gating on cells with low SSC and staining for CD45, identify B-cells through expression of CD19 and/or CD20; identify the following subsets: CD38+(bright) plasma cells, CD10 positive GC B-cells, CD5 positive B-cells; evaluate each subset for immunoglobulin light chain restriction; evaluate each subset visually for altered expression of CD19, CD20, or CD10. A separate tube evaluating bcl-2 expression was either ordered up-front or added if the 8-color B-cell tube identified a CD10 positive germinal center B-cell population, but did not adequately distinguish between GC-H and GC-L. The results of this conventional flow cytometric analysis for the 8-color B-cell tube and a separate flow cytometric tube containing Bcl-2 FITC/CD10 PE/CD20 PerCP-Cy5.5 were reviewed.
Computational analysis was performed on de-identified flow cytometric data as previously described (). Briefly, logical transformation of immunophenotypic data was performed and gates were determined for each marker so as to partition positive and negative cell populations (). Using this data, all possible phenotypes were extracted using flowType ([7, 8]). Receiver operating characteristic (ROC) analysis was then performed to identify the phenotypes associated with a statistically significant difference between GC-H and GC-L and those with the strongest predictive power were selected for further analysis. The phenotypes selected were analyzed using RchyOptimyx to identify their most important parent populations.
Manual re-analysis of the flow cytometric data was then performed using FACS DIVA software (BD Bioscience, San Jose, CA) to further explore the cell populations identified by computational analysis: CD10+, CD10+CD38(−), CD19+CD10+CD38(−), CD5(-)CD19+CD10+CD38(−); CD10+CD38(−), CD10+CD38(−)CD45(−), CD19+CD10+CD38(−)CD45(−); Lambda+CD10(−), Kappa+CD10−, Lambda+Kappa+CD10(−), Lambda+Kappa+CD10(−)CD38(−), Lambda+Kappa+CD10(−)CD20+. The proportion of events and median fluorescence intensity (MFI) was determined for each population identified in order to illustrate features previously reported using current manual methods and compare those with the computational results.
- Top of page
- MATERIALS AND METHODS
- LITERATURE CITED
Advances in instrumentation and reagent technologies have led to the widespread use of high-level multicolor flow cytometry. However, effective strategies for storing, representing, and interpreting the increasingly complex data have been lacking. Recognition of this need has led to the recent development of many automated gating tools, some of which are now employed in research applications and to support high-throughput technology ([10-12]). There is increasing interest in employing these tools in clinical flow cytometry laboratories to support the use of higher color systems, larger panels of reagents, and more sophisticated analysis strategies with complex hierarchical gating ([13, 14]). Manual analysis of flow cytometric data remains one of the largest variables in flow cytometric immunophenotyping () often relies on personal experience, is time intense, error prone, and difficult to standardize. Automated analysis is poised not only to remove the burden of manual gating (), but also to take the next step and identify biological changes associated with disease. However, these tools have the potential to also assist with manual analysis by maximizing the information obtained from a single multicolor tube, evaluating the relative importance of information in reaching a final interpretation, suggesting optimized gating strategies and potentially decreasing the size of marker panels ([13, 14]).
In the current study, we used computational analysis to identify features that can best distinguish germinal center lymphoma from reactive germinal center cells. The utility of our computational pipeline has previously been demonstrated using 13-color flow cytometric data from T-cell subset evaluation of HIV positive subjects ([7, 8]). This pipeline consists of two independent open-source tools. In this setting, flowType was used to identify surrogate cell surface marker phenotypes that could overcome the need to detect intracellular markers and RchyOptimyx was used to simplify the gating strategy for the identified phenotypes (e.g., using CD45RO− CCR5− CCR7+ instead of the more complicated CD28+ CD45RO− CD57− CCR5− CD27+ CCR7+) and summarize a large list of phenotypes to identify three strong predictors of progression to AIDS (). In the current study, we utilized RchyOptimyx to objectively identify the optimal analysis strategy for the 8-color B-cell directed antibody combination: anti-kappa, anti-lambda, CD19, CD20, CD10, CD5, CD38, and CD45. Application of this computational tool highlighted the diagnostic utility of identifying CD38 negative, CD10 positive germinal center B-cells in association with lymphoma, a feature not emphasized in the original manual analysis.
CD38 is a glycoprotein that is expressed by precursor B-cells, germinal center B-cells, and plasma cells, but is absent from naïve B-cells and memory B-cells ([17-19]). Its expression appears to be under tight control, with B-cells undergoing synchronous gain and loss of CD38 and CD10 expression as they enter and exit the germinal center (). Non-neoplastic B-cells with a CD38 negative, CD10 positive phenotype are not well recognized and therefore, the presence of this phenotype in GC-L may reflect aberrant antigen expression. This interpretation is supported by a previous study that reported significantly lower CD38 expression by the neoplastic cells in follicular lymphoma, as determined by MFI, when compared with reactive germinal center B-cells (). However, in both the previous and current study, evaluation of CD38 expression alone was insufficient to establish a diagnosis of lymphoma (). Although revision of the manual analysis strategy in the current study to emphasize CD38 negative, CD10 positive B-cells identified most of the cases of germinal center lymphoma, a small population of cells with this phenotype was also identified in some cases of reactive hyperplasia. The identification of a small population of CD38 negative, CD10 positive B-cells in reactive lymphoid tissue suggests that in GC-L this phenotype may represent expansion of this subset rather than aberrant antigen expression. It will be of interest to explore additional phenotypic findings than can assist in the distinction between normal and neoplastic CD10 positive, CD38 negative B-cells, and determine whether non-neoplastic B-cells with this phenotype reflects a transitional stage from naïve or immature/transitional B-cells to germinal center B-cells, or germinal center B-cells to memory B-cells ().
The main utility of the computational strategy utilized in this study was the ability to identify phenotypes that were associated with an outcome, in this case diagnosis, and thereby assess the relative utility of different antibodies and analysis strategies. However, although the computational analysis employed in the current study highlighted a cell population that had not been emphasized in the manual analysis, it did not supersede any of the existing analysis strategies. One factor that contributed to the decision to retain the other analysis components is the low tolerance for any false positive or false negative results in the clinical diagnostic setting. In addition, some of the diagnostically useful parameters utilized in the conventional analysis could not be identified with the computational analysis because of independent consideration of each marker and use of fixed thresholds. For example, the independent partition of each marker into positive and negative cell populations led to difficulty in distinguishing CD10 staining of granulocytes from that of germinal center B-cells, and could have been minimized by initial gating on B-cells or possibly use of side light scatter to gate out granulocytes. Similarly, this strategy could not distinguish B-cells and non-B-cell cytophilic staining for kappa and lambda, and was unable to assess for immunoglobulin light chain restriction, i.e., more homogeneous staining of B-cells for one light chain only. Other strategies for automated analysis of flow cytometric data, such as vector quantization, dimension reduction, and clustering algorithms have the advantage of utilizing multiple parameters simultaneously to identify and evaluate populations of cells, as shown in the referenced examples ([22-24]). Multiparametric approaches, such as these, or an ensemble of computational strategies, as highlighted through the FlowCAP challenges, might be more successful at population identification and characterization, and serve as a better comparison with manual analysis (). However, these multivariate approaches are limited in their ability to incorporate biological guidance for identifying cell populations, and often require a complex and subjective meta-clustering step for matching high-dimensional cell populations across different patients ([14, 25]). Another challenge in the design of automated computational strategies is the presence of technical and biologic variability in the data. Although the flow cytometric data utilized in the current study was generated employing clinical laboratory procedures to ensure consistency of data, there were some artifacts, such as the presence of non-specific antibody staining, doublet formation, and reagent aggregates, which could mislead automated computational analysis. As the development of clinically applicable computational tools progresses, it will be important to address these issues using controlled clinical data sets with associated outcome data, such as the one we describe here.