• minimal residual disease;
  • flow cytometry;
  • principal component analysis;
  • pattern classification;
  • leukemia;
  • Bayes theorem


  1. Top of page
  2. Abstract

Multiparameter flow cytometry has become an essential tool for monitoring response to therapy in hematological malignancies, including B-cell chronic lymphoproliferative disorders (B-CLPD). However, depending on the expertise of the operator minimal residual disease (MRD) can be misidentified, given that data analysis is based on the definition of expert-based bidimensional plots, where an operator selects the subpopulations of interest. Here, we propose and evaluate a probabilistic approach based on pattern classification tools and the Bayes theorem, for automated analysis of flow cytometry data from a group of 50 B-CLPD versus normal peripheral blood B-cells under MRD conditions, with the aim of reducing operator-associated subjectivity. The proposed approach provided a tool for MRD detection in B-CLPD by flow cytometry with a sensitivity of ≤8 × 10−5 (median of ≤2 × 10−7). Furthermore, in 86% of B-CLPD cases tested, no events corresponding to normal B-cells were wrongly identified as belonging to the neoplastic B-cell population at a level of ≤10−7. Thus, this approach based on the search for minimal numbers of neoplastic B-cells similar to those detected at diagnosis could potentially be applied with both a high sensitivity and specificity to investigate for the presence of MRD in virtually all B-CLPD. Further studies evaluating its efficiency in larger series of patients, where reactive conditions and non-neoplastic disorders are also included, are required to confirm these results. © 2008 International Society for Advancement of Cytometry

In recent years, multiparameter flow cytometry immunophenotyping has become an essential tool for the diagnosis and monitoring of response to therapy in a wide spectrum of diseases, including leukemic B-cell chronic lymphoproliferative disorders (B-CLPD) (1–3). Among other advantages, flow cytometry immunophenotyping allows for a rapid quantitative assessment of multiple characteristics of millions of cells, information being recorded for individual cellular events (4). This provides a tool for accurate multiparameter identification and characterization of neoplastic cells among normal cells in peripheral blood (PB) and bone marrow (BM), even when neoplastic cells are present at very low frequencies (≤10−4) among a major population of normal cells—minimal residual disease (MRD)—(5–8).

Detection of MRD by multiparameter flow cytometry immunophenotyping is based on the existence of different patterns of protein expression in normal versus neoplastic cells. In the last decade, it has been shown that MRD evaluation is of great clinical utility to predict disease recurrence and patient outcome in B-CPLD such as chronic lymphocytic leukemia (CLL) (7–12). To achieve the sensitivity required for MRD investigation, large numbers of cells—typically hundreds of thousands to millions—have to be analyzed (5, 7–13). Accordingly, information about several (≥6) different cell-associated features is typically obtained and stored in a digital list mode data file format for several hundreds of thousands to millions of cells measured. Such data files typically contain >106 individual data points; a number of entries which is by far larger than a typical data set containing information about a sample analyzed by DNA oligonucleotide microarray techniques (14–17).

In the past few years, important advances have been achieved in flow cytometers allowing for the measurement of an increasingly high number of parameters (≥10) in a more rapid way, through the evaluation of tens of thousands of cells per second (18). In contrast, analysis of the data recorded has not attained the same level of progress and it is still based on strategies, which were defined more than 20 years ago (4, 18, 19). Accordingly, with a few exceptions (20–26) analysis of flow cytometry immunophenotypic data typically relies on the definition of a variable number of bidimensional plots, where an experienced operator selects the subpopulations of interest (25–27). Often, depending on the expertise of the operator, specific cell populations—particularly those present at low frequencies—can be misidentified. Overestimation and/or underestimation of specific minor cell populations has a direct impact on the assessment of MRD with potential clinical/diagnostic consequences (5, 7, 28). More recently, we have described an alternative, automated method for analysis of flow cytometry immunophenotypic data (22, 25). With this new automated approach, we could detect neoplastic B-cells present in peripheral blood (PB) samples from patients with increased PB absolute lymphocyte counts, with a high efficiency and an increased reproducibility, by reducing expert-based data-analysis decisions. However, this method was only able to detect neoplastic cells in PB when they were present at relatively high frequencies (≥5% of the whole sample cellularity) (22, 25), its sensitivity being insufficient for MRD evaluation.

At present, consensus exists about the basic requirements for an adequate MRD technique to be used. Ideally, MRD approaches should allow clear and specific identification of neoplastic cells at frequencies ≤10−4, an MRD level which has proven to be clinically relevant in B-CLPD (1, 3, 8). In such case, flow cytometry measurements require a minimum number of events corresponding to the neoplastic cell population (e.g., ≥10 cells), to define it to be present, and at least 100 events to achieve statistical precision in quantifying the neoplastic cells (1, 3).

In this article, we describe an automated strategy for the detection of MRD in B-CLPD based on pattern classification tools and the Bayes theorem (29). With this probabilistic approach, we were able to systematically identify MRD by flow cytometry with a sensitivity of ≤8 × 10−5, a sensitivity of ≤2 × 10−7 being reached for the large majority of cases (80%). Furthermore, this approach allows “a priori” definition—e.g., at diagnosis—of the sensitivity that will be reached for each case, later on during follow-up of the disease.


  1. Top of page
  2. Abstract

Patients and Samples

A total of 50 EDTA-anticoagulated diagnostic PB samples from 50 patients—28 males and 22 females; mean age of 51 years, ranging from 40 to 89 years—with different subtypes of leukemic B-CLPD were included in this study. Patients were classified according to the WHO criteria (30) into the following diagnostic categories: B-cell chronic lymphocytic leukemia (B-CLL), 31 patients (26 typical and 5 atypical B-CLL cases); mantle cell lymphoma (MCL), 8; splenic marginal zone lymphoma (SMZL), 3; mucosa-associated lymphoid tissue (MALT) lymphoma, 3; diffuse large B-cell lymphoma (DLBCL), 2 cases, and; follicular lymphoma (FL), one patient. The other two cases had an unclassifiable B-CLPD. Median white blood cell (WBC) and lymphocyte counts were of 31 × 109 leukocytes/L (range: 2.9 × 109–183 × 109 leukocytes/L) and 23 × 109 lymphocytes/L (range: 0.8 × 109 to 164 × 109 lymphocytes/L), respectively. Overall, the median percentage of neoplastic B cells in the 50 infiltrated specimens was of 40% (range: 0.4–92%). Moreover, PB samples from three patients diagnosed of B-CLL, which were collected after therapy under MRD conditions, were also included in this study.

In addition, EDTA-anticoagulated PB samples from a total of 5 adult healthy individuals —3 males and 2 females—were also collected. Median WBC and lymphocyte counts as well as B-cell percentages and absolute counts were of 6.4 × 109 leucocytes/L (range: 4.6× 109 to 10 × 109 leucocytes/L), 2.1 × 109 lymphocytes/L (range: 1.6 × 109 to 2.8 × 109 lymphocytes/L), 3.8% of WBC (range: 3–4.7%), and 0.24 × 109 B-cells/L (range: 0.16 × 109 to 0.47 × 109 B-lymphocytes/L), respectively.

All healthy volunteers and B-CLPD patients gave their informed consent prior to entering this study, which was previously approved by the local Ethical Committee of the University Hospital of Salamanca (Salamanca, Spain).

Multiparameter Flow Cytometry Immunophenotypic Studies

Multiparameter flow cytometric studies were performed for each PB sample using a panel of five combinations of fluorochrome-conjugated monoclonal antibodies (MAb)—fluorescein isothyocyanate (FITC)/phycoerythrin (PE)/peridinin chlorophyll protein-cyanin 5.5 (PerCP-Cy5.5)/allophycocyanin (APC)—which systematically included CD19PerCP-Cy5.5 in every combination, in addition to FMC7FITC/CD24PE/CD34APC, sIgλFITC/sIgκPE/CD5APC, CD22FITC/CD23PE/CD20APC, CD103FITC/CD25PE/CD11cAPC, and CD43FITC/CD79bPE, for a total of 15 different markers one (CD19) was common to all combinations. The techniques used to stain the cells have been previously described in detail by Sanchez et al. (13). For each staining corresponding to each individual sample analyzed, information about >5 × 104 cells was acquired in a FACSCalibur flow cytometer (Becton/Dickinson Biosciences, BDB, San José, CA) and stored using the CellQUEST software program (BDB). Overall, for each sample information about 17 different biological features—15 immunophenotypic and 2 light scatter characteristics—was measured and stored for a total of >2.5 × 105 events. For samples with low percentages of B cells (i.e., normal samples), an additional second step acquisition was specifically performed to measure higher numbers of B-cells contained in each stained aliquot. In this later step, an electronic live-gate set on a CD19 versus side light scatter (SSC) bivariate dot plot histogram was used to build a gate to collect information about the B-cells contained in a total of 106 events/sample aliquot. Accordingly, for each PB sample, five different data files were stored, each containing information about five or six cell attributes (two attributes related to light dispersion: forward light scatter (FSC), sideward light scatter (SSC) and 3- or 4-fluorescence attributes). Based on the panel of reagents used, three out of the 17 attributes measured were common to all five data files (FSC, SSC and CD19); all other 14 parameters varied among the data files according to the reagents used in each staining from the panel described above.

Data Manipulation

Data files from the five multicolor stainings corresponding to the same PB sample (from either B-CLPD patients or healthy individuals) were merged using the INFINICYT™ software program (Cytognos SL, Salamanca, Spain), as previously reported (26). Subsequently, the “calculation” function of the INFINICYT software based on the nearest neighbor principle (26, 31, 32) was applied to the merged data files, to calculate the information about each individual attribute not actually measured for individual events in the merged data files, for the whole panel of markers tested. As a result, for each PB sample, a single data file was obtained containing information about all 17 cellular attributes measured for each event recorded.

For each patient data file, a minimum of 2.5 × 105 cellular events were measured. The number of cellular events corresponding to B-cells within the total events measured in each file varied between 3.2 × 103 and 6.5 × 105 (mean of 4.2 × 105 ± 2.2 × 105) according to the percentage of B-cells—mean of 34% ± 29% (range: 0.1–92%)—present in each PB sample.

Data on normal cells was computationally generated by merging data from five different data files each containing 1 × 106 events corresponding to a normal PB sample (total events in the merged data file of 5 × 106). The merged data files contained events corresponding to normal PB cells from five different normal PB samples obtained from an identical number of healthy volunteers. From this merged file, a pool of 79,856 normal B-cell events, denoted from here on as “normal-B-cell-pool data,” was obtained.

In diagnostic PB samples from B-CLPD patients, very few normal B-cells can be typically found, because almost all B-cells present in the sample being neoplastic. For this reason, artificial “B-CLPD diagnostic-files” were built electronically for each patient, by mixing events corresponding to neoplastic B-cells from the patient, with events corresponding to normal B-cells from the “normal-B-cell-pool file” at a 1:1 proportion. In addition, further dilutions of neoplastic B-cell events in the “normal-B-cell-pool data” was also performed at different concentrations, to simulate progressively lower levels of MRD (“MRD-files”), as described below in this section.

Data Analysis

For manual (operator dependent) data analysis, the INFINICYT software program was used. Briefly, during this process, total B cells were identified as those CD19+ events showing low to intermediate FSC and SSC values, after specifically excluding platelets and cell debris (11). Normal PB B cells were identified as being CD19+, CD20hi, CD22++, CD23−/lo, CD43, CD79b+, FMC7+, CD103, CD25−/lo, CD11c−/+, CD5−/lo, and either sIgκ+/sIgλ or sIgλ+/sIgκ. Neoplastic B cells were all other PB B-lymphocytes showing an aberrant phenotype. After this step, B-cells present in each sample were gated and the information about those events corresponding to B-cells was stored in a new data file.

For automated analysis, a Principal Component Analysis (PCA) Transformation was applied to each of the artificial “B-CLPD diagnostic-files” (33). In sequence, first we restricted our attention to the data projection into the space defined by the first versus second principal components. This PCA projection of diagnostic-files containing a 1/1 mixture of normal and neoplastic B-cells, systematically showed the presence in the data file of three clearly defined groups of B-cell events (Fig. 1). In fact, normal PB B-cells constantly displayed a bimodal distribution where two subpopulations defined by the isotype of the immunoglobulin light chain (sIg) expressed (sIgκ+ or sIgλ+), were detected; the third group of B-cell events corresponded to neoplastic B cells.

thumbnail image

Figure 1. Illustrating bidimensional dot plot histogram projections in the first versus second principal component analysis (PCA) space of B-cell events contained in diagnostic-files where neoplastic B-cells from four different representative individual patients where mixed at 1:1 proportion with normal B-cell populations from a pool of PB samples from five healthy volunteers (“Normal-B-cell-pool” file). For these four files, the two normal B-cell populations and the neoplastic one fall well apart. In all bivariate plots the normal sIg Kappa+ and sIg Lambda+ B-cell populations are displayed as green and blue events, respectively; in turn, those events corresponding to the neoplastic B-cells are painted as red events.

Download figure to PowerPoint

The goal of our automated data analysis strategy was first to split the two normal subpopulations (sIgκ+ or sIgλ+) for each of the 50 merged diagnostic-files, by using a classical k-means algorithm (34). Then, the normal B-cell population (e.g., blue and green events in Fig. 1C) which appeared to be localized closer to the neoplastic B-cells (e.g., red events in Fig. 1C) was identified for each “B-CLPD diagnostic-file” using the measure of the distance between the means of each B-cell population in the data file in ℜ17; the other far-off normal B-cell population (e.g., blue events in Fig. 1C) was temporarily discarded. Accordingly, at this point, we ended up with two B-cell populations for each case: one of the two normal B-cell populations and that of neoplastic B-cells.

Afterward, we calculated the mean and the covariance matrices for the two populations of normal B-cells (sIgκ+ or sIgλ+) as well as for each population of neoplastic B-cells from individual B-CLPD diagnostic data files (n = 50). On basis of these results, we estimated the likelihood rate for both populations, i.e., the probability distribution functions (pdf) where p(x|normal) and p(x|neoplastic). Here, p(x|normal) is the pdf of an event to assume a value x given that one knows that the population is normal (and not neoplastic). This strategy may be applied, because well-established conventional approaches for the investigation of MRD in B-CLPD indicate that in general, under MRD conditions after therapy, one should search for a neoplastic population with similar features to those observed at diagnosis (7–13). Accordingly, we assume that the pdf p(x|normal) and p(x|neoplastic) remain unchanged from diagnosis to sequential follow-up PB samples, obtained after therapy. Note that in fact, what one really wants to know is P(normal|x), i.e. the probability that an event belongs to the normal population, after measuring the attributes of this event. This goal may be achieved by applying the Bayes theorem (35) as follows (1):

  • equation image

Here, K is a constant to make P(normal|x) + P(neoplastic|x) = 1, and P(neoplastic) and P(normal) are the “a priori” probabilities of the two classes (neoplastic and normal B-cells, respectively). These reflect the probability of finding an event in one of the two classes, prior to be able to “see” the real data. In many applications, these a priori probabilities can be easily estimated by the relative frequencies of the classes in the sample. However, in the MRD setting, we are interested in estimating the “a priori” probabilities in the “B-CLPD diagnostic-files” to be applied, not at diagnosis, but after therapy during the follow-up period where the relative proportion between normal and neoplastic B-cells is expected to change and variably increase. This introduces a new challenge, because the number of neoplastic B-cell events is exactly what we are looking for. To overcome this awkwardness, we introduced the following iterative procedure: Step 1: Let us denote the number of neoplastic B-cell events at iteration ‘i’ as equation image. Set counter i = 0 and initialize equation image such that the frequency of neoplastic B-cell events is initially overestimated; Step 2: Estimate P(neoplastic) based on equation image and apply the Bayesian Theorem as in (1) to calculate P(neoplastic|x) and P(normal|x); Step 3: Allocate all observations to the neoplastic or the normal B-cell populations by applying the Optimal Bayesian Decision Rule (36): An observation x is set to the neoplastic B-cell population if P(neoplastic|x) > P(normal|x), otherwise it is assigned to the normal population; Step 4: Increase i, recalculate equation image; and Step 5: If equation image go to Step 2, otherwise, STOP.

For all experiments, we started with equation image = 1000 events, assuming that one is seeking for a residual neoplastic B-cell population smaller than this number. To test the sensitivity of the method in detecting progressively lower numbers of MRD, we selected decreasing quantities of observations corresponding to neoplastic B-cells from each diagnostic data file and added these events to the pool of normal events (“normal-B-cell-pool data” file). Accordingly, for each of the 50 B-CLPD samples, files containing only the neoplastic B-cell events were used to randomly built 88 different sets of data with 1, 2, 3, …, 50, 60, 70, …, 300, 350, 400, …, 1000 neoplastic B-cell events. Each of these sets was merged with the “normal-B-cell-pool data” file to generate files containing decreasingly lower levels of MRD—“MRD files”—. Consequently, for each of the 50 patients, 88 “MRD-files” were generated containing known proportions of between 1 and 1000 neoplastic B cells in the pool of 5 × 106 normal cells (MRD frequencies of between 2 × 10−4 and 2 × 10−7). The overall procedure run for each individual B-CLPD case is summarized in Figure 2.

thumbnail image

Figure 2. Schematic flowchart of the overall procedure run for each individual B-CLPD sample/case.

Download figure to PowerPoint

Two measures of performance were used for the 50 B-CLPD cases; the first relates to the minimal number of true neoplastic B-cell events the system is able to identify in a total of 5 × 106 events -sensitivity of the method-; for this purpose we defined the sensitivity of the method as the minimum number of true neoplastic B-cell events added to the “normal-B-cell-pool data” file that is associated with the detection of more than 60% of the neoplastic B-cell events merged. The second measure relates to the level of agreement observed between the number of neoplastic B-cell events added to the “normal-B-cell-pool data” file and the corresponding number of neoplastic B-cell events actually detected by the proposed procedure in that specific merged data file. For each B-CLPD case, we compared the MRD dilution frequencies of neoplastic B-cell events with the number of neoplastic B-cells identified to be present by the approach here described, in each of the individual MRD-files, and calculated the degree of correlation between the two measures (Pearson correlation coefficients).


  1. Top of page
  2. Abstract

Overall, a high degree of correlation was observed between the number of diluted and the number of computationally identified neoplastic B-cells for all 50 B-CLPD cases included in this study. In most cases (45/50 cases; 90%), Pearson correlation coefficients (r2) higher than 0.999 were detected. Those five cases showing the lowest correlation coefficients (r2 ≥ 0.964 and ≤ 0.999) are displayed in Figure 3. Of note, for all these later five cases, bivariate projections of the first versus the second principal components (Fig. 3, Column A) showed occurrence of a clear, partial overlap between the neoplastic B-cells and one of the normal B-cell populations in ℜ17 (either sIgκ+ or sIgλ+ normal B-cells).

thumbnail image

Figure 3. Illustrating examples of those cases (N = 5) showing the lowest correlation (r2 ≤ 0.999) between the number of neoplastic B-cells identified and the number of neoplastic B-cells actually present in the “MRD-files” corresponding to each case. In Column A, bivariate dot plot histogram projections of the first versus second PCA are shown for neoplastic as well as for normal B-cells from each patient (MRD data files). In Columns B and C, correlation plots are shown for the same cases (Column B) highlighting the region corresponding to the lowest dilutions surrounded in Column B plots (Column C). In Column D, the sensitivity level specifically obtained for each case is displayed.

Download figure to PowerPoint

Regarding sensitivity, in most cases (40/50 cases; 80%), MRD detection at levels as low as 1 event in 5 × 106 normal PB cells (2 × 10−7) were achieved. Interestingly, those five cases showing coefficients of correlation (r2) ≤ 0.999, which are represented in Figure 3, were among those 10 cases that showed a lower sensitivity between 8 × 10−5 (0.008%) and 6 × 10−6 (0.0006%) (Fig. 3, Column D). The other five cases in which a sensitivity >2 × 10−7 was achieved and had correlation coefficients (r2) ≥ 0.999 showed sensitivity levels of between 6 × 10−6 (0.0006%) and 6 × 10−7 (0.00006%) (Fig. 4). Accordingly, overall sensitivity was >1 × 10−6 in only 7 cases (14%) with an upper limit of 8 × 10−5. Of note, cases with a lower sensitivity (of between 8 × 10−5 and 6 × 10−7) corresponded to three cases of MCL (3/8 patients), three SMZL (3/3 cases), one FL (1/1 patient), and 2 B-CLL (2/31 cases).

thumbnail image

Figure 4. Illustrating examples of those cases (N = 5) showing correlation coefficients (r2) > 0.999 between the number of neoplastic B-cells identified and the number of neoplastic B-cells actually present in the “MRD-files,” but a sensitivity level ≤ 2 × 10−7. In Column A, bivariate dot plot histogram projections of the first versus second PCA are shown for neoplastic as well as for normal B-cells from each patient (MRD data files). In Columns B and C, correlation plots are shown for the same cases (Column B), highlighting the region corresponding to the lowest dilutions surrounded in Column B plots (Column C). In Column D, the sensitivity level specifically obtained for each case is displayed.

Download figure to PowerPoint

To evaluate the specificity for MRD detection of the proposed approach, the same strategy was applied to each “MRD-file,” after subtracting all neoplastic B-cell events in the PCA projection. The aim was to verify whether any event corresponding to normal B-cells would then be equivocally classified as a neoplastic B-cell event. Of note, in most cases (43/50 cases; 86%), no events were wrongly identified as belonging to the neoplastic B-cell populations. In the remaining seven patients, only one (3/50 cases; 6%), four (2/50 cases; 4%) and at maximum five (2 cases; 4%) out of 5 × 106 events were wrongly identified as belonging to the neoplastic B-cell population (≤1 × 10−6).

To validate the method in patients with resistant disease as well as real MRD cases, PB samples from three different B-CLL patients were sequentially evaluated at diagnosis and after treatment. In all three cases, with the probabilistic approach here proposed, we were able to identify the presence of neoplastic B-cells after therapy (Fig. 5). For individual cases, 99%, 97%, and 92% of all neoplastic B-cells present in the follow-up samples according to an expert operator were correctly identified with the automated probabilistic approach. Furthermore, in all three cases, no events corresponding to normal B-cells as defined by an expert operator were wrongly classified as neoplastic B-cells. The specific percentages detected in the follow-up samples from these patients by an experienced operator versus the automated probabilistic approach here proposed were of 0.45%, 0.11%, and 48.9% vs. 0.45%, 0.1%, and 45%, respectively.

thumbnail image

Figure 5. Performance of the proposed probabilistic approach in real PB samples with resistant disease and MRD obtained from three patients with B-CLPD after therapy (n = 3). In the left column (panels A, C, and E), bivariate projections in the first versus second PCA of the neoplastic B-cell populations from the three different B-CPLD patients at diagnosis and prior to therapy merged with the “normal-B-cell-pool data” are shown. In the column in the right (panels B, D, and F), projections in the first versus second PCA of all B-cell populations detected after therapy in the MRD samples from each of the three cases are displayed. Red events were classified by the proposed approach as neoplastic B-cells, whereas blue events were classified as normal B-cells. Of note, few B-cells were present in follow-up versus diagnostic samples (39%, 35.4% and 88% versus 0.45%, 0.1% and 45%, for cases 51, 52, and 53, respectively) with even less (almost undetectable) normal residual mature B-lymphocytes due to chemotherapy-induced lymphopenia.

Download figure to PowerPoint


  1. Top of page
  2. Abstract

In the last decade, MRD detection by flow cytometry has been increasingly used to monitor response to therapy (5–11, 37). Among other advantages over molecular approaches used for MRD detection, flow cytometry is based on a relatively rapid and simple interrogation of hundred thousands to millions of cells, information being collected for each individual cell measured (1, 5, 27). In addition, it can be applied to the great majority of all leukemia/lymphoma patients (1, 5, 11, 37). Finally, it has been shown that conventional flow cytometry approaches for MRD detection are highly sensitive allowing for the identification of down to around 1 × 10−4 (0.01%) neoplastic B-cells among a major population of normal PB and BM hematopoietic cells (1, 5, 7, 8, 38). However, it should be highlighted that for MRD investigation by flow cytometry, an expert operator with extensive and detailed knowledge about the patterns of protein expression associated with normal versus neoplastic B-cells is typically required for adequate data analysis (1, 5, 11, 28). This, together with the variable patterns of phenotypic aberrations detected among different subtypes of B-CLPD has limited the establishment and implementation of standardized data analysis procedures that would facilitate the extended use of standardized flow cytometry MRD approaches in routine clinical diagnostic laboratories.

Here, we propose the use of a probabilistic approach for the evaluation of MRD by multiparameter flow cytometry in B-CLPD, through automated analysis of patient data files measured at diagnosis and the dilution of neoplastic B-cell events, at increasingly lower concentrations, into data files containing information about normal PB B-cells. Overall, our results show that this approach can be applied to virtually all B-CLPD with both a high sensitivity and specificity, whenever the search for minimal numbers of neoplastic B-cells similar to those detected at diagnosis is required. Accordingly, the strategy here proposed reached a sensitivity of 2 × 10−7 in 80% of the cases. Such sensitivity is significantly greater than the best sensitivity described in the literature for MRD by flow cytometry (1 × 10−5) in B-CLPD (7–13). Of note, for the remaining cases which displayed a lower sensitivity: between 8 × 10−5 (0.008%) and 6 × 10−7 (0.00006%); this could be potentially improved with additional markers. Such markers should be capable of increasing the differences already observed at diagnosis between normal and neoplastic B-cells. Inclusion of markers aimed at detecting aberrant B/cell phenotypes in SMZL, MCL, and FL (e.g., bcl2 and CD10) (39) would be particularly useful. To a great extent, this high sensitivity was reached because of the high specificity achieved with the proposed approach, because aberrant phenotypes were detected in most cases, where no normal B-cell events were misclassified. Of note, with the automated approach proposed information about the specificity and sensitivity associated with each patient is obtained already at diagnosis, to be used later on for MRD evaluation. This is particularly important because it is not possible to base the a priori evaluation on the prevalence of each cell population at diagnosis as in Tosetto et al. (40); in this regard, in our study we introduced a scheme that provides an iterative estimation for the a priori probability. However, it should be noted that normal control B-cells were obtained from a relatively reduced number of healthy adults and thus, further studies evaluating the efficiency of the approach here proposed in larger series of patients where PB B-cells from reactive conditions and non-neoplastic disorders are also included as control B-lymphocytes are still necessary to confirm our results.

The proposed approach is aimed at identifying MRD in B-CLPD based on the probability of each individual event measured to correspond to either a normal or neoplastic B-cell. In most cases, one event was sufficient to establish the presence of MRD. Nowadays, with conventional four-color approaches for MRD detection, ≥10 events presenting similar immunophenotypic features are required to define presence of MRD; in addition, around 100 events are necessary to precisely quantify their percentage among other cells in a sample (1–3). By increasing the number of parameters simultaneously measured, we also decreased the number of events required to define the presence of an abnormal cell population down to five events (maximum number of misclassified normal B-cells in a sample) for a 100% efficiency. Furthermore, each event identified as corresponding to a neoplastic B-cell by the probabilistic approach can be further examined by an expert operator using conventional methods of manual data analysis, based on the visualization of multiple bidimensional plots. In this way, the expert operator can use the established knowledge about the phenotypic features displayed by normal B-cells and the aberrant patterns of protein expression found in B-CLPD, to evaluate the consistency of the probabilistic approach here proposed, in real individual cases. Interestingly, similar results were obtained by diluting neoplastic B-CLL samples in normal PB (n = 2), prior to staining, instead of using electronic dilution of events corresponding to neoplastic B-cells in a pool of electronic events corresponding to normal PB B-cells (data not shown).

A potential limitation of the strategy proposed here is that, after therapy, neoplastic cells from B-CLPD patients may display variations in their immunophenotypic attributes with respect to those observed at diagnosis. Previous reports have shown that this may actually occur in a few B-CLPD patients (40–42). In such cases, the proposed approach could be associated with false negative results. Because of this, we tested the probabilistic approach by analyzing a few cases where both diagnostic and MRD samples from the same patient were available, confirming the reproducibility of the approach in real MRD samples. Further studies are required to investigate the utility of this strategy to detect infiltration by B-CLPD of tissues other than PB (e.g., bone marrow, cerebrospinal fluid) for a more reproducible staging of the disease (1). In line with this, the proposed probabilistic approach could also be useful in the evaluation of MRD in other hematological malignancies. This would be particularly useful in acute myeloid leukemias where manual data analysis is much more complex due to coexistence of different abnormal cell populations in the same sample and a marked phenotypic heterogeneity (6, 28) and to the effects of therapy (43).

In summary, here we propose and evaluate a probabilistic approach aimed at automating and standardizing the search for MRD, based on the immunophenotypic features of neoplastic cells observed at diagnosis in B-CLPD. Overall the proposed strategy was associated with a higher specificity and sensitivity than previously defined with expert-based manual data analysis approaches and points out its potential utility also in other minimal disease situations in both B-CLPD and other hematological malignancies.


  1. Top of page
  2. Abstract
  • 1
    Davis BH,Holden JT,Bene MC,Borowitz MJ,Braylan RC,Cornfield D,Gorczyca W,Lee R,Maiese R,Orfao A,Wells D,Wood BL,Stetler-Stevenson M. 2006-Bethesda International Consensus recommendations on the flow cytometric immunophenotypic analysis of hematolymphoid neoplasia: Medical indications. Cytometry Part B 2007; 72B: S5S13.
  • 2
    Ruiz-Argüelles A,Rivadeneyra-Espinoza L,Duque RE,Orfao A, Latin American Consensus Conference. Report on the second Latin American consensus conference for flow cytometric immunophenotyping of hematological malignancies. Cytometry Part B 2006; 70B: 3944.
  • 3
    Stetler-Stevenson M. H43-A2 Clinical Flow Cytometric Analysis of Neoplastic Hematolymphoid Cells; Approved Guideline, 2nd ed. Clinical and Laboratory Standards Institute: Wayne, PA; 2007.
  • 4
    Edwards BS,Oprea T,Prossnitz ER,Sklar LA. Flow cytometry for high-throughput, high-content screening. Curr Opin Chem Biol 2004; 8: 392398.
  • 5
    Szczepański T,Orfão A,van der Velden VH,San Miguel JF,van Dongen JJ. Minimal residual disease in leukaemia patients. Lancet Oncol 2001; 2: 409417.
  • 6
    MRD-AML-BFM Study Group, Langebrake C,Creutzig U,Dworzak M,Hrusak O,Mejstrikova E,Griesinger F,Zimmermann M,Reinhardt D. Residual disease monitoring in childhood acute myeloid leukemia by multiparameter flow cytometry: The MRD-AML-BFM Study Group. J Clin Oncol 2006; 24: 36863692.
  • 7
    Sayala HA,Rawstron AC,Hillmen P. Minimal residual disease assessment in chronic lymphocytic leukaemia. Best Pract Res Clin Haematol 2007; 20: 499512.
  • 8
    Rawstron AC,Villamor N,Ritgen M,Böttcher S,Ghia P,Zehnder JL,Lozanski G,Colomer D,Moreno C,Geuna M,Evans PA,Natkunam Y,Coutre SE,Avery ED,Rassenti LZ,Kipps TJ,Caligaris-Cappio F,Kneba M,Byrd JC,Hallek MJ,Montserrat E,Hillmen P. International standardized approach for flow cytometric residual disease monitoring in chronic lymphocytic leukaemia. Leukemia 2007; 21: 956964.
  • 9
    Moreno C,Villamor N,Colomer D,Esteve J,Giné E,Muntañola A,Campo E,Bosch F,Montserrat E. Clinical significance of minimal residual disease, as assessed by different techniques, after stem cell transplantation for chronic lymphocytic leukemia. Blood 2006; 107: 45634569.
  • 10
    Moreton P,Kennedy B,Lucas G,Leach M,Rassam SM,Haynes A,Tighe J,Oscier D,Fegan C,Rawstron A,Hillmen P. Eradication of minimal residual disease in B-cell chronic lymphocytic leukemia after alemtuzumab therapy is associated with prolonged survival. J Clin Oncol 2005; 23: 29712979.
  • 11
    Montillo M,Tedeschi A,Miqueleiz S,Veronese S,Cairoli R,Intropido L,Ricci F,Colosimo A,Scarpati B,Montagna M,Nichelatti M,Regazzi M,Morra E. Alemtuzumab as consolidation after a response to fludarabine is effective in purging residual disease in patients with chronic lymphocytic leukemia. J Clin Oncol 2006; 24: 23372342.
  • 12
    Montillo M,Schinkoethe T,Elter T. Eradication of minimal residual disease with alemtuzumab in B-cell chronic lymphocytic leukemia (B-CLL) patients: The need for a standard method of detection and the potential impact of bone marrow clearance on disease outcome. Cancer Invest 2005; 23: 488496.
  • 13
    Sanchez ML,Almeida J,Vidriales B, López-Berges MC, Garcia-Marcos MA, Moro MJ, Corrales A, Calmuntia MJ, San Miguel JF, Orfao A. Incidence of phenotypic aberrations in a series of 467 patients with B chronic lymphoproliferative disorders: Basis for the design of specific four-color stainings to be used for minimal residual disease investigation. Leukemia 2002; 16: 14601469.
  • 14
    Andersson R,Bruder CE,Piotrowski A,Menzel U,Nord H,Sandgren J,Hvidsten TR,Diaz de Ståhl T,Dumanski JP,Komorowski J. A Segmental Maximum A Posteriori Approach to Genome-wide Copy Number Profiling. Bioinformatics 2008; 24: 751758.
  • 15
    Li F,Yang Y. Analysis of recursive gene selection approaches from micro-array data. Bioinformatics 2005; 21: 37413747.
  • 16
    Griffith M,Tang MJ,Griffith OL,Morin RD,Chan SY,Asano JK,Zeng T,Flibotte S,Ally A,Baross A,Hirst M,Jones SJ,Morin GB,Tai IT,Marra MA. ALEXA: A microarray design platform for alternative expression analysis. Nat Methods 2008; 5: 118.
  • 17
    Schlabach MR,Luo J,Solimini NL,Hu G,Xu Q,Li MZ,Zhao Z,Smogorzewska A,Sowa ME,Ang XL,Westbrook TF,Liang AC,Chang K,Hackett JA,Harper JW,Hannon GJ,Elledge SJ. Cancer proliferation gene discovery through functional genomics. Science 2008; 319: 620624.
  • 18
    Perfetto SP,Chattopadhyay PK,Roederer M. Seventeen-colour flow cytometry: Unravelling the immune system. Nat Rev Immunol 2004; 4: 648655.
  • 19
    Roederer M,Brenchley JM,Betts MR,De Rosa SC. Flow cytometric analysis of vaccine responses: How many colors are enough?. Clin Immunol 2004; 110: 199205.
  • 20
    Robinson JP,Durack G,Kelley S. An innovation in flow cytometry data collection and analysis producing a correlated multiple sample analysis in a single file. Cytometry 1991; 12: 8290.
  • 21
    Robinson JP,Ragheb K,Lawler G,Kelley S,Durack G. Rapid multivariate analysis and display of cross-reacting antibodies on human leukocytes. Cytometry 1992; 13: 7582.
  • 22
    Costa ES,Arroyo ME,Pedreira CE,García-Marcos MA,Tabernero MD,Almeida J,Orfao A. A new automated flow cytometry data analysis approach for the diagnostic screening of neoplastic B-cell disorders. Leukemia 2006; 20: 12211230.
  • 23
    Kitsos CM,Bhamidipati P,Melnikova I,Cash EP,McNulty C,Furman J,Cima MJ,Levinson D. Combination of automated high throughput platforms, flow cytometry, and hierarchical clustering to detect cell state. Cytometry A 2007; 71A: 1627.
  • 24
    Zeng QT,Pratt JP,Pak J,Ravnic D,Huss H,Mentzer SJ. Feature-guided clustering of multi-dimensional flow cytometry datasets. J Biomed Inform 2007; 40: 325331.
  • 25
    Pedreira CE,Costa ES,Arroyo ME,Almeida J,Orfao A. A Multidimensional Classification Approach for the Automated Analysis of Flow Cytometry Data. IEEE Transactions on Biomedical Engineering 2008; 55: 11551162.
  • 26
    Pedreira CE,Costa ES,Barrena S,Lecravisse Q,Almeida J,VanDongen JJ,Orfao A. Generation of flow cytometry data files with a potentially infinite number of dimensions. Cytometry Part A 73A: 834846.
  • 27
    Orfao A,Schmitz G,Brando B,Ruiz-Arguelles A,Basso G,Braylan R,Rothe G,Lacombe F,Lanza F,Papa S,Lucio P,San Miguel JF. Useful information provided by the flow cytometric immunophenotyping of hematological malignancies: Current status and future directions. Clin Chem 1999; 45: 17081717.
  • 28
    Olaru D,Campos L,Flandrin P,Nadal N,Duval A,Chautard S,Guyotat D. Multiparametric analysis of normal and postchemotherapy bone marrow: Implication for the detection of leukemia-associated immunophenotypes. Cytometry B 2008; 74B: 1724.
  • 29
    Duda RO,Hart PE,Stork GD. Baysian Decision Theory. In: DudaRO,HartPE,StorkGD, editors. Pattern Classification,2nd ed. New York: Wiley; 2001. pp 2082.
  • 30
    Harris NL,Jaffe ES,Diebold J,Flandrin G,Muller-Hermelink HK,Vardiman J,Lister TA,Bloomfield CD. World Health Organization classification of neoplastic diseases of the hematopoietic and lymphoid tissues: Report of the Clinical Advisory Committee Meeting—Airlie House, Virginia, November 1997. J Clin Oncol 1999; 17: 38353849.
  • 31
    Orfao A,Pedreira CE,Costa ES. A method for generating flow cytometry data files containing an infinite number of dimensions based on data estimation. U.S. Pat. No. 11/240,167; 2007.
  • 32
    Orfao A,Pedreira CE,Costa ES. Generation of flow cytometry data files with a potentially infinite number of dimensions derived from the fusion of a group of separate flow cytometry data files and their multidimensional reconstruction with both actually measured and estimated flow cytometry data. Eur. Pat. No. EP1,770,387; 2007.
  • 33
    Duda RO,Hart PE,Stork GD. Maximun-likelihood and Bayesian parameter estimation. In: DudaRO,HartPE,StorkGD, editors. Pattern Classification,2nd ed. New York: Wiley, 2001; 115116.
  • 34
    Duda RO,Hart PE,Stork GD. Unsupervised learning and clustering. In: DudaRO,HartPE,StorkGD, editors. Pattern classification,2nd ed. New York, 2001; 526527.
  • 35
    Spidlen J,Gentleman RC,Haaland PD,Langille M,Le Meur N,Ochs MF,Schmitt C,Smith CA,Treister AS,Brinkman RR. Data standards for flow cytometry. OMICS 2006; 10: 209214.
  • 36
    Kern W,Haferlach C,Haferlach T,Schnittger S. Monitoring of minimal residual disease in acute myeloid leukemia. Cancer 2008; 112: 416.
  • 37
    Coustan-Smith E,Sancho J,Hancock ML,Boyett JM,Behm FG,Raimondi SC,Sandlund JT,Rivera GK,Rubnitz JE,Ribeiro RC,Pui CH,Campana D. Clinical importance of minimal residual disease in childhood acute lymphoblastic leukemia. Blood 2000; 96: 26912696.
  • 38
    Lucio P,Gaipa G,van Lochem EG,van Wering ER,Porwit-MacDonald A,Faria T,Bjorklund E,Biondi A,van den Beemd MW,Baars E,Vidriales B,Parreira A,van Dongen JJ,San Miguel JF,Orfao A; BIOMED-I. BIOMED I concerted action report: Flow cytometric imunophenotyping of B-ALL with standartized triple-stainings. Leukemia 2001; 15: 11851192.
  • 39
    Menendez P,Vargas A,Bueno C,Barrena S,Almeida J,De Santiago M,López A,Roa S,San Miguel JF,Orfao A. Quantitative analysis of bcl-2 expression in normal and leukemic human B-cell differentiation. Leukemia 2004; 18: 491498.
  • 40
    Tosetto A,Castaman G,Rodeghiero F. Evidence-based diagnosis of type 1 von Willebrand disease: A Bayes theorem approach. Blood 2008; 111: 39984003.
  • 41
    Kroft SH,Dawson DB,McKenna RW. Large cell lymphoma transformation of chronic lymphocytic leukemia/small lymphocytic lymphoma. A flow cytometric analysis of seven cases. Am J Clin Pathol 2001; 115: 385395.
  • 42
    Späth-Schwalbe E,Flath B,Kaufmann O,Thiel G,Brinckmann R,Dietel M,Possinger K. An unusual case of leukemic non-Hodgkin's lymphoma with blastic transformation. Ann Hematol 2000; 79: 217221.
  • 43
    van Lochem EG,Wiegers YM,van den Beemd R,Hählen K,van Dongen JJ,Hooijkaas H. Regeneration pattern of precursor-B-cells in bone marrow of acute lymphoblastic leukemia patients depends on the type of preceding chemotherapy. Leukemia 2000; 14: 688695.