The analysis of cells from multiple experimental groups by multiparameter flow cytometry leads to the generation of complex data sets, for which adequate analysis tools are not commonly available. We report here that software designed for transcriptomics applications can be used in multiparameter flow cytometry.
Lymphocytes isolated from nine different mouse organs were stained and subjected to 10-parameter flow cytometry. The resulting data set contained 594 different T cell subsets per organ per mouse and was organized into a so-called flow cytometry array (FCA).
Computation of a hierarchical tree revealed that lymph nodes and spleen were populated by similar T cell subsets, while T cells from peripheral organs displayed a diverse subset composition. Furthermore, organ-specific T cell subsets were identified.
The recent introduction of benchtop multicolor flow cytometers and the availability of a large array of fluorochromes have led to the development of up to 17-color multiparameter flow cytometry (1). However, the analysis of the complex data sets generated by this technology is hampered by the lack of appropriate, commonly available tools (1). In current data analyses two parameters are plotted versus each other in a so-called dot plot, followed by gating the population(s) of interest (bivariate gating) and further two-dimensional plotting of cells belonging to the gated population(s). After a certain level of complexity is reached, populations of interest are usually not sub-gated further. Instead, multiple parameters are plotted side by side in consecutive dot plots versus the same reference parameter (e.g., reference (2)). From this point on, the information gathered in the experiments increases linearly with the number of parameters, although the true information content of the data increases exponentially with the number of parameters. Therefore, in most multiparameter flow cytometry experiments reported to date, potentially important information might have been ignored. Progress towards solving this problem has been made by automated clustering algorithms (1), with which populations of interest can be identified in the multi-dimensional flow cytometric space (3–5). Using the probability binning approach by Roederer et al. (3, 4) individual samples can be compared (6), however, the comparison of replicates of multiple experimental groups still remains a major problem.
The challenge associated with the analysis of complex data sets is not restricted to flow cytometry. Since the introduction of DNA-chip microarrays (7), software was developed to manage the large amount of data generated in experiments using these DNA-chips. We exploited the statistical power of chip analysis software to analyze the relationships between T cell populations present in mouse organs. Using 10-parameter flow cytometry, CD4 and CD8 T cells isolated from nine different mouse tissues were characterized by the expression of markers associated with different T cell lineages or stages of differentiation (Tables 1 and 2). To ensure the analysis of the complete information content of the samples, we calculated the fraction of each theoretically definable subset, which resulted in 594 different populations per organ and mouse. Manually, this number of data sets cannot be comprehensively searched or scanned for populations differentially overrepresented in different organs. However, analysis with software originally designed for DNA-chip analysis quickly revealed the relationships of organ-specific T cell populations using hierarchical clustering. Furthermore, multivariate analysis (ANOVA) led to the identification of subsets significantly overrepresented in individual organs.
Table 1. Parameters Acquired by Flow Cytometry
Isolated cells were stained in four parallel stainings (columns) with antibodies against several surface antigens and DAPI for dead cell exclusion. Parameters 1 - 4, 8, and 10 were used for T cell identification, parameters 5 - 7 and 9 were used for T cell characterization.
Table 2. Classification of T cell Subpopulations by Selected Surface Markers
Associated with subset
The antibody (clone 1B11) identifying effector cells detects the heavy, strongly glycosylated isoform.
Female C57BL/6 mice at the age of 8 weeks were obtained from Charles River (Sulzfeld, Germany). Mice were housed in individually ventilated cages under spf conditions. All animal experiments were performed in compliance with governmental and institutional guidelines.
Isolation of Lymphocytes and Flow Cytometry
Lymphocytes from blood, liver, lung, kidneys, femurs, peyers patches, spleen, peripheral (inguinal and axillar), and mesenteric lymph nodes were isolated essentially as described (26, 28). Isolated cells were stained using the following antibodies in four parallel stainings per sample: anti-CD4-APC.Cy7 (GK1.5), hCD43-PE (1B11), CD45RB (16A), CD69-FITC (H1.2F3), Ly6-C/G-FITC (Gr-1, all from BD Becton Dickinson, Basel, Switzerland), CD8α-PE.Cy7 (53-6.7), CD44-PE.Cy5.5 (IM7), CD25-APC (PC61, all from ebiosciences, San Diego, CA, USA), CD62L-PE (Mel-14, Caltag, Burlingame, CA, USA), anti-CCR2-Alexa488 (MC21 (29), labeled with a antibody labeling kit (Invitrogen, Basel, Switzerland)), CCR5-biotin (MC68 (29), labeled with a antibody labeling kit (Roche)), CXCR3 (220803, R&D Systems, Minneapolis, MN). Biotin-labeled antibody was detected with streptavidin (SAv)-PE (BD) or SAv-Alexa488 (Invitrogen, Basel, Switzerland), unlabeled antibody with APC-labeled anti-rat antiserum, followed by blocking of the free binding site of the secondary antibody with rat serum before addition of other antibodies. In one of the stainings, CD90.2-bio (53-2.1, BD) plus SAv-Alexa350 (Invitrogen) was used to improve T cell identification for the determination of absolute cell numbers. Dead cells were excluded by staining with DAPI (Invitrogen). The parameters examined in each staining are given in Table 1. Data was acquired on a LSR II flow cytometer (BD) and primary analysis was performed with FACSDiva Software 4.1 (BD) by separating populations into cells positive or negative for a particular antigen according to all-fluorescences-minus-one controls using hinged quadrants (30). If an antigen was expressed either at high or low/intermediate levels (i.e. hCD43, CD44, CD45RB, CD62L) the best possible unbiased separation was achieved by tethering quadrants to high antigen expressing populations and subsequent batch analysis of all samples originating from the same organs. Secondary analysis steps were performed with Excel (Microsoft) and GeneSpring (Silicon Genetics, Redwood City, CA) software.
Fractions of individual T cell subsets were compared between different organs by 1-way ANOVA assuming equal variances, because few replicates were performed (n = 3). Samples were considered differentially represented in individual organs, if P ≤ 0.05. As multiple testing correction the Benjamini and Hochberg False Discovery Rate correction was applied. This correction is the least stringent method and was used to avoid false negatives. Furthermore, populations identified to be differentially represented in individual organs were considered as reliable, if they consisted of several related subsets.
Identification of T Cell Subpopulations
CD4 and CD8 lymphocytes isolated from nine different mouse tissues were characterized simultaneously by combinations of 4 out of 10 different antibodies against T cell surface markers, e.g. CD44, CD62L, CD25, and CD69 (Fig. 1) in four parallel stainings. Absolute numbers of populations defined by the presence or absence of either of the four characterizing surface markers were obtained by intersecting the populations defined by two markers, as illustrated in Figure 1a, e.g. CD44hi, CD62Llo, CD25−, and CD69+ cells were displayed in quadrants “D” and “1.” In total, 16 populations can be defined by four characterizing surface markers, i.e. “A1”-“D4” in Figure 1, and the fraction of each subpopulation within total CD4 or CD8 T cells was calculated. Subdivision of total T cell numbers into many small populations bears the risk that differences between organ-specific populations might be missed, because few events might erroneously result in statistically not significant differences between subsets (see below). Therefore, we also calculated the numbers and percentages of all populations defined by the presence or absence of either of one, two, or three surface markers. For one surface marker there are two populations definable, i.e. cells positive or negative for this marker, corresponding to CD44hi and CD44lo cells in Figure 1b. Since four surface markers were examined simultaneously, this led to 4 × 21 = 8 populations per staining. For two surface markers, there are 22 = 4 populations definable, i.e. in Figure 1b all CD44/CD62L combinations; since the number of combinations for two out of four markers is 6, this results in 6 × 22 = 24 populations. Characterization of populations by three surface markers yields 4 × 23 = 32 populations. For both CD4 and CD8 cells the total number of populations definable is the sum of all these numbers, which is 16 + 8 + 24 + 32 = 80 populations for each staining. For the other three antibody combinations analogous calculations were performed, which finally resulted in 4 × (80 + 80) = 640 CD4 and CD8 populations, for which absolute numbers and percentages were calculated.
In practice, total numbers of cells belonging to each of the 16 populations defined by all four identifying surface markers were obtained by intersecting the events in each of the quadrants on the left panel in Figure 1a with those on the right panel. These numbers were then added in different combinations to obtain the events recorded for a particular population. For example, the total number of CD4+CD62LloCD69+ cells were calculated by adding the events in the quadrant intersections “C1” + “C2” + “D1” + “D2.” Subsequently, this total number was used to compute the percentage of CD4+CD62LloCD69+ cells among all CD4 cells.
Generation of the Flow Cytometry Array and Flow Cytome analysis
Following the calculation of the percentage of each population isolated from each organ, GeneSpring microarray analysis software was used to compare the subset structure between individual organs. For this evaluation, each population was regarded as “gene” and the corresponding frequency (in percent) as “signal intensity of the gene.” Data was imported into GeneSpring without further normalization. Since some antibodies were used in more than one of the four parallel stainings, the initial number of 640 T cell populations definable per organ was reduced to 594. The percentages, with which these replicate populations were represented in the individual stainings, were averaged during the import of the data into GeneSpring.
To visualize the relation between organ-specific T cell populations, organs were clustered into a hierarchical tree using standard correlation on the frequencies of these populations, as described by Eisen et al. for gene expression patterns (31). Peripheral lymph nodes were defined as reference, because in naive mice this tissue contains the most naive T cells (Fig. 2a). This analysis led to the identification of three major groups of T cell populations: The first group included lymphatic organs and lung T cell populations, the second, peripheral group consisted of liver and kidney populations and was most distant from the lymphatic group, and finally, the last group comprised blood and bone marrow populations and was situated in between the former two groups (Fig. 2a). Within the lymphatic organs group, the highest degree of similarity was revealed between peripheral and mesenteric lymph nodes. To obtain the numbers of T cell populations over- or under-represented in individual organs, a Tukey-test was applied to the results of the ANOVA (Fig. 2b). This test confirmed the relationships that were demonstrated by the hierarchical tree, e.g. the number of similar T cell subsets in liver and kidney is higher than the numbers obtained by comparing either of these two organs with any other organ. To address the issue of reproducibility, two more experiments similar to the one described were performed. The antibodies used in these experiments were the same (with two exceptions), although they were used in different combinations. The hierarchical trees obtained from these three experiments were nearly identical (Fig. 2c).
Identification of Populations Overrepresented in Specific Organs
Since the aim of the experiment was the detection of differences in organ-specific subset compositions, only populations with significantly different overrepresentation in at least one organ were identified by ANOVA (P ≤ 0.05). The 498 subsets passing this test were further examined for the organs in which they were overrepresented using Excel software. A subset was regarded as overrepresented, if its percentage in a particular organ was 1.5-fold greater than its average percentage in all organs. This was the case for either none (67 subsets), one (81), two (252), three (92), or four (6) organs. For each of these subsets the degree of overrepresentation was computed for each organ in which it was overrepresented by calculating the difference (Δ%) between the percentage in a particular organ and the average percentage of all organs in which this subset was not overrepresented. To identify subsets overrepresented in a certain organ, CD4 and CD8 subsets were ranked separately according to their degree of overrepresentation (Δ%) in this particular organ, as demonstrated for liver CD4 T cells (Table 3). Overrepresented CD44hi or CD62Llo subsets not expressing other activation markers were omitted because these marker combinations only characterize experienced T cells, but do not provide information on specific subsets. Complete lists of subsets overrepresented in each organ are available as Supplementary Table 1 online.
Table 3. Degree of Overrepresentation (Δ%) of Selected CD4 T cell Subsets Isolated from Liver
The populations are grouped by surface markers, and ranked according to their degree of overrepresentation in liver. Shaded subsets are characterized by one surface marker, boxed subsets by four markers.
bm, bone marrow.
Many CD4 T cells overrepresented in liver were CD69+, CD45RBlo, hCD43hi, and CCR5+. These properties were shared with T cells isolated from kidney, while some subsets present in blood and bone marrow also expressed hCD43 and CCR5. Within each population, it was unpredictable, whether a subset defined by a single surface marker (CCR5+) or a subset defined by several surface markers (e.g. CD62LloCD25-CD69+) demonstrated the highest degree of overrepresentation in an organ. However, a considerable range of differential overrepresentation within a population was observed between subsets, e.g. within the hCD43hi population it was between 25.3 and 44.2 percentage points (Table 3). Both aspects justified the examination of all theoretically definable subsets.
The distribution profiles representing the organ-specific frequency of the T cell subsets listed in Table 3 are depicted in Figure 3. Subsets defined by a single surface marker exhibited the highest prevalence in all organs, whereas the subsets defined by four markers always exhibited the lowest frequency. However, this did not correlate with the degree of overrepresentation (Δ%), as demonstrated in Table 3.
Organ-Specific T Cell Subset Compositions
Analysis of the organ-specific T cell subset compositions revealed no profound differences between the CD4 and CD8 populations for most surface markers examined, with two exceptions: first, fewer CD8 cells from peripheral organs, in particular liver and kidney, expressed high levels of hCD43. Second, a considerable fraction of CD8 cells from the periphery (kidney, liver, lung, blood, and bone marrow) was positive for the memory marker Gr-1. The subsets enriched in blood, lung, and lymphatic tissues were overlapping, but those present in several of these organs differed in the degree of overrepresentation. In lymphatic organs only naive T cells were overrepresented, while CXCR3+ cells were detected in blood and bone marrow.
The analysis of data generated by multiparameter flow cytometry is hampered by two major problems. First, all subsets of interest have to be unequivocally identified, and second, in most experiments replicates are performed and often several experimental groups need to be compared. Identification of potentially important populations within the multi-dimensional flow cytometric space is tedious and time-consuming. It is a minor challenge if subsets are defined only by few markers, as for example, in conventional 4-color flow cytometry. However, it turns out to be a major task with an increasing number of populations and experimental groups, and becomes an even more difficult undertaking, if there is uncertainty about which populations have to be compared and which might be irrelevant. Recently automated clustering algorithms have been implemented (1, 3–5), which can be used to identify populations significantly different from a control sample. However, these algorithms reach their limitations, if replicates are to be compared. We report here that both challenges, identification of populations and comparison of replicates, can be addressed by taking advantage of software originally designed for the analysis of genomics or transcriptomics data. After comparison to the appropriate staining controls (30) antigen-positive and -negative cells can be discriminated manually and further analysis is completed using statistics software. In contrast to previously published methods (3–5) statistics is performed on a different level, i.e. not on primary data, but on secondary data representing whole populations.
Our flow cytometry array (FCA) approach to multiparameter flow cytometry data analysis enables the evaluation of the complete information content of the acquired data by comparing all theoretically definable cell populations. In the present experiment, the characterization of T cells by up to four surface markers resulted in 80 CD4 and 80 CD8 T cell subsets per staining. We demonstrated that each of these subsets can be differentially represented in the individual experimental groups, irrespective of the number of parameters used for its definition (Table 3 and Fig. 3). For example, in one setting, Ly6C/G+ memory T cells might represent 50% of CD44hi cells, which are 50% of total CD8 T cells. In another, memory T cells might be 100% of CD44hi cells, which are 25% of total CD8 T cells. In the two settings, memory T cells contribute the same fraction to total CD8 T cells (25%), but clearly the two systems are very different. This difference would have been missed if only subsets defined by both parameters (CD44 and Ly6C/G) would have been analyzed, however in the FCA approach, differences would have been spotted by differential CD44 expression. Hence, it is not sufficient to analyze the tissue distribution of only those subsets that are defined by the complete array of surface markers. Instead, it is necessary to evaluate all theoretically definable subsets, which are characterized by any number of surface marker combinations. Because the number of theoretically definable subsets “ns” is increasing with each additional parameter “p” according to the formula ns = 3p − 1, this would result in 728 populations in an experiment, in which subsets would be characterized by up to six parameters. For an experiment using 15 parameters for subset characterization, which is technically feasible (1), more than 14 million populations would have to be analyzed and compared to reference samples. This huge number of comparisons can only be accomplished automatically, e.g. by using the FCA approach described here. In the present report, populations characterized by up to four surface markers required the computation of the absolute numbers of 80 individual subsets, which was performed by adding different combinations of the 16 absolute numbers of the populations defined by all four markers (Fig. 1). This was performed manually for one staining and was extended to the others by copying and pasting of the formulas. However, since the information content of a flow cytometry experiment increases exponentially with each additional parameter, an automated algorithm for calculating the absolute numbers of each theoretically possible subset would be of advantage for experiments applying more than four subset characterizing parameters.
Subdivision of a given number of events (cells) acquired by flow cytometry into distinct populations reduces the number of events per population, i.e. the percentages of individual populations characterized by four markers are lower than that of populations characterized by one marker (see also Fig. 3). Fewer events per subset may cause higher variability and therefore might erroneously lead to statistically nonsignificant results. One way to reduce this source of variability is the acquisition of more events, if the amount of sample is not limiting. An alternative or supplementary way to accomplish this is to use a FCA, because with this approach every possible array of surface markers is taken into account for subset characterization. Moreover, this mode of data analysis focuses on the populations containing the most events in at least one experimental condition of each array, and disregards the others. Thus, in a FCA approach, populations of interest are not predefined by the investigator, instead, populations displaying the most significant changes between experimental conditions are revealed by the unbiased analysis.
Another advantage of data analysis using a FCA is the ease of comparing multiple experimental groups by hierarchical trees, which are quickly compiled after data is imported into DNA-chip analysis software. These trees provide an overview prior to the analysis of individual populations. Furthermore, hierarchical trees can not only be used for comparing different experimental groups, but they are very helpful in quality control, because in a hierarchical tree replicate samples should group very close to each other. Besides using FCAs for the analysis of cell population compositions, we also applied this technique for comparing absolute cell numbers and surface marker expression levels, i.e. mean fluorescence intensities. Furthermore, a FCA could be extended to intracellular staining, e.g. to follow signaling cascades, which is a flow cytometry application of high potential (32).
To control for experimental variability, individual mice were sacrificed on different days and their organs examined immediately afterwards. Furthermore, three similar experiments were performed. Indeed, T cells expressing similar marker combinations were over- or under-represented in individual organs in each experiment, with, e.g. memory/effector subsets reproducibly being overrepresented in liver and kidney (CD44hi, CD62Llo, CD43hi, CD69+, CCR5+). Therefore, a high degree of reproducibility was demonstrated with this FCA-approach.
A certain bias is introduced into FCAs by the selection of the antibodies used in such an experiment. However, the introduction of a benchtop flow cytometer capable of acquiring 16 parameters simultaneously will facilitate the use of a large number of antibodies in few parallel stainings, thus reducing the bias caused by the experimental setup. Apart from this bias the FCA approach might be useful in discovering previously unknown or disregarded subsets that might serve as biomarkers in certain experimental or clinical applications. For example, in the blood of heart allograft recipient mice we identified two T cell subsets that correlated with rejection (manuscript in preparation).
The results obtained with this multiparameter FCA are in agreement with previously reported data, which were acquired by conventional flow cytometric characterization of lymphocyte subsets in different organs, e.g. the close relation of lymphoid organs, the predominance of CD44loCD62Lhi naive T cell populations in lymph nodes and the occurrence of CD44hi cells mainly in peripheral organs (8). Besides this confirmation of existing data, previously unappreciated tissue-specific subsets were identified by this FCA, e.g. CD44hiCD62LloCD25−CD69+ and CD44hiCD45RBlo CD25−CCR2−CD4 T cells in liver and kidney. The exact phenotypes and the biological roles of the populations detected by a FCA need to be established in additional experiments, which is in parallel to gene identification by transcriptomics, where RT–PCR is necessary to confirm DNA-chip data.
In summary, the overall complex pattern of T cell subset localization in different tissues was analyzed and categorized using a FCA. The visualization of subset relationships by a hierarchical tree and the unbiased search for overrepresented populations emphasized the power of this array approach. FCA analysis has the potential to be used for any cell-type subjectable to flow cytometry and has its greatest advantages in experiments, in which multiparameter flow cytometry is performed in replicates on multiple experimental groups and time points.
We thank Dr. Karl Welzenbach for helpful discussion on staining strategies, Dr. Lukas Roth for help with GeneSpring software, and Dr. Pius Loetscher for critical review of the manuscript.