P. K. Chattopadhyay, ImmunoTechnology Section, Laboratory of Immunology, Vaccine Research Center, National Institutes of Allergy and Infectious Diseases, National Institutes of Health, 40 Convent Dr., Room 5612 (Bldg. 40), Bethesda, MD 20892-3015, USA. Tel.: (301) 594-8656 Fax: (301) 480-2788 Email: email@example.com Senior author: M. Roederer, email: firstname.lastname@example.org
Multiparameter flow cytometry has matured tremendously since the 1990s, giving rise to a technology that allows us to study the immune system in unprecedented detail. In this article, we review the development of hardware, reagents, and data analysis tools for multiparameter flow cytometry and discuss future advances in the field. Finally, we highlight new applications that use this technology to reveal previously unappreciated aspects of cell biology and immunity.
The complexity of the immune system is remarkable: leucocytes are capable of expressing numerous proteins, combinations of which define functionally distinct cell subsets. To understand which subsets are important for a given immune response, researchers have relied on fluorescence-activated flow cytometry, the method of choice for enumerating and purifying cells. In this article, we review state-of-the-art technology in multiparameter flow cytometry and discuss recent research accomplishments stemming from it.
Beginning in the late 1960s, a series of studies defined unique subsets of B and T cells, setting the stage for multiparameter work. Among B cells, antibody-secreting plasma cells were first isolated in 1969 using single-laser flow cytometers1 and, in 1982, two-laser systems defined subpopulations expressing immunoglobulin M (IgM) and/or IgD.2,3 Subsequent studies defined several distinct peripheral B-cell populations; including B-1a, B-1b, B-2, and marginal zone B cells.4–8 Similarly, simple distinctions between CD4+ and CD8+ T cells were expanded to define naïve, memory, T helper type 1 (Th1), Th2, T cytotoxic type 1 (Tc1) and Tc2 subsets by expression of cell surface proteins.9–11 Beyond these phenotypic measurements, flow cytometry can also be used to measure cell function (e.g. cytokine production12 and cytotoxicity)13 or death14 on a cell-by-cell basis. In fact, a multitude of measurements can be performed now by flow cytometry; recent technical advances allow these measurements to be made simultaneously on individual cells.
In theory, such multiparameter analysis is quite powerful. It provides more data from less sample, a key consideration when patient samples are limited. Multiparameter analysis also allows more accurate identification of populations, by excluding unwanted cells that bind some reagents.15,16 Most importantly, the technology can identify cells with complex phenotypes, such as those responsible for haematological malignancies17 or those that may be relevant to immunity in vaccine or disease settings.18 As Fig. 1 demonstrates, bulk measurements (using only a few parameters) may not have the power to detect the complex phenotypes relevant to immunity.
This rationale underlies the field of multiparameter flow cytometry and has driven decades of successive advances in instrumentation and reagents. In particular, an explosion of advances since the 1990s (Fig. 2) has driven the field from five19 to 13 colours (in 2001),20 and recently to 18 colours (in 2006).21 The earliest advances occurred for hardware; these have largely been incorporated into commercial cytometers. Advances in fluorochromes and reagents followed, but some new fluorochromes were used for years in research laboratories before they became commercially available; the repertoire of commercial fluorescent reagents is still expanding rapidly. Finally, the least developed of the flow cytometry technologies is data analysis, where there is great potential (and a strong need) for significant new advances.
At the heart of today’s multiparameter flow cytometers is complex hardware and electronics, including lasers, optical filters, detectors and data acquisition boards. Predecessors to today’s instruments were reported in 1997 (10-parameter flow cytometers)22 and 2001 (13-parameter flow cytometers);20 however, these relied on high-powered lasers, were extensively customized, and required highly skilled operators.23 Recent advances, and the graduation of technologies from research to commercial settings, have made multiparameter flow cytometry much more accessible.
This is particularly evident for laser technology. Early lasers were available in only limited wavelengths and required cooling systems. Today, lasers are small and provide stable excitation sources in a wide variety of wavelengths. Blue (488 nm) and red (633 nm) lasers are typically found in commercial flow cytometers, allowing excitation of a wide range of fluorochromes, from fluorescein isothiocyanate (FITC) to allophycocyanin (APC). Although blue lasers are typically used to excite phycoerythrin (PE) and related tandem dyes (such as Cy5PE), a recent report shows that green (532 nm) lasers provide more sensitive measurement of these dyes, allowing better resolution of dimly-staining populations from background.24 Furthermore, expensive high-powered green lasers may not be necessary to realize this benefit. Simple green laser pointers, priced at around $150, may be used in instruments modified to allow slower transit of cells through the laser beam.25 This has important implications for the development of portable, low-cost flow cytometers.26
Although the three lasers described above can excite a wide variety of fluorochromes, other lasers are available for specialized applications. Ultraviolet (UV; 350 nm) lasers are routinely employed to excite Hoechst dyes, which measure DNA content in cell cycle applications.27 These can also excite quantum dots (QDs), a new class of fluorochromes based on nanotechnology. However, cells exposed to UV lasers are highly autofluorescent, which reduces staining sensitivity. Moreover, UV lasers are large and require much maintenance. Consequently, for QDs, violet (408 nm) lasers are preferred; they induce strong signals with much less cellular autofluorescence.21,28 Additionally, with the advent of yellow (560 nm) and orange (610 nm) lasers,29 excitation sources are now available for the complete range of visible light. These lasers may facilitate the use of new fluorescent dyes (such as red fluorescent proteins)30 and offer additional flexibility in multiparameter hardware. A system that employs violet, blue, green and red lasers can excite a broad range of fluorochromes, providing significant potential for multiplexed analysis.21
The development of high-throughput systems represents the most recent commercial advance in flow cytometry. These systems allow serial analysis of hundreds, if not thousands, of samples automatically and are particularly valuable in drug discovery efforts. In research and clinical settings, high-throughput systems allow the staining of multiple samples at once, thereby reducing manual labour and the potential for experiment-to-experiment variability. However, because high-throughput systems consume so many samples at once, it is critical that the flow cytometer is under rigorous quality control, otherwise massive amounts of data are lost before an instrumentation problem is noticed. In 2006, we published a research protocol describing our three-part approach to quality assurance.31 This approach consists of: (1) a system optimization step (to ensure that laser, mirror and filter configuration is optimal), (2) a calibration step (to determine the sensitivity and dynamic range of the detector, and develop a target value for detector gain), and (3) a monitoring step (to track precision, accuracy and sensitivity of fluorescence measurements over time). Analogous commercial efforts were recently introduced.
In summary, recent advances in lasers (and previous advances in other hardware) have produced relatively mature instrument technology that may, in its current form, be nearing its peak. In the short term, improvements in high-throughput systems are likely to occur.32 In the long term, emerging advances include the use of silicone avalanche photodiodes,33 for enhanced detection of far-red and infrared light, and microfluidic/acoustical focusing34,35 for instrument miniaturization. Beyond these, advances to the next generation of flow cytometers will rely on non-fluorescent probes36–39 to extend the number of measurable parameters.
Our understanding of the complexity of the immune system drove the need for new fluorochromes beyond the classical organic dyes FITC, PE and APC. By coupling cyanines (Cy5, Cy5.5 and Cy7) and Alexa dyes with PE and APC to produce resonance energy tandem dyes, the repertoire of fluorochromes was broadened through the late 1980s and 1990s. For example, tandem dyes produced with Alexa fluors and PE or APC often perform as well as (or better than) similar organic dyes in terms of fluorescent signal and stability. Many of these dyes are also commonly used alone, directly conjugated to antibodies (e.g. Alexas 488, 647 and 680). More recently, the development of inorganic dyes has provided even more choices (Fig. 3), and allowed researchers to evaluate the best fluorochromes for their instrumentation.
In particular, the development of a new class of inorganic dyes, based on nanotechnology, has generated considerable interest. QDs are semiconductor nanocrystals with physical and fluorescence properties suitable for flow cytometry.21,28 Compared to organic fluorochromes, these properties seem tailor-made for multicolour flow cytometry. Most importantly, organic fluorochromes have narrow excitation spectra, such that multiplexed analysis of these fluorochromes usually requires multiple lasers for excitation.40 The QDs, however, have broad excitation spectra21,28 so that all (QD525–QD800) are excited by the same laser source with bright signals. Consequently, even a single-laser flow cytometer can be used for complex multicolour experiments.21
The emission properties of QDs also offer advantages over organic fluorochromes. Most organic fluorochromes have broader emission spectra than QDs,21,40 particularly on the red side of the emission band. Therefore, when organic fluorochromes are multiplexed, their signals overlap significantly with each other, often reducing the ability to detect dimly-staining populations.41 In contrast, a five-QD system can be constructed with minimal fluorochrome overlap and excellent sensitivity.21 Moreover, when used in staining panels with organic fluorochromes, very little light from the organic dyes is detected in QD channels.21 Bright QDs are therefore particularly useful in multicolour panels that attempt to detect low-level expression.23,28,42
As Fig. 3 shows, there are now over 30 different fluorochromes used as tags with monoclonal antibody conjugates; this significantly exceeds the number of detectors available in even the most technologically advanced systems.23 Most fluorochromes differ in fluorescent and/or chemical properties so the wide variety provides researchers with many options, both in terms of applications and antibody–fluorochrome combinations. The latter is a particularly important consideration because not all reagent combinations work together in multiparameter systems.42
In summary, the development of reagent technology has lagged somewhat behind instrumentation (Fig. 2), still, research efforts for the development of new fluorochromes have reached a point of diminishing returns. In contrast, better commercial availability of more and varied fluorochrome–antibody combinations is still increasing.
As multiparameter flow cytometry becomes more complex, the volume of information generated grows exponentially. A typical four-colour experiment, identifying only positive and negative populations, could generate 16 potential subsets for analysis, while eight- and 18-colour experiments could generate 256 and 262 144 possible subsets, respectively. This sometimes exceeds the capabilities of researchers, computer processors and software. Fortunately, strategies and tools for the presentation and analysis of complex, multiparameter datasets are slowly emerging; however, there remains a critical need for additional work in this area. Analysis systems that rely heavily on the researcher (i.e. ‘manually managed’) are the most developed, but in the future we may rely increasingly on computer-driven analysis.
Historically, flow cytometry data have been plotted on a simple log scale to accommodate the wide dynamic range of the data – and because many fluorescence distributions are log-normal. However, in multiparametric fluorescence analyses, the distributions of events at low fluorescence are no longer log-normal; furthermore, the measurements can include values that are approaching zero and even have negative values (this arises from relatively large errors in measurements propagated by fluorescence spillover compensation).41 The end result was distributions that were difficult to interpret, and that could include a tremendous number of events ‘piled up’ on the axis of the log scale.43 Recently, new methods to scale the data (e.g. Logicle scaling)43,44 have become widely available to address this problem. This scaling smoothly combines a logarithmic scaling for highly fluorescent distributions with a linear scaling for measurements near zero, resulting in data displays that are far more intuitive and understandable to researchers. While seemingly arcane, this change in the visualization of multiparameter data has revolutionized the analysis and presentation of these complex data.
Manually managed analysis
When data analysis is managed manually, the choice of which relationships are examined is typically hypothesis driven, and a limited number of parameters are compared. This can provide focused and biologically plausible results; however, such analyses can miss data relationships that researchers had not envisioned. Moreover, the analysis methods of the researchers may influence outcomes. In this regard, certain methods may be used to ensure that populations of interest are identified correctly, easily quantified, correctly compared and optimally presented.
Once cell populations are correctly identified, they can be quantified and their frequency compared across groups. Generating gates to define every distinct subset can be cumbersome; therefore, we define cells positive for each parameter individually, and then use Boolean gating algorithms to combine positive and negative subsets for each marker in every possible combination. The results, in a six-colour experiment, are 64 distinct subsets automatically generated from the single parameter gates. Clearly, this process is highly dependent on how the researcher determines the cut-off between positive and negative populations, a potential source of bias that could be avoided using computer-driven analysis techniques (e.g. clustering). In any case, the frequency of each subset can be displayed and compared against groups using the recently developed pestle and spice software suite (M. Roederer, Vaccine Research Center, NIAID, NIH), which can also mask specific variables or display the frequency of cells expressing a partial (e.g. three-colour) phenotype.
Optimal presentation of multiparameter data remains a challenge because the human mind is conditioned to view and understand data in two dimensions at a time. Over the years, options that overlay additional data on classical bivariate dot plots have emerged. Examples include the addition of rough quantitative information with contour, density and pseudocolour plots.45 Notably, however, these graphs share the disadvantage that they do not provide any information about data relationships beyond the two markers displayed. Recently, polychromatic plots were developed to overcome this limitation.46 In these plots, shades of colour are overlayed on bivariate graphs to communicate the values of other parameters. For example, for a bivariate dot plot of CD45RA versus CCR7, the expression of interleukin-2 (IL-2) and interferon-γ (IFN-γ) for each event in the histogram can be shown by assigning two colours (red and blue, respectively) to the cytokines; cells that coexpress IL-2 and IFN-γ are purple, while cells expressing IL-2 alone are red. By changing how these plots are scaled and which colour takes priority, certain populations with shared characteristics can be emphasized or de-emphasized. In this way, data can be explored manually in up to five dimensions.
In summary, there are a number of means by which to analyse multiparameter data manually; however, these tools depend on human input, so confusion, bias and artefacts may arise if samples are not analysed carefully and consistently. Also, these analyses can be labour intensive. Finally, manually managed analyses tend to ignore relationships between markers that had not been envisioned by the researcher. In this way, they constrain the discovery of new cell types relevant to disease and immunity.
Because of the high content of today’s flow cytometry data, the potential to find new, biologically important cell types is enormous; however, without increasingly sophisticated analysis tools, multiparameter data cannot be mined easily. In recent years, techniques developed for genomic and proteomic studies are now being applied to flow cytometry.
For example, once values (such as population frequencies) are generated in conventional flow cytometry software, the relationships among parameters can be explored using automated analysis tools, such as principal component analysis and hierarchical clustering. These techniques have been used to identify differences between in vitro cultured cell populations, age-related memory T-cell frequency, and the profiles of various inflammatory and malignant diseases.47–53 The disadvantage of this approach is that manual data-processing (to determine population frequencies) is still required.
In contrast, raw flow cytometry data (in the form of list mode data files) can be used to cluster data automatically (without any researcher effort or intervention).54–57 These methods are very similar to those used to analyse microarray data; however, list mode data have much higher information content than microarrays. For example, 100 microarrays contain around five million summarized measures, while just one 10-colour list mode file of 500 000 events contains more than six million data-points. This highlights one of the challenges facing this field: cluster analysis of high content, multicolour clinical data from large cohorts of patients will require enormous computing power.
The identification of the best clustering tools also remains an important challenge. A good clustering algorithm should describe not only visually evident populations within the data but also more intricate clusters distributed in multidimensional space. To describe populations and subsets based on phenotypic characteristics, a partitioning technique (such as k-means clustering) is frequently used. However, the asymmetric data distributions often found in flow cytometry data are not handled well by this algorithm. Moreover, with the simplest k-means tools the number of clusters to generate must be determined by the researcher.58,59 To eliminate this requirement, and allow completely unsupervised (computer-based) analysis, a number of recent studies have explored Gaussian Mixture modelling.55,60,61
In addition, to successfully employ automated/computer-driven analysis, it is very important to limit technical sources of variation in flow cytometry instrumentation, reagents and sample handling. In microarray studies, technical sources of variance are estimated simply by using control probes; however, in flow cytometry, multiple parameters (that are difficult to control for) can introduce variance. These include unique characteristics of the instrument, such as laser output and alignment, and staining-related issues such as pipetting, staining temperature and incubation time. To some extent, these sources of variance can be minimized with strict quality control and monitoring of instruments.31
As indicated in Fig. 2, data analysis remains a relatively nascent technology. There are few clear standards for data analysis,62–64 and the tools available are either extremely labour-intensive or require considerable expertise to apply. Moreover, basic questions remain. For example, what is a population, how can clustering tools identify populations, and how do we know whether the clusters discovered are ‘real’ and biologically relevant? Can clustering tools provide reproducible results when data are analysed on different instruments, on different days? These questions must be addressed before automated techniques for data analysis can be applied to clinical studies in a more extensive way.
In today’s era of systems biology,65,66 multiparameter flow cytometry can be a key tool for understanding protein networks. The remainder of this article highlights three applications where this technology is revealing previously unappreciated aspects of cell biology and immunity.
Recently, phosphoprotein-specific multicolour assays have been developed to explore the dynamics of the intracellular signalling pathways and map cell-signalling networks important in immunology and haematology.67 These assays combine intracellular and surface staining to determine the phosphorylation status of a number of intracellular signalling molecules, such as mitogen-activated protein kinases (MAPK), Jun N-terminal kinases (JNK), extracellular signal-regulated kinases (Erk), p38, Janus kinases/signal transducer and activator of transcription (JAK/Stat) and others. Studies using this technology have not only enabled the exploration of fundamental aspects of signalling, but have also paved the way for new clinical paradigms. By mapping aberrant signalling cascades in malignancies, and grouping diseases with shared networks, this technology can be used to reclassify cancers based on precise molecular (rather than nebulous pathological or clinical) traits, and may lead to novel disease therapies.68,69
Multiparameter flow cytometry has also reshaped our understanding of antigen-specific T-cell responses. In particular, recent studies show that cells with certain functional traits correlate with protective immunity to a wide variety of diseases or vaccines. These protective T cells cannot be identified solely by measurement of IFN-γ, but rather are defined by the combinations of cytokines they produce.
This was first described in the setting of human immunodeficiency virus type 1 (HIV-1) infection, where people with progressive disease have fewer cells expressing IL-2 alone or IL-2 + IFN-γ than people with non-progressing or treated disease.70,71 These studies were extended by developing nine-plus colour panels to measure tumour necrosis factor-α, macrophage inflammatory protein-1β, and the degranulation marker CD107a, which showed that cells expressing multiple functional molecules simultaneously correlated with reduced disease burden; cells producing IFN-γ alone were associated with progressive disease.72–74 Similarly, in settings where antigen load is low and transient (e.g. vaccinia immunization)75 more multifunctional cells are observed than when antigen is persistent (cytomegalovirus, HIV).72 Therefore, those immune responses that control viral infections were associated with multifunctional T cells, but it remained unclear whether these cells mediated control or whether their presence was a consequence of viral clearance.72 A mouse model for vaccine-induced protection provided the first evidence that this was not just a correlation.76 In this model, the protection afforded by different vaccine regimens could be predicted by the number of polyfunctional T cells induced before challenge. This study also revealed that cells making multiple cytokines express higher levels of IFN-γ on a per cell basis than cells that produce only one cytokine.76 The multifunctional measurement therefore revealed a strong correlate of vaccine-mediated protection as well as a mechanism underlying the protection. Because this important correlate of immunity is multidimensional, it could only be revealed through multiparameter flow cytometry.
Finally, in recent years, a wide variety of applications have emerged from QD technology. For example, in 2006, we analysed the maturity of various antigen-specific T-cell populations using a 17-colour staining panel.21 This panel consisted of seven QDs and 10 organic fluorochromes, which were measured simultaneously in the same sample. The QD reagents used were conjugates with conventional antibodies (against CD4, CD45RA and CD57), as well as peptide major histocompatibility complex class I (pMHCI) multimers designed to detect those antigen-specific T cells directed against HIV, Epstein–Barr virus and cytomegalovirus epitopes. Before this study, only FITC-, PE- and APC-tetramers were available, which limited panel design because many novel or dimly-staining antibodies are only found on these fluorochromes. The QDs also display higher valency than PE or APC, so more pMHCI molecules could be bound to a QD than to a PE or APC-streptavidin (SAV), allowing brighter signals and better staining resolution. This work demonstrated the power of a multiplexed approach: by identifying multiple phenotypically distinct subsets within each antigen-specific T-cell population, the remarkable intricacy of T-cell immunity can be appreciated. Moreover, QDs also allow us to measure many antigen-specific populations simultaneously, an important factor when sample availability is limited.
Since the development of a one-laser system for purification of plasma B cells, flow cytometry technology has come a long way. Early advances occurred in instrumentation, and over time these have produced relatively mature systems that can be customized extensively and are suitable for hundreds of applications. These applications would not be possible, however, without significant development of reagents and fluorochromes. Currently, in the research setting, enough fluorochromes are available to cover most instrumentation configurations; however, commercially this is not yet the case. Finally, data analysis technology lags significantly behind hardware and reagent technology, and is likely to provide the most dramatic advances in the future. The next few years will see this remarkable technology being applied to a wide variety of systems, undoubtedly spurring many advances in our knowledge of the immune system.