Flow cytometry is an analytical technique designed to make measurements on single cells in suspension. As instrumentation, reagents and analytical tools have improved, flow cytometry has been applied brilliantly to the study of heterogeneous “liquid” tissues such as blood and bone marrow, as well as other easily dissociated solid tissues such as lymph node and spleen. Much of what has been discovered in the fields of immunology and hematology is the direct result of the ability to study such heterogeneous tissues at the level of the single cell. Classical hematologists were able to discover differentiation pathways and make inferences about the identity of hematopoietic progenitor cells by grouping cells on the basis of their morphological features and affinity for dyes (1). These techniques reached their limitations when cells of different function but identical morphology (e.g., CD4 and CD8 T lymphocytes) could not be distinguished, and when important cell populations, inferred by biological experiments, evaded detection because of their low frequency (e.g., definitive hematopoietic stem cells). These problems (2, 3) and many others, have yielded to flow cytometry because of its robust analytical capabilities, and also because it has been a preparative method since the inception of the fluorescence activated cell sorter (4, 5).
Flow cytometry has the potential to advance the study of solid tissue differentiation, maintenance and repair in the same way, but its application has been challenging since it was first used to characterize ovarian carcinoma(6), pancreas (7), and hepatocytes (8). Many of the difficulties that early investigators faced are still problematic: How to tease strongly adhered cells into single cell suspension while maintaining viability; how to quantify the selection bias that occurs when some cell types survive the process better than others; how to determine which markers and functions are perturbed by the process of disaggregation, and when such loss is irreversible; how to identify and eliminate cellular debris and other sources of artifact from the analysis; and finally, having identified discrete cell populations, how to understand this information in the context of whole tissue architecture? This report will attempt to address these problems using the detection of a panel of putative stem/progenitor markers in non-small cell lung carcinoma and normal adjacent lung tissue as an example.
The hyaluronic acid receptor CD44, a ubiquitous adhesion molecule, was early implicated in tumor metastasis (9) and is the principal marker proposed by Clarke and coworkers to identify tumorigenic breast cancer cells (10). In normal stem cell biology, CD44 has been proposed to play a role in the migration and homing of mesenchymal stem cells (11). CD90 appears to be a lineage-independent adult tissue stem cell marker and was first described on primitive human BM stem cells (12). It is also expressed on oval cells of the liver (13), and on perivascular stem cells (14) which are closely related to mesenchymal stem cells (15). We have proposed CD90 as a principal cancer stem/progenitor cell marker in a variety of epithelial cancers and demonstrated its presence on cytokeratin+/ABCG2+ cells in lung, ovarian, gastric and breast cancer (16, 17). CD45-/CD90+ cells have been detected in liver tumors and in the circulation of liver cancer patients (18). When CD44 and CD90 are coexpressed on cytokeratin+ cells, they mark highly tumorigenic cells in breast cancer. As few as 50 cells directly sorted from clinical isolates are capable of tumor formation when coinjected with irradiated tumor (16, 19) or adipose-derived stromal cells (20). Further, in liver cancer, CD44+/CD90+ cells demonstrated a more aggressive phenotype than their CD44 negative/CD90+ counterpart and form metastatic lesions in the lungs of immunodeficient mice (18). Like CD90, the type III tyrosine kinase CD117 (KIT, stem cell factor receptor) marks stem/progenitor cells in a variety of tissues, but is also present in terminally differentiated cells such as mast cells (21). Activating mutations in CD117 are implicated in gastrointestinal stromal tumors (GISTs) (22), and myeloid leukemia (23). In normal stem cell biology, CD117 was recently used to identify and sort-purify human lung stem cells capable of regenerating bronchioles, alveoli, and pulmonary vessels (24). In the bone marrow CD133 marks hematopoietic stem/progenitor cells (25). It is present on human prostate epithelial basal cells (26) and has also been implicated on putative cancer stem cells in a variety of tumors (27–29), and most recently in poor risk lung cancer (30).
An annotated version of this methods section with commentary is available as an online Supplement 1 (Supporting Information) to this article.
Tissue Procurement and Transport
Non-small cell lung cancer samples and paired adjacent normal lung tissue were obtained from 17 patients at the time of surgical resection of the tumor. Specimens were collected under protocols approved by the University of Pittsburgh Internal Review Board (UPCI 99-053, 020391, 0503126, 07090247). The tissues were immediately immersed in sterile heparinized tissue culture medium (sodium heparin, 10 U/mL) and transported to the laboratory on an ice pack in a cooler.
After the tissue is accessioned, it is weighed, photographed, a physical description is recorded, and a sample is taken for formalin fixation and paraffin embedding. A schematic diagram of tissue processing workflow is provided in Supporting Information Figure 1, and the expected cell recovery of several tissue types prepared by mechanical dissociation (scalpels and screens) and collagenase digestion is shown in Supporting Information Table S1.
In the present study single cell suspensions were prepared from malignant lesions and tumor-free adjacent lung tissue as previously described (31). Briefly, tumors and lung tissue were minced with paired scalpels and digested with type I collagenase (0.4% in RPMI 1640 medium, Cat. No. C-0130, Sigma Chemicals, St. Louis MO) and DNase (350 KU/mL, Sigma Chemicals, St. Louis MO, Cat. No. D-5025) and disaggregated through 100 mesh stainless steel screens. Undigested tissue clumps were subjected to repeated rounds of digestion. Viable cells were separated from erythrocytes and debris on a Ficoll-Hypaque gradient (Histopaque 1077, Sigma Chemicals). Erythrocytes were lysed using an ammonium chloride lysing solution without fixative (Beckman-Coulter, Cat No. IM3630d). The complete laboratory procedure for tissue disaggregation is provided as in online Supplement 2 (Supporting Information).
Histology and Immunohistostaining
Normal and tumor tissues were fixed for 24 hours in neutral buffered formalin (Sigma Cat. No. F5554). Paraffin sections (5-6 μm) were prepared from embedded tissues. Tissue sections were heated (60°C, 20 min), deparaffinized (3 washes in xylenes), rehydrated by successive washes in absolute ethanol, 90% ethanol, 75% ethanol and deionized water and rinsed twice in Dako wash buffer (Dako). Antigen retrieval was performed at 125°C for 20 min in pH 9.0-EDTA buffer (Dako). After 2 washes in Dako wash buffer, the tissue sections were incubated for 1 hour in a blocking solution (PBS, 5% goat serum, 0.05% Tween 20) to reduce nonspecific antibody binding. Immunofluorescent staining was performed using CD117 (1:400 (35.7μg/mL),Dako Cat. No. A4502, polyclonal). The primary rabbit antibody was substituted by Dako Universal Negative Control for Rabbit Antibodies (ready to use, Dako Cat.No.N1699). Primary antibody and control were incubated for overnight at 4°C. Tissue sections were washed twice using DAKO Wash Buffer prior to applying biotinylated secondary goat anti-rabbit antibody (1:500 (1.52 μg/mL), Dako Cat. No. E0432) for 1 hour at room temperature. Tissue sections were washed twice with Dako wash buffer and incubated with streptavidin-Cy3 (1:500 (2μg/mL),Sigma, Cat. No. 6402) for 30 minutes at room temperature. Slides were washed again and tissue sections were incubated with Alexa 488-conjugated anti-pan-cytokeratin (1:200 (2.5μg/mL), clone AE1-AE-3, eBiosciences Cat. No. 53-9003-80) antibody for 1 hour at room temperature. Stained tissue sections were washed again twice in Dako wash buffer and nuclear staining was attained through 10minute incubation with DAPI (7.15μM Invitrogen, Cat. No. D1306). Slides were washed twice in PBS-A and mounted in Prolong Gold anti-fade reagent (Invitrogen, Cat. No. P36934). Immunofluorescent staining was observed and photographed using an epi-fluorescence microscope (Nikon Eclipse TE 2000-U).
Flow Cytometric Staining
Non-specific binding of fluorochrome-conjugated antibodies was minimized by preincubating pelleted cell suspensions for 5 minutes with neat decomplemented (56°C, 30 minutes) mouse serum (5 μL) (17). Prior to intracellular cytokeratin staining, cells were stained for surface markers (2 μL each added to the cell pellet, 15-30 minutes on ice; CD44-PE (Beckman-Coulter, Cat No. A32537), CD90-biotin (BD, Cat. No. 555594), Streptavidin-ECD (Beckman Coulter, Fullerton, CA Cat. No. IM3326), CD14-PECy5 (Beckman-Coulter, Cat. No. IM2640U), CD33-PECy5 (Beckman-Coulter, Cat. No. IM2647U), Glycophorin A-PECy5 (BD Biosciences, Cat.No.559944), CD133-APC (Miltenyi Biotech Cat. No. 130-090-854), CD117-PC7 (Beckman Coulter, Cat. No. IM3698), CD45-APCCy7 (BD, Cat. No. 348805)), and fixed with 2% methanol-free formaldehyde (Polysciences, Warrington, PA). Cells were then permeabilized with 0.1% saponin (Beckman Coulter) in phosphate buffered saline with 0.5% human serum albumin (10 minutes at room temperature), cell pellets were incubated with 5 μL of neat mouse serum for 5 minutes, centrifuged and decanted. The cell pellet was disaggregated and incubated with 2 μL of anti-pan cytokeratin-FITC (Beckman Coulter, Cat. No. IM2356) for 30 minutes. Cell pellets were diluted to a concentration of 10 million cells/400 μL of staining buffer and DAPI (Life Technologies, Grand Island NY, Cat. D1306) was added 10 minutes before sample acquisition, to a final concentration of 7.7μg/mL and 40 μL/106 cells (17).
Multi-dimensional flow cytometric acquisition was performed using a 10-color Gallios cytometer (Beckman Coulter, Miami FL). An effort was made to acquire a total of 1.8 million events per sample at rates not exceeding 10,000 events/second. For DAPI staining, PMT gain was optimized for linear (cell cycle) detection of 2N cells (tissue lymphocytes). The cytometer was calibrated to predetermined photomultiplier target channels prior to each use using SpectrAlign beads (DAKO, Cat. No. KO111) and 8-peak Rainbow Calibration Particles (Spherotech, Libertyville, IL, Cat. No. RCP-30-5A). Offline compensation and analyses were performed using VenturiOne software designed for multidimensional rare event problems (Applied Cytometry, Dinnington, Sheffield, UK). Spectral compensation matrices were calculated for each experiment using single-stained mouse IgG capture beads (Becton Dickinson, Cat. No. 552843) for each tandem antibody and hard stained beads (Calibrite, BD) for single molecule dyes (Becton Dickinson, FITC, PE (Cat. No. 349502), APC (Cat. No. 340487)).
Immunofluorescent Staining of a Primary Lung Adenocarcinoma
In order to determine the histologic location of stem/progenitor marker positive cells in primary adenocarcinoma of the lung, we prepared FFPE sections and stained for histology and the expression of key markers used in the flow cytometry panel. Cytokeratin, which identifies epithelial cells, and DAPI, which stains nuclei, were used in combination with other key markers. Figure 1 shows expression of CD117 on cytokeratin+ tumor cells from a primary adenocarcinoma of the lung. In this specimen most tumor cells, which are distinguished from normal cells by their histologic features, express CD117. CD117 was also detected on solitary cytokeratin negative mast cells. Other markers used in this study (CD44, CD90, α-SMA, Ki67) and validated for FFPE sections and flow cytometry, but not shown here, are detailed in Supporting Information Methods.
Artifacts of Tissue Digestion
In order to detect rare events in disaggregated lung tumor and adjacent tissue, we removed several potential sources of noise and artifact (Fig. 2). For “doublet discrimination” (Fig. 2, row 1 column I) forward scatter pulse area was plotted versus forward scatter pulse width. Next (Fig. 2, II) forward scatter was plotted versus DAPI fluorescence to eliminate events with <2N DNA content. The events to the far left of the histogram are subcellular debris. The events smearing leftward from the 2N peak are early apoptotic cells that have begun to degrade their DNA or hypodiploid tumor cells. Figure 2, column III is a one parameter histogram of cytokeratin-FITC fluorescence, used to eliminate the last 10 channels with saturating FITC fluorescence (not able to be spectrally compensated). Events with saturating FITC fluorescence represented only 0.2% of “clean” events, but spilled over into the PE and PE-Texas red channels as false positive events if not removed. The remaining rows of Figure 2 illustrate the properties of the events that were eliminated during artifact removal. The details are presented in the figure legend.
A dump gate was used to eliminate events that are known to be outside the domain of events of interest. It has the advantage of also removing events that bind antibody nonspecifically, as well as events with autofluorescence at the detection wavelength. The use of the dump gate (Fig. 2, column IV) requires some explanation. In earlier iterations of this panel we simply used CD45 versus cytokeratin to identify and eliminate CD45+ (hematopoietic cells). The sporadic appearance of a puzzling cytokeratin+/CD45+ population led us to add a myeloid/erythroid lineage cocktail to clarify this issue.
Figure 3 provides an investigation of the populations that we eliminate using this 2-parameter dump gate. Color-event gating is used to identify the CD45-/lineage- population (orange), cytokeratin+ events (green), and 3 major populations outside the CD45-/lineage- gate: lymphocytes (blue) and two myeloid populations (red, turquoise). Although the two populations staining for myeloid markers appear to spread into cytokeratin+ events, analysis of DNA staining shows that they are diploid and therefore probably not tumor cells. In contrast, CD45-/lineage- cells (orange) have a discernible population with DNA >2N, which is even more prominent in CD45-/lineage-/cytokeratin+ cells (green). We investigated the apparent CD45+/cytokeratin+ population further using imaging flow cytometry, which revealed monocytoid cells with cytokeratin+ cytoplasmic inclusions (Supporting Information Fig. S2). We conclude that heme lineage+/cytokeratin+ events should be excluded from analysis.
Choosing Classifiers and Outcomes for Multidimensional Flow Cytometry Data Analysis
In the present example stem/progenitor marker expression (CD44, CD90, CD117, CD133) is compared on tumor cells and cells from adjacent normal lung on paired samples for 17 patients. After limiting the analysis to nonhematopoietic cells, we classified cells based on cytokeratin expression (epithelial versus nonepithelial or pre-epithelial) and ploidy (2N versus >2N), yielding four classes of cells on which to examine outcomes (stem/progenitor markers, and light scatter, Fig. 4). The reason for using ploidy as a classifier is that in tumor samples, we could be certain that the majority of aneuploid cells were of bona fide tumor origin (as opposed to normal stromal or epithelial cells). It should be noted that the converse is not true; all 2N cells are not normal and pseudodiploid tumor cells are well documented (32, 33). Had our question or hypothesis been different, we may have chosen to use ploidy as an outcome rather than as a classifier. Tumor infiltrating lymphocytes, identified by CD45 expression, were used as internal standards defining 2N DNA and lymphoid (i.e. small cell) light scatter.
Figure 4 also illustrates the combination of a histogram array and data table, in which a representative sample is chosen to illustrate the analytical strategy, but the region statistics are based on the entire study population (17 patients). For the outcome variables, mean values, with 95% confidence intervals in parentheses, are provided to facilitate comparisons.
Scanning Figure 4 it is apparent that the majority of cells with >2N DNA (aneuploid/proliferating cells) are in the cytokeratin+ population and the majority of small cells are in the cytokeratin negative population. Inspection of the cytokeratin+ population reveals that cells bearing the stem/progenitor associated markers CD44, CD117 and CD133 are more prevalent among cells with >2N DNA. Although there is some marker coexpression, CD44, CD117 and CD133+ populations are largely distinct. Although CD117 is the most prominent stem/progenitor marker in this series, some care must be taken in the analysis. Tumor samples in our dataset dichotomized on the basis of CD117 expression and the example shown is CD117+, illustrating the difficulty in choosing a truly representative sample.
The cytokeratin negative population is more interesting than might have been assumed a priori. Particularly in the subset with >2N DNA, there is a prominent population of CD117+ cells, unlikely to be mast cells because of their ploidy. There is also a small but prominent population of low light scatter cells, which again, because of aneuploidy may represent small undifferentiated (i.e. cytokeratin negative) tumor cells. Finally, among the 2N population there is a robust population of cells coexpressing CD44 and CD90, most likely identifying tumor-associated mesenchymal stromal cells, an important component of the tumor niche.
An analogous graphic representing analysis of adjacent grossly normal lung tissue is shown in Supporting Information Figure S3. Cells with >2N DNA content are scant. Cytokeratin+ diploid cells include a small population of CD117+ cells, reported to be normal lung stem cells (24), as well as CD117-/CD133+ cells, absent or reduced in lung tumors. Mesenchymal stromal cells (CD44+/CD90+) are present and low light scatter cells are prominent in the diploid cytokeratin negative population.
From a total of 86 quantitative variables extracted from the data by conventional analysis (gates and regions), at total of 22 were significantly different (p ≤ 0.05) between tumor and normal lung in an uncorrected bivariate comparison (Fig. 4, bottom panel). Like all multivariate cytometry problems, rigorous comparison of tumor and normal lung is complicated by the fact that there are more variables than observations, many of the variables are heavily correlated, and multiple comparisons increase the risk of chance association. This problem is thoroughly treated, using this dataset, in a companion article (34). However, even a simple bivariate analysis reveals that 3 of the largest effects in the data set (p = 0.001) are those identifiable by morphology and simple immunohistochemistry: 1) CKP_2N = cytokeratin+/euploid (tumors have less); 2) CKP>2N = cytokeratin+/aneuploid (tumors have more); 3) CKP_SM = cytokeratin+/lymphoid light scatter (tumors have less). The fourth variable CN2117P133N = cytokeratin negative/diploid/CD133 negative, does not correspond to a known population, as mast cells were removed from the analysis by gating on CD45 negative events.
FFPE sections (Fig. 1) are critical to the interpretation of flow cytometry performed on digested tissues. These preparations provide a histologic context for key markers used in flow cytometry and provide a standard by which single cells suspensions may be evaluated for selection bias. The principal disadvantages of immunohistostaining, the small number of cells that can be evaluated, the subjective nature of analysis, and the technical difficulties associated with polychromatic staining, are easily overcome by flow cytometry. In the present data set, immunofluorescent staining provided several important cues for interpretation of flow cytometric data: 1) Cytokeratin negative CD117+ cells are mast cells; 2) Some morphologically identifiable tumor cells are cytokeratin negative; 3) The dichotomy observed by flow cytometry between patients with CD117+ and CD117- tumors was confirmed; and 4) Tissue processing for flow cytometry results in overrepresentation of hematopoietic cells, especially lymphocytes. In previous studies, combining immunohistostaining with flow cytometry allowed us to localize the cytokeratin+/CD44+/CD90+ population observed by flow cytometry to the invasive edge of breast tumors (19), and to determine the histologic location of CD45-/CD146-/CD31-/CD34+ adipose stem cells (35).
Among the first flow cytometric applications for disaggregated tumors was the study of tumor infiltrating immune cells (36). Compared to the study of epithelial tissues, which are complex vascularized structures in which cells are organized by avid adhesion to extracellular matrices and each other, tissue infiltrating immune cells are weakly associated and readily recovered as viable single cells. Tumors can be challenging to dissociate because they may be hardened by fibrosis and contain necrotic areas. However, careful observation and sequential digestion of tumor tissue will actually yield more cells per gram than normal tissue (Supporting Information Table S1).
Even the most careful tissue digestion will result in undigested tissue clumps, apoptotic cells, dead cells and subcellular debris. All of these will interfere with analysis and interpretation unless identified in the data set and removed by logical gating. The methods described here have previously been used for adipose tissue (35), normal breast and breast cancer tissues (19). Stem/progenitor populations are rare in most tissues and require special considerations for their detection and enumeration (37, 38). Most importantly, it is necessary to examine a sufficient number of “clean” events to yield an appropriate number of analyzable events. A population of 100 cells in an analytical region will give a Poisson counting coefficient of variation of 10%, as originally worked out by Student for the hemocytometer (39). To take an example from a recently published article describing multipotential adult human lung stem cells (24), which were present at a frequency of 1/24,000, a minimum of 2.4 million events (post artifact removal) must be acquired to attain a counting CV = 10%. Dealing with such large data sets requires specialized analytical software designed for parallel processing. Several packages are available for offline analysis. A hierarchical approach to data analysis, such as the artifact removal, classifier, outcome method described here helps to focus data exploration and analysis. However, the problem of more quantifiable features (i.e. analytical regions) than cases, with many variables highly correlated, is inherent in multidimensional cytometry data, and argues ultimately for an automated approach to data analysis. In its simplest form, this entails applying modern multivariate statistical techniques (40, 41) to the results of conventional gate/region type analyses such as those described here. Eventually, it may be possible to replace manual gate/region-based analysis with automated cluster-finding algorithms, but this can be a double-edged problem if attaining complete objectivity requires us to relinquish a wealth of a priori knowledge concerning the biological constraints imposed on marker expression.
In this data set, three of the four most significant distinguishing features identified by bivariate analysis involved a combination of morphology (light scatter), cytokeratin expression, and DNA content, features long used to identify tumor cells. Prior to analysis of stem/progenitor marker expression on nonhematopoietic cells, we chose to identify four classifier populations on the basis of cytokeratin expression and DNA content. In tumor samples, cytokeratin+ cells with >2N DNA are clearly tumor cells, but this does not exclude the possibility of cytokeratin negative or pseudodiploid tumor cells. Similarly normal lung airway cells have a proliferative (and therefore >2N) component (Supporting Information Fig. S3).
After subsetting the data on the basis of cytokeratin expression and DNA content, we found a striking similarity between stem/progenitor marker patterns in tumor and adjacent tumor-free lung. The conservation of expression patterns suggests that these proteins may play important functional roles in both tumor and the normal tissues (24). Similarly, we (17) and others (42–44) have demonstrated that constitutive self-protection mediated by ABC transporter activity in normal tissue stem cells can be retained or re-expressed in a subset of malignant cells. These data support the interpretation that airway stem cells and their malignant counterparts share at least some of these growth factor receptors and adhesion molecules, as has been demonstrated in colon cancer and normal colon (45). For example, CD44/CD90 expression on cytokeratin negative cells is consistent with mesenchymal stem cells in normal tissue, but in metastatic cancer, CD44/CD90 coexpression on cytokeratin positive cells (19) may signal epithelial to mesenchymal transition (46).
Taken together, our finding that tumor cells share stem/progenitor and adhesion markers with tumor-free chronically injured lung tissue is consistent with the hypothesis that the self-renewing, self-protected tumorigenic cell can take the form of a stem-progenitor hybrid in aggressive epithelial neoplasms such as lung cancer (17). Combining stem-like self-renewal and protection with high proliferative capacity, they need not be rare to exploit mechanisms employed by normal tissue stem cells for their renewal and survival.
The authors would like to acknowledge our clinical collaborators James D. Luketich and Adam M. Brufsky, as well as Dr. Ludovic Zimmerlin, James Arbore and E. Michael Meyer for their assistance in the development of the methods presented here.