Isolation and Molecular Characterization of Cancer Stem Cells in MMTV-Wnt-1 Murine Breast Tumors



In human breast cancers, a phenotypically distinct minority population of tumorigenic (TG) cancer cells (sometimes referred to as cancer stem cells) drives tumor growth when transplanted into immunodeficient mice. Our objective was to identify a mouse model of breast cancer stem cells that could have relevance to the study of human breast cancer. To do so, we used breast tumors of the mouse mammary tumor virus (MMTV)-Wnt-1 mice. MMTV-Wnt-1 breast tumors were harvested, dissociated into single-cell suspensions, and sorted by flow cytometry on Thy1, CD24, and CD45. Sorted cells were then injected into recipient background FVB/NJ female syngeneic mice. In six of seven tumors examined, Thy1+CD24+ cancer cells, which constituted approximately 1%–4% of tumor cells, were highly enriched for cells capable of regenerating new tumors compared with cells of the tumor that did not fit this profile (“not-Thy1+CD24+”). Resultant tumors had a phenotypic diversity similar to that of the original tumor and behaved in a similar manner when passaged. Microarray analysis comparing Thy1+CD24+ tumor cells to not-Thy1+CD24+ cells identified a list of differentially expressed genes. Orthologs of these differentially expressed genes predicted survival of human breast cancer patients from two different study groups. These studies suggest that there is a cancer stem cell compartment in the MMTV-Wnt-1 murine breast tumor and that there is a clinical utility of this model for the study of cancer stem cells.

Disclosure of potential conflicts of interest is found at the end of this article.


Evidence for the existence of cancer stem cells, first described in blood tumors, has recently emerged in solid tumors. In the cancer stem cell model, a tumor has a small subset of cells that give rise to both tumorigenic and nontumorigenic cancer cells that make up the malignant cell component of the tumor. The nontumorigenic cancer cells appear to have limited capacity to replicate and thus do not have the ability to propagate the tumor [1, 2]. In acute myeloid leukemia, only a subpopulation of cells is capable of driving leukemic engraftment in NOD/SCID mice [3, 4]. These leukemia initiating cells (L-ICs) express some of the same cell surface markers as normal hematopoietic stem cells, leading to the suggestion that the L-ICs may have arisen from a normal hematopoietic stem cell whose capacity for self-renewal is deregulated.

We have previously demonstrated that a small population of tumorigenic (TG) cells, isolated from human breast tumors and characterized by the expression of the cell surface markers CD44+CD24−/lowLineage, was capable of regenerating the phenotypic heterogeneity of the original tumor when injected subcutaneously into NOD/SCID mice [5]. In the tumors of most patients tested, these cells represented a minority population of cancer cells, were capable of being serially passaged, and retained the ability to reconstitute the heterogeneous population of cells within the original tumor upon serial passages using as few as 100 cells. Similarly, a TG cell population has been isolated from human brain tumors, characterized by the expression of CD133 [6].

Normal stem cells are an obvious possible target of transformation given their similarity to cancer stem cells in their ability to self-renew and proliferate. Supporting this, a recent study identified a population of murine lung epithelial cells with stem cell properties in culture that were expanded in early lung tumors [7]. In normal stem cells of adult tissues, the process of self-renewal and proliferation is tightly regulated to prevent an uncontrolled expansion of the stem cell pool. Disruption of genes involved in this regulation could result in unlimited expansion of self-renewing cells and may form the basis of tumorigenesis. Predictably, it has been shown that some oncogenes function to regulate self-renewal, the best-described being the Wnt/β-catenin signaling pathway, which plays a pivotal role in both self-renewal of normal stem cells and malignant transformation [8, [9]–10].

The Wnt pathway was first discovered in mouse mammary tumor virus (MMTV)-induced murine breast tumors in which proviral insertion resulted in deregulated expression of Wnt-1 and promoted the formation of mammary tumors [11]. Wnt signaling leads to the stabilization and accumulation of β-catenin, which is normally targeted for degradation. Consequently, β-catenin translocates into the nucleus, binds to the LEF/TCF family of transcription factors, and activates the transcription of genes that promote proliferation. Mutations in the Wnt signaling pathway that result in the constitutive activation of β-catenin have been implicated in tumorigenesis of colon cancer, as well as certain brain and skin cancers [12, [13], [14], [15]–16]. Although breast cancer mutations in APC or β-catenin are rare [17], a recent study documented that autocrine Wnt secretion activates the canonical β-catenin signaling pathway in approximately 25% of breast cancer cell lines that were tested [18].

To begin to elucidate the cellular origin of TG cancer cells and the process by which transformation by different oncogenes occurs, we turned to mouse models since the study of cancer stem cells in humans is limited by the difficulty in investigating the molecular regulation of human organ development. In mice, normal breast development has been shown to involve breast stem cells with regenerative potential [19, [20], [21]–22]. We used MMTV-Wnt-1 transgenic mice, which develop mammary tumors as a consequence of activation of β-catenin signaling, to prospectively identify a subpopulation of breast tumor cells that are highly enriched for cancer cells able to establish tumors when transplanted into syngeneic recipients. More specifically, as few as 50 tumor cells with the immunophenotype Thy1+CD24+CD49f+CD45− were capable of regenerating tumors with phenotypical heterogeneity similar to that of the original tumor when injected into the mammary fat pad of recipient mice. In addition, these cells retained their ability to replicate and differentiate after serial transplantation. Interestingly, CD24+CD49f+CD45− cells isolated from dissociated normal murine mammary tissue were capable of regenerating mammary tissue in vivo when implanted into cleared mammary fat pads of recipient mice, strongly suggesting that the phenotype of the TG cancer cells is similar to that of normal mammary epithelial cells with duct-regenerative capacity [21, 22]. In addition, microarray analysis of MMTV-Wnt-1 murine TG and nontumorigenic (NTG) cancer cells revealed a gene expression signature that was able to predict survival of human breast cancer patients. These results suggest that MMTV-Wnt-1 TG cells are relevant to the study of human disease. We propose our data as supporting evidence of cancer stem cells' relevance to human disease and for the use of the MMTV-Wnt-1 TG cells as an in vivo model for the study of cancer stem cells.

Materials and Methods

Tumor Harvest and Dissociation

Tumors from MMTV-Wnt-1 FVB/NJ (002934; Jackson Laboratory, Bar Harbor, ME, female transgenic mice were harvested when the tumors were approximately 1–2 cm3 (2–2.5 g), minced with a razor blade, and suspended into 20 ml of Medium 199 (Gibco-BRL, Gaithersburg, MD, with 20 mM Hepes buffer. One hundred Kunitz units of DNase I (D4263; Sigma-Aldrich, St. Louis,, 8 Wünsch U of Liberase Blendzyme 2 (1998433; Roche Diagnostics, Basel, Switzerland,, and 8 Wünsch U of Liberase Blendzyme 4 (1988476; Roche) were added. Digestions lasted for 2.5 hours at 38C with pipetting every 30 minutes through a 10-ml pipette for manual dissociation. After 2 hours of digestion, another 100 Kunitz U of DNase I was added. Once digested, 80 ml of RPMI (BioWhittaker, Lonza Group Ltd., Basel, Switzerland, with 10% calf serum (HyClone, Logan, UT, was added to the digestion solution to inactivate the collagenases. Forty-micrometer nylon filters were used to filter the sample. Cells were spun at 190 relative centrifugal force for 5 minutes. The cell pellet was resuspended for 1 minute in 5 ml of ACK buffer for red blood cell lysis. Hanks' balanced saline solution (HBSS; BioWhittaker) with 2% heat-inactivated calf serum (HICS) was used to dilute the ACK buffer, and cells were filtered again through a 40-μm nylon filter. The filtered cells were spun down and resuspended in 4–6 ml of HBSS with 2% HICS (staining medium).

Cell Staining and Flow Cytometry

Cells were stained at a concentration of 1 × 106 cells per 100 μl of HBSS with 2% HICS. Rabbit IgG (1 mg/ml) at a 1:100 dilution and antibodies at appropriate dilutions (CD24-phycoerythrin (PE), clone 30-F1, eBioscience Inc. [San Diego,]; Thy1.1-allophycocyanin, clone HIS51, eBioscience; CD49f-fluorescein isothiocyanate, clone GoH3, BD Pharmingen [San Diego,]; CD45-PE-Cy5, clone 30-F11, BD Pharmingen) were added. Staining duration was for 20 minutes on ice, with light agitation of the staining vessels every 5 minutes. Cells were then washed with staining medium and resuspended in staining medium containing 7-aminoactinomycin D (7-AAD; final concentration, 1 μg/ml) or 4′,6-diamidino-2-phenylindole (DAPI; final concentration, 1 μg/ml).

The stained specimens were then analyzed using FACSVantage (BD Biosciences, San Diego, or FACSAria with either Diva or CellQuest software (BD Biosciences). Selection criteria that included side scatter and forward scatter profiles, depletion of 7-AAD- or DAPI-positive cells, and depletion of CD45 cells were used. Cells with appropriate CD24, Thy1, and CD49f status were then collected. To reduce the rate of contamination of cells that did not fit the requested cell profile, all collected cells were sorted a second time (double sort). A small sample of the double-sorted cells was reanalyzed for purity. Final cell purity was greater than 95%. The cell counter of the flow cytometers was used to determine cell numbers. Cells were collected into RPMI or HBSS with 2% HICS.

Tumor Injection

FVB/NJ female mice (4–8 weeks of age) were sedated via intraperitoneal injection with a mixture of ketamine (Wyeth, Madison, NJ, and xylazine (Vedco, Inc, St. Joseph, MO, in phosphate-buffered saline. Sorted cells were suspended in 100 μl of collection medium, which was then mixed with 100 μl of Matrigel (354234; BD Biosciences). The cell mixture was then injected near the upper mammary fat pads of the mice using a 23-gauge, 1-inch needle that tracked caudally subcutaneously from the anterior rib border. Vetbond (1469SB; 3M, St. Paul, MN, was used to close the injection site. Mice were observed weekly for 6–8 months for tumor formation. Some resultant tumors were analyzed and injected in the same manner as de novo tumors. All tumorigenic cell frequencies were calculated using L-Calc by StemCell Technologies (Vancouver, BC, Canada, Because the Thy+CD24+ 2,000-cell dose was an outlier (Fig. 1D), this cell dose was excluded when calculating stem cell frequencies because the L-Calc program was unable to analyze the data when this dose was included.

Figure Figure 1..

Only a subset of the mouse mammary tumor virus (MMTV)-Wnt-1 tumor cells have tumor forming capacities. Flow cytometry was used to separate MMTV-Wnt-1 cells based on surface antigen expression. The collected populations of cells were injected into recipient background mice and observed for tumor formation. Black columns represent the numbers of injections, and gray columns represent tumors resulting from those injections. (A): CD45− cells were injected at the listed cell doses in a limiting dilution manner. (B): CD24+CD45− cells were injected in 1,000-cell doses. (C): Thy1+CD45− cells were injected in 1,000-cell doses. (D): The indicated cell populations from six different tumors were injected at the dosages listed. Denominators in the table represent the number of injections, and numerators represent the number of resultant tumors from the injected tumor cells. Abbreviation: K, thousand.

Real-Time Polymerase Chain Reaction

Triplicate collections of 1,000 cells of both CD24+Thy+CD45− and “not-CD24+Thy+CD45−” cells were collected in Trizol. RNA was collected and cDNA was made by common molecular biology techniques. TaqMan Gene Expression Assays (Applied Biosystems, Foster City, CA, were used in performing Real-Time (RT) polymerase chain reaction (PCR). Hprt-1 (Mm0046968; Applied Biosystems) and Krt1-19 (Mm00492980; Applied Biosystems) detection oligos were obtained and RT-PCR assays were done per product protocol in 20-μl PCR volumes. RT-PCRs were run in the University of Michigan Array Core.

Mouse Microarray

Three MMTV-Wnt-1 tumors were harvested and cells were sorted into CD24+Thy+CD45− and not-CD24+Thy+CD45− populations of 10,000 cells each using the above-mentioned protocol. RNA isolation was accomplished using RNAqueous-Micro (1931; Ambion, Austin, TX, The RNA was used by the University of Michigan to produce array data using the NuGEN Ovation Biotin labeling system (San Carlos, CA, and Affymetrix Mouse 430 2.0 GeneChips (Santa Clara, CA,

Microarray Data Analysis


Perfect match chip density and RNA degradation plots were analyzed to ensure array quality. Expression values for each array element were calculated using the robust multiarray average algorithm.

Defining the Mouse Signature.

Differentially expressed (DE) genes were obtained by using fold change rank ordering with a nonstringent p value cut-off [23, 24]. To generate our DE gene list, we selected genes that had a normalized intensity of ≥100 on at least one array, that varied by more than twofold between the TG and NTG samples, and that had nominal paired t test p values of ≤.05. This resulted in a list of 274 array elements representing 252 unique genes.

Array Annotation.

For Affymetrix arrays, the latest annotation files were downloaded from the Affymetrix Web site and used for all further analysis. For the Rosetta/Netherlands Cancer Institute (NKI) oligonucleotide array, oligo sequences were downloaded from Rosetta Web site, and a Blast search was performed between oligos and sequences from the National Center for Biotechnology Information genes database to annotate the array. Array elements from different array platforms were mapped to each other by gene symbols.

Data Transformation.

For Affymetrix score in the downloaded data set, signal intensity values of probes were transformed into log ratios using the average intensity of the probe across all samples within data set as denominator.

Mouse/Human Probe Set Mapping for Genes in Gene Signature.

An ortholog mapping file (Mouse430_2_ortholog.csv) between human and mouse probe sets was downloaded from the Affymetrix Web site. Human orthologs were identified for 205 of the mouse genes in the signature. One hundred sixty-eight genes were present in the NKI database. If there were multiple probes mapping to a single gene in the TG/NTG list, only the probe with highest intensity was used for further analysis. In the two human data sets, if multiple probes were annotated to a single gene, the probes with largest standard deviation (i.e., largest variation) among patients were used.

Statistical Analyses.

Average linkage clustering was carried out using the Cluster 3.0 software [25] and visualized using TreeView software (Lawrence Berkeley National Lab, Berkeley, CA, [26]. Gene expression data for the 168 and 205 aforementioned genes were extracted from the NKI database and the Karolinska Institute/Hospital database [27, 28], respectively. The 295 patients of the NKI database and the 395 patients of the Karolinska Institute/Hospital database were split into two groups using agglomerative hierarchical clustering. Kaplan-Meier survival analysis was performed using the GraphPad Prism software (version 4.03) (GraphPad Software, Inc., San Diego, Statistical significance between the curves from different groups of patients was assessed using log-rank tests. Overall survival was defined by death from any cause.


CD24 and Thy1 Tumor Cells Are Enriched for Tumorigenic Cells

We first tested viable CD45-negative cells for tumorigenicity since CD45 is an established exclusion marker for hematopoietic cells. CD45− cells from the MMTV-Wnt-1 tumors were injected subcutaneously near the upper nipple lines of syngeneic (FVB/NJ) mice in a limiting dilution analysis. A cell dose of 1,000 cells or less did not produce new tumors (Fig. 1A). At 2,000 cell injections, four tumors appeared after 15 injections. By the 10,000-cell doses, tumors arose from five of six injections. This limiting dilution experiment showed that the TG population of cells was 1 in 7,496 (95% CI of 4,865–11,550) tumor cells. This calculation did not take several variables into account, including potential local environmental issues and cell-cell interaction requirements, and thus could have underestimated the frequency of TG cells.

We next tested Thy1 and CD24 as additional markers to help segregate TG from NTG populations since these markers had been previously shown to be differentially expressed during development in other systems [5, 29, 30]. Tumor cells demonstrated differential expression patterns for both markers, consistent with tumors being composed of heterogeneous cell populations. CD24 and Thy1 were both useful markers in differentiating TG from NTG populations by flow cytometry. When injected into the breast of syngeneic mice, 15 of 38 injections of 1,000 CD24+CD45− cells formed tumors, whereas only 1 of 30 injections of 1,000 CD24−CD45− cells did so (p = .004) (Fig. 1B). This translates to a TG frequency in the CD24+CD45− population of 1 in 2,060 (95% CI of 1,236–3,434) tumor cells. Ten of 25 injections of Thy1+CD45− cells resulted in tumors, but only 1 of 15 injections of Thy1-CD45− cells gave rise to tumors (p = .03) (Fig. 1C). The calculated TG frequency of the Thy+CD45− population is 1 in 1,958 (95% CI of 1,046–3,663) tumor cells. This suggested that TG cells existed within both the CD24+CD45− and Thy1+CD45− subpopulations.

Flow cytometry was then used to isolate cells based on the combination of CD24, Thy1, and CD45. Approximately 1%–4% of the tumor cells were Thy1+CD24+CD45− (Fig. 2A, 2B). The bulk of these cells appeared to express low levels of Thy1 and low to medium levels of CD24 (Fig. 2B). Limiting dilutions were done to determine whether the Thy1+CD24+CD45− tumor cells were enriched for TG cells. In six of the seven de novo tumors examined, this was indeed the case. Tumors formed in eight of nine, eight of nine, three of five, and four of five injections of 1,000, 500, 100, and 50 Thy1+CD24+CD45− cells, respectively (Fig. 1D). This represents a TG frequency of 1 in 213 (95% CI of 14–396) (Fig. 1D). At the 2,000-Thy+CD24+ cell dose, only 4 of 10 injections resulted in tumor formation. The lower-than-expected frequency of engraftment from the Thy1+CD24+ cells could be due to damage to the cells during isolation that day. Alternatively, the secondary mutations leading to malignant transformation in that particular tumor could have resulted in a tumor whose tumorigenic cells were different from those driving the growth of the other tumors [31]. In six of seven tumors tested, the remaining tumor cells that were not-Thy1+CD24+CD45− were significantly depleted of cells capable of forming tumors. Only 2 of 25 injections of 1,000 of the remaining tumor cells formed tumors (Fig. 1D). The tumor-forming-cell frequency in the not-Thy+CD24+CD45− population was 1 in 10,447 (95% CI of in 5,215–20,927). These data suggest that the TG cells reside specifically in the small minority of cells that make up the Thy1+CD24+ subpopulation. Based on limiting dilution analysis, using both Thy1 and CD24 simultaneously enriched for TG cells by approximately 8–10-fold compared with using the markers independently.

Figure Figure 2..

Wnt-1 tumor Thy1+CD24+CD45− cells make up 1%–4% of total tumor cells and are enriched for tumor-forming cells. (A): Fluorescence-activated cell sorting plots of a representative mouse mammary tumor virus-Wnt-1 tumor in a single-cell suspension stained with CD24 and Thy-1 and depleted for CD45+ (hematopoietic cells) and 4′,6-diamidino-2-phenylindole+/7-aminoactinomycin D+ cells (dead or dying cells). Percentages of tumor cells in each quadrant are indicated. The Thy1+CD24+CD45− cell frequency was 1%–4% in the tumors analyzed. The bold box indicates the selection gate of Thy1+CD24+CD45− used to sort cells. (B): Thy1+CD24+CD45− cells make up 1%–4% of the total tumor cells. As can be seen in (B), most of the Thy1+CD24+CD45− cells are Thy1lo/CD24lo. The remaining cells that do not fit the Thy1+CD24+CD45− profile, referred to as not-Thy1+CD24+CD45−, are shown in (C). The Thy1+CD24+CD45− cells, when injected into recipient mice, resulted in tumor formation, whereas the Not Thy1+CD24+CD45− cells were depleted of tumorigenic cells. (D): The resultant tumor from injection of the Thy1+CD24+CD45− cells from (A) showed a marker profile similar to that of the original tumor. Abbreviations: APC, allophycoryanin; PE, phycoerythrin.

Tumorigenic Cells Give Rise to Tumors Containing Cells Whose Phenotypic Diversity Resembles That of the Original Tumor

We next performed serial transplantation studies to determine whether Thy1+CD24+CD45− cells were able to self-renew and recapitulate the heterogeneity found in the original tumors. Flow cytometry analysis of secondary tumors generated by as few as 50 Thy1+CD24+CD45− cells showed that they contained populations of TG and NTG cells that were phenotypically similar to those of the original tumor (Fig. 2D). Serial transplantation experiments showed that the TG and NTG cells from secondary tumors had tumor-forming capacities similar to those of the primary tumors (supplemental online Fig. 1). Tertiary tumors also contained cancer cells that were phenotypically similar to those found in de novo tumors (supplemental online Fig. 2). These results demonstrate that TG cancer cells are enriched in the Thy+CD24+CD45− cell fraction, are capable of self-renewal, and can give rise to fractions of NTG cells similar to those in the original tumor when analyzed by flow cytometry. The few tumors that grew in the not-Thy1+CD24+CD45− fraction could have been due to the inability to completely eliminate Thy1+CD24+ cells by sorting. Alternatively, there could have been a small population of TG cells that did not express CD24 or Thy1. We next asked whether CD49f, a marker expressed at high levels by normal mammary stem cells [21], could further enrich for TG cells. At least 80% of the Thy1+CD24+CD45− TG cancer cells were also CD49f+ (supplemental online Fig. 3). Cells that were not-Thy1+CD49f+CD24+CD45− were even further depleted of tumor-forming cells than not-Thy+CD24+CD45− cells (Fig. 1D). The tumor-forming-cell frequency in the Thy+CD24+CD49f+CD45− population is 1 in 203 (95% CI of 71–581), which, as predicted from the flow cytometry analysis, is almost identical to the 1 in 213 cell frequency seen in the Thy1+CD24+CD45− population. The tumor-forming-cell frequency in the not-Thy+CD24+CD49f+CD45− population is 1 in 54,905 (95% CI of 20,638–146,066).

Microarray Analysis of the MMTV-Wnt-1 Tumorigenic Cells

To identify molecular differences between TG and NTG cells, Thy1+CD24+CD45− (TG) cancer cells and not-Thy1+CD24+CD45− (NTG) cells from tumors in three different mice were used to generate probes for Affymetrix Mouse 430 2.0 oligonucleotide arrays. To begin, we were interested in analyzing the expression of developmentally regulated cytokeratins in our cell populations. The TG cells tended to overexpress basal keratins, such as Krt5, Krt14, and Krt17, whereas the NTG cells overexpressed luminal keratins, including Krt18 and Krt19. Interestingly, the expression pattern of these genes mirrored their expression in normal mammary stem cells and their progenitors in a recently published microarray data set [21] (supplemental online Fig. 4), indicating that TG cells share some attributes of normal breast stem or early progenitor cells. To validate these findings, we evaluated the expression of Krt19 by quantitative RT-PCR in an independent tumor and confirmed its overexpression in NTG cells (supplemental online Fig. 5).

We next set out to identify a list of differentially expressed probes between the two populations, which we termed the TG/NTG signature. To do so, we used fold change rank ordering with a nonstringent p value cut-off [32, 33] and selected well-expressed genes that varied by more than twofold between the TG and NTG samples and that had a nominal paired t test p value of ≤.05. This identified 274 differentially expressed probes representing 252 unique genes (supplemental online Fig. 6).

Figure 3 depicts results of agglomerative hierarchical cluster analysis using the TG/NTG signature. Genes overexpressed in TG cells included Notch4, a gene previously implicated in MMTV1-induced breast cancer [34]. TG cells also overexpressed B-cell chronic lymphocytic leukemia/lymphoma 6, member B (Bcl6b), a transcriptional repressor recently shown to be critical for spermatogonial stem cell self-renewal and survival [35]. This gene may play a similar role in self-renewal of TG cells. Among genes overexpressed in NTG cells were genes implicated in luminal breast cell function, such as whey acidic protein (Wap) and lactotransferrin (Ltf), as well as the luminal breast epithelial marker Elf5 [36] and claudin 1 (Cldn1) [37]. Elf5 is an epithelial-specific ETS transcription factor that is required for proliferation and differentiation of mouse mammary alveolar epithelial cells during pregnancy and lactation [38]. These data concur with our observations of differential cytokeratin expression and indicate that the NTG population includes cells that display a more mature epithelial cell phenotype than TG cells.

Figure Figure 3..

Identification of genes differentially expressed between tumorigenic and nontumorigenic mouse mammary tumor virus-Wnt-1 breast cancer cells. Analyses were performed as described in the text to generate a list of 252 genes that were differentially expressed between tumorigenic and nontumorigenic cells. The data are displayed as a hierarchical cluster, where rows represent genes and columns represent samples. Colored pixels reflect the magnitude of expression: shades of red and green represent induction and repression, respectively, relative to the mean for each gene. A complete list of differentially expressed genes is available in the supplemental online data.

The TG/NTG Signature Predicts Survival of Breast Cancer Patients

Gene expression patterns of whole tumors can be used to predict survival and outcomes of patients with cancer [27, 28, 39, 40]. In addition, we have recently shown that gene signatures derived from tumorigenic cells of human breast cancers can predict overall survival and metastasis-free survival in breast cancer patients [41]. We wished to determine whether the mouse breast cancer signature derived from comparing TG and NTG cells could also be used to predict outcomes of human breast cancer patients. We reasoned that genes that are important for breast cancer stem cell machinery could very well be involved in a cross-species manner. Human orthologs for the 205 TG/NTG genes were found. Of these, 168 genes were present in the previously published NKI whole tumor gene expression data set of 295 patients with stage I or II primary breast carcinoma [40]. Hierarchically clustering the patients using only these 168 genes separated them into two groups with markedly different outcomes. Kaplan-Meier analysis revealed overall survival of 75% versus 49% (p < .0003) at 12 years for the two groups (Fig. 4A; supplemental online Figs. 7, 8). Even though the 168 genes were a subset of the 252 initial mouse genes, their expression still separated the mouse TG and NTG populations (supplemental online Fig. 9). Thus, genes differentially expressed in TG and NTG cells of MMTV-Wnt-1 mouse tumors predict clinical outcomes in patients with breast cancer.

Figure Figure 4..

The mouse tumorigenic/nontumorigenic gene expression signature predicts survival of breast cancer patients. One hundred sixty-eight human orthologs of the 252 tumorigenic/nontumorigenic signature genes were present in the previously published Netherlands Cancer Institute (NKI) whole tumor gene expression data set of 295 patients with stage I or II primary breast carcinoma [40]. Using the NKI gene expression data for these 168 genes, patients were separated into two groups by agglomerative hierarchical clustering. The good prognostic group contained 214 patients, whereas the poor prognostic group contained 81 patients. Kaplan-Meier survival curves for the two groups were calculated using overall survival (A) as the clinical end points. Agglomerative hierarchical clustering was similarly performed on the Karolinska Institute/Hospital microarray gene expression data set of 395 human breast tumors [27, 28]. Patients were again stratified by overall survival, with the good prognostic group containing 207 patients and the poor prognostic group containing 188 patients (B). The respective p values are shown.

As patient prognosis is dependent on clinical grading criteria, we were interested to see whether our prognostic groups correlated with clinical grade. Related to this, as MMTV-Wnt-1 tumors are classified as estrogen receptor (ER)-positive [31], and human ER+ tumors have better prognosis than ER− tumors, we were interested to examine the distribution of ER+ and ER− tumors within our two prognostic groups. As expected, the vast majority of the ER− patients fell within the worse prognosis group, as did patients with other poor prognostic markers such as basal cell subtype and poor histologic grade (Fig. 5). This confirmation of clinically relevant prognostic indicators by our model gives further support for the relevance to human disease.

Figure Figure 5..

Netherlands Cancer Institute patient characteristics for the good and poor prognostic groups identified by the tumorigenic/nontumorigenic signature. Patients were divided into good and poor prognostic groups as described in Figure 4.

To verify these results, we also analyzed two published databases from the Karolinska Institute/Hospital [27, 28] consisting of breast tumor microarray expression profiles from 395 patients. Using the same gene signature to hierarchically cluster this validation data set, we could again separate patients into two groups, one with better survival than the other. Kaplan-Meier analysis revealed survival of 85% versus 66% (p < .0001) at 10 years (Fig. 4B).


We show that in six of the seven de novo breast tumors from MMTV-Wnt-1 transgenic mice, a minority population of phenotypically distinct cancer cells defined by the cell surface marker expression Thy1+CD24+CD49f+CD45− was highly enriched for cells able to form tumors when transplanted into syngeneic recipients. In multiple serial passage experiments, the TG cancer cells were able to give rise to a tumor with a ratio typical of Thy1+CD24+CD49f+CD45− TG cells to not-Thy1+CD24+CD49f+CD45− NTG cells as in the originating tumor. Moreover, NTG cells overexpressed luminal cytokeratins, Elf5, Cldn1, Wap, and Ltf, whereas TG cells overexpressed basal cytokeratins, suggesting that at least some of the NTG cells are more differentiated than TG cells. Thus, a small subset of tumor cells (a) repopulates itself, (b) gives rise to a majority population of cells that is phenotypically distinct, and (c) maintains similar ratios of the two cell populations. These observations are evidence that the NTG cells of MMTV-Wnt-1 model tumors are generated by TG cells [1].

The surface markers used to identify the mouse MMTV-Wnt-1 TG cells proved to be interesting. The mouse TG cells express CD49f and CD24, markers that previously have been shown to enrich normal mouse mammary stem cells [21, 22]. This similarity raises the possibility that the TG cells in many MMTV-Wnt-1 mouse tumors are the transformed counterpart of the normal mammary stem or early progenitor cells. Interestingly, it has previously been shown that the breast stem cell population is increased in premalignant MMTV-Wnt-1 mice [22]. Thy1, which is used to identify both human and mouse hematopoietic stem cells [43, 44], was expressed by a subset of the CD49f+CD24+ breast cancer cells. Interestingly, the Thy1+ fraction was enriched for the TG cells. This is the first use of Thy1 in the identification of an epithelial cancer stem cell population.

Our 252-gene signature derived from MMTV-Wnt-1 TG cells provided strong evidence for human biologic relevance since it had prognostic power in breast cancer patients. A simple explanation for the ability of the TG cancer cell signature to distinguish prognostic groups in breast cancer is that a closer identity with the signature represents a higher cancer stem cell frequency in a particular tumor. Alternatively, this signature could reflect particular mutations that affect biological properties of the TG and NTG cells in poor prognosis patients, resulting in progeny NTG cells with gene expression patterns more closely resembling those of the cancer stem cells. Since many oncogenic mutations block differentiation, it follows that mutations that inhibit phenotypic maturation might give rise to tumors whose gene signatures share similarities with the TG gene signature.

Although transgenic mouse models may be considered artificial systems, they are nonetheless valuable tools that will enable us to dissect molecular pathways involved in both normal and cancer stem cell biology. This approach has already been used successfully in leukemic stem cell studies to explain the phenotypic heterogeneity in leukemia stem cells seen in human and mouse leukemias. The combination of murine leukemia models [45, [46], [47]–48] and concurrent lineage mapping of normal hematopoiesis has shed light on the hematopoietic stem cell and progenitor compartments that are targeted by various leukemogenic genetic defects. Similarly, we expect that a better understanding of the cellular hierarchy of epithelial cancer stem cells will be gained from future work using a combination of our MMTV-Wnt-1 cancer stem cell model and studies of normal murine breast development. It is likely that such studies will shed light on why the phenotype of the cancer stem cells can sometimes vary when tumors from individual patients or MMTV-Wnt-1 mice are examined [5, 49]. The ability to genetically manipulate mouse strains will allow functional studies of candidate TG-specific genes in an in vivo setting.

In addition, we expect to study the interaction between stromal and cancer cells using our model. One of the drawbacks of human xenograft cancer stem cell models is the unclear effects of the interaction between mouse stromal cells and the human cancer cells [50]. In our model, as the TG cells are grown in syngeneic mice, normal stromal interactions that affect cancer stem cells can be studied in a more physiologic manner. Furthermore, having a system of readily available tumors is essential for the study of cancer stem cells, where collection of such cells is difficult because of the rarity of cells and difficulty in obtaining human samples. These studies demonstrate that the MMTV-Wnt-1 mouse model should serve as a powerful and clinically relevant tool to dissect the properties of tumorigenic cancer cells that will ultimately yield insights into their human cancer stem cell counterparts.

Disclosure of Potential Conflicts of Interest

The authors indicate no potential conflicts of interest.


This work was supported by NIH CA104987 (to M.F.C.). This work was also supported by the Breast Cancer Research Foundation and the Virginia and D.K. Ludwig Foundation (to M.F.C.). M.D. was supported by an ASTRO Research Seed grant. Author contributions were as follows: R.W.C. was responsible for the identification of Wnt-1 TG cells, mouse array data, and RT-PCR data and contributed to the writing and preparation of this article. M.D. contributed to the bioinformatic analyses of the array data and to the preparation of this article. G.Y.C. contributed to the molecular analysis and the writing and preparation of this article. X.W., M.D., A.G., J.L., and K.S. contributed to the statistical analysis of mouse versus human array data and the predictive survival analysis. M.F.C. is the primary investigator.