Identification of cellular and genetic drivers of breast cancer heterogeneity in genetically engineered mouse tumour models

The heterogeneous nature of mammary tumours may arise from different initiating genetic lesions occurring in distinct cells of origin. Here, we generated mice in which Brca2, Pten and p53 were depleted in either basal mammary epithelial cells or luminal oestrogen receptor (ER)‐negative cells. Basal cell‐origin tumours displayed similar histological phenotypes, regardless of the depleted gene. In contrast, luminal ER‐negative cells gave rise to diverse phenotypes, depending on the initiating lesions, including both ER‐negative and, strikingly, ER‐positive invasive ductal carcinomas. Molecular profiling demonstrated that luminal ER‐negative cell‐origin tumours resembled a range of the molecular subtypes of human breast cancer, including basal‐like, luminal B and ‘normal‐like’. Furthermore, a subset of these tumours resembled the ‘claudin‐low’ tumour subtype. These findings demonstrate that not only do mammary tumour phenotypes depend on the interactions between cell of origin and driver genetic aberrations, but also multiple mammary tumour subtypes, including both ER‐positive and ‐negative disease, can originate from a single epithelial cell type. This is a fundamental advance in our understanding of tumour aetiology. © 2014 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland.


Introduction
Breast cancer is a heterogeneous disease encompassing different histological and molecular subtypes, with distinct clinical behaviours [1][2][3][4]. The biological basis of this heterogeneity remains poorly understood; improving this understanding is key to better patient stratification. Although distinct molecular events occurring in different target cells may explain the variety of breast cancer phenotypes [5,6], there is not necessarily a direct correlation between tumour phenotype and its cell of origin. For instance, breast cancers of 'basal-like' subtype were proposed to arise from basal stem cells [7][8][9][10], but current models suggest that a substantial proportion, if not all, of these tumours derive from luminal-progenitor cells [11][12][13]. Disentangling the complex relationship between tumour-initiating genetic events, target cells and tumour phenotypes is ideally suited to studies using genetically engineered mouse models.
We previously demonstrated that when Brca1 and p53 loss were targeted to either basal or luminal ER-negative mammary (lumER neg ) cells in mouse models, the balance of tumour phenotypes depended on the cell of origin. Although all tumours were molecularly classified as 'basal-like', histologically the basal cell-origin tumours were mostly adenomyoepitheliomas (AMEs), while the lumER neg cell-origin tumours were high-grade invasive ductal carcinomas of no special type (IDC-NSTs) [13]. It remains to be defined, however, whether the cell of origin is the prime determinant of tumour subtype, or whether initiating genetic hits also play a role in shaping phenotype, in addition to simply stimulating tumourigenesis.
To address this question, we generated conditional mouse models in which Brca2, p53 and/or Pten were deleted in distinct cell populations of the mouse mammary gland. To fully describe the tumours these animals developed, detailed histopathological, immunohistochemical and gene expression analyses were performed. We demonstrate that the relative contributions of cell of origin and molecular lesion to determining mammary tumour heterogeneity are context-dependent. The final tumour phenotype is the result of both interactions between the cell of origin and genetic aberrations, and epistatic interactions between genetic aberrations within a cancer.

Gene expression microarray analysis
Samples which underwent gene expression analysis were morphologically checked to be representative. Microarray hybridization was performed by UCL Genomics (UCL, London, UK), using the Affymetrix GeneChip Mouse Genome 430 2.0 Array (Affymetrix, Santa Clara, CA, USA). Data were read using the Affymetrix package in R (v 2.11.0) and annotated using Bioconductor 2.8. Arrays were normalized with the RMA method in Expression Console 1.1 and annotated with corresponding human orthologue annotation based upon the Mouse Genome Informatics database (http://www.informatics.jax.org/). Subgroup assignment was performed based upon nearest-centroid Spearman rank correlation over 0.1, as described [13,17], using published centroid data [18]. Meta-analysis of the mouse tumour signatures in human breast cancers is fully described in the online supplementary material (see Supplementary experimental procedures). MIAME-compliant data are available (ArrayExpress, E-MEXP-3663).

Results
To determine how different cells of origin interact with different initiating genetic lesions to drive tumour heterogeneity, we generated mouse cohorts carrying conditional alleles of Brca2, p53 and Pten together with either K14Cre or BlgCre, which preferentially target tumour formation to basal or lumER neg cells, respectively [13]. Cohorts of virgin/parous BlgCre animals were established. For additional information about mouse cohorts, cells of origin of the tumours and full tumour details, see supplementary material (Supplementary experimental procedures and Tables S2, S3).
Cell of origin drives tumour phenotype in Brca2-deleted mammary tumours  Figure 1A). Significant reduction in conditional p53 and Brca2 expression was shown in all tumours relative to control spleens, concordant with deletion of floxed exons (see supplementary material, Figure S1). Droplet digital PCR (ddPCR) demonstrated that tumours consistently had fewer copies of floxed Brca2 and p53 exons compared to unfloxed exons (see supplementary material, Figure S2). However, the presence of infiltrating immune cells (see supplementary material, Table S2) and likely contamination of tumour samples by other wild-type host cells meant that tumours rarely showed a floxed allele number which approached zero.
All tumours had lower expression for Pten floxed exon 4 compared with exon 6 (see supplementary material, Figure S3A, B), confirming recombination of the conditional allele during tumourigenesis. Expression of p53 exon 4 was higher in Pten f/f tumours, similar or lower in Pten f/f :p53 f/+ tumours and always reduced in Pten f/f :p53 f/f tumours relative to control spleen (see supplementary material, Figure S3C). This is consistent with Pten loss causing p53 induction in p53 +/+ mice and a dose-dependent reduction in this response following loss of one or two p53 alleles [19]. Again, ddPCR demonstrated that tumours from BlgCre:Pten f/f :p53 f/+&f/f mice had fewer copies of floxed p53 exons compared to unfloxed exons (see supplementary material, Figure S2). The same caveats regarding infiltrating immune cells apply (see supplementary material, Table S2). For technical reasons, the ddPCR assay could not be performed on the floxed Pten allele.
Addition of p53 conditional alleles into the BlgCre:Pten cohort increased the ratio of malignant:benign tumours, with 20/21 (95%) tumours being malignant (see supplementary material, Table  S2). It also shifted the spectrum of histopathological phenotypes closer to that seen with BlgCre:Brca2 f/f :p53 f/f cohorts ( Figure 1B Figure S1) nuclear pleomorphism, lack of tubule formation and high MI ( Figure 1I). Tumours had mixed borders, with central/multifocal necrosis. Spindle (5-100%) and squamous (1-50%) metaplastic cells were seen in all tumours. All were K14/K18-positive ( Figure 1J, K), but staining tended to be at low levels in IDC-NSTs    Figure   S5) including IDC-NSTs ( Figure 4F, R), but was either absent or expressed at low levels in MSCTs. Like the Brca2 cohorts, PR staining was concordant with ER staining, with fewer cells PRA-positive than ER-positive, and fewer still PRB-positive than PRA-positive (see supplementary material, Figure  S5, Table S2). Double immunofluorescence staining of benign and malignant BlgCre:Pten f/f AMEs, as well as malignant BlgCre:Pten f/f :p53 f/+&f/f AMEs, demonstrated that in benign tumours p63-positive and ER-positive cell populations were mutually exclusive, but in malignant tumours most p63-positive cells were also ER-positive (see supplementary material, Figure  S6). This double positivity for ER/p63 suggests an aberrant differentiation in these tumour cells.
Therefore, BlgCre:Brca2 f/f :p53 f/f and BlgCre:Pten f/f mouse models showed distinct differences in tumour latency and phenotype, despite the initiating genetic lesions being targeted to the same cell population. This demonstrated that in these cases the cell of origin was not the sole determinant of tumour phenotype. Rather, the initiating genetic hits underpinned tumour behaviour and phenotype. Targeted deletion of both Pten and p53 to lumER neg cells accelerated tumour formation and, notably, resulted in a range of phenotypes that once again included IDC-NSTs and MSCTs. BlgCre:Pten f/f :p53 f/+&f/f IDC-NSTs, however, were strongly ER-positive, unlike IDC-NSTs from other cohorts (BlgCre:Brca2 f/f :p53 f/f and BlgCre:Brca1 f/f :p53 +/− ) [13,20].

Luminal ER neg -origin tumours display diverse molecular profiles determined by the initiating genetic lesion
We performed whole-transcriptome analysis of a subset of tumours from each genotype (including a previous collection of K14Cre:Brca1 f/f :p53 +/− and BlgCre:Brca1 f/f :p53 +/− tumours) [13], using the Affymetrix MouseChip Genome platform. Unsupervised hierarchical clustering showed that the tumours broadly clustered into three molecular groups ( Figure 5). One included the Brca2:p53 and Pten:p53 tumours; the second group consisted of most of Pten-only tumours; the third group included Brca1:p53 and some Pten tumours. Pairwise SAM comparisons between groups delivered a list of significantly-associated genes, which were interrogated for GO terms and KEGG pathway analysis (see supplementary material, Table S4). The Brca2:p53/Pten:p53 group (group 1) and the Brca1:p53 group (group 3) genes were highly enriched for GO Bioprocess annotations associated with transcription, metabolism, biosynthesis and regulation of cell death. In contrast, the group 2 (Pten) genes were enriched for development, homeostasis, signalling and regulation of cell death bioprocesses and expressed genes involved in 'response to hormone stimulus' and 'steroid metabolic process'. Pathway analysis showed a great similarity between all tumour groups (see supplementary material, Table S4), although with some differences. For instance, group 1 was enriched for genes associated with adhesion, junctional complexes and JAK-STAT signalling pathways, group 2 with genes associated with calcium signalling and vascular smooth muscle pathways, and group 3 with genes associated with the cell cycle and DNA replication pathways. Interestingly, genes for cysteine and methionine metabolism pathways were enriched in groups 1 and 3, while genes for glycine, serine, threonine and tyrosine metabolism pathways were enriched in group 2, suggesting fundamental differences in the metabolism of these tumour groups.
Importantly, these molecular clusters were determined by the initiating genetic lesion ( Figure 5C), with expression profiles being consistent across tumours carrying the same initiating lesion. Tumours with different lesions were not randomly interspersed, neither did tumours cluster by Cre promoter. Thus, the tumour molecular profile was governed by its initiating genetic lesion, not by the cell to which those lesions were targeted.

Luminal ER neg cells generate basal-like, 'normal breast-like', luminal A and luminal B tumours
We next asked which human breast cancer molecular subtypes the mouse tumours of this analysis most closely resembled, using a single sample predictor gene set (SSP) [18] (Table 1, Figure 5E; see also supplementary material, Table S5). Consistent with their lack of ER expression, 9/13 (70%) Brca2:p53 mouse tumours classed as basal-like using the PAM50 gene set, irrespective of whether they were from the K14Cre or BlgCre cohorts. Of the Pten tumours, 17/21 (81%) tumours were categorized as 'normal breast-like', three tumours classed as luminal A and one as basal-like. Conversely, Pten:p53 tumours were classified as luminal B (4/10), 'normal breast-like' (3/10), luminal A (2/10) and one could not be assigned to any subtype. Differences in the proportions of the predominant subtypes within each genotype were highly significant (p < 0.0001, χ 2 test) in pairwise genotype comparisons (Table 1).
PAM50 analysis is sensitive to sample cohort normalization issues [17]. We therefore interrogated different human breast tumour transcriptome datasets [1,3,18,[21][22][23], including three enriched for BRCA1/2 mutation carriers [24][25][26], using mouse tumour transcriptome signatures. We built three mouse molecular signatures based upon the top probes upand down-regulated within each mouse group (Pten only; Brca1:p53 only; combined Brca2:p53/Pten ; p53), identified by SAM pairwise comparisons. Signatures were applied to each sample from each dataset. Correlation heat maps for the mouse transcriptome signature in the human datasets ( Figure 5F; see also supplementary material, Figure S8) confirmed, first, that the Brca1:p53 mouse signature was associated with the human basal-like subtype and with human BRCA1 breast cancers; second, that luminal A, normal breast-like and non-BRCA1/2 cancers were enriched in breast cancer samples with a gene signature similar to the Pten mouse tumours; and third, that the Brca2:p53/Pten:p53 signature was observed across the range of human breast cancer molecular subtypes. Notably, when testing human breast cancer datasets that included the claudin-low subtype, a particular enrichment for the Brca2:p53/Pten:p53 signature was  [18] to identify the human breast cancer subtypes the mouse tumours most closely resemble; white squares indicate no association. Expression data of normal mouse populations and Brca1 tumours was taken from previously published work [13,34]. Pvclust analysis confirmed that the stability of the three main tumour molecular clusters was > 90% (see supplementary material, Figure S7 and Supplementary experimental procedures, for confirmatory analysis that microarray batch variation did not affect clustering). (F) Analysis for enrichment of up-and down-regulated gene sets in mouse signatures of groups 1 (enriched in Brca2 and Pten:p53 tumours), 2 (enriched in Pten tumours) and 3 (enriched in Brca1 tumours) in human breast cancer datasets [3,22]. Spearman rank correlation values for each signature were plotted against dataset molecular phenotypes as correlation heat maps. Note that group 2 and 3 signatures correlated with the human luminal A/normal breast-like and the basal-like subtypes, respectively (both heat maps), whereas the group 1 signature was highly correlated with the human claudin-low subtype (right heat map). (See also supplementary material, Figure  S8) obtained in this group. The Brca2:p53/Pten:p53 signature was not enriched in human BRCA2 tumours; indeed, in one study [25] the association was with BRCA1 tumours.
These results showed that tumours deriving from the same cell of origin, lumER neg cells, not only had very different molecular features, depending on the initiating genetic lesions, but also spanned a broad range of human-equivalent molecular signatures. Hence, the 'intrinsic subtype' classification of a tumour does not necessarily reflect its cell of origin.

Luminal ER neg cells generate 'claudin-low' tumours
The claudin-low subtype is not distinguished by the PAM50 gene set. This subtype is characterized by   Table S4).
up-regulation of mesenchymal-associated genes and down-regulation of genes related to epithelial cell-cell junctions, particularly claudins CLDN3, −4 and −7 and CDH1 [22]. As a 'mesenchymal-like' appearance was typical of the MSCTs from our tumour cohorts, and enrichment for the Brca2:p53/Pten:p53 signature, both tumour genotypes with high numbers of MSCTs, was observed in the claudin-low subtype in breast cancer datasets which included that group, we analysed expression of Cldn3, 4 and 7 and Cdh1 across the tumour panel categorized by histological phenotype. The results confirmed that MSCTs had significantly lower expression levels of these four genes compared to other tumour types (Figure 6), and indeed of the whole geneset reported as down-regulated in the claudin-low phenotype [22] (see supplementary material, Figure  S9). This demonstrates that the transcriptomic signature of MSCTs recapitulates that of claudin-low tumours, suggesting that this tumour type can also originate from lumER neg cells.

Human metaplastic tumours have variable PTEN expression but express low claudin levels
Using a pilot cohort of human breast cancers, including some very rare human AMEs, we examined whether there was an association between PTEN expression and the human histological phenotypes equivalent to those in our mouse cohorts. We found that staining of metaplastic tumours and IDC-NSTs was variable, but the few human AMEs we examined were very strongly PTEN-positive (see supplementary material, Table S6, Figure S10). We also examined the same tumour group for expression of CLDN3, CLDN4 and CDH1. Unlike PTEN staining, these results were concordant with the mouse data, as human IDC-NSTs expressed high protein levels, but there was absence of expression in non-epithelial areas of spindle-cell carcinomas and metaplastic carcinomas with mesenchymal differentiation (see supplementary material, Table S6, Figure S10).

Discussion
Inter-tumour heterogeneity must arise from different (epi)genetic lesions occurring in different cells of origin. Here, we have applied histopathological and molecular pathology approaches to analyse tumours arising in genetically engineered mouse models from different initiating lesions in distinct cells of origin. We show that, in our model system, targeting tumour-initiating lesions to basal cells results primarily in adenomyoepitheliomas, whereas targeting lumER neg cells results in tumours with a range of histopathological features including metaplastic tumours and invasive ductal carcinomas [4]. Importantly, we have generated both ER-positive luminal-like and ER-negative basal-like tumours from this target population. We have also shown that the initial genetic lesion is the prime determinant of the molecular profile of the subsequent tumours arising from these cells. This suggests that, rather than being a truly stochastic process, the aetiology of tumour formation is largely deterministic and depends on the earliest events in carcinogenesis (ie the founder genetic/epigenetic events).
Germline mutations in human BRCA2 predispose to breast and ovarian cancers [27]. Although 66-93% Figure 6. Low expression of claudin-related genes in metaplastic spindle cell carcinomas. Box plots showing expression levels for mouse orthologues of human genes (Cldn3, Cldn4, Cldn7 and Cdh1) characteristically down-regulated in the human claudin-low molecular subtype. Mouse tumours were grouped in five categories, based solely on histological phenotype: MSCT, metaplastic spindle cell tumours; IDC-NST, invasive ductal carcinoma of no special type; AME, malignant adenomyoepithelioma; ASQC, adenosquamous carcinoma; Benign AME, benign adenomyoepithelioma. Note that MSCTs overall show lower expression for each of the genes as compared with the other phenotypes. These differences were statistically significant (one-way ANOVA, * p < 0.05) for all genes except Cldn4 (p = 0.056). (See also supplementary material, Figure S9) of BRCA2-associated human cancers are ER-positive in > 10% tumour cells [28,29], only 3/15 (20%) of the Brca2 IDC-NSTs described here were ER-positive (1-10% tumour cells). As the same cells of origin could generate ER-positive IDC-NSTs in the Pten:p53 model, this is not likely explained by a lack of potential to differentiate along this lineage. Moreover, most of the Brca2 mouse tumours had a molecular profile similar to human basal-like breast cancers (Table 1, Figure 5F) and did not resemble a typical human BRCA2 tumour profile (see supplementary material, Figure S8). It should be noted, however, that a subset (13-19%) of human BRCA2-mutated breast cancers have a basal-like molecular profile and are also ER-negative [30], which would be consistent with the similarity of the Brca2:p53/Pten:p53 signature to human BRCA1 tumours in data from one study [25]. It is possible that BlgCre:Brca2:p53 tumours model the basal-like subset of human BRCA2 breast cancers.
Loss of PTEN expression is recurrent in human breast cancers, in both basal and luminal subtypes [3]. In our study, targeting conditional depletion of Pten alone to mouse lumER neg -cells resulted in a different effect to Brca1/2:p53 loss, leading to the development of benign and malignant AMEs and ASQCs. Notably, AMEs were highly differentiated and ER-positive. In contrast, analysis of a pilot cohort of human breast cancers, including very rare human AMEs, found strong PTEN expression in these tumours. Larger numbers are required to confirm these findings, but they suggest that human AMEs are not associated with somatic PTEN loss, unlike in the mouse. Notably, however, breast tumours from germline PTEN loss-syndrome families are enriched for molecular apocrine differentiation, which is characterized by elevated levels of androgen signalling [31], and our mouse Pten tumour cohort also expressed high levels of the androgen receptor (see supplementary material, Table S4). The mouse tumour phenotypes were altered when conditional Pten:p53 alleles were combined, resulting in the development of MSCTs and ER-positive IDC-NSTs. In these tumours, the molecular changes observed (loss of claudins in MSCTs) were reflected in the equivalent human tumours.
Our findings show that a broad spectrum of tumour phenotypes can emerge from the lumER neg cell population, they suggest that p53 loss-of-function is a prime driver of histopathological phenotype, and they demonstrate that, in contrast, cell of origin is not a strict driver of tumour phenotype. Our results are consistent with the notion that mammary tumour heterogeneity is a result of context-dependent interactions between cell of origin and early genetic hits. In the K14Cre basal-origin tumours we describe, the AME phenotype is the default tumour type, irrespective of the driving genetic lesion. Conversely, lumER neg cells are able to generate a broad spectrum of tumour histological and molecular phenotypes, including highly aggressive ER/PR-negative and ER/PR-positive neoplasms. Since tumours had long latency periods, and additional genetic mutations must have arisen in all genetic backgrounds to permit tumour formation, the stability of histological phenotypes within each genetic background was notable. Either any additional genetic hits were stochastic and had little effect on overall tumour phenotype, or each cell of origin/genetic background combination developed a set of stereotypical lesions that contributed to the tumour phenotype. Future massively-parallel sequencing studies may lead to a deeper understanding of the mutational changes in these genetic backgrounds.
Interestingly, other groups described K14Cre-driven models (K14Cre:Brca1 f/f :p53 f/f and K14Cre:Ecad f/f : p53 f/f ) [32,33], in which the predominant tumour phenotype was not an AME but rather a more typical luminal-like tumour. We have discussed this issue previously [13] but our current results support a model in which the Brca1 and Ecad alleles used by Liu and colleagues and Derksen and colleagues are dominant over the K14Cre cell of origin in driving tumour phenotype, in a way in which the alleles we have used are not.
Our study has important limitations. While the BlgCre transgene preferentially drives tumour formation in lumER neg cells, we cannot definitively exclude that promoter 'leakiness' may, in a modest number of cases, result in tumours originating from other cell types, or that initial gene deletions may affect cell differentiation and thus alter the phenotype of the cell that finally transforms (see supplementary material, Supplementary experimental procedures); whereas equivalent mouse and human mammary epithelial cell types can be inferred (ie cells which are luminal or basal, ER-positive or ER-negative), the cell types in which allele recombination occurs in the mouse have not been directly mapped to human cell types; the mouse strains we have used, while mainly on a C57Bl6 background, are not pure-bred (see supplementary material, Supplementary experimental procedures) and there may be background strain-specific alleles linked to the conditional alleles which could affect tumour phenotypes; in our models, and in all current mouse models involving more than one conditional allele, it is not possible to control the order in which allele recombination occurs; and finally, we have not yet observed tumours that resemble sporadic human ER-positive IDC-NSTs with a luminal A molecular profile. We hypothesize that either lumER pos progenitors will need to be targeted as the cell of origin for this tumour type, or that these tumours are simply too indolent to be modelled within the mouse lifespan. In general we note that, while mouse models are important as models of breast cancer, mice are not humans and caution must be exercised in extrapolating results between species, as is illustrated by the case of PTEN expression in AMEs.
Despite these limitations, this study does provide a fundamental advance in our understanding of the origins of mammary tumour heterogeneity. We provide multiple lines of evidence to demonstrate that the phenotype of a cancer is not a mere reflection of its cell of origin, calling into question conclusions about the histogenesis of malignancies derived from histopathological, immunophenotypical and transcriptomic analyses of fully developed tumours.

SUPPLEMENTARY MATERIAL ON THE INTERNET
The following supplementary material may be found in the online version of this article:   Table S1. Primers used for genotyping and TaqMan gene expression assays Table S2. Full malignant tumour histological features and immunohistochemical findings Table S3. Full benign tumour histological features and immunohistochemical findings Table S4. Genes up-regulated in tumour molecular clusters determined by SAM pairwise comparisons and analysed by Gene Ontology (GO) and KEGG pathway analysis