Recent studies on gene molecular profiling using cDNA microarray in a relatively small series of breast cancer have identified biologically distinct groups with apparent clinical and prognostic relevance. The validation of such new taxonomies should be confirmed on larger series of cases prior to acceptance in clinical practice. The development of tissue microarray (TMA) technology provides methodology for high-throughput concomitant analyses of multiple proteins on large numbers of archival tumour samples. In our study, we have used immunohistochemistry techniques applied to TMA preparations of 1,076 cases of invasive breast cancer to study the combined protein expression profiles of a large panel of well-characterized commercially available biomarkers related to epithelial cell lineage, differentiation, hormone and growth factor receptors and gene products known to be altered in some forms of breast cancer. Using hierarchical clustering methodology, 5 groups with distinct patterns of protein expression were identified. A sixth group of only 4 cases was also identified but deemed too small for further detailed assessment. Further analysis of these clusters was performed using multiple layer perceptron (MLP)-artificial neural network (ANN) with a back propagation algorithm to identify key biomarkers driving the membership of each group. We have identified 2 large groups by their expression of luminal epithelial cell phenotypic characteristics, hormone receptors positivity, absence of basal epithelial phenotype characteristics and lack of c-erbB-2 protein overexpression. Two additional groups were characterized by high c-erbB-2 positivity and negative or weak hormone receptors expression but showed differences in MUC1 and E-cadherin expression. The final group was characterized by strong basal epithelial characteristics, p53 positivity, absent hormone receptors and weak to low luminal epithelial cytokeratin expression. In addition, we have identified significant differences between clusters identified in this series with respect to established prognostic factors including tumour grade, size and histologic tumour type as well as differences in patient outcomes. The different protein expression profiles identified in our study confirm the biologic heterogeneity of breast cancer and demonstrate the clinical relevance of classification in this manner. These observations could form the basis of revision of existing traditional classification systems for breast cancer. © 2005 Wiley-Liss, Inc.